December 12, 2014

Turning Data into Information by Using Context

Posted in Content Management, Databases, Digital Libraries tagged at 5:16 pm by mknight1130

How can you search data from multiple databases when the terminology may be different? For example, you may type in “Python” in Google. Are you a techie who is interested in the programming language python or a parent who would like to find out when the python exhibit opens in the local zoo? How can a computer tell the difference between the two contexts? You will find, below, how this issue is being addressed.

A community called the World Wide Web Consortium (W3C) has been trying to come up with standards, mainly for web sites.  The W3C has generated a Data Activity (http://www.w3.org/2013/data/) group to develop standards and vocabularies, in part to improve interoperability between data base terms. Since the W3C Data Activity group addresses how to connect terms from different sources, I think this will also apply to your issue.

A couple of frameworks allow for interchange between many different databases by describing concepts and the relationship of the concepts. I think also these schemas allow for defining contextual constraints. The frameworks consist of RDF and RDF Schemas, SKOS, OWL, RIF.

While standard vocabulary is a starting place, of course, development tools are needed to allow for different database terminologies to work together. A list of development languages that would be helpful can be searched from the W3 Tools- Semantic Web Standards site (http://www.w3.org/2001/sw/wiki/Tools). There are over 300 tools. A tool that has had some positive press and was introduced at an O’ Reilly conference, is SPARQL(http://www.w3.org/TR/2013/REC-sparql11-protocol-20130321/) .

Let us say that you don’t have the time or resources to look at connecting databases by scratch. I recommend that you look at DSpace http://www.dspace.org/introducing or http://www.dspace.org/ . The MIT libraries have developed a digital archive of materials. In part, the institute has come up with a way of connecting search vocabularies and contexts, easier. You could also find services available to help you get going. Dura-Space is an open technology project to provide guidance in using D-Space. However you need to foster research and be open source, DuraSpace.Org (http://duraspace.org/about) may be an option. I think these folks offer free consulting in using DSpace.

For the average user who wants to make searching the web easier, you may wish to try Context Miner (http://contextminer.org/about.php). Funded by a National Science Foundation grant, Context Miner automatically crawls some specific websites and gets contextual data. You use context miner to set the context and the terms through a campaign. Context Miner uses these parameters to get you the relevant information and looks beyond the literal.

As you can see from the text above, you can find several tools to make your searching and database-integration easier. These put the context in data. Also stay tuned. You will find an exciting future making data more meaningful and searching more integrated.

Leave a comment