December 12, 2014

Turning Data into Information by Using Context

Posted in Content Management, Databases, Digital Libraries tagged at 5:16 pm by mknight1130

How can you search data from multiple databases when the terminology may be different? For example, you may type in “Python” in Google. Are you a techie who is interested in the programming language python or a parent who would like to find out when the python exhibit opens in the local zoo? How can a computer tell the difference between the two contexts? You will find, below, how this issue is being addressed.

A community called the World Wide Web Consortium (W3C) has been trying to come up with standards, mainly for web sites.  The W3C has generated a Data Activity (http://www.w3.org/2013/data/) group to develop standards and vocabularies, in part to improve interoperability between data base terms. Since the W3C Data Activity group addresses how to connect terms from different sources, I think this will also apply to your issue.

A couple of frameworks allow for interchange between many different databases by describing concepts and the relationship of the concepts. I think also these schemas allow for defining contextual constraints. The frameworks consist of RDF and RDF Schemas, SKOS, OWL, RIF.

While standard vocabulary is a starting place, of course, development tools are needed to allow for different database terminologies to work together. A list of development languages that would be helpful can be searched from the W3 Tools- Semantic Web Standards site (http://www.w3.org/2001/sw/wiki/Tools). There are over 300 tools. A tool that has had some positive press and was introduced at an O’ Reilly conference, is SPARQL(http://www.w3.org/TR/2013/REC-sparql11-protocol-20130321/) .

Let us say that you don’t have the time or resources to look at connecting databases by scratch. I recommend that you look at DSpace http://www.dspace.org/introducing or http://www.dspace.org/ . The MIT libraries have developed a digital archive of materials. In part, the institute has come up with a way of connecting search vocabularies and contexts, easier. You could also find services available to help you get going. Dura-Space is an open technology project to provide guidance in using D-Space. However you need to foster research and be open source, DuraSpace.Org (http://duraspace.org/about) may be an option. I think these folks offer free consulting in using DSpace.

For the average user who wants to make searching the web easier, you may wish to try Context Miner (http://contextminer.org/about.php). Funded by a National Science Foundation grant, Context Miner automatically crawls some specific websites and gets contextual data. You use context miner to set the context and the terms through a campaign. Context Miner uses these parameters to get you the relevant information and looks beyond the literal.

As you can see from the text above, you can find several tools to make your searching and database-integration easier. These put the context in data. Also stay tuned. You will find an exciting future making data more meaningful and searching more integrated.

July 5, 2011

SPARQL: a flexible querying language.

Posted in Databases, software testing at 5:19 pm by mknight1130

As I have been using SQL to query databases, I have wondered about alternatives to query among multiple data sets. Is there a more intuitive way to retrieve information that is not so tied down to a particular meaning of a field? Say I was looking for a new mobile phone and wished to query different vendors for the best price. Do I search for “cellphone”, “cell phone” “mobile phone”, “mobile device” , or “phone” ? All these terms can refer to a cell phone. It would be much easier if I could query one term and results close to the meaning of the first term would also appear. I would then get what I meant by the search. SPARQL is a query language that has such power.  SPARQL is a language that accesses  RDF or resource description framework in order to retrieve closer search results from a graph of possible meanings and pattern matching. As I am investigating this language, I have found several resources that I would like to share.

The W3, or World Wide Web Consortium 3,  provides an excellent SPARQL technical document and recommendations at Query Language for RDF ( http://www.w3.org/TR/rdf-sparql-query/) . The W3 is a group of web experts, pioneers, and interested contributors that develop web standards . The group is well regarded and is used as a reference upon developing web pages. This technical document starts by introducing SPARQL, giving some sample queries, providing details on the syntax, constraints, and defining the testing framework.

For the newbie,  XML.com provides a down to earth introduction in XML.com: Introducing SPARQL: Querying the Semantic Web http://www.xml.com/lpt/a/2005/11/16/introducing-sparql-querying-semantic-web-tutorial.html .   XML.com is a resource published by O’Reilly, a publisher of technical training materials.  The article talks about main points associated with SPARQL, simple queryies, other querying forms such as construct, describe and ask. The article describes the background needed to understand SPARQL, the context, tools and how to use patterns.

SPARQL is receiving attention from developers. Microsoft is looking for ways to implement this robust query language. See SPARQL-DL( http://academic.research.microsoft.com/Publication/5476728/sparql-dl-implementation-experience ) . This language is also making its way into the clinical space see Zynx Health Incorporated (http://www.zynxhealth.com and https://trak.baiworks.com/application/jobdescription.aspx?q=leSEDqZdwZ4gKo4Ayjbxnfq6W3IaFJTL4ysCRnjIn8wgDur%2fwJfrM72BIrQ5%2b2NeybHA4dEkh0U%3d). SPARQL has promise to query a wide range of information and bring back better matches in search results. The next time I look for a mobile phone or do any complex query, I may just use SPARQL.