A few years ago, we published an article which drew a formal analogy between chemistry and iTunes (sic). iTunes was the first really large commercial digital music library, and a feature under-the-skin was the use of meta-data to aid discoverability of any of the 10 million (26M in 2013) or so individual items in the store.‡ The analogy to digital chemistry and discoverability of the 70 or so million known molecules is, we argued, a good one.
Well, the digital photography revolution is very similar; I just checked my personal digital photo library to find it contains almost 14,000 photos dating back ten years now. It is not easy to find a particular photograph! Well, the reason I am posting here is to bring to your attention the first 6 minutes or so of an item in the BBC collection.† It is a very nice accessible explanation of the importance of meta-data for photography, and some of the innovative things that are being done for both acquiring and for manipulating this data. As I listened to this, I felt that for photograph, think molecule! And think of all the innovative things that could be done there as well.
Actually, you might reasonably ask how/whether molecular metadata is deployed here in this blog. It certainly is on Steve Bachrach’s site (see for example this recent post where you will find InChI keys for every molecule displayed; thus InChIKey=GOOHAUXETOMSMM-GSVOUGTGSA-N). I don’t do that on this blog (perhaps I should), but instead I provide URL links to a digital repository where they are displayed: thus follow http://dx.doi.org/10.6084/m9.figshare.706756 and you will find InChIKey=USGIFUSOUDIDJL-UHFFFAOYSA-N where it can be used as a search term to find any other instances of the same molecule at the site.
‡ Historical note: In 1997, we produced a CD-ROM containing the proceedings of the Electronic Conference on Trends in Heterocyclic Chemistry (ECHET96), H. S. Rzepa, J. Snyder and C. Leach, (Eds), ISBN 0-85404-894-4. Because it was entirely digital, we were able to include an “app” which created a visual navigation point derived from analysing the meta-data present (the entire contents had been expressed in HTML and so it was relatively easy to gather this meta-data). The software we used was called Hotsauce and was based on MCF (meta content framework) as developed by Apple engineer Ramanathan V. Guha for an internal experiment (we sometimes forget that in those days Apple was the Google of its day!). Guha left Apple, joined Netscape and MCF became RDF. The rest, as they say, is history. But you can see an early deployment on the CD-ROM I refer to above (these are NOT yet collectors items. Hint!).
† This being the BBC iPlayer collection, it is quite possible that it is not accessible outside the UK, or indeed even within the UK it may only be available for 8 days after broadcast. Which would be a shame.
- O. Casher, and H.S. Rzepa, "SemanticEye: A Semantic Web Application to Rationalize and Enhance Chemical Electronic Publishing", Journal of Chemical Information and Modeling, vol. 46, pp. 2396-2411, 2006. http://dx.doi.org/10.1021/ci060139e