Archive for the ‘Chemical IT’ Category

A cascading tutorial in finding rich NMR data using the Datacite datasearch engine.

Saturday, April 11th, 2020

In the previous post, I introduced three of a new generation of search engines specialising in the discovery of data. Data has some special features which make its properties slightly different from the conceptual (or natural language) searches we are used to performing for general information and so a search engine specifically for data is invariably going to reflect this. At the simplest level, the data search can retain much of the generic simplicity of a regular search, but to exploit the unique features of data, one really does have to move on to an advanced mode. Here, by introducing a set of search definitions that gradually increase in specificity and power, I hope to convey some of the flavour of one way in which this could be done.


New generations of globally aggregating search engines – for (chemical) data.

Tuesday, April 7th, 2020

Chemists have long been familiar with search engines that aspire to index a large proportion of the chemical literature. Think for example the old-generation (and commercial) SciFinder (Scholar) and Reaxys or those that arrived in the 1990s in the online era such as the non-commercial Pubchem or ChemSpider (there are more). But you may not be as familiar with the latest generation of global search engines and here I will focus on three relatively new ones that specialise specifically in tracking down data rather than just publications.


The Persistent Identifier ecosystem expands – to instruments!

Saturday, March 21st, 2020

A PID or persistent identifier has been in common use in scientific publishing for around 20 years now. It was introduced as a DOI (Digital Object Identifier), and the digital object in this case was the journal article. From 2000 onwards, DOIs started appearing for most journal articles, journals having obtained them from a registration agency, CrossRef. This is a not-for-profit organisation set up by a publishers association for the purpose. Most readers of journal articles started to use this DOI as an easier way of navigating through invariably different and sometimes confusing metaphors set up by any given journal to navigate through its issues. Readers slowly learnt to prepend the URL to the DOI to “resolve” it directly to what is known as the “landing page” of the article. More recently, the prefix recommendation has changed to the slightly shorter form. Few readers are aware  however that the DOI can serve a much more interesting purpose than just taking you to the article landing page. This post will explore a few of these extras.


A Non-nitrogen Containing Morpholine Isostere; an application of FAIR data principles.

Sunday, August 4th, 2019

In the pipeline reports on an intriguing new ring system acting as an isostere for morpholine. I was interested in how the conformation of this ring system might be rationalised electronically and so I delved into the article.[1] Here I recount what I found.



  1. H. Hobbs, G. Bravi, I. Campbell, M. Convery, H. Davies, G. Inglis, S. Pal, S. Peace, J. Redmond, and D. Summers, "Discovery of 3-Oxabicyclo[4.1.0]heptane, a Non-nitrogen Containing Morpholine Isostere, and Its Application in Novel Inhibitors of the PI3K-AKT-mTOR Pathway", Journal of Medicinal Chemistry, vol. 62, pp. 6972-6984, 2019.

Metadata. Why?

Tuesday, July 2nd, 2019

I have had some interesting discussions recently regarding metadata. What emerges is that it can be quite a broadly defined concept and it is clear that a variety of answers might be obtained when asking the simple question “what is it useful for?” Here I set out some of my answers to that question.


A search of some major chemistry publishers for FAIR data records.

Friday, April 12th, 2019

In recent years, findable data has become ever more important (the F in FAIR). Here I test that F using the DataCite search service.


Impossible molecules.

Monday, April 1st, 2019

Members of the chemical FAIR data community have just met in Orlando (with help from the NSF, the American National Science Foundation) to discuss how such data is progressing in chemistry. There are a lot of themes converging at the moment. Thus this article[1] extolls the virtues of having raw NMR data available in natural product research, to which we added that such raw data should also be made FAIR (Findable, Accessible, Interoperable and Reusable) by virtue of adding rich metadata and then properly registering it so that it can be searched. These themes are combined in another article which made a recent appearance.[2]



  1. J.B. McAlpine, S. Chen, A. Kutateladze, J.B. MacMillan, G. Appendino, A. Barison, M.A. Beniddir, M.W. Biavatti, S. Bluml, A. Boufridi, M.S. Butler, R.J. Capon, Y.H. Choi, D. Coppage, P. Crews, M.T. Crimmins, M. Csete, P. Dewapriya, J.M. Egan, M.J. Garson, G. Genta-Jouve, W.H. Gerwick, H. Gross, M.K. Harper, P. Hermanto, J.M. Hook, L. Hunter, D. Jeannerat, N. Ji, T.A. Johnson, D.G.I. Kingston, H. Koshino, H. Lee, G. Lewin, J. Li, R.G. Linington, M. Liu, K.L. McPhail, T.F. Molinski, B.S. Moore, J. Nam, R.P. Neupane, M. Niemitz, J. Nuzillard, N.H. Oberlies, F.M.M. Ocampos, G. Pan, R.J. Quinn, D.S. Reddy, J. Renault, J. Rivera-Chávez, W. Robien, C.M. Saunders, T.J. Schmidt, C. Seger, B. Shen, C. Steinbeck, H. Stuppner, S. Sturm, O. Taglialatela-Scafati, D.J. Tantillo, R. Verpoorte, B. Wang, C.M. Williams, P.G. Williams, J. Wist, J. Yue, C. Zhang, Z. Xu, C. Simmler, D.C. Lankin, J. Bisson, and G.F. Pauli, "The value of universally available raw NMR data for transparency, reproducibility, and integrity in natural product research", Natural Product Reports, vol. 36, pp. 35-107, 2019.
  2. A. Barba, S. Dominguez, C. Cobas, D.P. Martinsen, C. Romain, H.S. Rzepa, and F. Seoane, "Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data", ACS Omega, vol. 4, pp. 3280-3286, 2019.

Free energy relationships and their linearity: a test example.

Sunday, January 13th, 2019

Linear free energy relationships (LFER) are associated with the dawn of physical organic chemistry in the late 1930s and its objectives in understanding chemical reactivity as measured by reaction rates and equilibria.