Archive for March 28th, 2022

Raw data and the evolution of crystallographic FAIR data. Journals, processed and raw structure data.

Monday, March 28th, 2022

In my previous post on the topic, I introduced the concept that data can come in several forms, most commonly as “raw” or primary data and as a “processed” version of this data that has added value. In crystallography, the chemist is interested in this processed version, carried by a CIF file. However on rare occasions when a query arises about the processed component, this can in principle at least be resolved by taking a look at the original raw data, expressed as diffraction images. I established with much appreciated help from CCDC that since 2016, around 65 datasets in the CSD (Cambridge structural database) have appeared with such associated raw data. The problem is easily reconciling the two sets of data (the raw data is not stored on CSD) and one way of doing this is via the metadata associated with the datasets. In turn, if this metadata is suitably registered, one can query the metadata store for such associations, as was illustrated in the previous post on the topic. Here I explore the metadata records for five of these 65 sets to find out their properties, selected to illustrate the five data repositories thus far that host such data for compounds in the CSD database.

(more…)