Archive for March 1st, 2022

Raw data: the evolution of FAIR data and crystallography.

Tuesday, March 1st, 2022

Scientific data in chemistry has come a long way in the last few decades. Originally entangled into scientific articles in the form of tables of numbers or diagrams, it was (partially) disentangled into supporting information when journals became electronic in the late 1990s.[1] The next phase was the introduction of data repositories in the early naughties. Now associated with innovative commercial companies such as Figshare and later the non-commercial Zenodo, such repositories have also spread to institutional form such as eg the earlier SPECTRa project of 2006[2] and still evolving.[3] Perhaps the best known, and certainly one of the oldest examples of curated structural data in chemistry is the CCDC (Cambridge crystallographic data centre) CSD (Cambridge structural database) which has been operating for more than 55 years now, even before the online era! Curation here is the important context, since there you will find crystal diffraction data which has been refined into a structural model, firstly by the authors reporting the structure and then by CSD who amongst other operations, validate the associated data using a utility called CheckCIF.[4] What perhaps is not realised by most users of this data source is that the original or “raw” data, as obtained from a X-ray diffractometer and which the CSD data is derived from, is not actually available from the CSD. This primary form of crystallographic data is the topic of this post.

(more…)

References

  1. A.M. Hunter, and A.B. Smith, "Review of Supporting Information at Organic Letters", Organic Letters, vol. 17, pp. 2867-2869, 2015. http://dx.doi.org/10.1021/acs.orglett.5b01700
  2. J. Downing, P. Murray-Rust, A.P. Tonge, P. Morgan, H.S. Rzepa, F. Cotterill, N. Day, and M.J. Harvey, "SPECTRa: The Deposition and Validation of Primary Chemistry Research Data in Digital Repositories", Journal of Chemical Information and Modeling, vol. 48, pp. 1571-1581, 2008. http://dx.doi.org/10.1021/ci7004737
  3. M.J. Harvey, A. McLean, and H.S. Rzepa, "A metadata-driven approach to data repository design", Journal of Cheminformatics, vol. 9, 2017. http://dx.doi.org/10.1186/s13321-017-0190-6
  4. A.L. Spek, "Structure validation in chemical crystallography", Acta Crystallographica Section D Biological Crystallography, vol. 65, pp. 148-155, 2009. http://dx.doi.org/10.1107/s090744490804362x