In recent years, findable data has become ever more important (the F in FAIR). Here I test that F using the DataCite search service.
Firstly an introduction to this service. This is a metadata database about datasets and other research objects. One of the properties is relatedIdentifier which records other identifiers associated with the dataset, being say the DOI of any published article associated with the data, but it could also be pointers to related datasets.
One can query thus:
I have performed searches 2 and 3 for some popular publishers of chemistry (the same set that were analysed here).
Publisher | Search 2 | Search 3 |
---|---|---|
ACS | 210,240 | 14,213 |
RSC | 138,147 | 1,279 |
Elsevier | 185,351 | 56,373 |
Nature | 12,316 | 8,104 |
Wiley | 135,874 | 9,283 |
Science | 3,384 | 2,343 |
These publishers all have significant numbers of datasets which at least accord with the F of FAIR. A lot of data sets may not have metadata which in fact points back to a published article, since this can be something that has to be done only when the DOI of that article appears, in other words AFTER the publication of the dataset. So these numbers are probably low rather than high.
How about the other way around? Rather than datasets that have a journal article as a related identifier, we could search for articles that have a dataset as a related identifier?
It will also be of interest to show how these numbers change over time. Is there an exponential increase? We shall see.
Finally, we have not really explored adherence to eg the AIR of FAIR. That is for another post.
In an earlier post, I discussed a phenomenon known as the "anomeric effect" exhibited by…
In the mid to late 1990s as the Web developed, it was becoming more obvious…
I have written a few times about the so-called "anomeric effect", which relates to stereoelectronic…
The recent release of the DataCite Data Citation corpus, which has the stated aim of…
Following on from my template exploration of the Wilkinson hydrogenation catalyst, I now repeat this…
In the late 1980s, as I recollected here the equipment needed for real time molecular…
View Comments
I noted above the asymmetry between pointers from data to related identifiers such as articles compared to the reverse direction of pointers from articles to data.
Ian Bruno has kindly sent me three links which highlight or start to address this issue:
It seems lots is starting to happen!