Henry Rzepa's Blog

First, Open Access, then Open (and FAIR) Data, now Open Citations.

The topic of open citations was presented at the PIDapalooza conference and represents a third component in the increasing corpus of open scientific information.

David Shotton gave us an update on Citations as First Class data objects – Citation Identifiers and introduced (me) to the blog where he discusses this topic. The citations or bibliography has long been regarded as an essential, and until recently inseparable, component at the end of a scientific article. It is also a component easily susceptible to “game play“. Authors can be tempted to self-cite themselves, possibly to excess and perhaps worse, to cite their friends and colleagues for other than purely scientific reasons. There are other issues. Thus to infer the context of any particular citation, one has to read the text where it is cited and this too can be subjected to game play. One may have to “read between the lines” to try to judge whether the citation is being cited favourably as supporting any case being made, or instead to indicate disagreement with the cited authors. An article that is being cited because one disagrees with the conclusions therein may still go on to contribute to the cited author’s “h-index” of esteem. So there are various aspects of citations that deserve improvement, or certainly development and evolution.

Shotton told us that many publishers are now releasing article citations as open (CC0) data in their own right, as urged to do so on the Initiative for Open Citations site. A corpus of some 13 million of these are now available as RDF triples with a SPARQL end-point. This latter means that semantic searches of the corpus can be undertaken. So what are the benefits? Worthy aspirations such as to explore connections between knowledge fields, and to follow the evolution of ideas and scholarly disciplines (similar in fact to the new Dimensions product I discussed in the previous post). When I probed into the various sites linked above, I had in mind to identify some clear scientific outcomes of making them available in this manner, perchance even in the field of chemistry. When I succeed I will follow-up on this post, but at the moment I am not yet in a position to illustrate these benefits with chemical stories. If anyone reading this post has such, please let us know!

I will conclude here by noting much discussion at universities of the future of the scientific article itself; whether it should be increasingly mandated as GOLD Open Access (made so by payment of an article processing charge, or APC, by its authors), or whether journals should retain the hybrid publishing models where only a proportion of articles are GOLD, and the remainder are paid for by subscription fees for licensing access to the non-GOLD articles in the journal. Meanwhile, in what seems sometimes as a separate conversation, the article itself is being dis-assembled into components such as open and/or FAIR data, open citations, infographics, social media and yes, even blogs. Are these two evolutions headed in different directions? Certainly, I think the future is not what it used to be!

https://orcid.org/0000-0002-8635-8390

Henry Rzepa

Henry Rzepa is Emeritus Professor of Computational Chemistry at Imperial College London.

Next London: set to become a National Park City in 2019. »

Previous « PIDapalooza 2018. A conference like no other!

View Comments

David Shotton says:

February 5, 2018 at 9:36 am

Hi Henry! Thanks for the heads up on my presentation, now online at https://figshare.com/articles/Shotton_CitationIDs_PIDapalooza_24-01-18_REVISED_pptx/5844972. WIthout careful reading, the opening of paragraph two of your post could be mis-interpreted as conflating the Initiative for Open Citations (https://i4oc.org/), which is a group campaigning for publishers to open their references at Crossref (but which does not hold or publish data), and the OpenCitations Corpus (http://opencitations.net/), of which I am co-director, which publishes citation data in RDF. Your readers should see https://opencitations.wordpress.com/2018/01/29/opencitations-and-the-initiative-for-open-citations-a-clarification/ for more details of that distinction.
Henry Rzepa says:

February 5, 2018 at 9:44 am

Thanks for making clear the distinction David!

Can you point to any scientific use-cases in which the (semantic) analysis of citations has led to the discovery of new connections between knowledge fields or perhaps as revealing the evolution of ideas, either within or across subject domains?

First, Open Access, then Open (and FAIR) Data, now Open Citations.

View Comments

Recent Posts

Internet Archeology: reviving a 2001 article published in the Internet Journal of Chemistry.

Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.

Data Citation – a snapshot of the chemical landscape.

Mechanistic templates computed for the Grubbs alkene-metathesis reaction.

3D Molecular model visualisation: 3 Million atoms +

The Macintosh computer at 40.

First, Open Access, then Open (and FAIR) Data, now Open Citations.

View Comments

Related Post

Recent Posts

Internet Archeology: reviving a 2001 article published in the Internet Journal of Chemistry.

Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.

Data Citation – a snapshot of the chemical landscape.

Mechanistic templates computed for the Grubbs alkene-metathesis reaction.

3D Molecular model visualisation: 3 Million atoms +

The Macintosh computer at 40.