PIDapalooza 2018. A conference like no other!

Another occasional conference report (day 1). So why is one about “persistent identifiers” important, and particularly to the chemistry domain?

The PID most familiar to most chemists is the DOI (digital object identifier). In fact there are many; some 60 types have been collected by ORCID (themselves purveyors of researcher identifiers). They sometimes even have different names; in life sciences they tend to be known instead as accession numbers. One theme common to many (probably not all) is that they represent sources of metadata about the object being identified. Further information if which allows you (or a machine) to decide if acquiring the full object is worthwhile. So in no particular order, here are some of the things I learnt today.

Mark Hahnel noted the recent launch of the Dimensions resource which links research data with other research activities; I have not yet had a chance to learn its capabilities, but it seems an interesting alternative to other stalwarts such as eg Google Scholar etc.
You can try this example: https://app.dimensions.ai/discover/publication?search_text=10.6084&search_type=kws&full_search=true which retrieves articles in which the data repository with prefix 10.6084 (Figshare) is cited. Try also the prefix 10.14469 which is the Imperial College repository.
Andy Mabbett talked about the deployment and use of persistent identifiers (the Q numbers) in Wikidata, which increasingly underpin the basis for the various flavours of Wikipedia. He also noted their use of some 50 different identifiers.
Johanna McEntyre noted some 5M published articles in life sciences which reference 1M+ ORCID identifiers, easily the domain with the fastest uptake of this type. Also noted was the new FREYA project; aiming to connect open identifiers for discovery, access and use of research resources.
Tom Gillespie talked about RRID, or Research Resource Identifiers. Included in this are hardware, including instruments and with around 6000 RRIDs systematized so far. They argue this area promotes both the A and I of FAIR (accessible and inter-operable). Of course A and I mean many things to many people.
Several other presentations talked about the finer detail of metadata, such as sub-classifications into e.g. descriptive/admin/technical, but I did rather miss demos showing how search queries of such fine-grained metadata could be constructed.

Apart from the presentations themselves, PIDapalooza is unusual for some other activities. Thus you could go get your PIDnails done, with a selection of 8 or so tasteful logos to choose from. There will be tattoos tomorrow (this is a conference for younger people after all). I may grab a photo or two to provide evidence!

Author

Henry Rzepa

Henry Rzepa is Emeritus Professor of Computational Chemistry at Imperial College London.

View all posts

Tags: Academic publishing, Andy Mabbett, Digital Object Identifier, Identifiers, Imperial College, Index, Information science, Johanna McEntyre, Knowledge, Mark Hahnel, ORCiD, Persistent identifier, Publishing, Quotation, researcher, Scholarly communication, SciCrunch, search engines, Technical communication, Technology/Internet, Tom Gillespie

This entry was posted on Tuesday, January 23rd, 2018 at 7:03 pm and is filed under Chemical IT. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

One Response to “PIDapalooza 2018. A conference like no other!”

Henry Rzepa says:

February 9, 2018 at 1:40 pm

The datacite-allusers forum has been discussing the topic of “Tracking citations to datasets”. Three examples have been quoted:

https://scholar.google.com/scholar?q=10.14469 returns 72 matches. Although relatively few are false positives, there is clearly something amiss if eg http://search.datacite.org/ui?q=prefix:10.14469 returns 195,434 matches! That is a success rate of 0.04% Google!!

http://europepmc.org/search?query=(REF%3A%2710.14469%27) gives 540 matches, of which visual inspection suggests a high proportion are false positives.

https://app.dimensions.ai/discover/publication?search_text=10.14469 returns 24 hits, which are to publications citing the prefix 10.14469 (a data repository). In fact again I have gone through these 24, and 7 are false positives. We are also aware of around 10 false negatives.

So some ways to go yet before connecting data with articles becomes reliable.

Reply

Henry Rzepa's Blog