ChemRxiv. Why?

In August 2016, the launch of a chemistry pre-print service ChemRxiv was announced. I was phoned a day or so later by a staff journalist at C&E News for my opinion. The only comment that was retained for their report was my instantaneous feeling that “the community needed a chemistry pre-print server like one needed a hole in the head“. I had been there before you see, recollecting a pre-print server launched by the ChemWeb service around 1996 or 1997 and which lasted only about two years before being withdrawn due to the low quality of the preprints. So what do I think of ChemRxiv now in 2019?

Let me set the scene first. Nowadays, many journals offer open access options, most upon payment of an APC (article processing charge). One can sometimes get a grant for this fee from institutional libraries. Mine for example has a policy that to apply for an APC, one has to deposit a “final author version” (FAV) of a manuscript in our local institutional repository (Spiral). Thus the final outcome is two versions of open access articles, one the FAV and then a version-of-record (VOR) held by the publisher. ChemRxiv can now add a third version to the process, since the expectation is that after some life as a pre-print, the manuscript can then be submitted to a peer-reviewed journal. Because the pre-print is allocated a persistent identifier (a DOI), the expectation is that the pre-print will indeed be persistent, with no expiration. Three versions of any given article are therefore now likely to be around, in effect permanently (or what goes for permanence nowadays). Importantly, there is no clear protocol for indicating how these three versions might differ, if they do. Even the FAV and the VOR may contain differences such as errors found in galley-proofing which will appear in the VOR but may not be propagated to the FAV. The congruence between the pre-print and any VOR is even less obvious.

All this came to a head as a result of the pre-print I noted in my previous two posts.[1] Unlike the topic of an earlier post of mine, where the VOR article[2] (not a preprint) allows readers to comment (see e.g. https://www.nature.com/articles/s41586-019-1059-9#article-comments) I have not been able to identify a mechanism to post any comment about pre-prints. After all, that did seem to me to be a primary reason for exposing a pre-print, which is to invite insights from the community, perchance to improve the science or make suggestions related to it. What I have spotted however was an altmetric index. Hover over that and you get social media metrics. For this pre-print[1], these put it in the top 5% of all outputs, so it is clearly attracting much interest. This interest includes (currently) 1955 views, 539 downloads and commentary via two blog posts (www.altmetric.com/details/59250193/blogs) and 40 tweets (www.altmetric.com/details/59250193/twitter). You would have to work quite hard to visit all the blog posts and read all the tweets to assess overall how the community was responding to any specific pre-print. 

So what is the purpose of posting (or should I use the term publishing?) a ChemRxiv pre-print? Is it primarily to gather commentary via social media such as blog and Twitter posts and to use this to improve the final VOR based on such feedback? A colleague I discussed this with suggested that in some very competitive areas of science/chemistry, it might also serve to acquire a date-stamp for the research (part of the metadata associated with a DOI) and hence to claim priority, a stamp which would thus pre-date that obtained from VOR publication by a few months. This might be perceived as making all the difference in a competitive area in terms of gathering evidence of esteem, inclusion in grant proposals etc, especially for early career researchers. There may be other reasons which I have not thought of and comments here for these are most welcome.

I will end with noting the following project: en.wikiversity.org/wiki/WikiJournal_of_Science,[3] being part of the WikiVersity. Here, the APC is dispensed with (no publication costs, at least to the authors), a DOI is again allocated and each article is subjected to both public peer review (en.wikiversity.org/wiki/WikiJournal_of_Science/Peer_reviewers) and can also carry post-publication review comments and even direct edits in the manner of Wikipedia. The other infra-structures of the Wiki ecosystem are available, including access to WikiData, which is high quality reference data.

So I think it is going to be an interesting debate about how the publication of primary research articles is going to evolve. Is a Triad of articles (the pre-print, the FAV and the VOR) the future? Or could it be e.g. the Wiki Journal of Science (extended perchance in the future to Wiki Journal of chemistry?) showing an interesting alternative way? Or is it all just getting too fragmented and confusing?

References

  1. K. Miyamoto, S. Narita, Y. Masumoto, T. Hashishin, M. Kimura, M. Ochiai, and M. Uchiyama, "Room-Temperature Chemical Synthesis of C2", 2019. http://dx.doi.org/10.26434/chemrxiv.8009633.v1
  2. J. Lee, K.T. Crampton, N. Tallarida, and V.A. Apkarian, "Visualizing vibrational normal modes of a single molecule with atomically confined light", Nature, vol. 568, pp. 78-82, 2019. http://dx.doi.org/10.1038/s41586-019-1059-9
  3. T. Shafee, and . , "The aims and scope of WikiJournal of Science", WikiJournal of Science, vol. 1, pp. 1, 2018. http://dx.doi.org/10.15347/wjs/2018.001
Henry Rzepa

Henry Rzepa is Emeritus Professor of Computational Chemistry at Imperial College London.

View Comments

  • So what is the purpose of posting [...] a ChemRxiv pre-print? […] There may be other reasons which I have not thought of and comments here for these are most welcome.

    Dear Prof. Rzepa,
    Recently, I posted in ChemRxiv just for commenting an article.

    At the beginning, I had prepared a letter to editor but it was rejected recommending me to write a private communication to the authors or good-luck submitting elsewhere.

    However, I needed to publicly expose contradictions between this article and a recently published article of mine. I couldn't submit a private communication but I didn't see any point in publishing my commentary in another journal.

    In the middle of frustration, ChemRxiv appeared and in less than 3 days my comment had a DOI. I invited the authors of the article to make a response to the comment and they kindly agreed. After their reply, I updated my comment (in ChemRxiv they use a suffix v1, v2...etc in the DOI to distinguish between versions) and after this exchange of point of views, we have probably reached consensus with the discrepancies.

    My comment will remain preprint forever and will never go to any journal except where the commented article remains, but I don't care, I was able to expose my point of view and now it can be cited.

    Best regards,
    Emilio

  • Thanks for this really interesting insight into how modern discourse can (and apparently cannot) be conducted, including private vs open.

    I would be interested in the DOIs of your original article, the other article, and the DOI of your commentary.

    When you write I invited the authors of the article to make a response to the comment and they kindly agreed is that response to you privately or in open?

    Certainly if ChemRxiv facilities this sort of scientific discourse, I would change my mind about it!


    Postscript

    I have now located the relevant articles:

    1. Original article, DOI: 10.1039/C8SE00358K

    2. Comment: 10.26434/chemrxiv.7295585.v1

    3. Response: 10.26434/chemrxiv.7649897.v1

    4. Revised comment: 10.26434/chemrxiv.7295585.v2

  • The previous comment noted that a continuing commentary on ChemRxiv can take place in the form of DOIs with version numbers, ie ...85.v1 and ...85.v2

    Whenever I have heard versioning discussed at conferences such as PIDapalooza, there are some who strongly deprecate it. Thus I encountered this at https://help.zenodo.org/#versioning

    Q: Why don’t the DOIs have a version number suffix like “.v1”?
    A: Including semantic information such as the version number in a DOI is bad practice, because this information may change over time, while DOIs must remain persistent and should not change.

    Zenodo's solution is to issue a new DOI for each updated item and then to link them all using what they call a Concept DOI. We have used a similar idea in our repository, calling it a Collection which has the individual items as members of the collection. But this too could get very ungainly.

    So I should remind that the technical expedient that ChemRxiv have adopted to allow the kind of discourse discussed above may not turn out to be "persistent". A work in progress I fancy.

  • I remember following that discussion on ChemRxiv with great interest. It is an intangible benefit, but having such material available to those who can access only a fraction, or even none, of the published literature encourages them to maintain their interest in up to date research, and in turn pass the story on to others. If open access makes a small fraction of the public pro-chemistry rather than ignorant or suspicious of chemistry, that is a useful result.

  • I worked at ACS Publications from 2013-17 and helped conceive and launch ChemRxiv. We were inspired by the #ASAPbio initiative spearheaded by Ron Vale (HHMI/UCSF) and the promising growth of bioRxiv. The principal benefits are 1) the chance to share and receive feedback on research prior to publication; 2) to provide a timestamp of research; and 3) to dissseminate potentially important findings many months before formal publication, analagous to a presentation of unpublished data at a conference.

    It is to ACS' great credit that it launched ChemRxiv without unanimous backing from its >50 journal editors -- the flagship JACS was not a supporter initially, but has changed policies (presumably in response to pressure from authors and EAB members).

    The spread of multiple versions of a paper may be a distraction but is not the point; other preprint servers allow authors to post revisions of their preprint. The posting of a preprint provides authors with some peace of mind as they endure the peer-review process, and in some cases have to try their luck with multiple journals.

    The altmetrics data are a frill that might provide some interesting data but wasn't a consideration when the server was launched.

    -- Kevin Davies

  • Thanks Kevin for that useful perspective from the point of view of how a publisher sees things.

    I found your analysis that The posting of a preprint provides authors with some peace of mind as they endure the peer-review process, and in some cases have to try their luck with multiple journals. the most salient point perhaps.

    That could at face value be taken as an indictment of the peer-review process itself rather than as necessarily a powerful case for a pre-print server. Your use of the word "luck" also implies that peer review is almost a random process, one perhaps driven by personal human motivations other than the quality of the science. In which case perhaps fixing peer review rather than using pre-print servers should be the real priority? What I think emerged from the Chemweb preprint experiment >20 years ago was that very few of the posted items attracted any comment at all (whether positive or negative) and were in effect treated with a big yawn.

    As a devil's advocate, I might also suggest that if authors need to try multiple journals to get into print, that could also indicate real issues with their science rather than having unlucky reviewers? Or that their science is simply too routine, I remain worried that pre-prints may drive quality down and especially noise up. There are also many that say we all publish far to much already and pre-prints are unlikely to address that aspect.

    Finally, preprints could be an opportunity to address the quality of the data associated with research, ideally in the form of FAIR data. I do not currently see chemRxiv pre-prints as directly addressing this issue, related to quality and replicability.

  • I asked a colleague in life science why they favoured pre-prints so strongly. He responded that the funders loved them (indeed sometimes mandated them) because it made their funding appear to produce twice as many outputs! Authors, especially perhaps early career researchers (?), probably also appreciate this.

    I also remember a few decades back tha research "fragmentation" was strongly discouraged. Whilst pre-prints are not necessarily such fragmentation, since they are not peer reviewed as such, there is nothing to prevent authors from using them to improve outputs and to fragment in effect covertly.

    As alluded in my last comment, if pre-prints were to come to be regarded as a "data rich" version of any given research output, or even just a useful pointer to FAIR data in repositories, I might support such a role. I also noted that Zenodo uses concept DOIs to link a story together and perhaps that might also be a role that pre-print servers take on?

  • In my previous comment, I noted that a link from an article was inserted by the depositor to the data on which the article was in part based. I was curious how this link might have been exposed in the metadata about the article. This metadata can be acquired using eg

    https://data.datacite.org/application/vnd.datacite.datacite+xml/10.26434/chemrxiv.8267837.v1 and what it contains (or does not contain) is quite interesting.

    1. It lists the DOI identifier for the article
    2. It lists the authors, two of which are identified by their ORCID.
    3. It gives a title and a description.

    But no inclusion of the reference https://doi.org/10.14469/hpc/5737 which means there is no easily automated procedure for mining the data starting from the article.

    So can I urge ChemRxiv to extend their metadata record to include such information?

Recent Posts

Internet Archeology: reviving a 2001 article published in the Internet Journal of Chemistry.

In the mid to late 1990s as the Web developed, it was becoming more obvious…

1 month ago

Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.

I have written a few times about the so-called "anomeric effect", which relates to stereoelectronic…

1 month ago

Data Citation – a snapshot of the chemical landscape.

The recent release of the DataCite Data Citation corpus, which has the stated aim of…

2 months ago

Mechanistic templates computed for the Grubbs alkene-metathesis reaction.

Following on from my template exploration of the Wilkinson hydrogenation catalyst, I now repeat this…

2 months ago

3D Molecular model visualisation: 3 Million atoms +

In the late 1980s, as I recollected here the equipment needed for real time molecular…

3 months ago

The Macintosh computer at 40.

On 24th January 1984, the Macintosh computer was released, as all the media are informing…

3 months ago