5 Ways Preprint Servers Could Improve
Let's explore how a better preprint server might be built, and why that matters
I’ve been critical of preprint servers. I worry they are being started and funded without regard for larger issues of information legitimacy and public responsibility. I think the model for preprint servers hasn’t been thought through. I don’t get the oxymoron of “unpublished preprints” that are publicly available. I don’t understand how a preprint can be abandoned after 2-3 years, yet remain up on a preprint server as if it’s a legitimate undertaking. Assuming we’ll be feeding published papers into AI systems that may govern our lives to an increasing degree, preprint servers seem lacking in foresight, as well.
So, in the spirit of spotting a problem and proposing a solution, here are a few ways I would improve current approaches to preprint servers:
- Make it so preprint servers are only open to users who know the topics and are far closer to being peers than just drive-by users.
- Remove preprints if a peer-reviewed paper is not published in journal within 2-3 years.
- Create a sustainable business that doesn’t depend upon patronage.
- Stop giving preprints DOIs.
- Create a uniform, voluntary pre-posting review standard across all preprint servers, so there is no variability in practices, making preprint screening equivalent across fields.
Given these improvements, what might a better preprint server look like?
First, it wouldn’t be open to anyone. Preprints never were before, so why should they be now? Instead, a preprint could be viewed by people who have some qualification to evaluate the papers in the relevant subject areas.
Why? This would be more in the spirit of preprints. Preprints are shared so authors can gain early feedback on preliminary papers from trusted colleagues. Sure, the Internet amplifies everything, so sharing a preprint with qualified strangers or relevant sixth-degree associates would remain an amplification, but one that retains the spirit. Sharing preprints with Google, Facebook, and the public are new and uncharted waters not in keeping with the spirit of preprints, not monitored, and potentially risky in an age where information is easily weaponized against audiences.
How could this be accomplished? Use ORCID as a sign-in protocol for registration, process the papers of the registrant, map them to a subject area, and if there’s a match, the person can access and comment on preprints in those areas. This is trivial computer programming, and the data exist. No ORCID? Try Google Scholar, and use the same process.
What about patients or non-scholarly experts — EMTs, accountants, non-traditional scholars, or end-of-life caretakers? Simply leave room for some case-by-case situations to be handled by a human, and let people register with an email address and explain why they want to see the preprint.
In total, these approaches would mean that most users of a preprint server would be a peer to the extent that they would be familiar with peer review and have benefited from it at some point. A few would be let in based on other valid criteria.
But that doesn’t make preprints “open,” some would say. Exactly. Why should they be? That presumption and assumption needs to be tested. Presuming that “free” and “open” are better has not served us well. Curation, qualification, and care work better, and are growing in importance. I think the “open” assertion around preprint servers needs to justify itself at this point, and not the other way around.
Second, preprints would expire. After 2-3 years, if there is no record of a preprint author successfully publishing a final version of the paper in a peer-reviewed journal, the preprint would be suppressed or removed from the preprint server.
Why? Because it’s apparently been abandoned, the thread has been lost, or there’s some other problem with it. The authors aren’t seeking feedback anymore, anyhow. It’s either in a pre-publication state, or they’ve moved on in some way.
How could this be accomplished? Again, it’s simple — set a timer, and then remove the paper from search and display layers once the time is up.
I also think a preprint should be removed once a peer-reviewed paper is published. Leaving a preprint up is wasteful of time and attention, and has virtually no intrinsic value.
Third, a viable business for preprint servers would be created and promoted, one that doesn’t depend on patronage of some sort, whether from the publishers of the best preprints (ChemRxiv) or funders who may or may not have the best interests of humanity in mind (BioRxiv).
As it is now, arXiv, the grand-daddy of preprint servers, is barely able to hold on despite robust support from Cornell University, which I’ve projected will be supporting arXiv well into six-figures in a few years. Others have whispers of business models, but they’re either not promoted or return for more patronage.
If preprints actually serve a valuable purpose for authors and researchers, a bilateral business model seems entirely feasible — authors pay a submission charge, and researchers pay a modest annual subscription price for alerts and access. Without this kind of self-service business model, the value proposition isn’t clear. If the advocates of preprint servers are correct, there is a business here. That’s a good thing. Businesses respond to user needs, and adapt to maintain value and loyalty. Without a business model, not much happens. Look at how little arXiv has changed over its long history. An actual business could be operationalized so that those being served have a large say over its future.
Fourth, preprint servers could stop giving preprints DOIs. URLs are sufficient for the transitory nature of preprints, which I’d argue should be removed once a peer-review publication exists or the preprint is abandoned or expires (after, say, 2-3 years of inactivity in the peer-reviewed literature).
The problem with granting a DOI to a preliminary publication is that doing so creates what looks like a citable, DOI-rich version of a paper. URLs can still be cited in grant applications, but there’s less of a chance of mistaking something as having passed peer-review and editorial review without one. We should reduce the chance that the public finds these papers and believes that because they have valid DOIs, they are somehow vetted and approved. URLs work, and don’t convey a message of permanence. Why would you want a draft to be a permanent entry into the literature? Contradictions like this around preprint servers keep showing up. It’s maybe time to resolve them.
Fifth, the way preprint servers keep edging toward full-on peer-review could be better addressed with some standard approach. The boundaries of preprint servers have been explored, and it’s time to condensate some knowledge into a voluntary standard.
Currently, those behind preprint servers dispute the idea that editorial review or peer-review is going on, but it is happening in various ways — a MedRxiv preprint has undergone far more review than an arXiv or ChemRix preprint, while BioRxiv is somewhere in the middle. Certainly, these are defensible positions in some ways, or the positions wouldn’t have been established, but they are confusing for users who cross fields, something we all want to see happen more often. Creating a uniform pre-posting process — checking for libel, slander, and perhaps plagiarism — and leaving it at that, for instance, might clarify what a preprint does and what it means for users.
Summing up, preprints are intended to be disposable, preliminary drafts of final papers promulgated so the authors can get feedback from people qualified in some way — professional experience, relevant education, similar research interests — to provide good feedback. Broadcasting preprints into search engines, social media sites, and open web resources doesn’t seem to fit the intent or spirit of preprints, nor does avoiding the proposition that preprint server users — authors and reviewers — should determine the value and path forward via a supportive business model.
Blending these five improvements would make preprint servers more functional for authors, as authors would be sharing their preprints with qualified parties who were devoted to improving papers in relevant areas; would eliminate potential confusion over what is a paper and what is preliminary, especially for the public but for others as well (students, grant-making bodies); and, would size preprint servers appropriately based on market needs and their actual utility because direct revenues have a way of doing that.
Preprint servers are currently preliminary themselves — nearly everything is in life, as nothing is final or perfect. The current rough draft of preprint servers can get better, as peer-reviewers are finding problems and proposing improvements. Will preprint server organizations respond to helpful review? Or is a preliminary version enough for them, too?