Reviewing the Peer Review

24 Jul

Update: Current version is posted here.

Science is a process. And for a good deal of time, peer review has been an essential part of the process. Looked independently by people with no experience with it, it makes a fair bit of sense. For there is only one well-known way of increasing the quality of an academic paper — additional independent thinking. And who better than engaged, trained colleagues.

But this seemingly sound part of the process is creaking. Today, you can’t bring two academics together without them venting their frustration about the broken review system. The plaint is that the current system is a lose-lose-lose. All the parties — the authors, the editors, and the reviewers — lose lots and lots of time. And the change in quality as a result of suggested changes is variable, generally small, and sometimes negative. Given how critical the peer review is in the scientific production, it deserves closer attention, preferably with good data.

But data on peer review aren’t available to be analyzed. Thus, some anecdotal data. Of the 80 or so reviews that I have filed and for which editors have been kind enough to share comments by other reviewers, two things have jumped at me: a) hefty variation in quality of reviews, b) and equally hefty variation in recommendations for the final disposition. It would be good to quantify the two. The latter is easy enough to quantify.

Reliability of the review process has implications for how many recommenders we need to reliably accept or reject the same article. Counter-intuitively, increasing the number of reviewers per manuscript may not increase the overall burden of reviewing. Partly because everyone knows that the review process is so noisy, there is an incentive to submit articles that people know aren’t good enough. Some submitters likely reason that there is a reasonable chance of a `low quality’ article being accepted at top places. Thus, low reliability peer review systems may actually increase the number of submissions. Greater number of submissions, in turn, increases editors’ and reviewer’s load and hence reduces the quality of reviews, and lowers the reliability of recommendations still further. It is a vicious cycle. And the answer may be as simple as making the peer review process more reliable. At any rate, these data ought to be publicly released. Side-by-side, editors should consider experimenting with number of reviewers to collect more data on the point.

Quantifying the quality of reviews is a much harder problem. What do we mean by a good review? A review that points to important problems in the manuscript and, where possible, suggests solutions? Likely so. But this is much trickier to code. But perhaps there isn’t as much a point to quantifying this. What is needed perhaps is guidance. Much like child-rearing, there is no manual for reviewing. There really should be. What should reviewers attend to? What are they missing? And most critically, how do we incentivize this process?

When thinking about incentives, there are three parties whose incentives we need to restructure — the author, the editor, and the reviewer. Authors’ incentives can be restructured by making the process less noisy, as we discuss above. And by making submissions costly. All editors know this: electronic submissions have greatly increased the number of submissions. (It would be useful to study what the consequence of move to electronic submission has been on quality of articles.) As for the editors — if the editors are not blinded to the author (and the author knows this), they are likely to factor in the author’s status in choosing the reviewers, in whether or not to defer to the reviewers’ recommendations, and in making the final call. Thus we need triple blinded pipelines.

Whether or not the reviewer’s identity is known to the editor when s/he is reading the reviewer’s comments also likely affects reviewer’s contributions — in both good and bad ways. For instance, there is every chance that junior scholars in trying to impress editors file more negative reviews than they would if they would if they knew that the editor had no way of tying the identity of the reviewer with the review. Beyond altering anonymity, one way to incentivize reviewers would be to publish the reviews publicly, perhaps as part of the paper. Just like online appendices, we can have a set of reviews published online with each article.

With that, some concrete suggestions beyond the ones already discussed. Expectedly — given they come from a quantitative social scientist — they fall into two broad brackets: releasing and learning from the data already available, and collecting more data.

Existing Data

A fair bit of data can be potentially released without violating anonymity. For instance,

  • Whether manuscript was desk rejected or not
  • How many reviewers were invited
  • Time taken by each reviewer to accept (NA for those from whom you never heard)
  • Total time in review for each article (till R and R or reject) (And separate set of column for each revision)
  • Time taken by each reviewer
  • Recommendation by each reviewer
  • Length of each review
  • How many reviewers did the author(s) suggest?
  • How often were suggested reviewers followed-up on?

In fact, much of the data submitted in multiple-choice question format can probably be released easily. If editors are hesitant, a group of scholars can come together and we can crowdsource collection of review data. People can deposit their reviews and the associated manuscript in a specific format to a server. And to maintain confidentiality, we can sandbox these data allowing scholars to run a variety of pre-screened scripts on it. Or else journals can institute similar mechanisms.

Collecting More Data

  • In economics, people have tried to institute shorter deadlines for reviewers to effect of reducing review times. We can try that out.
  • In terms of incentives, it may be a good idea to try out cash but also perhaps experimenting with a system where reviewers are told that their comments will be public. I, for one, think it would lead to more responsible reviewing. It would be also good to experiment with triple-blind reviewing.

If you have additional thoughts on the issue, please propose them at: https://gist.github.com/soodoku/b20e6d31d21e83ed5e39

Here’s to making advances in the production of science and our pursuit for truth.