September 08, 2006

Academic publishing, tomorrow

Imagine a world where academic publishing is handled purely by academics, rather than ruthless, greedy corporate entities. [1] Imagine a world where hiring decisions were made on the techincal merit of your work, rather than the coterie of journals associated with your c.v. Imagine a world where papers are living documents, actively discussed and modified (wikified?) by the relevant community of interested intellectuals. This, and a bit more, is the future, according to Adam Rogers, a senior associate editor at "Wired" magazine. (tip to The Geomblog)

The gist of Rogers' argument is that the Web will change academic publishing into this utopian paradise of open information. I seriously doubt things will be like he predicts, but he does raise some excellent points about how the Web is facilitating new ways of communicating technical results. For instance, he mentions a couple of on-going experiments in this area:

In other quarters, traditional peer review has already been abandoned. Physicists and mathematicians today mainly communicate via a Web site called arXiv. (The X is supposed to be the Greek letter chi; it's pronounced "archive." If you were a physicist, you'd find that hilarious.) Since 1991, arXiv has been allowing researchers to post prepublication papers for their colleagues to read. The online journal Biology Direct publishes any article for which the author can find three members of its editorial board to write reviews. (The journal also posts the reviews – author names attached.) And when PLoS ONE launches later this year, the papers on its site will have been evaluated only for technical merit – do the work right and acceptance is guaranteed.

It's a bit hasty to claim that peer review has been "abandoned", but the arxiv has certainly almost completely supplanted some journals in their role of disseminating new research [2]. This is probably most true for physicists, since they're the ones who started the arxiv; other fields, like biology, don't have a pre-print archive (that I know of), but they seem to be moving toward open access journals for the same purpose. In computer science, we already have something like this, since the primary venue for publication is in conferences (which are peer reviewed, unlike conference in just about every other discipline), and whose papers are typically picked up by CiteSeer.

It seems that a lot of people are thinking or talking about open access this week. The Chronicle of Higher Education has a piece on the momentum for greater open access journals. It's main message is the new letter, signed by 53 presidents of liberal arts colleges (including my own Haverford College) in support of the bill currently in Congress (although unlikely to pass this year) that would mandate that all federally funded research be eventually made publicly available. The comments from the publishing industry are unsurprisingly self-interested and uninspiring, but they also betray a great deal of arrogance and greed. I wholeheartedly support more open access to articles - publicly funded research should be free to the public, just like public roads are free for everyone to use.

But, the bigger question here is, Could any these various alternatives to the pay-for-access model really replace journals? I'm less sure of the future here, as journals also serve a couple of other roles that things like the arxiv were never intended to fill. That is, journals run the peer review process, which, at its best, prevents erroneous research from getting a stamp of "community approval" and thereby distracting researchers for a while as they a) figure out that it's mistaken, and b) write new papers to correct it. This is why, I think, there is a lot of crap on the arxiv. A lot of authors self-police themselves quite well, and end up submitting nearly error-free and highly competent work to journals, but the error-checking process is crucial, I think. Sure, peer review does miss a lot of errors (and frauds), but, to paraphrase Mason Porter paraphrasing Churchill on democracy, peer review is the worst form of quality control for research, except for all the others. The real point here is that until something comes along that can replace journals as being the "community approved" body of work, I doubt they'll disappear. I do hope, though, that they'll morph into more benign organizations. PNAS and PLoS are excellent role models for the future, I think. And, they also happen to publish really great research.

Another point Rogers makes about the changes the Web is encouraging is a social one.

[...] Today’s undergrads have ... never functioned without IM and Wikipedia and arXiv, and they’re going to demand different kinds of review for different kinds of papers.

It's certainly true that I conduct my research very differently because I have access to Wikipedia, arxiv, email, etc. In fact, I would say that the real change these technologies will have on the world of research will be to decentralize it a little. It's now much easier to be a productive, contributing member of a research community without being down the hall from your colleagues and collaborators than it was 20 years ago. These electronic modes of communication just make it easier for information to flow freely, and I think that ultimately has a very positive effect on research itself. Taking that role away from the journals suggests that they will become more about getting that stamp of approval, than anything else. With its increased relative importance, who knows, perhaps journals will do a better job at running the peer review process (they could certainly use the Web, etc. to do a better job at picking reviewers...).

[1] Actually, computer science conferences, impressively, are a reasonable approximation to this, although they have their own fair share of issues.

[2] A side effect of the arXiv is that it presents tricky issues regarding citation, timing and proper attribution. For instance, if a research article becomes a "living" documents, proper citation becomes rather problematic. For instance, which version of an article do you cite? (Surely not all of them!) And, if you revise your article after someone posts a derivative work, are you obligated to cite it in your revision?

posted September 8, 2006


your footnote on the arXiv brings up a good point: although the arXiv is given a lot of respect in terms of dissemination, its ability to act as a tech report server (creating time stamping) is problematic. And this is primarily why I agree that journals aren't going away any time soon, at least in their peer-review role.

p.s here's a related question I had posted. See the comments.

Posted by: Suresh at September 9, 2006 07:18 AM

Exactly. The time-stamping issue is a huge problem for concurrent work, and it's very easy to people to get bent out of shape over attribution there. I'm not sure what the right solution is here (some thoughts in a moment, though), but some journals / conferences are not helping matters. For instance, Science and Nature claim that they will not consider papers that have been posted online (e.g., on the arxiv), which encourages people to sit on their results, running the risk of being time-stamp-scooped on the arxiv. To me, this seems like Nature and Science trying to protect their pre-Web role, but on the other hand, they've been burned before by people going to the public with results first, and to the journal second (I'm thinking of the polywater and cold fusion examples).

In my mind, I think that the question of whether to cite something that is on the arxiv, but which has not yet been peer-reviewed, should go a little like this. If your new results depend, in a non-trivial way, on the results of the arxiv report (two quick examples: you use methods that were developed in the arxiv report, or the significance of your results depends on the arxiv report), then yes, you should cite it (although perhaps in a way that notes that the arxiv results have not passed peer review yet). Otherwise, it should be merely a courtesy to cite the arxiv report - something you do to point readers to related work, or perhaps something you do to extend some graciousness to the authors of the arxiv study - but again in a way that perhaps does not lend the arxiv report as much weight as a peer reviewed publication.

The tricky part is the cultural use of the arxiv, and I think this is what's led to some problems with time-stamping things by posting there. Science writers clearly trawl the arxiv for things to write about, and they typically present the results in the same way that peer-reviewed results are presented. This is highly problematic. Seperately, if by time-stamping something on the arxiv, I instantly gain ownership (in terms of attribution rights for subsequent papers) of the presented ideas (regardless of their veracity), then the arxiv can easily become a mine field for future research, and lead to fights between authors over proper attribution. There doesn't seem to be an accepted understanding about what it even means to post something as a tech report (as your post points out) / arxiv posting, so maybe these issues will never be resolved. In the meantime at least, I think authors will continue to manage these case-by-case, until the culture changes. This isn't a pretty way to do it -- some people will get their feelings hurt and some people will lose ownership when they rightly deserve it -- but I guess that's the way it goes.

Posted by: Aaron at September 9, 2006 12:58 PM

I'm glad you posted our earlier discussion. In my take on Weil, I don't have time to write my "proof" here.

The comment about undergrads and the arXiv is strange. The comment should instead be about researchers and the arXiv. Undergrads typically find out about the arXiv when somebody like us tells them about it. (If they replace 'arXiv' by 'Facebook', then we suddenly get an accurate comment.) Comments about their not knowing things like IM are amusing, though, because the non-GUI technology has been there for a long time. The program 'talk' is really old. It dates to, what, the 70s? (For me, that is old. :) ) Anyway, it had existed for a long time when I did things like using it from my experimental economics class to ask my friend to go into my room to see if a certain call of mine had been returned. (The advantage I had was that the professor would never know to check of things like IMing that I could be doing instead of paying attention. Ah, the good old days... now I've joined the Dark Side.)

In terms of time-stamping, that issue shows up especially in subjects like math where the refereeing process takes very long. For example, I have a paper that will appear in The Monthly that cites a Notices paper (an expository paper; the Notices is the AMS analog of Physics Today) that cites the original arXiv version of that paper from several months before the Notices paper was even conceived.

My practice is to cite the latest version of an arxiv paper because the site for that already includes all the links to prior versions. An exception to this, that I have yet to need to use in practice, would be if something was dropped (not to due to lack of correctness but for other reasons) from the older version and I wanted to cite that specific item. (I once had a referee on a paper that had both localized and extended solutions ask me to focus on the extended part and save the localized stuff to be expanded on in a future paper so that the paper wouldn't appear as two disjoint papers which were stapled together, which was his take on it. I indeed followed that advice, and I am glad for it because, in retrospect, I ended up deciding that he was right even though I didn't agree with it at first.)

I need to go to a baseball game now.

Posted by: Mason at September 9, 2006 06:03 PM