Nov 132006

A couple of days ago I blogged on a new survey by Chris Beckett and Simon Inger of Scholarly Information Strategies looking at whether or not self-archiving of authors’ journal articles on open archives would be likely to lead to cancellation of journal subscriptions by libraries.

I had also conducted a survey of librarians’ opinion on the same topic earlier this year which received responses from 340 librarians (report available on the ALPSP website here; free summary article here). Although my methodology was much less sophisticated than this study’s (and there was some criticism on one of the lists that some of my questionnaire begged a key question), there are nevertheless some puzzling differences between the findings in the two surveys:

(1) Librarians in this new survey expressed no preference for the publisher’s final version over the author’s refereed post-print. I have no particular argument with the librarians on this but they appear to say something different in my survey. When we asked “What freely available versions would you consider an acceptable substitute for the journal?”, 97% chose the final journal pdf but only 39% the author’s post-print. The recent finding does seem anomalous, though: as the authors say, it is not concurrent with current observed behaviour. If it is a true finding, then it’s a concern for those who think there is value in the copy-editing, linking, formatting etc. that publishers do.

(2) This survey finds that a 6 month embargo had little impact (on librarians’ preference for (delayed) OA material rather than the paid-for version), but that longer periods (12/24 months) had larger effects. Overall, the direction of the preference for more immediate material is hardly surprising, but the key point for many publishers is where the tipping point lies, and on this there appears to be a conflict with our earlier findings. The very different methodologies makes it hard to compare reliably, but my data showed only 18% of librarians regarded material embargoed for 4 or more months an acceptable substitute for a subscription.


Some other points:

(3) I have some doubts about the methodology (which others seem to share), although it was obviously done very carefully by experienced people. It’s not fully clear to me, though, that the matrices of options presented in the conjoint analysis really capture the choices faced by librarians. From a librarian’s perspective, for instance, there are important factors such as supplier support, ease of integration with library systems, etc. that were not represented in the study. Unlike Steven Harnad, however, I do think that it is reasonable to think about “acquisition” of OA content – the librarians’ role is to help library patrons with their searching and other information needs, and this includes using freely available tools. In a modern digital library, online resources have to be integrated with the library system and other resources, which takes time (& money). From a librarian’s perspective, therefore, it’s as reasonable to talk about acquiring free materials as for an organisation to talk about acquiring open source software.

(4) On a perhaps minor point of methodology, I thought “reliability of access” was an ambiguous term, especially as “reliable” is used in a different sense in the second part of the survey (“Content on OA archives is reliable”). I suppose librarians would have seen it in this context as related to uncertainly as to whether content available on one day would be there the next, or have the same link, analogous to the criticism of aggregation products as being not reliable because publishers can withdraw their content. But the results on this factor don’t seem to tell us much.

(5) One reason that librarians may be slow to substitute OA archives for journal subscriptions is that they have limited knowledge of the degree of overlap between their holdings and the archives. The authors don’t mention this as a factor but in my survey we found only 16% of librarians had estimates of this overlap, and only 31% had plans to introduce systems to measure this overlap. This may change with the integration of archive records into e.g. Thomson’s World of Science, which will make the archive content (even) more visible, add (additional) citation linking and perhaps crucially also “validate” the content by inclusion in a trusted source. (Incidentally, on a related point, I noted that Thomson’s Reynold Guida’s slides from his presentation on this at the Charleston conference included the point that “[WoS/arXiv integration] Provides links and citation data at article level as an incentive for every researcher to post work on IRs ”.)

(6) Looking at the headline question of the new study, there’s enough in just the Part 2 findings to at the least suggest OA archives will be a factor in cancellations. With some 38% of librarians disagreeing with the statement that publishers should not worry about OA archive causing cancellation, and 40% thinking that libraries that continue to subscribe when the content is freely available, it is surely rational for publishers to worry about this! As Steven Harnad says, though, the key issue is one of timing and extent – we don’t know exactly when and how many. But am I right to think Steven is changing his position slightly on this, from saying there’s no evidence that OA will cause cancellation and that arXiv suggests the opposite in physics, to saying that it is “possible, even probable that self-archiving will cause some cancellations” but that this is a (much) lower-priority issue than the gains to research (and hence society) that would flow from OA. I think the latter emphasis is much more coherent and one that crystallises the issue for publishers.