practice and the influence of publisher policies in the...

12
Learned Publishing (2006), 19, 85–95 Self-archiving practice and the influence of publisher policies in the social sciences Kristin Antelman LEARNED PUBLISHING VOL. 19 NO. 2 APRIL 2006 I ntroduction While repositories are transforming select disciplines, the more widespread impact of networked technologies has been to facilitate informal scholarly commu- nication. Formal communication remains unchanged, at least for the time being; the peer-reviewed journal still constitutes the primary channel for official dissemination of scholarship. Where in the past scholars would have phoned a colleague to request a reprint, most of them will now email the col- league and receive an electronic copy of the article in reply. Some authors are choosing to take the next step in making their articles available by self-archiving a copy on a per- sonal website or in a repository. There are several important consequences of this behaviour. One is that, by self-archiving, an author is contributing (knowingly or not) to the body of open access scholarship. Another is that, as these copies are distrib- uted on the Internet, the reader is more likely to find versions of articles that may differ from the final published version. Publishers and others have discussed the importance of understanding and control- ling document versions from the perspective of scholarly communication. 1 At the heart of this issue is integrity, or trust. In our cur- rent system publishers facilitate peer review, provide a guaranty of authenticity and defin- itiveness, and brand an article. These quality filters are important both to authors and readers. When choosing where to submit a manuscript, authors attach the highest value to the journal’s reputation, 2 and readers sim- ilarly rely on that context as affirming the quality of an article. This context is easy to assess with electronic journals, in which articles typically have the same look and feel as their print versions. But context is much more variable when looking at an article discovered through a search engine. The greatest context will be present with the Self-archiving practice and the influence of publisher policies in the social sciences 85 LEARNED PUBLISHING VOL. 19 NO. 2 APRIL 2006 Self-archiving practice and the influence of publisher policies in the social sciences Kristin Antelman North Carolina State University © Kristin Antelman 2006 ABSTRACT: Authors in different disciplines exhibit very different behaviours on the so-called ‘green’ road to open access, i.e. self-archiving. This study looks at the self-archiving behaviour of authors publishing in leading journals in six social science disciplines. It tests the hypothesis that authors are self-archiving according to the norms of their respective disciplines rather than following self-archiving policies of publishers, and that, as a result, they are self-archiving significant numbers of publisher PDF versions. It finds significant levels of self-archiving, as well as significant self-archiving of the publisher PDF version, in all the disciplines investigated. Publishers’ self-archiving policies have no influence on author self-archiving practice. Kristin Antelman

Upload: trinhhanh

Post on 16-Apr-2018

221 views

Category:

Documents


2 download

TRANSCRIPT

Learned Publishing (2006), 19, 85–95Self-archiving practice and the influence of publisher policies in the social sciencesKristin AntelmanLEARNED PUBLISHING VOL. 19 NO. 2 APRIL 2006

Introduction

While repositories are transformingselect disciplines, the more widespreadimpact of networked technologies has

been to facilitate informal scholarly commu-nication. Formal communication remainsunchanged, at least for the time being; thepeer-reviewed journal still constitutes theprimary channel for official dissemination ofscholarship. Where in the past scholarswould have phoned a colleague to request areprint, most of them will now email the col-league and receive an electronic copy of thearticle in reply. Some authors are choosingto take the next step in making their articlesavailable by self-archiving a copy on a per-sonal website or in a repository. There areseveral important consequences of thisbehaviour. One is that, by self-archiving,an author is contributing (knowingly or not)to the body of open access scholarship.Another is that, as these copies are distrib-uted on the Internet, the reader is morelikely to find versions of articles that maydiffer from the final published version.

Publishers and others have discussed theimportance of understanding and control-ling document versions from the perspectiveof scholarly communication.1 At the heartof this issue is integrity, or trust. In our cur-rent system publishers facilitate peer review,provide a guaranty of authenticity and defin-itiveness, and brand an article. These qualityfilters are important both to authors andreaders. When choosing where to submit amanuscript, authors attach the highest valueto the journal’s reputation,2 and readers sim-ilarly rely on that context as affirming thequality of an article. This context is easy toassess with electronic journals, in whicharticles typically have the same look and feelas their print versions. But context is muchmore variable when looking at an articlediscovered through a search engine. Thegreatest context will be present with the

Self-archiving practice and the influence of publisher policies in the social sciences 85

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

Self-archiving

practice and the

influence of

publisher policies

in the social

sciencesKristin AntelmanNorth Carolina State University

© Kristin Antelman 2006

ABSTRACT: Authors in different disciplines exhibitvery different behaviours on the so-called ‘green’road to open access, i.e. self-archiving. This studylooks at the self-archiving behaviour of authorspublishing in leading journals in six social sciencedisciplines. It tests the hypothesis that authors areself-archiving according to the norms of theirrespective disciplines rather than followingself-archiving policies of publishers, and that, as aresult, they are self-archiving significant numbers ofpublisher PDF versions. It finds significant levels ofself-archiving, as well as significant self-archiving ofthe publisher PDF version, in all the disciplinesinvestigated. Publishers’ self-archiving policies haveno influence on author self-archiving practice.

Kristin Antelman

document version that looks exactly like thearticle retrieved from a publisher’s website:the typeset and branded published article inPDF format.

While some publishers do not permit anyself-archiving, most now permit authors toarchive a version of their article, typicallythe author’s final version. But because of itsinherent substitutability for an article on thepublisher’s website, publishers who permitself-archiving typically do not grant authorspermission to self-archive the publisher PDFversion. If authors are self-archiving forscholarly communication, however, onewould expect that they would archive thebest and easiest version possible, which isthe publisher PDF. There is therefore a ten-sion between publishers’ desire to controltheir content and some authors’ desire tomake it available in the best and easiest-to-use format. But authors may not even beaware that this tension exists. Studies haveshown that authors are confused by, andgenerally uninterested in, their copyrightagreements with publishers where they aremost likely to learn about their self-archivingrights.3 While several studies have exploredvarious aspects of self-archiving behaviour,surveying authors on their practices andopinions or quantifying rates of open access,none have specifically focused on examiningself-archiving of publisher PDFs.4

The study reported here looks at severalaspects of self-archiving behaviour by authorspublishing in leading journals in six socialscience disciplines. It tests the hypothesisthat authors are self-archiving according tothe norms of their respective disciplinesrather than publisher self-archiving policies,and that, as a result, they are self-archivingsignificant numbers of publisher PDF ver-sions.

Background

Versions

Mabe and Amin, and Guédon describe thetwo hats a scholar wears, that of researcher-as-author and that of author-as-reader.5 Theresearcher-as-author cares relatively moreabout journals, while the author-as-readercares relatively more about articles. As 75%

of all authors search Google for articles,there are more readers of open access schol-arship than there are authors practisingself-archiving.6 What factors into readersatisfaction with open access? Clearly,document findability and identifiability arefundamental. Guédon points out that wehave a long way to go on findability, even forarticles archived in disciplinary or institu-tional repositories (currently the minority).7Identifiability is a similarly confusing picturethanks to multiple versions.

Authors, publishers, open access advo-cates, and others have assigned labels todifferent versions of articles, but oftenthese labels (e.g. preprint, postprint) do notdescribe the same objects. In identifying anarticle, a reader would take into accountsome combination of the document’s con-tent and who wrote it, attributes of thedocument’s appearance such as formatting,notations about its provenance, and thecontext in which it is found. For example,one relatively well-defined, author-controlledversion is the technical report or workingpaper preprint, usually so-labelled andlocated in the context of a series or collec-tion. Institutional branding informs thereader that there has been some measure ofquality control, while at the same time thereader knows from disciplinary acculturationthat the text is likely to differ substantiallyfrom that of a final published article. Otherpreprints are much less well defined andcould be a draft at any stage up to andincluding the final text. The SHERPA pro-ject website defines ‘preprint’ as ‘the versionof the paper before peer review’ and the ‘au-thor postprint’ as ‘the version of the paperafter peer-review with revisions having beenmade’.8 This definition of preprint is not uni-versally adopted, however, as the site notes:‘Another use of the term pre-print is for thefinished article, reviewed and amended,ready and accepted for publication – butseparate from the version that is type-set orformatted by the publisher. This use is morecommon amongst publishers.’9

In 2000, in a report entitled ‘Defining andCertifying Electronic Publication in Sci-ence’, a working group of publishers pointedto the importance of understanding andidentifying article versions.10 They opened

86 Kristin Antelman

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

there are morereaders of

open accessscholarshipthan there

are authorspractising

self-archiving

the report with the statement, ‘The peer-reviewed article will continue to play acrucial part in the certification, communica-tion and recording of scientific research.However, in the electronic environment itrepresents one point on a potential contin-uum of communication.’ In fact, in thecontext of open access, the peer-reviewedarticle represents at least two points on thatcontinuum: the author’s final version (theso-called author postprint), and the copy-edited, formatted, and branded publisherversion. The author postprint is not onlytermed a preprint by some but it looks likeone and can only definitively be identified asa postprint by comparing its text with thepublished article. While an author mightmake declarations of postprint status, suchas ‘accepted for publication in . . .’, readersknow at a glance that such a document is anauthor-controlled document and scholarstrust publishers to assert this provenance,not individual authors. Without the contex-tual branding of a journal or pagination,such a document is not, according to thenorms of most disciplines, citable.

Work on formal version control is still inthe early stages. The report of the ResearchCouncils of the UK on open access empha-sizes the importance of distinguishingbetween pre- and postprints and states theirintention to work with repository managers‘to develop a common and recognizablestandard to ensure that the distinctionbetween pre-prints and post-prints is clear toall users’.11 In late 2005 JISC announced itwould fund a scoping study on repositoryversion identification, and NISO is alsointending to explore the issue.12 The NIHdefines what they mean by postprint, andtheir policy for replacing it with the pub-lisher’s version if the publisher chooses, butthey are not validating submissions as post-prints.13 In any event, such formal versioncontrol mechanisms are not likely to beadopted by authors posting documents topersonal web pages.

Publisher self-archiving policies

Many, if not most, copyright agreements aresilent and/or ambiguous about self-archivingrights and on the question of version. Even

where the agreement addresses self-archiv-ing, by using ambiguous terms to describethe versions (e.g. ‘the work’, ‘the paper’, ‘thecontribution’), the author does not haveenough information to know what versionhe or she can self-archive. In addition, pub-lishers require copyright assignment atdifferent points in the review processdepending on the journal, and cover differ-ent – or multiple – versions.14 Ambiguityalso exists in the more standardized self-archiving conditions in the SHERPA data-base. A survey of those conditions showedthat, of 34 conceptually distinct conditionsacross all publishers, 12 could be associatedwith either preprints, postprints, or both,either because the restriction is unspecifiedor it makes sense for both (e.g. ‘on author’sor employer’s website only’).15

Very few authors’ self-archiving behaviouris driven by what they find in SHERPAbecause few are aware of it.16 More surpris-ing, many authors are not aware of the termsof their copyright agreements and manybelieve they hold copyright to their ownworks.17 At the same time, authors whoknow their rights or publisher requirementsdo not necessarily abide by them. Pinfield,and Swan and Brown found that there isclear evidence that a significant percentageof authors are self-archiving irrespective ofthe terms of their agreements, or self-archiv-ing without pursuing clarification frompublishers.18 Lack of awareness of, or realinterest in, the importance of these agree-ments is evidenced by how few authorsattempt to modify them.19

A study of author behaviour in six socialscience disciplines

Methodology

Author self-archiving practice was surveyedacross six social science disciplines: sociol-ogy, anthropology, economics, politicalscience, geography, and psychology. Wherepossible (sociology, anthropology, and eco-nomics), journals with different authorself-archiving policies were selected. Thepolicies were ‘nothing allowed’ (SHERPA‘white’ journals) and ‘preprint and authorpostprint allowed’ (SHERPA ‘green’ jour-

Self-archiving practice and the influence of publisher policies in the social sciences 87

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

authors whoknow theirrights orpublisherrequirements donot necessarilyabide by them

nals). Publishers whose policies were notknown to have changed since 2003,although they may have been codified sincethen, were selected. Eleven publishers arerepresented in the sample.20 It is importantto note that, despite what we know aboutpublishers’ policies and aggregate authorbehaviour, without asking individual authorsit is impossible to know which rights theyhad, or believed they had, when they self-archived a given article, and whether theyheeded or disregarded them.

Approximately 1,400 articles from 22journals published between January 2003and June 2004 were sampled. Google wassearched to identify self-archived openaccess articles.21 Because author behaviour,rather than overall rate of open access, is ofinterest, only those articles evidently postedby authors were counted. Open access copiesfound were coded by version: working paper,preprint, author postprint, or publisherPDF.22 A separate sample of approximately550 articles from the first half of 2005 wasalso taken to answer a related question,namely do authors self-archive at the samerate within six months of publication? Thatsample showed that the overall rate of self-archiving was virtually identical between thetwo time periods. Therefore, what followsreflects a combined data set of approxi-

mately 2,000 articles published between2003 and 2005.

The author postprint is the version closestto the publisher’s that most green publisherspermit authors to self-archive, and is there-fore a critical version from the perspective ofopen access. But, as Goodman points out,little is known about the actual prevalenceof author postprints.23 Acknowledging thatit is impossible to identify definitively anauthor postprint without comparing texts, itremains of interest to approximate howmany self-archived articles a reader would belikely to identify as author postprints. Tounderstand how authors self-identify post-prints, standard phrases were collected fromtitle pages of documents examined in thisstudy. The following designations wereselected as likely identifying postprints: ‘Iwould like to thank anonymous referees’(the most prevalent); ‘published in/isforthcoming in/to appear in/accepted forpublication in [journal name]’; or ‘final ver-sion’. Using this admittedly approximatemethod to identify postprints, it was foundthat, within these six social sciences, theauthor postprint is overall quite rare (5% ofall articles, constituting 16% of all self-archived articles). Pinfield has found thatapparent preprints in arXiv were almostalways in fact author postprints.24 So to get a

88 Kristin Antelman

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

Figure 1 Self-archiving by discipline and article version.

without askingindividual

authors it isimpossible toknow which

rights they had

very rough sense of how many authorpostprints were missed using this approach,ten preprints were sampled across disciplinesand compared with the publisher’s version.Of those ten, five were indeed authorpostprints and five were preprints. So theactual number of author postprints in thissample is likely to be greater (and, therefore,preprints fewer).

Self-archiving rates by version

Self-archiving rates varied considerablyacross disciplines in the sample, from a lowof 14% in geography to a high of 60% in eco-nomics (see Figure 1). It is important to notethat the self-archiving rate in this samplecannot be generalized to the disciplines as awhole, since the sample contained dispro-portionately high-impact journals in eachfield, and there is evidence that, at least inbiomedicine, authors publishing in presti-gious journals archive at a greater rate.25

The version authors chose to self-archivevaried significantly as well. Looking only atthe self-archived articles (n = 575), authorsoverall posted approximately the same num-ber of publisher PDFs (49%) as preprints/author postprints (51%). (See Figure 2.)This overall balance does not carry down tothe individual disciplines, however: the per-centage of self-archived articles that werepreprints ranged from 3% in anthropology to

61% in economics; author postprints rangedfrom 3% in anthropology to 22% in econom-ics; and publisher PDFs ranged from 17% ineconomics to 94% in anthropology. Interest-ingly, economics, with the highest overallself-archiving rate by a factor of two, has thelowest relative rate of self-archiving of thepublisher PDF. Publisher PDFs made upbetween 52% and 74% of self-archivedarticles in sociology, political science, geogra-phy, and psychology. Excepting economics,the balance is 68% publisher PDF/32% pre-and author postprint.

The ease of working with the publisherPDF, for both authors and readers, surelycontributes to its prevalence. Posting a pub-lisher PDF is technically straightforward,particularly if the publisher provides theauthor with an electronic file of the versionor final proof as part of the publication pro-cess. As Watkinson points out, in addition toease of printing (appealing to readers), ‘theother advantage of PDF was, and is, the sim-ple built-in security’.26 Posting an authorpostprint can be much more complicated,particularly when an author has multipleversions and, with charts or graphics, multi-ple files. Putting them all together into asingle document some time after an articlewas submitted for publication can be a non-trivial process. That authors see theseadvantages of the PDF format apart from thebranding advantage is evidenced by the fact

Self-archiving practice and the influence of publisher policies in the social sciences 89

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

Figure 2 Versions of self-archived articles by discipline.

self-archivingrates variedconsiderablyacrossdisciplines

that over 90% of the author versions in thisstudy were converted to PDF format beforeself-archiving.

Self-archiving rates and publisher policies

This study finds no relationship betweenpublisher policy and self-archiving behav-iour. In fact, the overall self-archiving ratefor the white journals examined in this studyis significantly higher than the self-archivingrate for the green journals (36% versus 27%,�2 = 11.62, P < 0.001, n = 6 [white], n =16 [green]).27 Since white journals prohibitall self-archiving, clearly their policies arenot influencing author behaviour. The same

absence of correlation can be seen in therate of self-archiving of publisher PDFs.There is no significant difference when thesix disciplines are looked at in combination:authors posted publisher PDFs of 15% of allarticles from white journals and 14% fromgreen journals. (See Figure 3.) Since moreauthors publishing in white journalsself-archived, however, the percentage ofself-archived articles represented by pub-lisher PDFs is lower (42% for white journalsversus 51% for green journals).

The correlation of publisher policy withself-archiving practice was examined withinthe three disciplines where journals with dif-ferent policies were sampled (sociology,anthropology, and economics). (See Figure4.) In economics, journal policy has little orno effect, with identical postprint and pub-lisher PDF self-archiving rates for articlespublished in white and green journals andone-third more preprints in green journals.The possibility that policies have an effectin anthropology remains open. Authors ofarticles in white journals self-archive signifi-cantly fewer articles overall (�2 = 4.78, P <0.05). In sociology the picture is reversed,with significantly more overall self-archiving,and more self-archiving of publisher PDFs,of articles published in white journals (thiscould be explained in part by the fact thatthe flagship journal of the American Socio-logical Association, the American Journal of

90 Kristin Antelman

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

Figure 3 Self-archiving by journal policy andarticle version.

Figure 4 Self-archiving by discipline, publisher policy, and version.

this study findsno relationship

betweenpublisher

policy andself-archiving

behaviour

Sociology, is a white journal and has a self-archiving rate of 48%).

A disciplinary norm for sharing versionsappears to be consistent across publisher pol-icies. (See Figure 5.) The publisher PDF as apercentage of all self-archiving is virtuallythe same in each discipline. At the least, onecan conclude that distribution of preprints isimportant in economics, less so in sociology,and much less so in anthropology.

Underscoring the tentative nature ofthese differences between disciplines, it wasobserved that there is more variationbetween the journals within the three disci-plines than there is between the disciplines.(See Table 1.) Significant variation is alsoseen between journals published by the samepublisher. For instance, the four journalspublished by the University of Chicago Presshad overall self-archiving rates between 16%and 67% and ranged between 16% and 91%in publisher PDF as a percentage of allself-archiving. Author self-archiving rates in

the two University of Chicago Press titles ineconomics were also significantly different(67% and 41%).

To look at the question of whether self-archiving practices were consistent withinthe same journal over time, articles from2003 were compared with articles from2004–5 for each journal. The similar overallself-archiving rates (28% in 2003, 29% in2004–5), and relatively consistent self-archiving rates at the discipline level, con-cealed great variability at the journal level.Seven journals had close to or the same self--archiving rates between the time periods, 11increased, and seven decreased, rangingfrom a 115% increase to 100% decrease inself-archiving. This fluctuation reflects smallsample sizes at the individual journal level(averaging 46 articles for each time period).Given this great variability at the journallevel, it is intriguing that the rates remainedrelatively constant at the discipline level andoverall.

Self-archiving practice and the influence of publisher policies in the social sciences 91

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

Figure 5 Self-archiving by discipline and publisher policy.

Table 1 Self-archiving rates by journal

Self-archiving rate (%) Overallself-archivingrate (%)Journal A Journal B Journal C Journal D

Economics 79 (g) 67 (w) 56 (g) 41 (w) 60Sociology 48 (w) 26 (w) 12 (g) 11 (g) 23Anthropology 24 (g) 16 (g) 16 (w) 5 (w) 17

w, white; g, green.

distribution ofpreprints isimportant ineconomics, lessso in sociology,and muchless so inanthropology

Discussion

This study’s findings that the self-archivingprofiles of each discipline examined are verydifferent, and that there is an evident lack ofinfluence of publisher policies across theboard, both point to the fact that disciplin-ary norms for scholarly communication aredriving author self-archiving behaviour.There are several conceptual models thatcan help frame these findings. Whitley pro-posed that differences between disciplinescan be characterized in terms of the degreeof mutual dependence between researchersand the degree of task uncertainty in defin-ing shared problems, goals and procedures.28

Disciplines with high mutual dependenceand low task uncertainty move very quicklyto answer problems that have agreed-uponimportance, and they place a high value onestablishing priority (high-energy physics,the first field to self-archive and establish adisciplinary repository, is the prime example).As he points out, the humanities and socialsciences tend to be characterized by a lowdegree of mutual dependence and high taskuncertainty.29 Researchers in these discip-lines address a wide range of research ques-tions and lack shared problem definitions,theoretical goals, and work procedures.30 Asa result, they tend to rely less on exchange ofpre-published research.

Becher proposes an analogous model fordescribing the social dimensions of intellec-tual fields using two properties, convergent/divergent and urban/rural.31 In divergentdisciplines, just as in disciplines with lowmutual dependence, there is a less stableélite with less intellectual control over thefield than in convergent disciplines. Whileeconomics is characterized as convergent,sociology and geography are characterized asdivergent.32 While Becher does not addressthem specifically, political science, anthro-pology, and psychology would also be moredivergent than convergent. The six socialsciences examined in this study, with theexception of much of economics, are alsoconsidered rural disciplines, characterized bylow people-to-problem ratios and researchthat addresses a wide range of problemswithout agreed-upon priority or importance.Rural disciplines rely less on sharing

preprints, as rapid dissemination of resultsand establishment of priority is of lesserpriority than in urban disciplines.

This study confirms the profile of therural, low mutual dependence/high taskuncertainty discipline in finding relativelyfew self-archived preprints in anthropology,geography, sociology, and psychology. Thiswas less the case in political science (11%preprints) and not the case at all in econom-ics (36% preprints). The relatively highnumber of working papers in economics(more than one-third of all preprints) showsthat economics has characteristics of a disci-pline with mutual dependence in engagingin informal sharing of research results insymposia and working paper series. Thesedisciplines, with economics and political sci-ence again being most anomalous, exhibitanother characteristic of rural disciplines inmost frequently self-archiving postprints.

Another characteristic of rural fields isporous disciplinary boundaries. In economicsand political science in particular, and to alesser extent sociology, psychology, andgeography, authors have – and desire tohave – readers from related disciplines andeven outside academia, for instance in gov-ernment. This pursuit of a broad audiencecould be an incentive for some authors toself-archive. Since fields that are most con-cerned with establishing priority have theshortest publication delays,33 the presence ofself-archiving in the social sciences may notbe due so much to authors’ desire to estab-lish priority, but to the desire of a significantminority to use readily available technolo-gies to mitigate the detrimental effects ofparticularly long publication delays. At leastone study has shown authors are quite con-cerned with such delays.34 On the otherhand, the relative absence of disciplinaryrepositories in the social sciences indicatesthat informal publication on personal ordepartmental websites, however incom-pletely indexed such websites are, seems tosuffice to meet authors’ needs.

Conclusions and further study

This study finds that social scientists areself-archiving at a significant rate and thatpublisher self-archiving policies do not influ-

92 Kristin Antelman

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

there is anevident lack of

influence ofpublisher

policies acrossthe board

ence their self-archiving practices. Thelikelihood that an author will self-archive atall varies between disciplines but remainsfairly constant between different publisherpolicies (‘green’ and ‘white’) in the three dis-ciplines where journals with differentpolicies were examined. Similarly, the rate atwhich authors self-archive the publisherPDF version, a practice that none of the sur-veyed journals permits, varies significantlybetween disciplines but not by journalpolicy.

Many authors as well as publishers con-sider posting an article to a website orrepository to be a form of publication. InKing’s framework, self-archiving would beconsidered ‘weak’ publishing, while the pub-lished, indexed version in the journal wouldbe ‘strong’ publishing.35 From the perspec-tive of the author-as-reader searching onGoogle for desired articles, the strongestpublications are those that are both dis-coverable and useful, i.e. have accessible fulltext and are a citable version. A publishercopy that the reader does not have access to,while discoverable, is not accessible and sonot as useful as an open access copy. Simi-larly, an open access copy located on awebsite is not as discoverable as it could be ifit were in a repository or a journal. On theutility side, a preprint or author postprint isless useful than the publisher’s versionbecause, while it supports knowledge discov-ery, it cannot be cited. The self-archivedpublisher PDF, while not as discoverable asthe publisher’s copy, is just as useful becauseit bundles knowledge discovery with certifi-cation and citability. The publisher PDFversion allows the author to transfer thevalue of the publisher’s copy to one he con-trols, and allows readers to apply what theyalready know about identifying scholarlypublications. Thus, it is not surprising thatmost authors who choose to self-archive apostprint, self-archive the publisher PDFversion, the version that is both easiest topost and that they know will provide readerswith the information they want.

As ‘rural’ disciplines, and therefore with aless strong sense of collective identity, thesocial sciences (excepting economics) self-archive at a moderate rate, nothing likewhat is seen in more ‘urban’ fields such as

physics, mathematics, and computer sciencethat have developed disciplinary repositoriesto meet the demands for rapid commun-ication. A corollary of this is that which jour-nals one selects to examine is particularlyconsequential when seeking to understandbehaviour at a discipline level. As Guédonnotes, journals in the social sciences andhumanities ‘often tend to incarnate a theor-etical position or even a particular group,rather than a segment of knowledge.’36 Thegreat variability found between journalswithin disciplines in this study underscoresthat fields with high task uncertainty tendto create journals that occupy distinct intel-lectual niches. As Fry and others havedemonstrated, intellectual fields within asingle discipline can vary to a great extent,and a given intellectual field may have morein common with an intellectual field inanother discipline than its own parent disci-pline.37 In fact, both Whitley and Becherprefer to make comparisons at the level ofintellectual fields (or specialisms) ratherthan disciplines. Future studies selectingintellectual fields as the unit of analysisrather than disciplines will be better posi-tioned to identify and interpret differencesin self-archiving practice. Further light couldbe shed on the question of the influenceof publisher policy on self-archiving behav-iour if authors were the unit of analysis, andthe actual status of their copyright agree-ment, as well as their knowledge and statedmotivation, were correlated with their self-archiving behaviour at the article level.

While it is clear that by self-archiving anauthor is contributing to the body of openaccess scholarship, his action may or maynot reflect self-aware endorsement of openaccess principles. This study shows that, formany reasons, one should be careful aboutcharacterizing a given rate of self-archivingas a rate of ‘adoption’ of open access, andauthors as ‘complying with’ or ‘violating’publishers’ policies. Self-archiving one’s ownresearch may be simply a logical extension ofexisting scholarly communication practicesinto the digital realm. Just as it is authorsand not publishers who self-archive, it is dis-cipline-based norms and practices that shapeself-archiving behaviour, not the terms ofcopyright transfer agreements.

Self-archiving practice and the influence of publisher policies in the social sciences 93

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

most authorswho choose toself-archivea postprint,self-archive thepublisher PDF

Acknowledgements

I thank David Goodman, Janice Pilch, Greg Raschke, andAnthony Watkinson for their helpful comments andsuggestions. Some of this data was reported at theCharleston Conference ‘Issues in Book and SerialAcquisitions’, 2–5 Nov 2005.

References

1. Watkinson, A. Securing authenticity of scholarlypaternity and integrity, Book Industry Commun-ication, 2003. (http://www.bic.org.uk/securing%20authenticity.pdf). Frankel, M. S., et al. Defining andcertifying electronic publication in science: a proposalto the International Association of STM Publishers.Learned Publishing, 2000:13(4), 251–8. URL: www.ingentaselect.com/rpsv/catchword/alpsp/09531513/v13n4/s8/p251. Lynch, C. When documents deceive:trust and provenance as new factors for informationretrieval in a tangled web. Journal of the AmericanSociety for Information Science and Technology,2001:52(1), 12–17.

2. Rowlands, I. and Nicholas, D. New journal publishingmodels: an international survey of senior researchers.A CIBER report for the Publishers Association andthe International Association of STM Publishers,2005, p. 17 (http://www.slais.ucl.ac.uk/papers/dni-20050925.pdf).

3. Swan, A. and Brown, S. Open access self-archiving:an author study. Key Perspectives Limited, 2005, p. 56(http://eprints.ecs.soton.ac.uk/11006/).

4. Swan and Brown take a somewhat different approachand find that 49% of survey respondents hadself-archived or published in an open access journal inthe previous three years (Open access self-archiving,30). Harnad, S. and Brody, T. Comparing the impactof open access (OA) vs. non-OA articles in the samejournals. D-Lib Magazine, 2004: June. (http://www.dlib.org/dlib/june04/harnad/06harnad.html).Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C.,Demleitner, M., and Murray, S. S. Worldwide use andimpact of the NASA Astrophysics Data System digitallibrary, Journal of the American Society for InformationScience and Technology, 2005:56(1), 36–45. Antelman,K. Do open access articles have a greater researchimpact? College & Research Libraries, 2004:65(5),372–82. Wren, J.D. Open access and openlyaccessible: a study of scientific publications shared viathe internet. BMJ, 2005:330, 1–5. Wren’s study didfocus on PDFs, but the documents found were notnecessarily self-archived by authors.

5. Mabe, M. A. and Amin, M. Dr Jekyll and Dr Hyde:author–reader asymmetries in scholarly publishing.Aslib Proceedings, 2002:54(3). Guédon, J. C. The‘green’ and ‘gold’ roads to open access: the case formixing and matching. Serials Review, 2004:30(4), 316.

6. Swan and Brown (Open access self-archiving, 25)found that 72% of researchers surveyed reportedsearching for articles in Google. Gadd et al. found that75% of authors who had not self-archived (and 95%of those who had) had used others’ self-archivedpapers. Gadd, E., Oppenheim, C., and Probets, S.RoMEO Studies 3: how academics expect to useopen-access research papers. Journal of Librarianshipand Information Science, 2003:35(3), 7.

7. Guédon, J. C. The ‘green’ and ‘gold’ roads to open

access: the case for mixing and matching. SerialsReview, 2004:30(4), 316.

8. Definitions and terms: pre-print and post-print,SHERPA Publisher Copyright Policies & Self-archiving (http://www.sherpa.ac.uk/romeoinfo.html).

9. Ibid.10. Frankel, M. S., Elliott, R., Blume, M., Bourgois, J. M.,

Hugenholtz, B., Lindquist, M. G., Morris, S., andSandewell, E. Defining and certifying electronicpublication in science. Learned Publishing, 2000:13(4), 251. www.ingentaselect.com/rpsv/catchword/alpsp/09531513/v13n4/s8/p251

11. RCUK Position statement on access to researchoutputs, Research Councils UK, 7–8 (http://www.rcuk.ac.uk/access).

12. JISC ITT: Scoping study on repository versionidentification. (http://www.jisc.ac.uk/index.cfm?name=funding_versionidentification). NISO/ALPSP work-ing group on versions of journal articles. (http://www.niso.org/committees/Journal_versioning/JournalVer_comm.html).

13. Policy on enhancing public access to archivedpublications resulting from NIH-funded research. USDepartment of Health and Human Services NationalInstitutes of Health (http://grants.nih.gov/grants/guide/notice-files/NOT-OD-05-022.html).

14. Gadd et al. found that 69% of publishers asked forcopyright transfer prior to refereeing the paper. Gadd,E., Oppenheim, C., Probets, S. RoMEO Studies 4:an analysis of journal publishers’ copyright agree-ments. Learned Publishing, 2003:16(4), 293. www.ingentaselect.com/rpsv/catchword/alpsp/09531513/v16n4/s9/p293

15. This survey of the SHERPA database was conductedon 13 June 2005. It is also noted that a quantitativeview of restrictions and conditions does not reflectthe true impact on authors, because a publisher-levelview treats small and large publishers equally. Also,not all conditions are equivalent in their burdenon the author: Pinfield found that authors followreasonable ones that are also in their own interest, e.g.citing the published version on a postprint, but notunreasonable ones, e.g. copying specified clausesfrom the copyright agreement. Pinfield, S. How dophysicists use an e-print archive? Implications forinstitutional e-print services. D-Lib Magazine, 2001:Dec, 5. (http://www.dlib.org/dlib/ december01/pinfield/12pinfield.html).

16. Swan and Brown (Open access self-archiving, 48)report that only 10% of authors are currently aware ofSHERPA.

17. JISC Disciplinary Differences Report, Sparks, S.,Rightscom Ltd, 2005, 49. (http://www.jisc.ac.uk/uploaded_documents/Disciplinary%20Differences%20and%20Needs.doc). Swan and Brown (Open accessself-archiving, 56) report 35% believe they hold suchrights. Gadd and Oppenheim found 61% of authorssurveyed believed they held copyright. Gadd, E. andOppenheim, C. RoMEO Studies 1: the impact ofcopyright ownership on academic author self-archiving. Journal of Documentation, 2003:59(3), 256.

18. Gadd et al. reported 12% of respondents to theirsurvey would ignore publisher policies forbiddingthem from making articles freely available online.RoMEO 4, 13. Swan and Brown (Open accessself-archiving, p. 56) reported that 84% of all authorsdid not ask for permission and that one-third did notknow if permission was required. Pinfield randomly

94 Kristin Antelman

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6

sampled ten articles in arXiv that were published bypublishers forbidding posting of the postprint andfound that all were the same as published version incontent (How do physicists use an e-print archive?,pp. 5–6).

19. JISC Disciplinary Differences Report, 49. Swan andBrown, Open access self-archiving, p. 56. AUniversity of California faculty survey reported byJohn Ober indicated that 18% of respondents have atone time modified the terms of a copyright agreementor contract. Ober, J. Postprint Repository Services:Context and Feasibility at the University ofCalifornia, Final Draft, 2005, 23 (http:// osc.universityofcalifornia.edu/responses/materials/UC_postprintstudy_final.pdf). Gadd et al. reported that only3% of authors insisted on retaining copyright,RoMEO Studies 1, 259.

20 ‘White’ journals surveyed were: American Journal ofSociology, Current Anthropology, Economic Developmentand Cultural Change, and Journal of Political Economy(all University of Chicago Press); American Socio-logical Review (American Sociological Association);American Ethnologist (American AnthropologicalAssociation). ‘Green’ journals surveyed were: Journalof Social Issues, Social Problems (University ofCalifornia Press), Annals of the Association of AmericanGeographers, Psychological Science, Econometrica, andAmerican Journal of Political Science (all Blackwell),Political Behavior and Theory and Society (bothKluwer), American Journal of Physical Anthropology(Wiley), Journal of Human Evolution, PoliticalGeography, and Journal of Econometrics (all Elsevier),Journal of Conflict Resolution and Comparative PoliticalStudies (both Sage), Progress in Human Geography(Arnold), and Journal of Applied Psychology (AmericanPsychological Association). Data were collectedbetween March and July 2005.

21. Only articles labeled by journals as ‘research’ or‘original’ were selected. Typically, full titles weresearched as phrases in Google. If there were noresults, or there was ambiguity between documentswith the same title, the author’s last name was addedto the query. Since Google does not consistentlyindex parenthetical phrases or certain characters(e.g., ?, a), where present those were omitted from thequeries.

22. When multiple self-archived versions were found, the“highest” (i.e., closest to the final publisher version)was coded as the self-archived version.

23. Goodman, D. The criteria for open access. SerialsReview, 2004:30(4), 260.

24. Pinfield, S. How do physicists use an e-print archive?

Implications for institutional e-print services, D-LibMagazine, 2001: Dec, 5. (http://www.dlib.org/dlib/december01/pinfield/12pinfield.html).

25. These numbers do not represent an accurateestimation of open access in these disciplines forseveral other reasons: non-Google search engineswere not used, unindexed author homepages were notsought out, and only author-archived copies werecounted.

26. Wren, J. D. Open access and openly accessible: astudy of scientific publications shared via the internet.BMJ, 2005:330, 1-5.

27. Watkinson, A. Securing authenticity of scholarlypaternity and integrity, Book Industry Commun-ication, 2003. (http://www.bic.org.uk/securing%20authenticity.pdf).

28. Whitley, R. The Intellectual and Social Organization ofthe Sciences, New York, Oxford University Press, 2000,chapter 5.

29. Ibid., 127.30. Ibid., 119–20.31. Becher, T. and Trowler, P. R. Academic Tribes and

Territories, The Society for Research into HigherEducation,& Open University Press, 2001, chapter 9.

32. Ibid., 187–88.33. Ibid., 189.34. ‘What Authors Want’: the ALPSP research study on

the motivations and concerns of contributors tolearned journals. Learned Publishing, 1999:12 (3), 171.www.ingentaselect.com/rpsv/catchword/alpsp/09531513/v12n3/s1/p170

35. King, R., Spector, L., and McKim, G. The guildmodel. Journal of Electronic Publishing 2002:8(1).(http://www.press.umich.edu/jep/08-01/kling.html).

36. Guédon, J. C. In Oldenburg’s long shadow: librarians,research scientists, publishers, and the control ofscientific publishing. In ARL (Association of ResearchLibraries) Proceedings, 138. May 2001 MembershipMeeting, 22. (http://www.arl.org/arl/proceedings/138/guedon.html).

37. Fry, J. Scholarly research and information practices: adomain analytic approach. Information Processing &Management 2006:42 (available online Nov 2004),299–316. Becher and Trowler.

Kristin AntelmanAssociate Director for Information TechnologyNCSU LibrariesBox 7111Raleigh, NC 27696-7111, USAEmail: [email protected]

Self-archiving practice and the influence of publisher policies in the social sciences 95

L E A R N E D P U B L I S H I N G V O L . 1 9 N O . 2 A P R I L 2 0 0 6