Georgia Institute of Technology From the SelectedWorks of Jan Youtie 2014 The use of citation speed to understand the effects of a multi-institutional science center Jan Youtie, Georgia Institute of Technology - Main Campus Available at: https://works.bepress.com/jan_youtie/59/
1 The Use of Citation Speed to Understand the Effects of a Multi-institutional Science Center Jan Youtie Enterprise Innovation Institute and School of Public Policy, Georgia Institute of Technology Atlanta, Georgia USA 30308 jan.youtie@innovate.gatech.edu +1 404-894-6111 (voice) +1 404-894-8194 (fax) Appearing in Scientometrics
2 Abstract The extent to which an article attracts citations has long been of interest. However, recent research has emphasized not just the receipt but also the pacing of citation. Citation speed has been shown to be affected by journal prestige and self-citation but also public funding of research. Amidst these viewpoints, this paper explores the speed of article citation of a multiinstitutional, multi-disciplinary publicly funded research center relative to that of a comparison group of articles. Results indicate that articles by authors affiliated with the center are significantly more likely to have early-cited papers within the year of publication than the random comparison group, with controls by field also being significant. Implications for the ability of a publicly funded center to attract attention toward articles are discussed. Keywords: citation; research center; speed; self-citation JEL C25
3 Introduction Citations are commonly used by research sponsors and policy makers to assess science and technology investment. This use of citations occurs in spite of the decades of research on the nuances of the application of citations. The use of citations to measure science quality has been discussed as providing incentives for self-citation (Glanzel et al., 2006). Interpretations of the meaning of citations have been found to be dependent upon several factors and characteristics: time, field, journal, article, author/reader, publication availability, and technical problems in correctly citing references (Garfield 1973). Rafols et al. (2012) demonstrate that citations are biased toward single discipline research. Citations may be viewed in terms of two perspectives on how they can be used: as cognitive scientific credit giving versus as rhetorical influencing of colleagues (Merton 1973, Cozzens 1989). Analyses of the semantic positioning of citations in an article give further interpretation to these nuances. Citations may be central or peripheral to the article; negative citations may be used; highly cited works may consistently be cited; and citations may appear in different sections of the article to bolster background, relevant work, theory, data source, or methodology (Bornmann and Daniel 2008). Surveys have also been used to ask about motivations for citation. The extent to which references emphasize exemplary citations versus creative works and reviews of prior work versus new concepts have been explored, for example, in a study of psychology journals in Shadish et al., (1995). Surveys also have been used to understand why certain works do not get cited. White and Wang (2000) found that the relevance of the research to the topic was the most important factor in whether or not it was cited. These authors and MacRoberts and MacRoberts (1996) emphasize that many influences on research do not end up in the cited reference list.
4 This paper proposes the usefulness of conceiving of citations as a method of knowledge diffusion (Leydesdorff, 1998). To this end, it is reasonable to consider the extent and speed of knowledge diffusion in scientific investments. In examining the diffusion of research through citations, several factors should be given consideration. Bornmann and Daniel (2010) find that articles published in a prestigious journal are more likely to receive their first citations more quickly than those rejected by this journal and published elsewhere. In contrast, Rogers (2010) finds that highly cited nanotechnology articles received both quick and lagged citations. Does receipt of public sponsorship lead to quicker citation? Lewison and Cunningham (1991) observe that European Community research programs in biotechnology and environmental chemicals were more likely to receive citations in the five years after publications than other papers in the same journals. This early citation rate was particularly evidenced for biotechnology publications, which was more likely to have a higher share of its articles in earlier years than the comparison group. In looking at the timing of receipt of citations, self-citation is an issue. The influence of authors citing of their own work has been shown to diminish as time passes, suggesting that self-citations are apt to play a greater role in studies with shorter than lengthier time horizons. Moreover, the prevalence of self-citation is liable to be greater for articles with more authors than for those with fewer authors (Rousseau 1999; Aksnes 2003; Schubert et al. 2006). This paper explores the pacing of citation of publications to assess the effects of research organization. Specifically, the paper examines the influence of a multi-institutional, multidisciplinary research center s authored-articles in terms of speed of article citation. Centers may provide an institutional framework for faster research diffusion (Youtie et al., 2012) although
5 contrasting findings suggest that centers are important because they involve excellent researchers, rather than because of their organizational resources per se (Rogers 2012). Amidst these perspectives, this work proposes that the timing of citations is an important consideration in understanding the value of investments in science centers. Specifically, it posits that centeraffiliated authors articles are more likely to be cited within a year of publication than a comparison group of articles written by authors not located in a given center. This proposition is related to the work on immediacy, which is commonly studied as a feature of journal citation, for example, in the Thomson Reuters Immediacy Index. 1 Method Citation distributions over time of 87 articles authored by members of a multi-institution, multidisciplinary center are contrasted with 88 articles authored by a comparison group. The data in this paper draw on the experience of a given publicly funded multidisciplinary center (hereafter known as Center or the center ). Center s name and identifying attributes are not able to be disclosed, because of the kind of relationship of the author to the center. Even so, the data presented here represent valid and genuine data assembled from articles which Center researchers authored. This analysis uses articles authored by Center members as indicated in the center s annual reports to its sponsor. As with many centers, Center submits annual reports which include lists of publications. The focal dataset for this analysis is 177 peer reviewed articles listed in the Center s annual reports for its first five years of operation. In December 2009, researchers extracted the articles listed, removed duplicates and searched for the resulting articles in three 1 http://admin-apps.webofknowledge.com/jcr/help/h_immedindex.htm, accessed on March 7, 2013.
6 indexed datasets (1) Web of Science (WOS), Science and Social Science Citation Index, 2005-9, (2) ERIC, and (3) PubMed. The search for Center articles in an indexed web site allows using the same source to create an appropriate comparison group. WOS has been selected the primary dataset because fewer Center articles were available in ERIC and PubMed. Only 49 articles were indexed in PubMed and 53 in ERIC whereas WOS had 87 articles. Most of the articles that were in ERIC or PubMed were also in WOS. WOS articles accounted for 67 percent of all non-book, non- under review/in press, or non-article publications such as rejoinders or technical reports. This article coverage rate by WOS is comparable to what was reported in a parallel study of 15 nanoscale science and engineering centers (Rogers et al, 2012). The citations of the center are assessed relative to a comparison group. The comparison group was developed through extracting articles from the same journals as the top Center journal subject categories for the same 2005-2009 time period, then selecting a random sample of 88 articles matched on journal and publication year. The aim of this dataset is to provide a non-center framework against which to understand the pacing of citations of articles associated with the center. This random sample comparison group has similar characteristics to that of the center focal dataset. For example, the publication year distribution is similar for the center focal dataset and the random comparison group, albeit with a somewhat higher concentration of publications in 2006 and lesser concentration of publications in 2008. In addition, all journal titles in the comparison group are also in the center focal dataset. The center by design is more interdisciplinary than the comparison group, however. Indeed it has higher interdisciplinarity measures as represented in specialization and integration scores (specialization=.27 for the Center versus.40 for the comparison group, where 1=highly specialized and 0=not specialized;
7 integration=.7 for the Center versus.63 for the comparison, where 1=highly integrated and 0=not integrated) (Porter et al. 2007). An initial examination of the citation patterns of the center and comparison group suggest that the latter forms a reasonable comparison group for assessing citations of center publications. Small distributional differences in citation patterns are evidenced in that the center has somewhat more papers that have attracted at least 50 citations (mean=9.6, median=2) whereas the comparison group has slightly fewer zero-cited papers (mean=8.3, median=3) (Figure 1). Nonetheless, plots of the citation distributions for the center and comparison group appear similar (Mann-Whitney U test, p>.10). INSERT FIGURE 1 NEAR HERE This paper posits a model that conceptualizes quick citations as a function of centerrelated authorship, along with the number of authors, year of publication, and field dummy variables. The analysis controls for field effects given that the center is comprised of three main fields: psychology, neuroscience, and educational research. It also controls for the number of authors and year of publication as guided by previous research. In addition, self-citations are considered given previous work on the role of self-citation in quick first citations (Rousseau 1999; Aksnes 2003; Schubert et al. 2006). This model is represented as: QUICKCITES =f (CENTER, NUMBEROFAUTHORS, PUBYEAR, FIELD), where QUICKCITES represents the share of articles with a citation in the year in which the article was published. CENTER takes the value of 1 for publications authored by center-
8 affiliated investigators and zero for publications in the comparison group. NUMBEROFAUTHORS represents total number of authors in a given publication. PUBYEAR represents the year of publication. Field controls are represented through a series of dummies: PSYCHOLOGY=1 for articles in journals in psychology related subject categories, zero otherwise; NEUROSCIENCE=1 for articles in journals in neuroscience related subject categories, zero otherwise; and EDUCATION serves as the reference category. A variable QUICKNOCITES proxies the effect of self-citations. QUICKNOCITES takes the difference between the share of all citations in the first year of publication and the share of self-citations in the same period. The results form the basis for a second dependent variable in which a given publication that is cited in the same year of publication is not considered to be quickly cited because it was cited by one or more of the authors of the given publication. Results Descriptive information about the variables indicates some noteworthy bivariate differences between center and non-center articles (Table 1). Publications authored by center-affiliated investigators authored papers have a slightly higher proportion of early citations (.44 for Center versus.36 for the comparison group). In contrast, the comparison group has slightly more authors per paper (3.16 for Center versus 4.17 for the comparison group). The comparison group also has a lower share of non-self cited articles (.37 for Center versus.24 for the comparison group). A correlation matrix was developed to examine the extent of association between the covariates in the model (See Table 2). Correlation coefficients were found to be relatively small although a moderate relationship between two of the field control variables is observed. INSERT TABLES 1 AND 2 NEAR HERE
9 Two probit models are estimated. The first model estimates the probability of QUICKCITES as a function of CENTER and other covariates. The second model estimates the probability of QUICKNOCITES also as a function of CENTER and other covariates (Table 3). The likelihood ratio chi square for these models is statistically significant and McFadden s pseudo R2 is 0.09 for the first model and 0.13 for the second. Sixty-four percent of the QUICKYEAR observations are correctly classified in the first model based on an average of positive and negative predictive value percentages, and 68% of the observations are correctly classified by the second model that controls for quick self-citations (QUICKNOCITES). Although these results suggest that the likelihood of achieving citations in the first year is weakly predicted by this model, the models likelihood ratio chi squares are still statistically significant. INSERT TABLE 3 NEAR HERE The results indicate that articles by authors affiliated with the center are significantly more likely to have early-cited papers within the year of publication than the comparison group. The covariates number of co-authors, field, and publication year are also significant. This relationship between center affiliation and likelihood of quick citations is not diminished by controlling for quick self-citations in the second model, although the number of authors is no longer significant in the second model. The marginal effects of the probit models, computed at the means of the independent variables, give further indication of the relationship between center affiliation and quick citations. The probability of a quick citation is 18% higher for publications authored by centeraffiliated investigators. If self-citations are accounted for (i.e., the second model with
10 QUICKNOCITES as the dependent variable), the probability of a quick citations is 25% higher for publications authored by center-affiliated investigators. It could be argued that the comparison group includes authors who are part of science centers, though not the one under investigation. Put another way, the comparison group could still have a center effect. This possibility was explored by looking up information located from author web sites and published curriculum vitae. Seventeen of the 88 comparison group articles were found to have authors affiliated with science centers, none of which were with the same science center. This feature was incorporated into the model as an explanatory variable COMPARISONCENTER. The COMPARISONCENTER variable has 10 of these 17 articles (or 59%) attracting citations in the first year. When the COMPARISON CENTER variable is included in the model results, both COMPARISON CENTER and CENTER are statistically significant along with the other covariates, although this is not the case when considering selfcitations (Table 3). This finding suggests that the relationship between publications from authors in a research center and quick citation is not just a function of the center in question, but also of other centers operating in similar research domains. Conclusions This work examined the relationship between authors associated with a multi-institutional, multidisciplinary science center and the likelihood that their publications will be quickly cited. The results have shown that articles by authors affiliated with the center are significantly more likely to have early-cited papers within the year of publication than the comparison group, even taking into account controls such as number of authors, year of publication, and field effects. This relationship is not diminished by taking quick self-citations into consideration. As such it
11 builds on existing citation timing studies and the work of Lewison and Cunningham (1991) concerning the effect of public investments on this timing. It could be argued that this relationship between center affiliation and quick citation is to be expected because of the agglomeration of researchers involved in a center. The rationale could lie in that centers have more researchers who although not authors, could become aware of and therefore cite articles written by their center colleagues. Normal center communications such as regular meetings, retreats, all hands meetings, and reporting requirements could result in center colleagues becoming aware of and consequently citing their colleagues papers. This is not necessarily the case, however. Only three of the 38 center-authored articles cited in the year they were published were cited by center colleagues articles. Centers are able to make available several mechanisms for disseminating research papers of affiliated investigators. Centers typically have websites that promote new research results, often on the home page. Center funding sponsors often promote findings of their investigators in news releases on the sponsor s websites and through gatherings of principal investigators who have received funding from the sponsors and who typically are asked to make brief presentations or posters summarizing results. The particular center in this study also made an effort to disseminate its work in popular journals and magazines, policy documents and congressional testimony, newspapers, radio, and television newscasts. Blogs were shown to pick up information about center research as well. Centers also have funding to allow for travel of its affiliated faculty and students to other locales and to conferences. The extent to which these mechanisms were useful in encouraging a quick citation of a paper authored by center-affiliated investigators could be looked into in future research through surveys of or interviews with citing authors.
12 Although the models presented in this paper are statistically significant, they also suggest that additional factors besides what were considered in this paper have an effect on the likelihood that a paper will receive a speedy citation by another paper. The ability of a journal to post articles online first could be one such factor in the ease of a paper s obtaining an early citation. In this vein, the growing use of open access policies for dissemination of research could be significant. van Raan s sleeping beauties thesis further suggests the possibility that a few of the articles that did not attract many citations in the short term may well attract more when reassessed over a longer period (van Raan 2004). Future studies should give consideration to investigating the significance of these and other factors on the pacing of research dissemination and usage. Policymakers and funders commonly use citations to assess science and technology research investments. If citations were used to assess the effect of the center versus a comparison group as was illustrated in Figure 1, an observer would conclude that the investment in the center had no effect from a citation standpoint over and above what is evidenced in the randomly selected set of publications. Yet Glanzel and colleagues point out, If the citation rate of a given paper set is low, bibliometrics cannot immediately conclude on the quality of underlying research. (Glanzel et. al., 2006, p. 268) This research indicates that the center did have a nuanced effect in that its work flowed more quickly to and was used by the relevant research community. What s more, self-citations were less of a factor in this faster flow of research knowledge. Policy makers acknowledge that research is a long-range investment not to be expected to yield significant results in the near-term. Yet they often express a need for rapid outcomes to justify their research investments. To the extent that short-term results are necessitated, citation
pacing indicators can provide value. 13
14 References Aksnes, Dag. (2003). A macro study of self-citation. Scientometrics, 56(2), 235 246. Bornmann, L. & Daniel, H.-D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation 64(1), 45-80 Bornmann, L. & Daniel, H.-D. (2010). Citation speed as a measure to predict the attention an article receives: An investigation of the validity of editorial decisions at Angewandte Chemie International Edition. Journal of Informetrics 4, 83-88. Cozzens, S. (1989). What Do Citations Count? The Rhetoric-First Model. Scientometrics 15(5-6), 437-447. Garfield, E. (1973). Citation Analysis as a Tool in Journal Evaluation, Science, New Series 178, 471-479. Glanzel, W. & Thijs, B. (2004). The influence of author self-citations on bibliometric macro indicators. Scientometrics, 59(3), 281 310. Hargens, L.L. (2000). Using the Literature: Reference Networks, Reference Contexts, and the Social Structure of Scholarship, American Sociological Review 65 (6), 846-865. Leydesdorff, L. (1998). Theories of citation? Scientometrics, 43(1), 5 25. Lewison, G. & Cunningham, P. (1991). Bibliometric Studies for the Evaluation of Trans- National Research. Scientometrics, 21, 223-244
15 MacRoberts, M.& MacRoberts, B. (1996). Problems of Citation Analysis. Scientometrics 36 (3), 435-444. Merton, R.K., (1973).The Matthew Effect in Science, in Merton, R.K. (Ed.) The Sociology of Science. Theoretical and Empirical Investigations. Chicago, Chicago University Press, 439-459. Porter, A.L., Cohen, A.S., Roessner, J.D., Perreault, M. (2007). 'Measuring researcher interdisciplinarity' Scientometrics 72 (1), 117-147. Rafols, I., Leydesdorff, L., O'Hare, A., Nightingale, P. & Stirling, A. (2012) How journal rankings can suppress interdisciplinary research: A comparison between Innovation Studies and Business & Management. Research Policy, 41 (7), 1262-1282. Rogers, J, (2010). Citation analysis of nanotechnology at the field level: implications of R&D evaluation. Research Evaluation 19(4): 281-290. Rogers, J. (2012). Is the difference in research center productivity real?: The effect of concentration of human resources in centers. Presentation at the Innovative Methods for Innovation Management and Policy Conference, Beijing, China. Rogers, J., Youtie, J. & Kay, L. (2012). Program-level assessment of research centers: Contribution of Nanoscale Science and Engineering Centers to US Nanotechnology National Initiative goals. Research Evaluation 21 (5), 368-380. Rousseau, R. (1999). Temporal differences in self-citation rates of scientific journals. Scientometrics, 44(3), 521 531. Schubert, A., Glanzel, W., & Thijs, B. (2006). The weight of author self-citations. A fractional approach to self-citation counting. Scientometrics, 67(3), 503 514.
16 Shadish, W., Tolliver, D., Gray, M. & Gupta, S (1995). Author Judgements about Works They Cite: Three Studies from Psychology Journals. Social Studies of Science 25 (3), 477-498. Van Raan (2004). Sleeping Beauties in science. Scientometrics 59 (3), 467-472. White, M. & Wang, P. (2000). A qualitative study of citing behavior: contributions, criteria, and meta level documentation concerns. Library Quarterly, 67, 122-154. Youtie, J., Kay, L. & Melkers, J. (2012). Bibliographic Coupling and Network Analysis to Assess Knowledge Coalescence in a Research Center Environment, Atlanta Georgia: White paper.
17 Table 1. Descriptive Statistics Variable Observation Mean Std. Dev. Min Max Center QUICKYEAR 87 0.44 0.50 0 1 NUMBER OF AUTHORS 87 3.16 1.58 1 9 PSYCHOLOGY 87 0.39 0.49 0 1 NEUROSCIENCE 87 0.21 0.41 0 1 PUBYEAR 87 2007.38 1.37 2004 2009 QUICKNOCITES 87 0.37 0.49 0 1 Comparison QUICKYEAR 88 0.36 0.48 0 1 NUMBER OF AUTHORS 88 4.17 2.47 1 16 PSYCHOLOGY 88 0.17 0.38 0 1 NEUROSCIENCE 88 0.63 0.49 0 1 PUBYEAR 88 2007.14 1.36 2005 2009 QUICKNOCITES 88 0.24 0.43 0 1
Table 2. Correlation Matrix QUICKYEAR CENTER NUMBEROFAUTHORS PSYCHOLOGY NEUROSCIENCE PUBYEAR QUICKYEAR 1 CENTER 0.0747 1 NUMBEROFAUTHORS 0.1929-0.2372 1 PSYCHOLOGY 0.0623 0.2454-0.1839 1 NEUROSCIENCE 0.1372-0.424 0.4475-0.5276 1 PUBYEAR 0.1287 0.0894 0.0176-0.0431-0.1003 1
Table 3. Probit Model of the Relationship between Center Affiliation and Likelihood of Quick Citations 1 COMPARISONCENTER QUICKYEAR QUICKNOCITES QUICKYEAR QUICKNOCITES CENTER 0.47 0.75 0.59 0.85 (0.23)** (0.25)** (0.24)** (0.26)*** COMPARISONCENTER 0.66 0.55 (0.37)* (0.38) NUMBEROFAUTHORS 0.10 0.08 0.10 0.07 (0.05)* (0.05) (0.06)* (0.05) PSYCHOLOGY 0.63 0.54 0.62 0.53 (0.27)** (0.29)* (0.27)** (0.29)** NEUROSCIENCE 0.76 0.88 0.75 0.87 (0.29)*** (0.31)*** (0.29)*** (0.31)*** PUBYEAR 0.15 0.17 0.15 0.16 (0.08)** (0.08)** (0.08)* (0.08)* Constant -309.69-334.79-295.11-327.32 (151.59)** (158.03)** (152.56)** (152.56)** Pseudo R 2.09.13.10.11 Log Likelihood -107.6*** -102.5** -105.98-95.95 Observations 175 175 175 175 1 Standard errors in parentheses, 64% correctly classified (QUICKYEAR), 68% correctly classified (QUICKNOCITES); 66% correctly classified (QUICKYEAR WITH COMPARISON), 71% correctly classified (QUICKNOCITES WITH COMPARISON) *significant at 10%; ** significant at 5%; *** significant at 1%
Figure 1. Center and Comparison Group by Times Cited 8 7 Center 6 Number Papers 5 4 3 2 1 0 0 50 100 150 200 250 Times Cited 8 7 Comparison Group 6 Number Papers 5 4 3 2 1 0 0 50 100 150 200 250 Times Cited Source: Author analysis of 87 center publications and 88 comparison group publications.