The problems of field-normalization of bibliometric data and comparison among research institutions: Recent Developments

Similar documents
INTRODUCTION TO SCIENTOMETRICS. Farzaneh Aminpour, PhD. Ministry of Health and Medical Education

Bibliometric measures for research evaluation

Scopus. Advanced research tips and tricks. Massimiliano Bearzot Customer Consultant Elsevier

Citation analysis: Web of science, scopus. Masoud Mohammadi Golestan University of Medical Sciences Information Management and Research Network

INTRODUCTION TO SCIENTOMETRICS. Farzaneh Aminpour, PhD. Ministry of Health and Medical Education

The journal relative impact: an indicator for journal assessment

Bibliometric Rankings of Journals Based on the Thomson Reuters Citations Database

A systematic empirical comparison of different approaches for normalizing citation impact indicators

Cited Publications 1 (ISI Indexed) (6 Apr 2012)

BIBLIOMETRIC REPORT. Bibliometric analysis of Mälardalen University. Final Report - updated. April 28 th, 2014

Research Evaluation Metrics. Gali Halevi, MLS, PhD Chief Director Mount Sinai Health System Libraries Assistant Professor Department of Medicine

Measuring Academic Impact

SCIENTOMETRICS AND RELEVANT BIBLIOGRAPHIC DATABASES IN THE FIELD OF AQUACULTURE

MEASURING EMERGING SCIENTIFIC IMPACT AND CURRENT RESEARCH TRENDS: A COMPARISON OF ALTMETRIC AND HOT PAPERS INDICATORS

Complementary bibliometric analysis of the Educational Science (UV) research specialisation

Complementary bibliometric analysis of the Health and Welfare (HV) research specialisation

SEARCH about SCIENCE: databases, personal ID and evaluation

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 3, Issue 2, March 2014

Source normalized indicators of citation impact: An overview of different approaches and an empirical comparison

What is bibliometrics?

THE USE OF THOMSON REUTERS RESEARCH ANALYTIC RESOURCES IN ACADEMIC PERFORMANCE EVALUATION DR. EVANGELIA A.E.C. LIPITAKIS SEPTEMBER 2014

Percentile Rank and Author Superiority Indexes for Evaluating Individual Journal Articles and the Author's Overall Citation Performance

UNDERSTANDING JOURNAL METRICS

Research Playing the impact game how to improve your visibility. Helmien van den Berg Economic and Management Sciences Library 7 th May 2013

Bibliometric glossary

Citation analysis: State of the art, good practices, and future developments

On the causes of subject-specific citation rates in Web of Science.

Citation & Journal Impact Analysis

InCites Indicators Handbook

Embedding Librarians into the STEM Publication Process. Scientists and librarians both recognize the importance of peer-reviewed scholarly

Your research footprint:

An Introduction to Bibliometrics Ciarán Quinn

What are Bibliometrics?

Cascading Citation Indexing in Action *

SCOPUS : BEST PRACTICES. Presented by Ozge Sertdemir

Bibliometrics & Research Impact Measures

Journal Citation Reports Your gateway to find the most relevant and impactful journals. Subhasree A. Nag, PhD Solution consultant

Impact Factors: Scientific Assessment by Numbers

USING THE UNISA LIBRARY S RESOURCES FOR E- visibility and NRF RATING. Mr. A. Tshikotshi Unisa Library

Corso di dottorato in Scienze Farmacologiche Information Literacy in Pharmacological Sciences 2018 WEB OF SCIENCE SCOPUS AUTHOR INDENTIFIERS

Comprehensive Citation Index for Research Networks

Towards a Stratified Learning Approach to Predict Future Citation Counts

Alphabetical co-authorship in the social sciences and humanities: evidence from a comprehensive local database 1

Scientometrics & Altmetrics

Analysis of the Hirsch index s operational properties

Elsevier Databases Training

Bibliometrics and the Research Excellence Framework (REF)

In basic science the percentage of authoritative references decreases as bibliographies become shorter

Citation Analysis. Presented by: Rama R Ramakrishnan Librarian (Instructional Services) Engineering Librarian (Aerospace & Mechanical)

Navigate to the Journal Profile page

CITATION CLASSES 1 : A NOVEL INDICATOR BASE TO CLASSIFY SCIENTIFIC OUTPUT

The Google Scholar Revolution: a big data bibliometric tool

WHO S CITING YOU? TRACKING THE IMPACT OF YOUR RESEARCH PRACTICAL PROFESSOR WORKSHOPS MISSISSIPPI STATE UNIVERSITY LIBRARIES

Where to present your results. V4 Seminars for Young Scientists on Publishing Techniques in the Field of Engineering Science

Promoting your journal for maximum impact

IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 0, NO.,

Normalizing Google Scholar data for use in research evaluation

DON T SPECULATE. VALIDATE. A new standard of journal citation impact.

Scopus. Dénes Kocsis PhD Elsevier freelance trainer

The mf-index: A Citation-Based Multiple Factor Index to Evaluate and Compare the Output of Scientists

CITATION INDEX AND ANALYSIS DATABASES

Citations and Self Citations of Indian Authors in Library and Information Science: A Study Based on Indian Citation Index

PBL Netherlands Environmental Assessment Agency (PBL): Research performance analysis ( )

Using Bibliometric Analyses for Evaluating Leading Journals and Top Researchers in SoTL

researchtrends IN THIS ISSUE: Did you know? Scientometrics from past to present Focus on Turkey: the influence of policy on research output

CONTRIBUTION OF INDIAN AUTHORS IN WEB OF SCIENCE: BIBLIOMETRIC ANALYSIS OF ARTS & HUMANITIES CITATION INDEX (A&HCI)

Measuring the Impact of Electronic Publishing on Citation Indicators of Education Journals

A quantitative evaluation system of Chinese journals in the humanities and social sciences

F. W. Lancaster: A Bibliometric Analysis

The use of bibliometrics in the Italian Research Evaluation exercises

Introduction to Citation Metrics

Citation Metrics. From the SelectedWorks of Anne Rauh. Anne E. Rauh, Syracuse University Linda M. Galloway, Syracuse University.

STRATEGY TOWARDS HIGH IMPACT JOURNAL

Results of the bibliometric study on the Faculty of Veterinary Medicine of the Utrecht University

VISIBILITY OF AFRICAN SCHOLARS IN THE LITERATURE OF BIBLIOMETRICS

Horizon 2020 Policy Support Facility

PUBLIKASI JURNAL INTERNASIONAL

Alfonso Ibanez Concha Bielza Pedro Larranaga

Aalborg Universitet. Scaling Analysis of Author Level Bibliometric Indicators Wildgaard, Lorna; Larsen, Birger. Published in: STI 2014 Leiden

EVALUATING THE IMPACT FACTOR: A CITATION STUDY FOR INFORMATION TECHNOLOGY JOURNALS

Constructing bibliometric networks: A comparison between full and fractional counting

Citation Metrics. BJKines-NJBAS Volume-6, Dec

Edited Volumes, Monographs, and Book Chapters in the Book Citation Index. (BCI) and Science Citation Index (SCI, SoSCI, A&HCI)

Evaluation Tools. Journal Impact Factor. Journal Ranking. Citations. H-index. Library Service Section Elyachar Central Library.

Rawal Medical Journal An Analysis of Citation Pattern

CITATION DATABASES: SCOPUS, WEB OF SCIENCE CITESCORE SJR SNIP H INDEX IF ISSUES

Kent Academic Repository

Which percentile-based approach should be preferred. for calculating normalized citation impact values? An empirical comparison of five approaches

University of Liverpool Library. Introduction to Journal Bibliometrics and Research Impact. Contents

1.1 What is CiteScore? Why don t you include articles-in-press in CiteScore? Why don t you include abstracts in CiteScore?

Discussing some basic critique on Journal Impact Factors: revision of earlier comments

Eigenfactor : Does the Principle of Repeated Improvement Result in Better Journal. Impact Estimates than Raw Citation Counts?

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Research metrics. Anne Costigan University of Bradford

Scopus Journal FAQs: Helping to improve the submission & success process for Editors & Publishers

Quality assessments permeate the

The evolution of a citation network topology: The development of the journal Scientometrics

Bibliometric analysis for information scientists in the University of Tampere in 2012: some results and discussion on information sources

ARTICLE IN PRESS. Journal of Informetrics xxx (2009) xxx xxx. Contents lists available at ScienceDirect. Journal of Informetrics

Accpeted for publication in the Journal of Korean Medical Science (JKMS)

Transcription:

The problems of field-normalization of bibliometric data and comparison among research institutions: Recent Developments Domenico MAISANO

Evaluating research output 1. scientific publications (e.g. journal papers, conference proceedings, book chapters, monographs, etc ) addressed to the scientific community; 2. technology transfer applications (e.g. patents, university spin-offs, consulting services etc ) addressed to the industry and the whole socio-economic system; 3. specialized courses and research seminars, popularization of science; D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 2

Several objects of evaluation Groups of scientific publications concerning: single scientists (competitive examinations for research position/promotion or funding at universities); groups of scientists (e.g. of the same dept., school/faculty, university or research institution); scientific journals. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 3

Why journals? For (at least) two fundamental reasons: supporting subscription and selection decisions of librarians; helping an author to see more clearly the characteristics of the showcases (journals) in which the products (papers) of his/her work have been (or will be) exposed. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 4

Typical approaches The ideal solution is probably given by Peer Review committees. - However, it is not always objective and transparent; - Also, it is not practicable in case of large-scale evaluations (i.e., concerning hundreds or even thousands of articles). Citation Analysis is now used in almost (all) nations around the globe with a sizeable science enterprise. - It has been known for more than one century; - A citation index for science was first described in 1955 by Eugene Garfield; - Important and continuous growth in the last 25 years (evolution of database systems, Internet, etc.) D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 5

(a) What exactly are citations? (b) P 2 (c 2, r 2 ) P 5 (c 5, r 5 ) P 9 (c 9, r 9 ) P 1 (c 1, r 1 ) P 4 (c 4, r 4 ) P 8 (c 8, r 8 ) P 11 (c 11,r 11 ) P 7 (c 7, r 7 ) c i i-th paper r i P i (c i, r i ) P 3 (c 3, r 3 ) P 6 (c 6, r 6 ) P 10 (c 10,r 10 ) (P i ) (c) Fig. 1. (a) Representation of the scientific literature by a graph. (b) Nodes represent scientific papers while arcs represent mutual citations. (c) Regarding the i-th paper (P i ), c i denotes the total citations received (incoming arrows), while r i denotes the total citations made or references (outgoing arrows). P C = c = R = r i i = 1 i = 1 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 6 P i P P 1 1 c = c = r = r P i i = 1 P i = 1 i

Bibliometric Indicators They generally concern: - scientific journals; - single researchers; - groups of researchers (depts., scientific disciplinary sectors, universities, etc.). D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 7

Typical analysis dimensions Productivity; Impact/Diffusion; Regularity; Affinities between documents; Co-authors networks; D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 8

Brief excursus of most popular Indicators 1) P total number of publications indicator of productivity 2) C total number of citations received indicator of overall impact/diffusion 3) CPP (=C /P ) citations per paper indicator of average impact/diffusion 3.1) Impact Factor 3.2) Immediacy Index D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 9

CPP: variations on a theme (1) 3.1) (Journal) Impact Factor (JIF ) Ref. year = 2003 A specific time-window B=70 articles published during the two preceding years (i.e., 2001 and 2002). A=81 number of times these articles were cited by indexed journals during 2003. JIF = A/B = 81/70 = 1.16 Garfield (1972) showed that this 2-year citation time-window is wide enough to include a significant number of citations and dynamic enough to measure the evolution of scientific journals D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 10

CPP: variations on a theme (2) 3.2) Immediacy Index (II ) In a given year, II of a journal is the average number of citations to those papers that were published during the same year. The journal Immediacy Index is a measure of how quickly the average article in a journal is cited. Example: Ref. year = 2003 B=30 articles published during 2003. A=11 number of times these articles were cited by indexed journals during 2003. II = A/B = 11/30 = 0.37 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 11

4) Hirsch (h) index h is defined as the number such that, for a general group of papers, h papers received at least h citations while the other papers received no more than h citations (Hirsch, 2005). citations for each rank publication 1 6 2 4 3 4 4 2 5 1 h-core 2 dimensions: publications productivity citations diffusion/impact D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 12

Something on the h-index scale of measurement h is a Ν number defined over an ordinal scale with only equivalence and ordering properties. The distance, in terms of average effort between two consecutive classes is gradually increasing: 1 2 3 4 5 6 7 h N D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 13

The h-index does not allow compositions Author (a) Author (b) Author (a) + (b) paper id c i rank h-core paper id c i rank h-core paper id c i rank h-core A 7 1 E 18 1 E 18 1 B 5 2 F 16 2 F 16 2 C 5 3 G 8 8 G 8 3 D 3 4 H 6 9 A 7 4 I 1 10 H 6 5 h=3 B 5 6 h=4 C 5 7 D 3 8 I 1 9 h=5 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 14

5) Indicators considering the reputation of citations According to the indicators seen before, a citation has exactly the same prestige of another. Other more complex indicators weigh citations depending on the impact of the citing papers: a paper is important if it is endorsed by important papers. PageRank-inspired indicators websites links scientific papers citations D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 15

Simplifying hypotheses for the previous indicators Role of citation. Self citations. Co-authorship. Citation importance. Article types. Same scientific field. Database reliability. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 16

The need for normalization Journals with different productivity (P ); Scientists with different seniority (Y ); Research institutions with different staff No. (N ); Papers with a different number of (co-)authors (A ); Articles from (sub-)fields with different citation propensity. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 17

The problem of field-normalization All the previous indicators must regard a set of papers within the same field/discipline. Different scientific fields may have: 1. different propensities to the practice of citation (citation rates); 2. differences of maturing of citation impact (citation accumulation). D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 18

Average citations of papers issued in 2000 c 49 c 13 7 c c 19 Warning: do not compare apples with oranges! D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 19

Field-based groups of papers discipline 1 1, 1 discipline 2 c r c 2, r2 in general c d r d On the average: more references discipline 3 more citations received (and vice-versa) c, r 3 3 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 20

A common feature of field-normalized indicators A common feature of field-normalized indicators is that they are based on the comparison between: (1) the amount of citations received by a group of publications examined, (2) a comparison term (CT ) given by the expected number of the citations received/made by analogous publications in the specific discipline(s) of interest. CT should represent the citation propensity of the publications of interest. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 21

How to determine CT? 1 st issue: selection of a reference sample of publications with similar citation propensity. e.g., same journal, superimposed classifications, neighborhood 2 nd issue: use an indicator based on references or citations received. 3 rd issue: choice of the central tendency indicator. e.g., mean, median, harmonic mean, etc D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 22

Sample selection: an unstable balance point! goodness of estimation Low representativeness confusion among different (sub-)fields sample size D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 23

CTs for some popular field-normalized indicators 1 st issue: 2 nd issue: 3 rd issue: Sample selection technique Refs or Cites? Central Tendency Indicator MNCS (Waltman et al., 2010) Classification dependent Cites Mean value Fractional Citation Counting (Leydesdorff & Opthof, 2010) Neighborhood (citing papers) Refs Harmonic Mean (of the references) Audience Factor (Zitt & Small, 2008) Neighborhood (journals of citing papers) Refs Harmonic mean (of the average references) SNIP (Moed, 2009) Neighborhood Refs Mean value Secondary aspects: (1) size of the time-window for counting the citations received (or made) by the publications examined, (2) the way of determining the neighbourhood of a publication, etc D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 24

Some popular field-normalized indicators Mean Normalized Citation Score (MNCS alias New Crown Indicator) MNCS 1 P c = i P e i= 1 i being: P the publications of interest, denoted by i =1,, P ; c i the number of citations of publication i ; e i the expected number of citations of publication i given the field in which publication i has been published. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 25

MNCS (New Crown Indicator) Mech. Eng. e 1 = CPP Mech.Eng. c 1 =2 J1 J2 J3 =1.2 Ind. Eng. MNCS 1 = P ci P e i= 1 i c 1 =1 J4 J5 e 1 = CPP Ind. Eng. =1.5 1 2 1 = + 2 1.2 1.5 = 1.17 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 26

Some popular field-normalized indicators Overlappings are very frequent! D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 27

D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 28

Some popular field-normalized indicators Indicators based on Fractional Citation Counting (FCC) FCC being: FCC i field-normalized indicator relating to a (i-th) publication of interest (i =1,, P ); j order number of a citation (j =1,, m ); r i,j HM i m 1 m = = = = 1, m j ri j 1 m / j= 1 ri, j m HM citations made (i.e., references) by the j-th citing document; harmonic mean of the r i,j values relating the i-th publication. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 29

FCC (Indicator based on Fractional Citation Counting) publications of interest (and corresponding cites): c 1 = 3 citing publications (and corresponding refs): 1-st citing paper 2-nd citing paper r 1 = 5 r 2 = 7 Fractional contribution: 1/5 1/7 1/6 3-rd citing paper r 3 = 6 c 2 = 4 1-st citing paper r 4 = 3 2-nd citing paper r 4 = 3 3-rd citing paper r 5 = 4 1/3 1/3 1/4 1/8 4-th citing paper r 6 = 8 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 30

Some popular field-normalized indicators Indicators based on Fractional Citation Counting Normalization of citations received is performed before joining them by an aggregated indicator (a priori normalization). This technique is adaptive since the CT is calculated considering the neighborhood of the publication(s) of interest. (typically consisting of the set of publications citing or being cited by them). D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 31

Some popular field-normalized indicators SNIP (Source Normalized Impact per Paper) publications of interest (and corresponding cites): c 1 = 3 citing publications (and corresponding refs): 1-st citing paper 2-nd citing paper r 1 = 5 r 2 = 7 3-rd citing paper r 3 = 6 c 2 = 4 4-th citing paper r 4 = 3 5-th citing paper r 5 = 4 6-th citing paper r 6 = 8 c 3 = 2 7-th citing paper r 7 = 5 - Applicable to Journals - Implemented by SCOPUS RIP = Avg ( c i ) = 3 8-th citing paper r 8 = 7 RIP 3 SNIP = K = K = 0.53 K DCP 5.6 DCP = Avg( r j ) = 5.6 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 32

A common feature At the risk of oversimplifying, most of the previous field-normalized indicator can be expressed as: CPP /CT being: CPP citations per paper relating to a set of publications of interest CT a comparison term estimating the citation propensity of citing papers. Most of the previous indicators apply to homogeneous groups of articles i.e., from the same (sub-)field. Sometimes articles to be analyzed are heterogeneous! D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 33

Success-index Requisites: Simplicity and immediacy of meaning (likewise h-index); Applies to groups of heterogeneous papers (in terms of citation propensity). D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 34

Basic principle of Success-index The growth of impact/diffusion of a paper (tree) soil (literature: refs) crop (citations) D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 35

The expected crop (CT ) will depend on the type of tree (discipline). CT(Oranges) CT(Bananas) CT(Lemons) D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 36

Trees: A successful tree (paper) will produce more fruit than expected (CT ). Tree 1 (O) Tree 4 (B) Tree 2 (O) Tree 5 (B) Tree 3 (L) Tree 6 (O) CTs: CT(Oranges): CT(Bananas): CT(Lemons): D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 37

Example paper id c i CT i success-core A 7 4 B 7 5 C 6 6 D 4 3 E 3 5 F 2 5 G 1 4 success-index=3 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 38

Philosophy of Success-index Developed by an intuition of Kosmulski (2011), Success-index makes it possible to isolate a subset of publications, defined as successful papers, among a group of publications examined e.g., those associated to a scientist or a journal. The (non) success-state of a paper is relative to the citation propensity of homologous papers. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 39

How to determine CT i terms? CT i is an estimate of the number of citations that a i th publication, in a certain scientific context and period of time, should potentially achieve. (I) (II) It is given by the mean/median number of citations made/received by articles close to the publication of interest. (III) D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 40

How to determine CT i terms? goodness of estimation Low representativeneness confusion among different (sub-)fields sample size D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 41

1 st Example: 2 scientists from different disciplines A Chemist VS a Computer Scientist For simplicity and practicality, CT will be calculated as the median number of references made by the articles published by the same journal and year of the i-th publication concerned. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 42

1 st Example: 2 scientists from different disciplines S1 (anonymous Chemist) S2 (anonymous Computer Scientist) Paper No. Year c i CT i success-core Paper No. Year c i CT i success-core 1 2002 117 27.0 1 2000 53 18.0 2 2004 109 20.0 2 2000 46 17.0 3 2004 106 28.0 3 2000 22 12.0 4 2002 101 58.5 4 2002 21 17.0 5 2000 52 22.0 5 2000 20 17.0 6 2003 43 26.0 6 2000 19 17.0 35 2005 1 27.0 35 2003 1 14.0 36 2004 0 22.0 36 2004 1 20.0 P=37 2005 0 27.0 37 2005 1 16.0 Total - C=833 - h-index=15 38 2005 1 21.0 Mean - 22.5 28.1 39 2001 0 15.0 Median - 9.0 27.0 success-index=6 40 2004 0 25.5 St.dev. - 32.6 9.6 41 2004 0 19.0 P=42 2005 0 21.0 Total - C=416 - h-index=13 Mean - 9.9 21.5 Median - 6.5 19.0 success-index=7 St.dev. - 11.0 7.1 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 43

Important properties of Success-index Scale type h N 1 2 3 4 5 6 7 effort Success N 1 2 3 4 5 6 7 8 9 10 11 12 13 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 44

Important properties of Success-index Composition of publication portfolios The success-state of a paper originates from its (1) received citations and (2) specific CT. it does not depend on the other publications statistics. Thanks to the ratio property of the scale, it is possible to aggregate blocks of publications (even from several sub-fields). E.g., papers from different journals, years, scientists, departments, etc.. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 45

Important properties of Success-index Composition of publication portfolios Author (a) Author (b) Author (a) + (b) paper id ci rank h-core = ( ɶ ) CTi r successcore i i JY CT i i r successcore i core JY CT r success- JY A 7 1 3 E 18 1 9 E 18 1 9 B 5 2 4 F 16 2 8 F 16 2 8 C 5 3 5 G 8 3 6 G 8 3 6 D 3 4 4 H 6 4 5 A 7 4 3 I 1 5 6 H 6 5 5 h=3 Success-index=2 B 5 6 4 h=4 Success-index=5 C 5 7 5 D 3 8 4 I 1 9 6 paper id ci rank h-core = ( ɶ ) paper id ci rank h-core = ( ɶ ) h=5 Success-index=7 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 46

Practical implications Considering the same capacity of producing successful papers, it is expected that the Success-index should increase proportionally with the (different types of) resources used. Very versatile for different types of normalizations: Resources N (staff No.) Y (years of activity) P (No. of papers of a journal) I (unit of investment) Normalization Success/N Success/Y Success/P Success/I D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 47

A (potential) drawback of the Success-index Excess citations are not taken into account. Condition 1 paper id c i CT i success-core A 7 4 B 7 5 C 6 6 D 4 3 E 3 5 F 2 5 G 1 4 success-index=3 Condition 2 paper id c i CT i success-core A 1000 4 B 700 5 C 6 6 D 4 3 E 3 5 F 2 5 G 1 4 success-index=3 However, a partial patch to fix this issue is to use variable CTs (sensitivity analysis). D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 48

Example of sensitivity analysis Goal: comparison between two publication portfolios (A and B) CT calculation approach: CT i terms relating to individual papers are determined considering the distribution of the references made by the articles published by the same journal and year of the i-th publication concerned. Sensitivity analysis: apart from the median (5 th decile), additional CTs are introduced with reference to other points of the distribution (e.g., 4 th and 6 th decile). 0 (4 th decile) CT L ( ) 2 0 median (5 th decile) CT( ) = rɶ ( ) M JY i 4 0 6 0 (6 th decile) CT H ( ) D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 49

How to deal with multiple CTs? Success index Portfolio A Portfolio B CT (L) CT (M) CT (H) (threshold) severity D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 50

2 nd Example: groups of scientists in the same field Comparison among four groups of scientists from four different Italian Universities, within the same disciplinary sector (SSD). Analysis of their overall scientific production over the period from 2006 to 2008. Let s start with 2006: 2006 Univ. N P C CPP h A 20 11 64 5.8 5 B 15 19 88 4.6 6 C 12 14 166 11.9 6 D 13 29 345 11.9 12 Citation statistics were retrieved from SCOPUS D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 51

2 nd Example: groups of scientists in the same field For each article, a CT is calculated following an approach inspired by Moed s DCP. Next, Success-indices concerning each group are calculated: 2006 Univ. N P C CPP h A 20 11 64 5.8 5 B 15 19 88 4.6 6 C 12 14 166 11.9 6 D 13 29 345 11.9 12 Success-index 5 5 7 16 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 52

Results concerning 2006 Success (in 2006) Success / N (in 2006) (Success-index per capita) 18 16 14 12 10 8 6 4 2 0 A B C D 1.4 1.2 1 0.8 0.6 0.4 0.2 0 A B C D D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 53

Analysis in the whole 2006-to-2008 period Results relating to individual years can be aggregated together (property of composition) in order to calculate the Success-indices with respect to the whole 2006-to-2008 period. Success-indices in 2006 Success-indices in 2007 Success-indices in the whole 2006-to-2008 period Success-indices in 2008 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 54

Composition 2006 18 16 14 12 10 8 6 4 2 0 5 Success 5 7 16 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 Success / N (Success-index per capita) 2007 14 12 10 8 6 4 2 2 5 7 12 1.0 0.8 0.6 0.4 0.2 0 0.0 2008 7 6 5 4 3 2 1 2 5 4 6 0.5 0.4 0.3 0.2 0.1 0 A B C D 0.0 A B C D 2006 to 2008 9 15 18 34 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 55

Analysis in the whole 2006-to-2008 period Comparison among four groups of researchers within the same disciplinary sector (SSD) from different Italian universities over the whole period from 2006 to 2008: 2006 to 2008 Univ. N P C CPP h A 20 35 158 4.5 8 B 15 68 254 3.7 10 C 12 58 364 6.3 11 D 13 81 678 8.4 13 Success-index 9 15 18 34 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 56

Results concerning the 2006-to-2008 period Success (2006 to 2008) Success / N (2006 to 2008) (Success-index per capita) 40 35 30 25 20 15 10 5 0 A B C D 3.0 2.5 2.0 1.5 1.0 0.5 0.0 A B C D D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 57

What about CT distributions? Statistics relating to the CT distributions in the whole 2006-to- 2008 period: 2006 to 2008 Univ. Mean Median St.Dev. IQR A 5.2 4.8 1.8 3.4 B 4.6 4.1 2.8 0.9 C 8.0 5.9 12.4 3.9 D 6.1 4.4 3.0 4.6 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 58

3 rd Example: university depts. within different fields Comparison among two university departments of PoliTo, within different disciplines. Analysis of their overall scientific production over the period from 2006 to 2008. Let s start with 2007: 2007 Dept. N P C CPP h D1 88 143 1392 9.7 21 D2 60 47 467 9.9 11 Citation statistics were retrieved from SCOPUS D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 59

3 rd Example: university depts. within different fields For each article, a CT is calculated following an approach inspired by Moed s DCP. Next, Success-indices concerning each department are calculated: 2007 Dept. N P C CPP h D1 88 143 1392 9.7 21 D2 60 47 467 9.9 11 Success-index 53 15 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 60

What about CT distributions? Statistics relating to the CT distributions in 2007: 2007 Dept. Mean Median St.Dev. IQR D1 10.0 8.7 12.9 3.2 D2 4.8 4.3 2.3 1.9 40% CT distribution in 2007 30% 20% D1 D2 10% 0% 2.7 4.6 6.5 8.4 10.3 12.2 14.0 15.9 17.8 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 61

Results concerning 2007 Success (in 2007) Success / N (in 2007) (Success-index per capita) 60 0.7 50 40 30 20 0.6 0.5 0.4 0.3 0.2 10 0.1 0 D1 D2 0 D1 D2 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 62

Results concerning the 2006-to-2008 period Success (2006 to 2008) Success / N (2006 to 2008) (Success-index per capita) 200 2.5 150 2 100 1.5 1 50 0.5 0 D1 D2 0 D1 D2 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 63

4 th Example: academic institutions Comparison among three Italian Universities. Analysis of their overall scientific production over the period from 2006 to 2008. Let s start with 2006: 2006 Univ. N P C CPP h U1 837 606 6524 10.8 35 U2 1388 854 9932 11.6 42 U3 316 185 2227 12.0 22 Citation statistics were retrieved from SCOPUS D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 64

4 rd Example: academic institutions For each article, a CT are calculated following an approach inspired by Moed s DCP. Next, Success-indices concerning each group are calculated: 2006 Univ. N P C CPP h U1 837 606 6524 10.8 35 U2 1388 854 9932 11.6 42 U3 316 185 2227 12.0 22 Success-index 296 429 92 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 65

Results concerning 2006 500 400 300 200 100 Success (in 2006) Success / N (in 2006) (Success-index per capita) 0.4 0.3 0.2 0.1 0 U1 U2 U3 0.0 U1 U2 U3 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 66

Results concerning the 2006-to-2008 period 1400 1200 1000 800 600 400 200 0 Success (2006 to 2008) Success / N (2006 to 2008) (Success-index per capita) U1 U2 U3 1.2 1.0 0.8 0.6 0.4 0.2 0.0 U1 U2 U3 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 67

What about CT distributions? Statistics relating to the CT distributions in the whole 2006-to- 2008 period: 2006 to 2008 Univ. Mean Median St.Dev. IQR U1 6.8 6.2 5.2 4.6 U2 7.0 6.2 4.0 4.6 U3 6.3 5.4 3.6 3.7 30% 25% 20% 15% U1 U2 U3 10% 5% 0% 0.8 2.3 3.8 5.3 6.8 8.3 9.8 11.3 12.8 14.3 15.8 17.3 18.8 20.3 21.8 D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 68

The rush to publish implicates more successful papers (?) 900 U1 800 700 600 500 400 P Success 300 200 100 0 2003 2004 2005 2006 2007 2008 2009 Year D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 69

Concluding remarks (1) Field normalization is often neglected. The problem of sub-field normalization when comparing scientific publications within the same field is even subtler and trickier. Field normalization is a sine-qua-non condition when evaluating multidisciplinary research institutions. In the Literature, most of the field-normalized indicators concern journals. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 70

Concluding remarks (2) Success-index is an indicator of immediate meaning, able to deal with multidisciplinary groups of papers. applicable to the publication output of entire research institutions. Also, it is very flexible so as to perform other kinds of normalization (e.g., by years of activity, staff No., etc.). D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 71

Open Issues A crucial point is the determination of CT i.e., estimator of the citing propensity within a specific (sub-)field. Probably, the best approaches are those based on the citation propensity of citing articles of a journal. An intense empirical study should be performed to evaluate how robust the possible estimation approaches are. D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 72

Thank you for your attention e-mail: domenico.maisano@polito.it website: http://staff.polito.it/fiorenzo.franceschini/maisano_eng.htm D. Maisano The problems of field-normalization of bibliometric data, 12 th March 2012 73