Scientific and technical foundation for altmetrics in the US

Scientific and technical foundation for altmetrics in the US William Gunn, Ph.D. Head of Academic Outreach Mendeley @mrgunn https://orcid.org/0000-0002-3555-2054

Why altmetrics? http://www.stm-assoc.org/2009_10_13_mwc_stm_report.pdf 978 data repositories 19 funder policies 16 data journals New forms of scholarship need new metrics.

Problems with Impact Factor

Problems with Impact Factor Country Documents Citable documents Citations Self- Citations Citations per Document H index United States 7,063,329 6,672,307 129,540,193 62,480,425 20.45 1,380 China 2,680,395 2,655,272 11,253,119 6,127,507 6.17 385 United Kingdom 1,918,650 1,763,766 31,393,290 7,513,112 18.29 851 Germany 1,782,920 1,704,566 25,848,738 6,852,785 16.16 740 Japan 1,776,473 1,734,289 20,347,377 6,073,934 12.11 635 France 1,283,370 1,229,376 17,870,597 4,151,730 15.6 681 Canada 993,461 946,493 15,696,168 3,050,504 18.5 658 Italy 959,688 909,701 12,719,572 2,976,533 15.26 588 Spain 759,811 715,452 8,688,942 2,212,008 13.89 476 India 750,777 716,232 4,528,302 1,585,248 7.99 301

Problems with Impact Factor During discussions with Thomson Scientific over which article types in PLoS Medicine the company deems as citable, it became clear that the process of determining a journal's impact factor is unscientific and arbitrary. http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0030291

There is no correlation between the number a citations an article receives and the impact factor of the journal. http://www.bmj.com/content/314/7079/497.1.full The higher the impact factor, the more likely the research is to be retracted, partly due to intense competition. http://bjoern.brembs.net/news766.html.11 What matters is who is reading your work!

Adams, Jonathan. "Collaborations: the fourth age of research." Nature 497.7451 (2013): 557-560.

King, Christopher (2012) Thomson Reuters Annual Report http://ar.thomsonreuters.com/_files/pdf/multiauthorpapers_chrisking.pdf

What are altmetrics? Research impacts more than authors

http://dx.doi.org/10.3789/isqv25no2.2013.04

Citations are slow

Research is fast

Readership vs. citations it comes with a payload of metadata it accrues faster it illuminates previously hidden impact

Install Mendeley Desktop...and aggregates research data in the cloud Mendeley extracts research data Collecting rich signals from domain experts.

Defining readership Each document addition is a read stamped with metadata describing the context of the read event a read is like a citation, but faster and captures more

Professors on Mendeley tend to be in applied math, stats, and physics. http://dx.doi.org/10.6084/m9.figshare.1041819 Graduate students on Mendeley tend to be in engineering disciplines.

Cell Biology and Neuroscience are highly active disciplines, relative to their output. http://dx.doi.org/10.6084/m9.figshare.1041819 Social sciences are highly active relative to their citations/paper

altmetrics show broader impact http://arxiv.org/html/1203.4745v1

Issues To Be Addressed Identity Privacy Attribution Gaming standards/ best practice Filtration

Consistency is key http://www.niso.org/publications/isq/2013/v25no2/chamberlain/

NISO Altmetrics Standards Alfred P. Sloan Foundation American Library Association California Institute of Technology Center for Research Libraries EBSCO Elsevier Harvard Internet Archive Wiley Library and Information Technology Association Library of Congress Los Alamos National Laboratory National Institutes of Health National Library of Medicine OCLC Princeton Columbia Smithsonian Stanford

NISO Altmetrics Standards Types of sources: Mendeley, Twitter, Views, Downloads, Github Quality of sources: collection, reporting, aggregation methods; provenance; availability Use cases: discovery and assessment (of people and objects)

Types of sources Most altmetrics providers use the following: Page views or downloads Mendeley readers (articles only for now) Tweets Comments:Blog posts, Pubmed Commons Github

Quality of sources Collection methods vary & counts are inconsistent. Further study is needed. For reporting, transparency is key. Show raw data, not just a derived number. Aggregation of raw data is generally done by the recipient (institution, funder, publisher, author, etc) according to their need, instead of using one central source.

Quality of sources Understanding and open reporting of provenance is important for community buy-in and long term stability. Raw data should be available under open license, via API, with identifiers. Identifiers include DOI:object, ORCID:person, ISNI/Ringgold: institution Ex. This person(orcid), at this institution (ISNI), released this object (DOI).

Use cases Two main use cases exist: Discovery and Evaluation Because the data sources remain variable, discovery can be done now. Accuracy of numbers matters less in recommendation than in assessment. Precision important for both.

Next Steps NISO white paper will be in public comment period soon. Working Groups will be established to develop best practices and standards. Pending approval, NISO will issue recommended practice or published standard. NISO to develop training to implement and adopt any recommended standards.

There is no gold standard Amgen: 47 of 53 landmark oncology publications could not be reproduced Bayer: 43 of 67 oncology & cardiovascular projects were based on contradictory results Dr. John Ioannidis: 432 publications purporting sex differences in hypertension, multiple sclerosis, or lung cancer. Only one data set was reproducible http://reproducibilityinitiative.org

www.mendeley.com william.gunn@mendeley.com @mrgunn https://orcid.org/0000-0002-3555-2054

Cloud Library Home Work Mobile

Shared Folder

Mendeley Research Catalog

Read papers + keep track of notes 470M documents

Taking some misery out of writing

Information Extraction

We are publishing this data to the LOD cloud http://code-research.eu/