Data Citation Principles Workshop May 16 17, 2011 IQSS at Harvard University

Similar documents
DATA CITATION. what you need to know

Data Citation Principles CODATA TG on Data Citation

ICOMOS Charter for the Interpretation and Presentation of Cultural Heritage Sites

LIBER Road Map towards Digitisation

Presentation by Martie van Deventer to eresearch Africa 2013 Conference 08 October

PUBLICATION OF RESEARCH RESULTS

Workshop on repositories and journals

Digital Initiatives & Scholar Commons

Before EFG: MIDAS. A Gateway to Film Heritage in Europe. Il Cinema Ritrovato Bologna 4 July 2009

Aggregating Digital Resources for Musicology

Academic Identity: an Overview. Mr. P. Kannan, Scientist C (LS)

Susan K. Reilly LIBER The Hague, Netherlands

A Gateway to Film Heritage in Europe

A Gateway to Film Heritage in Europe

Managing content in the electronic world Anne Knight Acting Head of Information Systems / Resources & Facilities Manager

Frequently Asked Questions about Rice University Open-Access Mandate

The shelf-free generation

How to Publish Your Research Workshop

White Paper ABC. The Costs of Print Book Collections: Making the case for large scale ebook acquisitions. springer.com. Read Now

MEASURING EMERGING SCIENTIFIC IMPACT AND CURRENT RESEARCH TRENDS: A COMPARISON OF ALTMETRIC AND HOT PAPERS INDICATORS

How comprehensive is the PubMed Central Open Access full-text database?

A Gateway to Film Heritage in Europe BAAC & LCSA Annual Conference 5 October 2009 Vilnius

CITATION INDEX AND ANALYSIS DATABASES

Scopus. Advanced research tips and tricks. Massimiliano Bearzot Customer Consultant Elsevier

Research Impact Measures The Times They Are A Changin'

CLARIN AAI Vision. Daan Broeder Max-Planck Institute for Psycholinguistics. DFN meeting June 7 th Berlin

History, Reputation Management, and Value: Discussing the Merits for

The Consortium of European Research Libraries: Accessing the Record of Europe s Book Heritage. Marian Lefferts, Executive Manager

Capital Works process for Medium Works contracts

Knowledge, Support, Innovation... Ryan Scicluna Outreach Department 1 st October 2014

EUROPEAN COMMISSION Directorate-General for Communications Networks, Content and Technology

Bibliometric Study on LIS Journals Archived in DOAJ

CONTEMPORARY TENDENCES IN SERBIAN ACADEMIC LIBRARIANSHIP WITH SPECIAL EMPHASIS ON CATALOGUING AND CLASSIFYING LIBRARY MATERIALS

Research outputs: You want me to do what?!?

Scientometric and Webometric Methods

Corso di dottorato in Scienze Farmacologiche Information Literacy in Pharmacological Sciences 2018 WEB OF SCIENCE SCOPUS AUTHOR INDENTIFIERS

Enabling editors through machine learning

STORYTELLING TOOLKIT. Research Tips

A Gateway to Film Heritage in Europe Archimages09 18 November 2009 Paris

NLI Update Elhanan Adler, Marina Goldsmith

Data Citation Analysis Framework for Open Science Data

APA Review Library Information Sciences End of Program Exam. Susan Whitmer Reference Specialist University of North Texas Libraries September 2013

CNR National Research

The Librarian and the E-Book

The ESO Library Your gateway to information

The Joint Transportation Research Program & Purdue Library Publishing Services

CERL at a Glance. Marian Lefferts. CERL meetings, NL Oslo, October 2014

Web of Science, Scopus, & Altmetrics:

NYU Scholars for Department Coordinators:

Defining National Solutions for Managing Book Collections and Improving Digital Access

Approaches to E-Book Acquisition in Bavaria

Today s WorldCat: New Uses, New Data

AGENDA. Mendeley Content. What are the advantages of Mendeley? How to use Mendeley? Mendeley Institutional Edition

file://c:\documents and Settings\Administrator\Desktop\issue4_1-2.html ISSN

COLLECTION DEVELOPMENT POLICY OF THE NATIONAL LIBRARY OF FINLAND

Quality Control in Scholarly Publishing. What are the Alternatives to Peer Review? William Y. Arms Cornell University

Purdue Libraries Publishing Services: The Domino Effect of Repository-Based Publishing, Outreach, and Promotion

SDDS Plus - Efficient reporting and coordination concept

The Radio Preservation Task Force and the National Recording Preservation Plan

Professor Birger Hjørland and associate professor Jeppe Nicolaisen hereby endorse the proposal by

Open Access Essentials

WEB OF SCIENCE THE NEXT GENERATAION. Emma Dennis Account Manager Nordics

Representing Social Sciences

A portal for film archives in Europe - The European Film Gateway

RDA and cultural heritage - a new starting point for international cooperation?

Web of Science The First Stop to Research Discovery

The European Film Gateway. September 2008 August Project presentation. Cofunded by the Community Programme econtentplus

LIBRARY & ARCHIVES MANAGEMENT PRACTICE COLLECTION MANAGEMENT

UNISA S CENTRE FOR APPLIED INFORMATION AND COMMUNICATION

Development of Reference Management System in Cloud Computing Environment

IMS Brochure. Integrated Management System (IMS) of the ILF Group

Preserving Digital Memory at the National Archives and Records Administration of the U.S.

The Handbook of Journal Publishing

SpringerLink Inforum, Prague 26 May Frans Lettenström SpringerLink Licensing Executive South & East Europe SPRINGER

Media and Data Converging Media and Content

Self-publishing services for book authors

Introduction to

Osgoode Digital Commons: Digital Repository Success Stories

Digitization : Basic Concepts

EUROPEAN COMMISSION Directorate-General for Communications Networks, Content and Technology

EDITORIAL POLICY. Open Access and Copyright Policy

Presentation Panel on Discoverability/Access

The Internet-of-Things For Biodiversity

Institutional Report. For my report, I chose to visit the Ralph Rinzler Folklife Archives located in Washington,

Europeana Foundation Governing Board Meeting

Ithaka S+R US Library Survey 2013

It's Not Just About Weeding: Using Collaborative Collection Analysis to Develop Consortial Collections

Instructions to authors

1. Controlled Vocabularies in Context

21. OVERVIEW: ANCILLARY STUDY PROPOSALS, SECONDARY DATA ANALYSIS

Embedding Librarians into the STEM Publication Process. Scientists and librarians both recognize the importance of peer-reviewed scholarly

Cirtec project (former CyrCitEc/CitEcCyr)

Recomm I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n

Do we use standards? The presence of ISO/TC-46 standards in the scientific literature ( )

Configuring Ex Libris Primo for JSTOR: A Quick Reference Guide

Music Information Retrieval

Information Standards Quarterly

Collection Development Policy J.N. Desmarais Library

Overview of Open Access Books in Library and Information Science in DOAB

Internet of Things: Networking Infrastructure for C.P.S. Wei Zhao University of Macau December 2012

Transcription:

Data Citation Principles Workshop May 16 17, 2011 IQSS at Harvard University Deep Data Citation Mechanism and Service for Scientific Data: Defining Framework for Biodiversity Data Publishers Vishwas Chavan Global l Biodiversity Information Facility (GBIF) Secretariat May 16, 2011

Outlines 1. About GBIF 2. GBIF Data Publishing Framework 3. Data Citation 4. Data Citation formulations 5. Waterfall Model of Data Citation

About GBIF

GBIF: Vision and Mission Vision: A world in which biodiversity information is freely and universally available for science, society, and a sustainable future Mission: To be the foremost global resource for biodiversity information, and engender smart solutions for environmental and human well-being

GBIF Country Participants 2011 Voting Participants: 33 Associate Participants: 23 Associate Participating Organisations: 46 Last updated: 2011 04 11

Growth in Data Records Milli ion of prima ary biodivers sity records 290 280 270 260 250 240 230 220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 Last updated: 2011 04 12

Data Publishing Framework

Why should I publish data? Recognition Opportunities Investment

Data Publishing Framework Cultural change towards free and open access to biodiversity it dt data Addresses social, technical, and policy concerns Answer What is there for me? for ALL The Data Publishing Framework is defined as environment conducive to enabling free and open access to the world s primary biodiversity data. The core purpose p of the framework is to overcome socio political, technicalinfrastructural, policy political, legal and economic investmentbarriers orimpedimentsaffecting the discovery and publishing of data

Infrastructure and Technical Legal Policy and Political Economic Socio Cultural l Chavan and Ingwersen (2009), BMC Bioinformatics, 10 (Suppl. 14): S2

DPF: Core Technical Components Data Citation Mechanisms Persistent Identifiers Data Usage Index Chavan and Ingwersen (2009), BMC Bioinformatics, 10 (Suppl. 14): S2

Data Citation

Data Citations today: Example Source: GBIF Data Portal, data.gbif.net Search string: Panthera tigris Search results: 696 records, from 37 datasets, published by 31 Data Publishers Date: Thursday, 4 November 2010, Time: 10.03.30 Existing Data Citation style Please cite this data as follows: (accessed through GBIF data portal, Mammal specimens, http://data.gbif.org/datasets/resource/559) (accessed through GBIF data portal, Vertebrate specimens, http://data.gbif.org/datasets/resource/541) (accessed through GBIF data portal, Natural History Museum Rotterdam, http://data.gbif.org/datasets/resource/693) (accessed through GBIF data portal, Database Schema for UC Davis Wildlife museum, http://data.gbif.org/datasets/resource/736) (accessed through GBIF data portal, UNSM Vertebrate Specimens, http://data.gbif.org/datasets/resource/812)......... Un-answered facts What was the search string? How many records were retrieved? How many Data Publishers contributed to the data? When search was carried out? Who is the original contributor of the data? Who played what role from collection to publishing? How can I retrieve the same result?

Data Citation: What is needed? Deep data citation mechanism Recognise ALL with their roles Multilayer citation producer, publisher, aggregator, curator Cascading Citations citations within citations Data Citation Service Resolve citation any time Discover the underlined data

Data Citation: Challenges Dealing with dynamic streaming data? Resolving to human or machine interpretable description of object? Need for registry of name spaces? Can metadata standards support multiple GUIDs? Failure to enforce data citation as mandatory step in Publishing cycle

Waterfall model of data citation

Data Citation formulations Types of Publishers Publisher (individual) Publisher (group of individuals) Institution or Research Group or Consortium Release / Update frequency One time release Frequent updates

Data Citation formulations... Publisher (individual) one time data release Publisher (YEAR), <Title of the data resource>, <total t nos. of records>, published <modes of publishing>, <Primary access point>, released on<release date>, <Persistent Identifier>. Rumble KJ (1998) Cephalopods of North America 10023 Rumble KJ (1998). Cephalopods of North America. 10023 records, published online, http://www.rumblejk.org/cephna/, released on 31/12/1998, doi:10.4000/iisc.0.00.36.

Data Citation formulations... Publisher (group of individuals) frequent updates Publisher 1,... and Publisher n <YEAR). <Title of the data resource>, <total nos. of records>, published <modes of publishing>, <Primary access point>, first released on<release date>, <current version no. or last updated/released d/ d on (date)>, <Persistent t Identifier>. Remsen D, Bello J, Sheldon S, Raymond M, and AJK Arino (2005 -). Fishes of the Cape Cod Region, MA,USA. 70089 records published online, http://www.remsen.net/capecodfishes/, first released on 17/05/2005, last updated on 10/10/2010, doi: 11.3389/mbl.1.11.131.

Data Citation formulations... Institute/Research Group/ Consortium frequent updates <Publisher as Institution / Research Group / Consortium> <YEAR (Year first published / released -)>, <Title of the data resource>, <total nos. of records>, <Contributed by contributor 1(role), contributor 2 (role)... contributor n(role)>, <published (modes of publishing)>, <Primary access point>,<version no., or last updated/released on (date)>, <Persistent Identifier>. Smithsonian National Museum of Natural History (2002 -), Museum Collection Records: Mammals. 579257 records. Contributed by Helgen KM (Principal Investigator, cutrator, author), Gordon LK (manager, author, curator), Peurach SC (author, manager), Potter CW (manager, author), Carleton MD (curator), Maldonado JE (author, developer), Wilson DE (curator, author), Thorington Jr RW (curator, author, validator), Ludwig CA (manager, developer, author), Lunde DP (author). Published online, http://collections.nmnh.si.edu/search/mammals/, first released on 12/02/2002, last updated on 15/09/2010, doi:17.3377/smi.8.57.965.

Waterfall model for Data Citation Cascading citations Citations within citations Recognising roles in data management life cycle Three types of citations Publisher determined citations User driven citations Composite citations Use of Persistent Identifiers (PI) Persistent Identifiers for each citation types Support multiple types of PIs Handles, ARK, PURL, URN, LSID, DoI etc. Data Citation service Registration service: assign PI to citations Resolver service: resolve citations from PIs

Waterfall model for Data Citation: exemplification Source: GBIF Data Portal, data.gbif.net Search string: Panthera tigris Search results: 696 records, from 37 datasets, published by 31 Data Publishers Date: Thursday, 4 November 2010, Time: 10.03.30 User Search http://data.gbif.net (2010). user doi:09.1111/gbif.9.11.444.. http://data.gbif.net t t (2010). Search string:panthera ti th tigris, i 696 records, contributed by 37 data resources, user doi: 09.1111/gbif.9.11.444, accessed on 04/11/2010, 10:03:30. (data resources: doi: 09.1111/lsu.9.11.559, unr:lsid:msu.org:observation:541, http://nhmr.nl/ark:/1205/693xz693, http://hdl.loc.gov/ucd/736, loc http://purl.unsm.org/unsm/812, unsm urn:gnhm:0-486-1047,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...). User driven citation using Persistent Identifier resolve to full composite citation and/or snapshot of resultant data can be accessed Medium sized composite citation resolve to full composite citation including detailed Publisher determined d citation in cascading manner

Waterfall model for Data Citation: exemplification Full length composite citation http://data.gbif.net (2010). Search string:panthera tigris, 696 records, contributed by 37 data resources, user doi: 09.1111/gbif.9.11.444, accessed on 04/11/2010, 10:03:30. User driven citation 1. Louisian State University (2007), Museum of Natural Science: Collection of Mammal, 36000 records. Contributed by Patterson DN (Principal Institutional Investigator, dataset, architect, onetime author), release, doi Sandeep PK (author, curator), Fieldman LN (author, developer), Remsen D (curator, validator), published online http://www.museum.lsu.edu/mns/mammcoll.hml, released on October 2007, doi:09.1111/lsu.9.11.559. 2. Michigan State University (2001 -), MSU Vertebrate Collection, 76523 records. Contributed by Cook DK (Principal Investigator, author, curator, Institutional validator), dataset, Hirsh L frequent (author, architect, update, lsid developer), Lane MP (manager, author, curator)..., Morris JH (curator), published online http://musuem.msu.edu/researchandcollections/dvnh, first released on 01/10/2001, last updated on 18/01/2010, urn:lsid:msu.org:observation:541. 3. Cursada PK, Bello J, and AJK Moelicker (2006), Natural History Museum Rotterdam: Mammal collection, 1123 records, published online, http://www.nlbif.nl/nhmr_mc/, released on 7 July 2006, http://nhmr.nl/ark:/1205/693xz693. Multiple authors, frequent update, ARK......... Single author, frequent update, handel 37. Rumble KJ (1998 -). Vertebarte collection of Rumble 1960-1999. 786 records, published online, http://www.sbnature.org/rumble_collection/, first released on 13/09/1998, last updated on 27/01/2010, http://hdl.oclc.gov/sbnature/5678.

Waterfall model for Data Citation: How will this happen? Publisher determined citations Detailed citation as part of metadata document, and/or Register citation at Citation Service Persistent Identifier is assigned to metadata document and/or citations i alone User driven citations ti Search data through Publisher access point Single dataset Search result together with Publisher determined citation Multiple datasets Search result together with all datasets User write user driven string of citation User register it citation ti with Citation ti Service User archive snapshot of search linked to user driven citation

Process for Publisher determined citations Database Administrator Database Administrator Database Administrator Database Administrator Database Administrator Citation Service(s) Persistent Identifier Metadata Catalogue(s) Discovery Platforms Aggregators Networks Information Systems

Process for User driven citations Archive dataset used User access data Citation Service returns PI Finalise dataset for use Use Citation Service

Wish List for Data Citation Best practice guide for data citation Persistent identifiers to datasets Credit to all players from data producers to publishers, aggregators etc. All levels of granularity and combinations With or without annotations Link between traditional literature and data Coordinated citation support for ALL Research metrics for datasets

Impact of Data Citation Data Use Data Citation Data Preservation Data Discovery Data Publishing

Data Publishing = Scholarly Publishing! Email: vchavan@gbif.org