Mass digitization and digitization projects at National library of Florence. Giovanni Bergamin Biblioteca Nazionale Centrale Firenze

Similar documents
Today s WorldCat: New Uses, New Data

The Race to Create a Digital Library: Google Books vs. the Open Content Alliance Klara Maidenberg, FIS2309, Design of Electronic Text

from physical to digital worlds Tefko Saracevic, Ph.D.

The Next Mother Lode for Large-scale Digitization? Historic Serials, Copyrights, and Shared Knowledge

Welsh print online THE INSPIRATION THE THEATRE OF MEMORY:

Case study 1: Google Books at the Complutense University of Madrid CERL Annual Seminar 2012 October , British Library

The Joint Transportation Research Program & Purdue Library Publishing Services

White Paper ABC. The Costs of Print Book Collections: Making the case for large scale ebook acquisitions. springer.com. Read Now

ANNUAL REPORT 2010 (Short version)

Life Sciences sales and marketing

CNR National Research

Concours de Courts is open to all short films produced from January 1st 2014.

Carolyn Waters Acquisitions & Reference Librarian The New York Society Library

Bibliotheca Rosenthaliana: Training the Next Generation Practical Case Studies Panel: Rachel Boertjens and Rachel Cilia Werdmölder

ANNUAL REPORT 2013 (Short version)

PubMed, PubMed Central, Open Access, and Public Access Sept 9, 2009

University of Wisconsin Libraries Last Copy Retention Guidelines

Negotiation Exercises for Journal Article Publishing Contracts and Scholarly Monograph Publishing Contracts

Open Access and Historical Monographs: Book Processing Charges amongst Selected Publishers of UK-based Historians

The College/University Library & Peace Studies Program Development

How to find a book or manual

The digitized Newspaper Collection as National Patrimony of the Russian Federation

Salaborsa a pop library sharing cultures. Madrid May, 19th 2010 Meeting of European Public Library Directors

CONSIGLIO REGIONALE DELLA PUGLIA

EUROPEAN COMMISSION Directorate-General for Communications Networks, Content and Technology

I. GENERAL OVERVIEW OF RECENT MAJOR DEVELOPMENTS AND RELATIONSHIP TO GOVERNMENT

Author Frequently Asked Questions

All the News That's Fit to Digitize

EndNote X8. Research Smarter. Online Guide. Don t forget to download the ipad App

Digital Library Literature: A Scientometric Analysis

STORYTELLING TOOLKIT. Research Tips

University Library Collection Development Policy

Susan K. Reilly LIBER The Hague, Netherlands

ELECTRONIC DOCTORAL DISSERTATION. Guide for Preparation and Uploading Revised May 1, 2012

Presentation file for cultural institutions

LIBRARY POLICY. Collection Development Policy

Wikipedia the free online encyclopedia

Monographic Collections Analysis Webinar

Digital reunification of dispersed collections: The National Library of Korea digitization project

NEH-Funded Brittle Books Microfilming: Cumulative Statistics of Harvard s Contributions

CERL 30 octobre 2012 Londres. Digital Library. Direction des Collections/Affaires scientifiques et techniques Bibliothèque nationale de France

HOW FAIR IS THE GOOGLE BOOK SEARCH SETTLEMENT? Pamela Samuelson Berkeley Law School Feb. 12, 2010 FAIR TO WHOM?

COLLECTION DEVELOPMENT POLICY OF THE NATIONAL LIBRARY OF FINLAND

Harvard Law School Library Collection Development Policy

PubMed Central. SPEC Kit 338: Library Management of Disciplinary Repositories 113

The Oxford History Of Ancient Egypt Oxford Illustrated History

This presentation does not include audiovisual collections that are in possession

It's Not Just About Weeding: Using Collaborative Collection Analysis to Develop Consortial Collections

For more information about how to cite these materials visit

Media and Data Converging Media and Content

COLLECTION DEVELOPMENT POLICY

ipl2 Reference by Megan McCrery

YIDDISH ON DEMAND: THE DEBUT OF THE STEVEN SPIELBERG DIGITAL YIDDISH LIBRARY. Faye Zipkowitz

Aggregating Digital Resources for Musicology

Before the FEDERAL COMMUNICATIONS COMMISSION Washington, DC 20554

Images Of Organization By Gareth Morgan

International Bibliography of Discographies

Myanmar Country Report to CDNL-AO 2011

e-infrastructure for Scientific Communities

PUBLIC NOTICE FOR PARTICIPATION IN THE APULIA FILM FORUM 11 th - 13 th October Monopoli (Italy)

Renovating Descriptive Practices: A Presentation for the ARL Fellows. Karen Calhoun OCLC Vice President WorldCat & Metadata Services November 1, 2007

of Nebraska - Lincoln

Success Providing Excellent Service in a Changing World of Digital Information Resources: Collection Services at McGill

Amazon s Kindle Fire. Anthony B. Fullerton. Due Oct 11, 2011 IT Professor: Dr. Steve Schorling. George Mason University

EOD and 20th century s digitisation desert: can we make it bloom? Silvia Gstrein, University of Innsbruck Tartu, University Library 7 June 2013

Visualize and model your collection with Sustainable Collection Services

ANNUAL REPORT 2014 (Short version)

The SALIS collection unveiled: Building an ATOD digital archive

MUSEUMS, HERITAGE SITES AND PUBLIC PARTICIPATION INTRODUCTION

LIBRARY & ARCHIVES MANAGEMENT PRACTICE COLLECTION MANAGEMENT

Towards HDTV and beyond. Giovanni Ridolfi RAI Technological Strategies

The Accidental Archivists: Lessons Learned from a Digital Archive Project

LIBER Road Map towards Digitisation

Processor time 9 Used memory 9. Lost video frames 11 Storage buffer 11 Received rate 11

COLLECTION DEVELOPMENT AND MANAGEMENT POLICY BOONE COUNTY PUBLIC LIBRARY

Central Park Zoo Poetry: The Language of Conservation Case Overview

TV Subscriptions and Licence Fees

Faculty Governance Minutes A Compilation for online version

Promoting Ontario Music. August 23, 2013

Australian Broadcasting Corporation. submission to. National Cultural Policy Consultation

UNESCO/Jikji Memory of the World Prize. Nomination form To be submitted by 31 December 2004

Little remains physically but it lives in the minds of all. Callimachus Pinakes. The library held about 700,000 scrolls, arranged in storage racks

Preserving Digital Memory at the National Archives and Records Administration of the U.S.

UNO SGUARDO RARO - THIRD EDITION Rare Disease International Film Festival CONTEST NOTICE 2017

Scientific Publishing at Karger

Reading Room of The Library of the Academy of Sciences

LIBRARY. Preble County District Library Annual Report. Preble County District

THE AFRICAN DIGITAL LIBRARY: CONCEPT AND PRACTICE

THE UNIVERSITY OF AKRON UNIVERSITY LIBRARIES ARCHIVAL SERVICES COLLECTION DEVELOPMENT POLICY

The Internet Archive Keeps Book-Scanning Free

Publishers Directory

2009 CDNLAO COUNTRY REPORT

PUBLIC NOTICE FOR PARTICIPATION IN THE APULIA FILM FORUM 16 th - 18 th November Vieste (Italy)

The Eight SEEDI conference Digitisation of cultural and scientific heritage Zagreb, 15 16th May 2013.

Instructions for the Preparation. of the Master s Thesis

Annual Report of the IFLA-PAC China Center

The CYCU Chang Ching Yu Memorial Library Resource Development Policy

INFORMATION FOR DONORS

Automatically Creating Biomedical Bibliographic Records from Printed Volumes of Old Indexes

Guide to the collection Amos Gitai Film Archive M No online items

Transcription:

Mass digitization and digitization projects at National library of Florence Giovanni Bergamin Biblioteca Nazionale Centrale Firenze

Some definitions Mass digitization of books (MDB) = conversion of materials (books) on an industrial scale (not just on a large-scale); conversion of whole libraries without making a selection of individual materials source: Karen Coyle

Two main MDB projects: Google Books Internet Archive (Open Content Alliance)

Some notes on Google Books_1 Started in 2004 Planned end of the project 2020 The Google Books aim is the Google aim: organize the world's information and make it universally accessible and usable --- so the content of all published books has to be searchable together with the content of all web pages

Some notes on Google Books_2 Just how many books are out there? How many books have already been digitized by Google Books? 25-30M (non ci sono statistiche ufficiali)

Numbers...

A famous debate on GB in 2005_1

A famous debate on GB in 2005_2 Jean-Noël Jeanneney, historian and former President of National Library of France wrote in 2005 that: The promise of Google is enchanting [...]: everyone with access to the Internet can soon view the recorded memory of the ages in the palm of their hand and search this universe in a fraction of a second however...

A famous debate on GB in 2005_3 We are faced with several possible dangers with respect to: works of various cultural heritages that have fallen into the public domain, the list of priorities will likely weigh in favor of Anglo-Saxon culture; works still under copyright, of which only excerpts, or "snippets," will be offered for the time being, the weight of American publishers may be overwhelming; journals and books disseminating ongoing research, the dominance of work from the United States may become even greater than it is today

11 years later... according to reliable sources the highest percentage of the digitized books is in English (close to 50% out of 450 languages of books in GB) The interest of Google for non English languages is growing

ex. g. Ngram Viewer Ngram service now available also for texts in: German,French, Italian, Spanish, Russian, Hebrew, Chinese From 2009 year after year they are adding new languages

Somes notes on Internet Archive (OCA)_1 The Open Content Alliance (OCA) is a consortium of organizations contributing to a permanent, publicly accessible archive of digitized texts. Its creation was announced in October 2005 by Yahoo!, the Internet Archive, the University of California, the University of Toronto and others. Scanning for the OCA is administered by the Internet Archive, which also provides permanent storage and access through its website

Somes notes on Internet Archive (OCA)_2 More than 8,7 million of texts available up to now

Some differences private owned (1 company) huge amount of resources available for digitization research and development consortium between companies and nonprofit institutions depends on donations and self-financing (limited resources available)

Digitization projects at BNCF (DPB) started in the early 1990s when the size of HD was 40 Megabytes (and there was no WWW)

DPB: faithful copy or searchable text? - 1 Early projects aim: enrichment of bibliographic records through the digitization of title pages, table of contents etc (OCR)

DPB: faithful copy or searchable text? - 2 Following projects aim: faithful copy for manuscripts, ancient books, maps etc

DPB results

Google Books and Proquest EEB at BNCF - 1 GB range: 1701-1875 ebooks with liquid and searchable text national project costs: books circulation (inside lib.) EEB range: -1700 faithful copy BNCF project costs: none

Google Books and Proquest EEB at BNCF - 2 GB scanning location: outside lib. (Italy) outcomes: GRIN and free worldwide accessibility EEB scanning location: inside lib. outcomes: master files and free access from Italian IP access fee outside and royalties for BNCF

MD problems (Google) limitations, ex. g.: size of books foldouts (from 2016 it will be possible) note: MPOB Modified Process for older books pre 1700 (color and text)

MD and copyright (orphan works etc.)

BNCF and Wikisource 2014 Agreement between BNCF and Wikimedia Italia for Wikisource starting point: public domain book digitized by BNCF aim: improve access to digitized books results: crowdsourced text correction in Wikisource (the free library that anyone can improve)

How it works

Closing remarks and open questions MDB: Is there an alternative to Google Books? cooperation with IA and Wikisource 140 years buffer (orphan works and cooperation with publishers)

Thank you for your patience giovanni.bergamin@gmail.com