CLARIN - NL. Language Resources and Technology Infrastructure for the Humanities in the Netherlands. Jan Odijk NO-CLARIN Meeting Oslo 18 June 2010

Similar documents
CLARIN AAI Vision. Daan Broeder Max-Planck Institute for Psycholinguistics. DFN meeting June 7 th Berlin

WP6- Analysis in the Visual Domain

The European Film Gateway. September 2008 August Project presentation. Cofunded by the Community Programme econtentplus

STANDARDISATION MANDATE TO THE CEN ON THE HARMONISATION OF

Laurent Romary. To cite this version: HAL Id: hal

A portal for film archives in Europe - The European Film Gateway

Europeana Core Service Platform

Christian Aliverti, Head of the Section of Bibliographic Access at the Swiss National Library, Librarian. Member of the Management Board of the Swiss

ENCYCLOPEDIA DATABASE

(Presenter) Rome, Italy. locations. other. catalogue. strategy. Meeting: Manuscripts

ICOMOS Charter for the Interpretation and Presentation of Cultural Heritage Sites

WORLD LIBRARY AND INFORMATION CONGRESS: 75TH IFLA GENERAL CONFERENCE AND COUNCIL

A Gateway to Film Heritage in Europe

A Gateway to Film Heritage in Europe

Tenso North+South project plan

The Consortium of European Research Libraries: Accessing the Record of Europe s Book Heritage. Marian Lefferts, Executive Manager

ITU-T Y Functional framework and capabilities of the Internet of things

LIBER Road Map towards Digitisation

Before EFG: MIDAS. A Gateway to Film Heritage in Europe. Il Cinema Ritrovato Bologna 4 July 2009

PART A - Project summary

The EU and film archives

Media and Data Converging Media and Content

RESULT-BASED STATUS REPORT

Capital Works process for Medium Works contracts

Name / Title of intervention. 1. Abstract

CRIS with in-text citations as interactive entities. Sergey Parinov CEMI RAS and RANEPA

EUROPEAN COMMISSION Directorate-General for Communications Networks, Content and Technology

The Biblissima Portal

SDDS Plus - Efficient reporting and coordination concept

UA Libraries; UW-Madison Libraries; IMLS: Advisory Committee; Program Manager; Support Staff

IMS Brochure. Integrated Management System (IMS) of the ILF Group

Europeana DCHE. 11 May 2017 Jill Cousins, Harry Verwayen, Shadi Ardalan

Digital Editions for Corpus Linguistics

Modelling Intellectual Processes: The FRBR - CRM Harmonization. Authors: Martin Doerr and Patrick LeBoeuf

ManusOnLine. the Italian proposal for manuscript cataloguing: new implementations and functionalities

The Netherlands Institute for Social Research (2016), Sport and Culture patterns in interest and participation

Date Effected May 20, May 20, 2015

ARTISTIC DIRECTOR APPLICATION PACK

Scopus. Advanced research tips and tricks. Massimiliano Bearzot Customer Consultant Elsevier

The digital Beethoven house

AGENDA. Mendeley Content. What are the advantages of Mendeley? How to use Mendeley? Mendeley Institutional Edition

Internet of Things: Cross-cutting Integration Platforms Across Sectors

Florida State University Libraries

ICOMOS ENAME CHARTER

The Management Committee of the BEREC Office,

A Gateway to Film Heritage in Europe Archimages09 18 November 2009 Paris

Archiving Your Research: the UNM Institutional Repository

MUSIC CITIES NETWORK a presentation. Bergen

Magic Lantern Slide Heritage As Artefacts in the Common European History of Learning

ICOMOS ENAME CHARTER

Frequently Asked Questions about Rice University Open-Access Mandate

The Librarian and the E-Book

ITU-T Y.4552/Y.2078 (02/2016) Application support models of the Internet of things

Aggregating Digital Resources for Musicology

The contribution of UNIFE: NGTC and STARS projects. Peter Gurník Technical Affairs Manager

The Joint Transportation Research Program & Purdue Library Publishing Services

Development of Classical Tamil Digital Library: CIIL Experience. Abstract

Signatures of All Things I am Here to Read : Digital Research as Practice, Digital Networks as Public Engagement

EUROPEAN COMMISSION Directorate-General for Communications Networks, Content and Technology

RESULT-BASED ACTION PLAN

Visualize and model your collection with Sustainable Collection Services

EFG1914: FINAL PUBLIC PROGRESS REPORT

Digitised Content: How we Make It Relevant to Researchers, Teachers and Students

Research outputs: You want me to do what?!?

Article 2: A distributor who meets the following requirements is eligible for financial support:

ICOMOS Ename Charter for the Interpretation of Cultural Heritage Sites

Qualitative Transformation of the Libraries in Serbia - developing information literacy as an imperative

India s perspective on IoT in smart cities program

1. Controlled Vocabularies in Context

Preserving Music Recitals before they fade away

NLI Update Elhanan Adler, Marina Goldsmith

Siân Thomas Systems Manager National Library of Wales

All about Mendeley. University of Southampton 18 May mendeley.com. Michaela Kurschildgen, Customer Consultant Elsevier

Susan K. Reilly LIBER The Hague, Netherlands

A Gateway to Film Heritage in Europe BAAC & LCSA Annual Conference 5 October 2009 Vilnius

ON-SCREEN GUIDELINES FOR BBC PRODUCTIONS IN SCOTLAND 2016/17

Potravinarstvo: Editorial board meeting, 1st of February /10

Usage of provenance : A Tower of Babel Towards a concept map Position paper for the Life Cycle Seminar, Mountain View, July 10, 2006

Vice President, Development League of American Orchestras

Bibliotheca Rosenthaliana: Training the Next Generation Practical Case Studies Panel: Rachel Boertjens and Rachel Cilia Werdmölder

4 th CLMV Regional Conference

Voyager and WorldCat Local - A Cataloger's Perspective

Institutional Report. For my report, I chose to visit the Ralph Rinzler Folklife Archives located in Washington,

Primo. Michael Cotta-Schønberg. To cite this version: HAL Id: hprints

Library and IT Services Manual EndNote import filters Tilburg University

PRACE - Partnership for Advanced Computing in Europe Key Performance Indicators. Philippe Segers GENCI (on behalf of PRACE aisbl)

Cirtec project (former CyrCitEc/CitEcCyr)

Welcome to the Hybrid age! HBB around Europe ITALY. hybrid - G. Alberico Rai Radiotelevisione Italiana

ARCHIVAL DESCRIPTION GOOD, BETTER, BEST

Patron-Driven Acquisition: What Do We Know about Our Patrons?

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (PRS)

SAMPLE COLLECTION DEVELOPMENT POLICY

Bibliothèque numérique de l enssib

Corporate. Biopharmaceutical Company. South Korea

Bridging the Interoperability Gap of the Internet of Things. BIG IoT Project. Rosa Ma Martin (inlab FIB, UPC) JORNADAS TÉCNICAS RedIRIS 2017

NOT-FOR-PROFIT ORGANISATION MEMBERSHIP

Introduction to

Text Type Classification for the Historical DTA Corpus

SCS/GreenGlass: Decision Support for Print Book Collections

Bibliometric practices and activities at the University of Vienna

Transcription:

CLARIN - NL Language Resources and Technology Infrastructure for the Humanities in the Netherlands Jan Odijk NO-CLARIN Meeting Oslo 18 June 2010 1

Overview The CLARIN-NL Project CLARIN Infrastructure Targeted Users Subprojects Relation with CLARIN-EU Future Conclusions 2

CLARIN-NL Project Project in the Netherlands Aims to play a leading role in the creation of the European CLARIN technical infrastructure Budget: 9.01M Euro 2009-2014 Coordinated by Utrecht University 23 participants http://www.clarin.nl/ 3

CLARIN-NL: Infrastructure The CLARIN Infrastructure Will make data and tools on different locations easily accessible via web interfaces and services (CLARIN-portal(s) with intelligent searching, browsing, viewing and querying services) Will make it possible for non-technical researchers to extract / combine/ enrich data (supported by dissemination and training) Will make available interoperable data and tools based on existing standards and best practices 4

CLARIN-NL: For whom? For researchers that work with language data Humanities Linguistics (broadly construed) Literary and Theatrical Studies Media en Culture History Political Sciences 5

CLARIN-NL: For whom? Current Partners (23) Targeted Users Linguistics (10) Culture (2) Lexicography (2) Social History (4) Literature (2) Technology Providers Language technology (6) Speech technology (2) Data Centres and Service providers Data Centres (5) Libraries (2) 6

CLARIN-NL: Subprojects Infrastructure Implementation General Partners: Candidate CLARIN Centres MPI, MI, INL, DANS Directly assigned subprojects Provide guidelines / training for others Metadata project (.5 yr) Testing CMDI against existing national data Create initial set of required metadata components 7

CLARIN-NL: Subprojects Infrastructure Implementation (cont.) Infrastructure Implementation (3 yrs) infrastructure services, an open archiving service, registries, federation of centres, set up a schema registry, profile matching, ISOCAT maintenance, add relation registry RELCAT. coordinate and give guidance for work on web services, wrapper and service bus specification and implementation, select work flow tools and experiment with them. Search&Develop (3 yrs) centralized metadata search distributed content search Text based and structured search 8

CLARIN-NL: Subprojects User Survey & Base Line (.5yr) Directly assigned subproject User survey Interactive interviews Current use/non-use of digital data and tools Identify causes for non-use Identify obstacles for (wider) use 9

CLARIN-NL: Subprojects Data Curation & Demonstrator Projects Data Curation project adapt existing resource making it visible, uniquely referable and accessible via the web, and properly documented Demonstrator projects Create a documented web application that can be used as a demonstrator starting from an existing tool or application that can function as a showcase of functionality CLARIN will support 10

CLARIN-NL: Subprojects Data Curation & Demonstrator Projects Common Goals apply standards and best practices and make use of the suggested CLARIN architecture and agreements esp. CMDI & ISOCAT to understand their limitations and the requirements for extensions establish requirements and desiderata for the CLARIN infrastructure. 11

CLARIN-NL: Subprojects Data Curation & Demonstrator Projects Must involve a targeted user and address the user s research questions Open call for subprojects Small subprojects (.5 yr / 60k Euro) 17 projects submitted, 11 received funding Will make available a range of curated resources a range of showcases of CLARIN functionality evidence-based requirements and desiderata for the CLARIN infrastructure and for supported standards and best practices 12

CLARIN-NL Subprojects INTER-VIEWS project; Data curation and search functionality for (spoken) interviews with veteran soldiers (Veteraneninstituut) AAM-LR Annotation tool for (field)linguists mark speech/non-speech Mark different speakers 13

CLARIN-NL Subprojects TTNWW (speech) Implement user friendly workflow services for indexing and search of (a limited set of) audio and video data For social historians (Aletta, KDC, KADOC, M2P) TICClops (Tilburg) Text cleaning, spelling correction and normalisation 14

CLARIN-NL Subprojects Adelheid (Nijmegen) Text cleaning, PoS tagging and lemmatisation historical Dutch texts (13th century) For historical linguistic research Geleerdenbrievenproject (CKCC) selected in the CLARIN-EU call for humanities and social sciences projects as the project proposal that [would] best demonstrate the use of LRT and would show the potential of a research infrastructure in the humanities Enriching scholars letters with syntactic and semantic annotations In accordance with CLARIN standards For research into circulation of knowledge in scholars letters in NL in the 17th century 15

CLARIN-NL Subprojects (LASSY demo): Simple ( Google-style ) search interface to automatically parsed text corpora TTNWW (text) Implement user friendly workflow services for enriching text corpora with annotations For literature researchers (Huygens) and archeologists (Salagassos) 16

CLARIN-NL Subprojects Standardisation and integration of linguistic data and tools (for linguistic research) En Garde/DUELME-LMF (UU) DUELME database of multi-word expressions WFT-GTB (Fryske Akademy) Integration of Wurdboek fan e Fryske Taal with Integrated Language Data Base ADEPT (UG) Adaptation of edit-distance tool for dialect and historical linguistic research 17

CLARIN-NL Subprojects Standardisation and integration of linguistic data and tools (for linguistic research) MIMORE (MI, UU) Microcomparative Morphosyntax Research Tool TDS-Curator (UU) Curation of the Typological Database System TQE (RU) Transcription Quality Evaluation Sign-LinC (RU) Links lexical databases and annotated corpora of sign languages 18

CLARIN-NL: Subprojects Education, Training & Awareness Organize conferences / workshops / meetings Attend / presentations at events Support events (logistically and financially) Support individual researchers for visiting events Tutorials and lectures (ISOCAT, CMDI, PIDs, ), presence at Summer and Winter schools Website (with Web2.0 functionality) Newsflashes, newsletters, etc. 19

CLARIN-NL: v. CLARIN-EU Organizationally A CLARIN ERIC is being set up NL aims to host the CLARIN ERIC Dutch Minister of Education, Culture and Sciences invited his colleagues to join the CLARIN ERIC CLARIN-NL has funds to fulfil a leading role of NL in the CLARIN ERIC 20

CLARIN-NL: v. CLARIN-EU Content-wise Complementary to EU preparatory project Not only preparatory phase but also implementation phase and first part of exploitation phase carries out activities such as the metadata project, the data curation and demonstrator projects Focusing on data and tools from the Netherlands Not covered by the European preparatory project 21

CLARIN-NL: Future Working on priorities for next subprojects (2010-2011) Analyzing current situation, identifying gaps Proposals are being worked out on form (open call, tender, direct assignment, mix) focus on topics/disciplines Budget and timing Decisions expected Mid 2010 Centres of Expertise Centres of expertise are physical or virtual centres that possess a specific type of knowledge and expertise on a topic that is relevant to CLARIN and that have sufficient mass to guarantee the sustainability of this knowledge and expertise. Identify candidates Plan of activities 22

CLARIN-NL: Future Embedding start work for ensuring the longer term existence of the CLARIN infrastructure embed it in the normal research activities prepare both a governance structure and structural financing in close cooperation with CLARIN EU Continue and intensify educational, training and awareness activities In particular get the CLARIN infrastructure and working in a CLARIN-compatible manner into the regular curricula of universities. 23

CLARIN-NL: Conclusions the CLARIN-NL project is an excellent example for other national CLARIN projects mix between a programme and a project provides flexibility (new developments, new players) offers opportunities for defining a few longer term projects in selected areas to sustain knowledge and expertise built up in the participating institutes. data curation and demonstrator projects offer opportunities for testing standards and best practices and CLARIN architecture will strengthen these and also show their limitations Will provide evidence-based arguments for modifications or extensions Provides opportunities for influencing a selection of standards and best practices compatible with the existing national data. will yield curated data, a range of showcases for explaining and demonstrating the advantages of the CLARIN infrastructure and the new possibilities it will offer 24

CLARIN-NL: Conclusions Involvement of targeted users cooperation between the targeted users and the technology providers is required with a central role for the users' research questions bringing these different communities together in concrete cooperation projects So that the CLARIN infrastructure will provide the functionality that is actually needed by the researchers 25

CLARIN-NL Thanks for your attention! http://www.clarin.nl/ 26

CLARIN-NL Governance Executive Board (4) National Advisory Panel (17) International Advisory Panel (6) Board (8) 27