Contextualizing Subject Access Across Digital Collections. The "See Also" Problem

Similar documents
Book Indexes p. 49 Citation Indexes p. 49 Classified Indexes p. 51 Coordinate Indexes p. 51 Cumulative Indexes p. 51 Faceted Indexes p.

Discovery has become a library buzzword, but it refers to a traditional concept: enabling users to find library information and materials.

VRAcore:

SUBJECT DISCOVERY IN LIBRARY CATALOGUES

MOVING IMAGE ARCHIVING & PRESERVATION PROGRAM ACCESS TO MOVING IMAGE COLLECTIONS, H

Using EndNote 6 to create bibliographies

Information Standards Quarterly

Voyager and WorldCat Local - A Cataloger's Perspective

Illinois Statewide Cataloging Standards

Florida State University Libraries

Variations2: The Indiana University Digital Music Library Project

Gustavus Adolphus College. Some Scientific Software of Interest

This is a talk I did to Internet Archive Staff about the Open Library project. the amazing site that is

Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web

Story Finder Personal Portfolio Project Digital Media Design, Fall 2013 Genevieve Haggard

USING THE WEB TO CHANGE EDITORIAL RESEARCH PRACTICE. Patrick Golden & Michael Buckland Pacific Neighborhood Consortium December 7, 2012

Guide to InTouch HMI Documentation Invensys Systems, Inc.

Visualize and model your collection with Sustainable Collection Services

Wonderware Guide to InTouch HMI Documentation

Bibliometric analysis of the field of folksonomy research

Taxonomy Displays Bridging UX & Taxonomy Design. Content Strategy Seattle Meetup April 28, 2015 Heather Hedden

The Ohio State University's Library Control System: From Circulation to Subject Access and Authority Control

Lokman I. Meho and Kiduk Yang School of Library and Information Science Indiana University Bloomington, Indiana, USA

Reference Content. Multi-Publisher General Reference Multiple Subjects Images & Multimedia All with full citations

Updates from the World of Cataloguing

EndNote for Windows. Take a class. Background. Getting Started. 1 of 17

SCS/GreenGlass: Decision Support for Print Book Collections

Faceted classification as the basis of all information retrieval. A view from the twenty-first century

From Here to There (And Back Again)

Telescope Bibliometrics 101. Uta Grothkopf & Jill Lagerstrom

1. Controlled Vocabularies in Context

Using Endnote. Introduction

What to Read Next? The Value of Social Metadata for Book Search

THE UNIVERSITY OF THE WEST INDIES

CHAPTER 8 CONCLUSION AND FUTURE SCOPE

Literature search. etc. etc. Manuscript Report Thesis. Report Manuscript Thesis

Unnamed things: Creating a controlled vocabulary for the description of animated moving image content

IGeLU 2017 Content conversations

Universal Decimal Classification adding value to the user experience. Penny Doulgeris, Metadata Librarian, IAEA Library.

Steps in the Reference Interview p. 53 Opening the Interview p. 53 Negotiating the Question p. 54 The Search Process p. 57 Communicating the

Siân Thomas Systems Manager National Library of Wales

Exploiting user interactions to support complex book search tasks

Ask a Librarian: The Role of Librarians in the Music Information Retrieval Community

Figures in Scientific Open Access Publications

EndNote X6 with Word 2007

administration access control A security feature that determines who can edit the configuration settings for a given Transmitter.

***Please be aware that there are some issues of compatibility between all current versions of EndNote and macos Sierra (version 10.12).

WHITEPAPER. Customer Insights: A European Pay-TV Operator s Transition to Test Automation

specifications of your design. Generally, this component will be customized to meet the specific look of the broadcaster.

@UERA Summer School 2016


INTRODUCTION TO ENDNOTE X4

Use and Usability in Digital Library Development

U S E R D O C U M E N T A T I O N. ALEPH Scan Interface

Representing Aggregate Works in the Digital Library

Introduction to EndNote

Computer Graphics. Introduction

Tag-Resource-User: A Review of Approaches in Studying Folksonomies

INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR NPTEL ONLINE CERTIFICATION COURSE. On Industrial Automation and Control

RESOURCES FOR HISTORY BUFFS

EndNote: Keeping Track of References

RDA RESOURCE DESCRIPTION AND ACCESS

6JSC/Chair/8/DNB response 4 October 2013 Page 1 of 6

INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019)

WORKING NOTES AS AN. Michael Buckland, School of Information, UC Berkeley Andrew Hyslop, California State Archives. April 13, 2013

ANSI/SCTE

Article begins on next page

AACR2 s Updates for Electronic Resources Response of a Multinational Cataloguing Code A Case Study March 2002

Staff User s Guide Course Reading and Reserves. Version 22

Do we still need bibliographic standards in computer systems?

LMS301: Reference Management Software (Mendeley)

Dynamic Map Display in Web OPAC: An Experiment at Wichita State University Libraries

Welcome to Verde. Copyright Statement

THE GARDEN OF FORKING PATHS: LAW LIBRARIES AND THE FUTURE OF THE CATALOG

Off campus access: If you are off campus when you click on PsycINFO you will be asked to log in with a library barcode and PIN number.

(Presenter) Rome, Italy. locations. other. catalogue. strategy. Meeting: Manuscripts

Searching for the right feelings: Emotional metadata in music

Basic Copy Cataloging (Books) Goals

DESIGN PATENTS FOR IMAGE INTERFACES

A 21st century look at an ancient concept: Understanding FRBR,

Icons. Cartoons. and. Mohan.r. Psyc 579

ETHNOMUSE: ARCHIVING FOLK MUSIC AND DANCE CULTURE

National University of Singapore, Singapore,

MHS LIBRARY RESOURCE GUIDE. Science Edition 1.0

What do you mean by literature?

Table of Contents. iii

Printed Documentation

An Introduction to MARC Tagging. ILLINET/OCLC Service Staff

6-Point Rubrics. for Books A H

Grade 6. Library Media Curriculum Guide August Edition

1a Teens Time: A video call

Introduction to EndNote Web for UF/IFAS Faculty. By Brian Gray, IFAS Dean for Research Office

Battle of the giants: a comparison of Web of Science, Scopus & Google Scholar

Indexing in Databases. Roya Daneshmand Kowsar Medical Institute

Development of Classical Tamil Digital Library: CIIL Experience. Abstract

Information Literacy Skills Tutorial

Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web

Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web

Calibrating Measuring Microphones and Sound Sources for Acoustic Measurements with Audio Analyzer R&S UPV

Guidelines for Subject Access. in National Bibliographies

Transcription:

Contextualizing Subject Access Across Digital Collections The "See Also" Problem Joseph B. Dalton The New York Public Library Research Libraries Digital Library Program DLF Fall Forum 2006 - Nov. 9, 2006 1

What We ll Cover Overview of Problem Some Approaches Expanding Subject-Access Problems, Opportunities and Challenges 2

Background: Some Numbers NYPL Digital Gallery collections 524,000 images 318,000 bibliographic (item-level) records NYPL Digital Library Program uses several subject thesauri LCTGM, LCSH, LCNAF, AAT, GMGPC, etc. Number of records containing at least 1 subject heading: 260,000 (81 % of total) 3

The Problem 58,000 subject headings indexed for searching NYPL Digital Gallery Browsing a list this size is like trying to find a needle in a haystack Lincoln: Lincoln, Abraham Posters depicting circuses: Posters - - Circus or Circus Posters? 4

Possible Approaches for Subject Browsing Parse or index list by subjects facets 19th century - French - Posters French - Posters - 19th century Posters - French - 19th century Map subjects to selected thesauri and provide cross-references Build hierarchical browsing on top of some taxonomy Printing & Graphics > Circus Posters Culture & Society > Posters - French - 19th century 5

Our Approach Create separate index of subjects in Lucene Index the pointers from those subjects to their associated objects or containers Provide front-end context lists, like subjects in [Collection Title], by filtering on the objects Provide free-text retrieval of subjects through some kind of subject finder 6

Indexing Field Relationships at Object-Level First Goal: provide object-based lists Gather all subjects and some associated bibliographic identifiers: item-level, titlelevel, collection-level, etc. 7

Subject Display by Parent-Title Object 8

Subject Display by Group of Collections 9

Indexing Subjects A-Z Early Assumption: 1 row = 1 subject field per object ID Reality: subjects are indexed as multiple fields First test searches for quaker* returned: Book jackets Quakers Abolitionists Reformers Baseball 10

11

New Opportunity: Related Subjects As we examined this problem, some opportunities emerged: Results might be expanded by each subject s association to its related (item-level) subjects These results resemble a reverse-mapping of NYPL Digital Gallery subjects, derived not from an applied top-level taxonomy but from the objects descriptions of themselves 12

Example of Basic Term-Matching Results 25 subjects for Posters 13

Related Subjects Query Expands Results 781 subjects for Posters 14

Example of Related Subjects Query 15

The Scrapbook Problem Frame-of-reference implicitly tied to single images In a single object (one image) a dog and cat are considered "related" In printer's proofs, scrapbooks of illustrations, multiple plates, etc. this notion can be problematic Subjects may share 1 bibliographic reference, but are they "related? 16

Related Subjects: "sailboats" 17

Related Subjects: "sailboats" Children blowing bubbles 18

Publisher s Proofs 19

Publisher s Proofs - Detail 20

The 80/20 Problem How much metadata is enough metadata? 260,000 out of 318,000 images contain subject headings; however, 50,000 items are virtually invisible to our interface If user-experience proves the utility of leveraging subject headings, more staff and $ could be allocated 21

Expanding Subjects: The UI Problem Expanded ( related ) subject list doesn t fall easily into familiar categories: - Faceted browse display - More like this context - Top-down hierarchical menus Additional user-testing needed on GUI Front-end processing, though lightweight, is sometimes expensive: more can be done to optimize index and query 22

The Relevance Problem: Do Subjects Matter? Outside of specialized domains (Medline), do researchers still need subjects? The Big Indexers don t care so much about subjects, they want to index all of your data: scale is where the big gains are in search now, right? Good subject-analysis is expensive Folksonomies, tags, etc. attempt to describe things the way people think of them 23

Flickr: "Bad Day, 1445" 24

NYPL: "Rubric and full-page miniature of..." 25

"Bad Day, 1445" 26

"Bad Day, 1445" 27

Search: violence or torture 28

Search: "mutilation" 29

30

The Answer: [Maybe No Yes]? Images, at least, need description, and likely will for the foreseeable future(?) Subjects and other controlled vocabularies are good at describing while minimizing noise Tags are subjects, just loosely typed? 31

Some Further References & Acknowledgments Krowne, Aaron and Martin Halbert. An Initial Evaluation of Automated Organization for Digital Library Browsing (JCDL 05). 2005. Lagoze, Carl, Dean B. Krafft, Sandy Payette, Susan Jesuroga. What is a Digital Library Anymore, Anyway? (D-Lib Magazine). November, 2005. NISO. A Framework of Guidance for Building Good Digital Collections (2nd Edition). 2004. Thanks to: Lee Horowitz (NYPL DGTL Oracle database consulting), Janet Murray (NYPL DGTL Metadata Coordinator), Tom Robertson (Lucene consulting, Stanford), Barbara Taranto (NYPL DGTL Director) Contact: jdalton@nypl.org 32