Exploiting user interactions to support complex book search tasks

Similar documents
Overview of the SBS 2015 Suggestion Track

Looking for Books in Social Media Koolen, Marijn; Bogers, Antonius Marinus; Jaap, Kamps; Van den Bosch, Antal

What to Read Next? The Value of Social Metadata for Book Search

National University of Singapore, Singapore,

MSc Projects Information Searching. MSc Projects Information Searching. Peter Hancox Computer Science

Peter Ingwersen and Howard D. White win the 2005 Derek John de Solla Price Medal

AGENDA. Mendeley Content. What are the advantages of Mendeley? How to use Mendeley? Mendeley Institutional Edition

NYU Scholars for Individual & Proxy Users:

Visualize and model your collection with Sustainable Collection Services

What is Web of Science Core Collection? Thomson Reuters Journal Selection Process for Web of Science

Bibliometric glossary

CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO:

Report on the 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)

Overview of the SBS 2016 Mining Track

LMS301: Reference Management Software (Mendeley)

INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019)

Telescope Bibliometrics 101. Uta Grothkopf & Jill Lagerstrom

Journal of American Computing Machinery: A Citation Study

Lokman I. Meho and Kiduk Yang School of Library and Information Science Indiana University Bloomington, Indiana, USA

K-means and Hierarchical Clustering Method to Improve our Understanding of Citation Contexts

Professor Birger Hjørland and associate professor Jeppe Nicolaisen hereby endorse the proposal by

Sentiment Analysis. Andrea Esuli

Overview of the INEX 2009 Book Track

Introduction to Sentiment Analysis. Text Analytics - Andrea Esuli

Recommending Citations: Translating Papers into References

Identifying functions of citations with CiTalO

Bringing Books to Readers

NYU Scholars for Department Coordinators:

Contextualizing Subject Access Across Digital Collections. The "See Also" Problem

Searching for the right feelings: Emotional metadata in music

2015: University of Copenhagen, Department of Science Education - Certificate in Higher Education Teaching; Certificate in University Pedagogy

The wisdom of the cataloguers: LCSH, indexer inconsistencies and collective intelligence

Navigate to the Journal Profile page

Absolute Relevance? Ranking in the Scholarly Domain. Tamar Sadeh, PhD CNI, Baltimore, MD April 2012

A Scientometric Study of Digital Literacy in Online Library Information Science and Technology Abstracts (LISTA)

Mendeley. By: Mina Ebrahimi-Rad (Ph.D.) Biochemistry Department Head of Library & Information Center Pasteur Institute of Iran

What is LibraryThing? Prerequisites Value Getting Started Become a Member

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

The New & Improved Bloom s Literature

PUBLICATION OF RESEARCH RESULTS

On-line literature searching. Outline

Improving MeSH Classification of Biomedical Articles using Citation Contexts

How to get published Preparing your manuscript. Bart Wacek Publishing Director, Biochemistry

Scopus. Advanced research tips and tricks. Massimiliano Bearzot Customer Consultant Elsevier

A Visualization of Relationships Among Papers Using Citation and Co-citation Information

Contribution of Academics towards University Rankings: South Eastern University of Sri Lanka

SCOPUS : BEST PRACTICES. Presented by Ozge Sertdemir

Citation analysis: State of the art, good practices, and future developments

Information Networks

Book Indexes p. 49 Citation Indexes p. 49 Classified Indexes p. 51 Coordinate Indexes p. 51 Cumulative Indexes p. 51 Faceted Indexes p.

- Primo Central (PCI) is a database of citations a mega-aggregator, approaching 1 billion items contained in 1700 collections

Mapping Interdisciplinarity at the Interfaces between the Science Citation Index and the Social Science Citation Index

Citation Resolution: A method for evaluating context-based citation recommendation systems

USING THE UNISA LIBRARY S RESOURCES FOR E- visibility and NRF RATING. Mr. A. Tshikotshi Unisa Library

Citation Indexes: The Paradox of Quality

Inverted Index Construction

Temporal Dynamics in Music Listening Behavior: A Case Study of Online Music Service

Easy access to medical literature: Are user habits changing? Is this a threat to the quality of Science?

Query terms for art images: A comparison of specialist and layperson terminology

WHO S CITING YOU? TRACKING THE IMPACT OF YOUR RESEARCH PRACTICAL PROFESSOR WORKSHOPS MISSISSIPPI STATE UNIVERSITY LIBRARIES

All about Mendeley. University of Southampton 18 May mendeley.com. Michaela Kurschildgen, Customer Consultant Elsevier

Chapter 3 sourcing InFoRMAtIon FoR YoUR thesis

BBC Red Button: Service Review

The Library Reference Collection: What Kinds of Materials will you find in the Reference Collection?

Users satisfaction survey

Scientometrics & Altmetrics

MUSI-6201 Computational Music Analysis

Bibliometric analysis of the field of folksonomy research

@UERA Summer School 2016

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

Library of Congress Subject Headings and New Music. Keith Knop Florida State University

Bibliometric evaluation and international benchmarking of the UK s physics research

VISIBILITY OF AFRICAN SCHOLARS IN THE LITERATURE OF BIBLIOMETRICS

Valeria Aman Does the Scopus author ID suffice to track scientific international mobility? A case study based on Leibniz laureates (abstract IS10)

Chapter Two - Finding and Evaluating Sources

SCS/GreenGlass: Decision Support for Print Book Collections

RDA RESOURCE DESCRIPTION AND ACCESS

Assessments: Multiple Choice-Shakespeare s Romeo and Juliet. Restricted Response Performance- Romeo and Juliet Alternate Ending & Scene Creation

Publishing Your Research

MEASURING EMERGING SCIENTIFIC IMPACT AND CURRENT RESEARCH TRENDS: A COMPARISON OF ALTMETRIC AND HOT PAPERS INDICATORS

The digital revolution and the future of scientific publishing or Why ERSA's journal REGION is open access

Citation Indexes and Bibliometrics. Giovanni Colavizza

PUBLISHING IN E-JOURNALS

CS 5014: Research Methods in Computer Science

Usage of provenance : A Tower of Babel Towards a concept map Position paper for the Life Cycle Seminar, Mountain View, July 10, 2006

Enabling editors through machine learning

Taxonomy Displays Bridging UX & Taxonomy Design. Content Strategy Seattle Meetup April 28, 2015 Heather Hedden

Citation analysis: Web of science, scopus. Masoud Mohammadi Golestan University of Medical Sciences Information Management and Research Network

Automatic classification of citation function

STI 2018 Conference Proceedings

Introduction to Mendeley

Getting Started with Cataloging. A Self-Paced Lesson for Library Staff

Video needs at the different stages of television program making process

How comprehensive is the PubMed Central Open Access full-text database?

Quality Of Manuscripts and Editorial Process

Citation Concentration in ASLIB Proceedings Journal: A Comparative Study of 2005 and 2015 Volumes

BIBLIOMETRIC REPORT. Bibliometric analysis of Mälardalen University. Final Report - updated. April 28 th, 2014

Off campus access: If you are off campus when you click on PsycINFO you will be asked to log in with a library barcode and PIN number.

Finding Secondary Sources

InCites Indicators Handbook

Transcription:

Exploiting user interactions to support complex book search tasks Marijn Koolen Huygens ING Search Engines Amsterdam 29-09-2016, Spui25, Amsterdam

LibraryThing Forums

LibraryThing Forums

LibraryThing Forums

Observations Book searchers struggling with existing systems (search engines, recommender systems) Requests are highly complex: example docs + textual query + personal profile + context of use Need for models dealing with complex relevance aspects, personal interests, preferences, background knowledge Need for interfaces to support for such tasks

Observations User-generated content covers quality aspects, textual characteristics, opinions & perspectives Unstructured, noisy, diverse Skewed towards popular Often kept out of search index May require NLP/Text Mining

Overview 1. Complex Search Tasks 2. (Social) User Interactions 3. System Support 4. Conclusions

1. Complex Search Tasks

Complex Requests Many textual relevance aspects Examples of known books and authors Context of use Search criteria vs. selection criteria (i.e. searching within information of relevant/interesting books)

Search Stages Information search process models e.g., pre-focus, focus formulation, and post-focus (Vakkari, 2001), Kuhlthau s six stage model (Kuhlthau, 1991) focused on search as part of academic research Decision stages in book selection e.g., browsing, selecting, judging, sampling, and sustained reading (Goodall, 1989)

Textual Aspects Non-topical aspects: writing style, humour, characters, plot, setting, pace, engagement Modelling relevance: standard retrieval models on full-text of books or metadata is not enough

Reading Experience Many information needs based on previous reading experience 36% of requests explicitly reference previous reading experience, often with examples 15% mention authors: looking for similar authors or order in oeuvre

Reading Order Readers often want advice on where to start reading: a prolific author s oeuvre a set of books on a topic Also common issue in scholarly domain

Selection Tasks Select Best: "I want to know what the best books on this topic are, I'm not (yet) interested in the rest" Select Start: "I want to know what books are good to start reading on this topic" Select Next: "I've read X, Y and Z want to know what books are good for further reading, to explore the topic" Select Order: "Given a set of books on this topic, I want to know what the best order is to read them in"

2. (Social) User Interactions

Interaction Types Cataloguing, reviewing, discussing forms of citation analysis (i.e. bibliometrics) each with its own characteristics each has advantages and disadvantages General issue: crowd interactions tend be highly skewed, heavy users dominate, Harry Potter effect

Social Book Search Lab Amazon/LibraryThing collection: curated metadata + user tags and reviews for 2.8M books 45M catalogue entries by 170k users 11M Amazon reviews by 1.8M users 1.6M forum mentions by 16k users in 132k threads

User Catalogues Catalogue reveals connections between books (cocitation) and reading order (citation order) Advantage: many users, many transactions per user, very long tails Disadvantages: noisy, based on variety of interests (also temporally)

Cataloguing Order

Bulk Cataloguing

Book Reviews Review as mega-citation (Zuccala & Bod, 2012) formal reviews in journals Amazon/GoodReads reviews informal, written for variety of reasons reviewing order proxy for reading order (again, noisy)

Forum Discussions Citations and co-citations in online discussions Advantages: contributions from multiple readers (crowd wisdom) Disadvantages: topic drift, game-like threads, sparse data Complexity: when are 2 mentions co-citations? Levels: post, thread, user, discussion group, or a combination of these

Citations in Book Discussions

Differences In Patterns Often co-cited in catalogues, rarely in discussions: later books in series, books by same author, management books discussions avoid obvious connections? Often co-cited in discussions, rarely in catalogues/reviews nominees of Mann Booker & Orange prizes literary praise leads to discussion, less to reading

3. System Support

Supporting Stages How can systems support complex search tasks? look at what other users do, e.g. what they read, review, discuss and in which order Support different sub-tasks with different interfaces multistage search systems (Huurdeman & Kamps, 2014) Disclaimer: the interface concepts you are about to see are very primitive!

Shortlists Query by document: paradigm using a document content as query What about multiple documents representing information need? Shortlist reduce cognitive effort of exploration and selection, improve recommendation performance (Schnabel et al. (2015))

Search By Shortlist Model relevance with multiple example documents approach: represent inf. need through overlapping terms/descriptors or citations (Boomerang effect, Larsen (2002)) challenge: with rich user-generated content, how to select useful overlapping terms

Shortlist vs. Feedback Shortlist search similar to query-by-document (but with multiple docs) relevance feedback (but text query-independent) recommendation (but ad hoc, interactive, focused) list completion (but open-ended) personalised IR (but not necessarily personalised)

Similar How? Single item has many aspects user may not want exactly similar Multiple items may overlap in certain aspects better reflection of relevant aspects? show overlap to user, let her choose aspects

Compare Shortlist Items: Tag Overlap

Compare Shortlist Items: Amazon Category Overlap

Citation-Supported Search Citation context: non-topical aspects in reviews and discussions Citation order: proxy for reading order Co-citation: relationships between shortlist items and collection

Citation Context Textual context of citations can improve retrieval in scientific literature search (Ritchie et al., 2008) Book reviews and user tags also improve many book retrieval tasks (Koolen et al., 2012, Koolen, 2014) Reviews can capture many aspects that curated metadata rarely does: style, humour, characterisation, recency, comprehensiveness, engagement

Revealing Reading Order Signals revealing reading order: popularity, order of interaction, co-citation In what order do Steven Brust fans read his Taltos series?

Reading Order Distribution

Reading Order and Ratings

Cocitations

4.Conclusions Many tasks beyond finding relevant items shortlist search, selection tasks (reading order) Many interactions provide relevant information beyond topical aspects can be summarised and aggregated in interesting ways to reveal relevant usage info Many ways to support complex search tasks challenge to provide support in intuitive way that doesn t lead to overly complex interfaces (reduce cognitive effort)

References (1/2) Deborah Goodall. Browsing in public libraries. Library and Information Statistics Unit LISU, Loughborough, UK. Huurdeman Hugo, Jaap Kamps. From multistage informationseeking models to multistage search systems. IIIX 2014 Koolen, Marijn, Jaap Kamps, Gabriella Kazai. Social book search: comparing topical relevance judgements and book suggestions for evaluation. CIKM 2012. Koolen, Marijn. User reviews in the search index? that'll never work! ECIR 2014. Kuhltau, Carol. Inside the search process: Information seeking from the user's perspective. JASIS, Volume 42(5) 1991.

References (2/2) Larsen, Birger. Exploiting citation overlaps for information retrieval: Generating a boomerang effect from the network of scientific papers. Scientometrics, Volume 54(2), 2002. Ritchie, Anna, Simone Teufel, Stephen Robertson. Comparing citation contexts for information retrieval. CIKM 2008 Schnabel, T, Paul N. Bennett, Susan Dumais, Thorsten Joachims. Using Shortlists to Support Decision Making and Improve Recommender System Performance. WWW 2016. Vakkari, Pertti. A theory of the task-based information retrieval process: a summary and generalisation of a longitudinal study. Journal of Documentation, Volume 57(1), 2001. Zuccala, Alesia, Rens Bod. Book reviews as mega-citations : A fresh look at citation theory. STI 2012.

Questions? Thank You!

Multistage interface Browse view

Multistage interface Search view

Multistage interface Book-bag view

Pennant Diagrams White & Mayr, 2013

Pennants and Order Top left region are descriptors or books with low citation count but relatively high co-citation count tend to be more specific subjects, less obvious connections Can pennant regions help determine reading order? can they be used with multiple seeds? how can this be usefully incorporated in interfaces?