Lecture Notes in Artificial Intelligence 7250

Similar documents
Towards Bisociative Knowledge Discovery

Companion to European Heritage Revivals / edited by Linde Egberts and Koos Bosma

Paul M. Gauthier. Lectures on Several Complex

Mathematics, Computer Science and Logic - A Never Ending Story

Lecture Notes in Artificial Intelligence 9060

Zdravko Cvetkovski. Inequalities. Theorems, Techniques and Selected Problems

Protecting Chips Against Hold Time Violations Due to Variability

How to Write Technical Reports

Calculation of Demographic Parameters in Tropical Livestock Herds

Introduction to the Representation Theory of Algebras

Guide to Computing for Expressive Music Performance

Collected Papers VI. Literary Reality and Relationships

A Algorithms and Combinatorics 13

Damage Mechanics with Finite Elements

Formal Concept Analysis

Foundations of Mathematics

The Sound of Silence

Communicating Science

Phase Equilibria, Crystallographic and Thermodynamic Data of Binary Alloys

Landolt-Börnstein Numerical Data and Functional Relationships in Science and Technology New Series / Editor in Chief: W.

Enabling Things to Talk

Creating Mindmaps of Documents

Landolt-Börnstein / New Series

MATLAB Ò and Design Recipes for Earth Sciences

Theory of Digital Automata

SpringerBriefs in Electrical and Computer Engineering

Edible Medicinal and Non-Medicinal Plants

Landolt-Börnstein Numerical Data and Functional Relationships in Science and Technology New Series / Editor in Chief: W.

Freshwater Invertebrates in Central Europe

Springer-Verlag Berlin Heidelberg GmbH

Quantum Theory and Local Causality

Racial Profiling and the NYPD

Being Agile. Your Roadmap to Successful Adoption of Agile. Mario E. Moreira

Lecture Notes in Computer Science 7020

The Discourse of Peer Review

NEUROANATOMY 3D-Stereoscopic Atlas of the Human Brain

The Language of Cosmetics Advertising

Studies in German Idealism

Encyclopedia of Marine Sciences

Training for Model Citizenship

EATCS Monographs on Theoretical Computer Science

Readability: Text and Context

Postdisciplinary Studies in Discourse

Ergebnisse der Mathematik und ihrer Grenzgebiete

English for Biomedical Scientists Ramón Ribes Palma Iannarelli Rafael F. Duarte

Innovations Lead to Economic Crises

Multicriteria Optimization

Injectable Fillers in Aesthetic Medicine

Music Recommendation and Discovery

Lecture Notes in Computer Science 2845 Edited by G. Goos, J. Hartmanis, and J. van Leeuwen

Publications des Archives Henri-Poincaré Publications of the Henri Poincaré Archives

Environmental Impact of Fertilizer on Soil and Water

Teaching and the Internet: The Application of Web Apps, Networking, and Online Tech for Chemistry Education

Marxism and Education. Series Editor Anthony Green Institute of Education University of London London, United Kingdom

Human Rights Violation in Turkey

Texts in Theoretical Computer Science An EATCS Series

Problem Books in Mathematics

The Contemporary Novel and the City

Jane Dowson. Carol Ann Duffy. Poet for Our Times

Transcultural Research Heidelberg Studies on Asia and Europe in a Global Context

The New Middle Ages. Series Editor Bonnie Wheeler English & Medieval Studies Southern Methodist University Dallas, Texas, USA

Urbanization and the Migrant in British Cinema

An Introduction to Well Control Calculations for Drilling Operations

A Hybrid Theory of Metaphor

Literature and Politics in the 1620s

Theatre and Residual Culture

Burkhard Vogel. How to Gain Gain. A Reference Book on Triodes in Audio Pre-Amps

Benedetto Cotrugli The Book of the Art of Trade

LOGIC, LANGUAGE AND REASONING

Metaphor and Political Discourse

J. Andrew Hubbell. Byron s Nature. A Romantic Vision of Cultural Ecology

Propaganda and Hogarth s Line of Beauty in the First World War

Corpus Approaches to Critical Metaphor Analysis

Ancient West Asian Civilization

Calculating the Human

The Grotesque in Contemporary Anglophone Drama

LOCALITY DOMAINS IN THE SPANISH DETERMINER PHRASE

Edited by: Wolfgang Dietrich UNESCO Chair for Peace Studies University of Innsbruck/Austria

British Women Writers and the Short Story,

The New European Left

Reasonably Simple Economics

Performing Age in Modern Drama

Köhler s Invention Birkhäuser Verlag Basel Boston Berlin

PUBLICATION OF RESEARCH RESULTS

Heritage, Nostalgia and Modern British Theatre

INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019)

Vision, Illusion and Perception

Polymer Technology Dictionary

Yorick Wilks. Machine Translation. Its Scope and Limits

Jill Scott Esther Stoeckli Editors. Neuromedia. Art and Neuroscience Research

What Does a Chameleon Look Like?

Britain, Europe and National Identity

The Language of Suspense in Crime Fiction

Author Frequently Asked Questions

MRI in Clinical Practice

Rhetoric, Politics and Society

DOI: / Open-Air Shakespeare

Journey through Mathematics

The Rhetoric of Religious Cults

Dada and Existentialism

Transcription:

Lecture Notes in Artificial Intelligence 7250 Subseries of Lecture Notes in Computer Science LNAI Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany LNAI Founding Series Editor Joerg Siekmann DFKI and Saarland University, Saarbrücken, Germany

(Ed.) Bisociative Knowledge Discovery An Introduction to Concept, Algorithms, Tools, and Applications 13

Series Editors Randy Goebel, University of Alberta, Edmonton, Canada Jörg Siekmann, University of Saarland, Saarbrücken, Germany Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany Volume Editor University of Konstanz Department of Computer and Information Science Konstanz, Germany E-mail: michael.berthold@uni-konstanz.de Acknowledgement and Disclaimer The work reported in this book was funded by the European Commission in the 7th Framework Programme (FP7-ICT-2007-C FET-Open, contract no. BISON-211898). ISSN 0302-9743 e-issn 1611-3349 ISBN 978-3-642-31829-0 e-isbn 978-3-642-31830-6 DOI 10.1007/978-3-642-31830-6 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2012941862 CR Subject Classification (1998): I.2, H.3, H.2.8, H.4, C.2, F.1 LNCS Sublibrary: SL 7 Artificial Intelligence The Editor(s) (if applicable) and the Author(s) 2012. The book is published with open access at SpringerLink.com. OpenAccess. This book is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. All commercial rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher s location, in its current version, and permission for commercial use must always be obtained from Springer. Permissions for commercial use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Foreword We have all heard of the success story of the discovery of a link between the mental problems of children and the chemical pollutants in their drinking water. Similarly, we have heard of the 1854 Broad Street cholera outbreak in London, and the linking of it to a contaminated public water pump. These are two highprofile examples of bisociation, the combination of information from two different sources. This is exactly the focus of the BISON project and this book. Instead of attempting to keep up with the meaningful annotation of the data floods we are facing, the BISON group pursued a network-based integration of various types of data repositories and the development of new ways to analyze and explore the resulting gigantic information networks. Instead of finding well-defined global or local patterns they wanted to find domain-bridging associations which are, by definition, not well defined since they will be especially interesting if they are sparse and have not been encountered before. The present volume now collects the highlights of the BISON project. Not only did the consortium succeed in formalizing the concept of bisociation and proposing a number of types of bisociation and measures to rank their bisociativeness, but they also developed a series of new algorithms, and extended several of the existing algorithms, to find bisociation in large bisociative information networks. From a personal point of view, I was delighted to see that some of our own work on finding structurally similar pieces in large networks actually fit into that framework very well: Random walks, and related diffusion-based methods, can help find correlated nodes in bisociative networks. The concept of bisociative knowledge discovery formalizes an aspect of data mining that people have been aware of to some degree but were unable to formally pin down. The present volume serves as a great basis for future work in this direction. May 2012 Christos Faloutsos

Table of Contents Part I: Bisociation Towards Bisociative Knowledge Discovery... 1 Towards Creative Information Exploration Based on Koestler s Concept of Bisociation... 11 Werner Dubitzky, Tobias Kötter, Oliver Schmidt, and From Information Networks to Bisociative Information Networks... 33 Tobias Kötter and Part II: Representation and Network Creation Network Creation: Overview... 51 Christian Borgelt Selecting the Links in BisoNets Generated from Document Collections... 54 Marc Segond and Christian Borgelt Bridging Concept Identification for Constructing Information Networks from Text Documents... 66 Matjaž Juršič, Borut Sluban, Bojan Cestnik, Miha Grčar, and Nada Lavrač Discovery of Novel Term Associations in a Document Collection... 91 Teemu Hynönen, Sébastien Mahler, and Hannu Toivonen Cover Similarity Based Item Set Mining... 104 Marc Segond and Christian Borgelt Patterns and Logic for Reasoning with Networks... 122 Angelika Kimmig, Esther Galbrun, Hannu Toivonen, and Luc De Raedt Part III: Network Analysis Network Analysis: Overview... 144 Hannu Toivonen BiQL: A Query Language for Analyzing Information Networks... 147 Anton Dries, Siegfried Nijssen, and Luc De Raedt

VIII Table of Contents Review of BisoNet Abstraction Techniques... 166 Fang Zhou, Sébastien Mahler, and Hannu Toivonen Simplification of Networks by Edge Pruning... 179 Fang Zhou, Sébastien Mahler, and Hannu Toivonen Network Compression by Node and Edge Mergers... 199 Hannu Toivonen, Fang Zhou, Aleksi Hartikainen, and Atte Hinkka Finding Representative Nodes in Probabilistic Graphs... 218 Laura Langohr and Hannu Toivonen (Missing) Concept Discovery in Heterogeneous Information Networks... 230 Tobias Kötter and Node Similarities from Spreading Activation... 246 Kilian Thiel and Towards Discovery of Subgraph Bisociations... 263 Uwe Nagel, Kilian Thiel, Tobias Kötter, Dawid Piatek, and Part IV: Exploration Exploration: Overview... 285 Andreas Nürnberger Data Exploration for Bisociative Knowledge Discovery: A Brief Overview of Tools and Evaluation Methods... 287 Tatiana Gossen, Marcus Nitsche, Stefan Haun, and Andreas Nürnberger On the Integration of Graph Exploration and Data Analysis: The Creative Exploration Toolkit... 301 Stefan Haun, Tatiana Gossen, Andreas Nürnberger, Tobias Kötter, Kilian Thiel, and Bisociative Knowledge Discovery by Literature Outlier Detection... 313 Ingrid Petrič, Bojan Cestnik, Nada Lavrač, and Tanja Urbančič Exploring the Power of Outliers for Cross-Domain Literature Mining... 325 Borut Sluban, Matjaž Juršič, Bojan Cestnik, and Nada Lavrač Bisociative Literature Mining by Ensemble Heuristics... 338 Matjaž Juršič, Bojan Cestnik, Tanja Urbančič, and Nada Lavrač

Table of Contents IX Part V: Applications and Evaluation Applications and Evaluation: Overview... 359 Igor Mozetič and Nada Lavrač Biomine: A Network-Structured Resource of Biological Entities for Link Prediction... 364 Lauri Eronen, Petteri Hintsanen, and Hannu Toivonen Semantic Subgroup Discovery and Cross-Context Linking for Microarray Data Analysis... 379 Igor Mozetič, Nada Lavrač, Vid Podpečan, Petra Kralj Novak, Helena Motaln, Marko Petek, Kristina Gruden, Hannu Toivonen, and Kimmo Kulovesi Contrast Mining from Interesting Subgroups... 390 Laura Langohr, Vid Podpečan, Marko Petek, Igor Mozetič, and Kristina Gruden Link and Node Prediction in Metabolic Networks with Probabilistic Logic... 407 Angelika Kimmig and Fabrizio Costa Modelling a Biological System: Network Creation by Triplet Extraction from Biological Literature... 427 Dragana Miljkovic, Vid Podpečan, Miha Grčar, Kristina Gruden, Tjaša Stare, Marko Petek, Igor Mozetič, and Nada Lavrač Bisociative Exploration of Biological and Financial Literature Using Clustering... 438 Oliver Schmidt, Janez Kranjc, Igor Mozetič, Paul Thompson, and Werner Dubitzky Bisociative Discovery in Business Process Models... 452 Trevor Martin and Hongmei He Bisociative Music Discovery and Recommendation... 472 Sebastian Stober, Stefan Haun, and Andreas Nürnberger Author Index... 485