Tradeoffs in information graphics 1. Andrew Gelman 2 and Antony Unwin Oct 2012

Similar documents
Beautiful Evidence: A Journey through the Mind of Edward Tufte Stephen Few August 8, 2006

Why visualize data? Advanced GDA and Software: Multivariate approaches, Interactive Graphics, Mondrian, iplots and R. German Bundestagswahl 2005

Principles of Data Visualization. Jeffrey University of Washington

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

In the transactions between scientists and the media, influence

Salt on Baxter on Cutting

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

REPORT ON THE NOVEMBER 2009 EXAMINATIONS

EDITORIAL POSTLUDE HERBERT JACK ROTFELD. Editors Talking

SIMULATION OF PRODUCTION LINES THE IMPORTANCE OF BREAKDOWN STATISTICS AND THE EFFECT OF MACHINE POSITION

Visual Encoding Design

Visual Revelations. Improving Graphic Displays by Controlling Creativity

This past April, Math

The APA Style Converter: A Web-based interface for converting articles to APA style for publication

Name / Title of intervention. 1. Abstract

Validity. What Is It? Types We Will Discuss. The degree to which an inference from a test score is appropriate or meaningful.

Easy access to medical literature: Are user habits changing? Is this a threat to the quality of Science?

Compte-rendu : Patrick Dunleavy, Authoring a PhD. How to Plan, Draft, Write and Finish a Doctoral Thesis or Dissertation, 2007

National Standards for Visual Art The National Standards for Arts Education

IZA World of Labor: Author guidelines

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! EDITORS NOTES GETTING YOUR ARTICLES PUBLISHED: JOURNAL EDITORS OFFER SOME ADVICE !!! EDITORS NOTES FROM

RESEARCH WRITING. Copyright by Pearson Education, publishing as Longman Aaron, The Little, Brown Compact Handbook, Sixth Edition

Barbara Tversky. using space to represent space and meaning

STUDENT: TEACHER: DATE: 2.5

ILO Library Collection Development Policy

Centre for Economic Policy Research

Independent Reading Project

USING THE UNISA LIBRARY S RESOURCES FOR E- visibility and NRF RATING. Mr. A. Tshikotshi Unisa Library

St. John-Endicott Cooperative Schools. Art Curriculum Standards

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level

AWWA Publishing Preliminary Questionnaire for All Proposed Acquisitions

Communication Studies Publication details, including instructions for authors and subscription information:

The Three Elements of Persuasion: Ethos, Logos, Pathos

N12/5/MATSD/SP2/ENG/TZ0/XX. mathematical STUDIES. Wednesday 7 November 2012 (morning) 1 hour 30 minutes. instructions to candidates

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

Lisa Randall, a professor of physics at Harvard, is the author of "Warped Passages: Unraveling the Mysteries of the Universe's Hidden Dimensions.

2 Unified Reality Theory

Activity Rules and Guidelines

Western School of Technology and Environmental Science First Quarter Reading Assignment ENGLISH 10 GT

Statistics for Engineers

by people with a variety of skills and training. What I have just said about graphics, skepticism at suitable standards is not new.

read read essay book how writes write. essay

Fundamentals of Business Communication 2012 Chapter 17: Writing Reports

Relationships Between Quantitative Variables

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

A Functional Representation of Fuzzy Preferences

Quantify. The Subjective. PQM: A New Quantitative Tool for Evaluating Display Design Options

1. Structure of the paper: 2. Title

Continuum for Opinion/Argument Writing

When submitting your manuscript, it is important that you provide a printed version in

ILLINOIS LICENSURE TESTING SYSTEM

Analysis of local and global timing and pitch change in ordinary

Faceted classification as the basis of all information retrieval. A view from the twenty-first century

Tutorial 0: Uncertainty in Power and Sample Size Estimation. Acknowledgements:

1/8. The Third Paralogism and the Transcendental Unity of Apperception

Use black ink or black ball-point pen. Pencil should only be used for drawing. *

Detecting Medicaid Data Anomalies Using Data Mining Techniques Shenjun Zhu, Qiling Shi, Aran Canes, AdvanceMed Corporation, Nashville, TN

NARI GANDHI TROPHY. Culture - Architecture Connect NARI GANDHI TROPHY THEMATIC PREAMBLE

Optical Engineering Review Form

Learning Opportunities

SUBJECTIVE EVALUATION OF THE BEIJING NATIONAL GRAND THEATRE OF CHINA

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC

Matrix Mathematics: Theory, Facts, and Formulas

Collection Management Policy

MIRA COSTA HIGH SCHOOL English Department Writing Manual TABLE OF CONTENTS. 1. Prewriting Introductions 4. 3.

College and Career Readiness Anchor Standards K-12 Montana Common Core Reading Standards (CCRA.R)

Use words and pictures to make a timeline of the important events in your book

AP Language and Composition Summer Homework Mrs. Lineman

Thomas Kuhn's "The Structure of Scientific Revolutions"

Collaboration with Industry on STEM Education At Grand Valley State University, Grand Rapids, MI June 3-4, 2013

PHY221 Lab 1 Discovering Motion: Introduction to Logger Pro and the Motion Detector; Motion with Constant Velocity

Curriculum Framework for Visual Arts

6 ~ata-ink Maximization and Graphical Design

CAT Topic-wise Solved Papers

Sound visualization through a swarm of fireflies

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

ABOUT ASCE JOURNALS ASCE LIBRARY

How to write summary paragraph >>>CLICK HERE<<< Take time to consider the answers to these

The Project. The Details. Tips for Success. Decorate a box to represent the book and fill it with objects that represent different parts of the book.

With prompting and support, ask and answer questions about key details in a text. Grade 1 Ask and answer questions about key details in a text.

ASSEMBLING HITS AT MOTOWN

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

[T]here is a social definition of culture, in which culture is a description of a particular way of life. (Williams, The analysis of culture )

The Omnichannel Illusion. 80% of retailers lack an omnichannel strategy

MUSIC PRODUCTION OVERVIEW PURPOSE ELIGIBILITY TIME LIMITS ATTIRE

Influence of Discovery Search Tools on Science and Engineering e-books Usage

Department of MBA, School of Communication and Management Studies, Nalukettu, Kerala, India

Bibliometric glossary

In this essay, I criticise the arguments made in Dickie's article The Myth of the Aesthetic

1) New Paths to New Machine Learning Science. 2) How an Unruly Mob Almost Stole. Jeff Howbert University of Washington

Passion Structure Language Form References. Writing Economics. How to Avoid the Worst in Academic Writing. Roman Horvath

How to be More Prolific A Strategy for Writing and Publishing Scientific Papers

Why Publish in Journals? How to write a technical paper. How about Theses and Reports? Where Should I Publish? General Considerations: Tone and Style

Tuesday 10 January 2017 Morning

Visual and Performing Arts Standards. Dance Music Theatre Visual Arts

In basic science the percentage of authoritative references decreases as bibliographies become shorter

INFORMATION FOR AUTHORS

ECONOMICS 351* -- INTRODUCTORY ECONOMETRICS. Queen's University Department of Economics. ECONOMICS 351* -- Winter Term 2005 INTRODUCTORY ECONOMETRICS

[PDF] The Elements Of Graphic Design

Transcription:

Tradeoffs in information graphics 1 Andrew Gelman 2 and Antony Unwin 3 27 Oct 2012 The visual display of quantitative information (to use Edward Tufte s wonderful term) is a diverse field or set of fields, and its practitioners have different goals. The goals of software designers, applied statisticians, biologists, graphic designers, and journalists (to list just a few of the important creators of data graphics) often overlap but not completely. One of our aims in writing our article was to emphasize the diversity of graphical goals, as it seems to us that even experts tend to consider one aspect of a graph and not others. Our main practical suggestion was that, in the internet age, we should not have to choose between attractive graphs and informational graphs: it should be possible to display both, via interactive displays. But to follow this suggestion, one must first accept that not every beautiful graph is informative, and not every informative graph is beautiful. Our favorite example along those lines is the Crimean War data, displayed first in an exciting and innovative form by Florence Nightingale as an attractive graphic that still inspires modern designers, and then replotted in a boring but transparent (or perhaps we should say, transparent but boring) pair of graphs in Figure 16 of our paper. Our display is not better or worse than Nightingale s; it is different. The displays serve different purposes. Our graphs are better for revealing trends and comparisons in the data; hers is better for attracting interest in the underlying story. As we and several discussants wrote, it would indeed be better if designers and statisticians could collaborate (in this case, by drawing the Nightingale-style graph in a way that did not 1 Rejoinder to discussion of Infovis and Statistical Graphics: Different Goals, Different Looks, for the Journal of Computational and Graphical Statistics. We thank Richard Levine for organizing the discussion, the Institute of Education Sciences for grants R305D090006-09A and ED-GRANTS-032309-005, and the National Science Foundation for grants SES-1023189 and SES-1023176. 2 Department of Statistics and Department of Political Science, Columbia University, New York, gelman@stat.columbia.edu, http://www.stat.columbia.edu/~gelman/ 3 Department of Mathematics, University of Augsburg, unwin@math.uni-augsburg.de 1

obscure the real patterns in the data, or by adapting our time series plots to be more lively, or maybe by some different, outside-the-box solution) but, in the meantime, we are happy to have Nightingale s graph on the front page and ours on the click-through. But the data graphics communities will only get to that point if we recognize the inherent contradictions between the multiple goals involved in graphical communication. Yes, it can sometimes be possible for a graph to be both beautiful and informative, as in Minard s famous Napoleon-in-Russia map, or more recently the Baby Name Wizard, which we featured in our article. But such synergy is not always possible, and we believe that an approach to data graphics that focuses on celebrating such wonderful examples can mislead people by obscuring the tradeoffs between the goals of visual appeal to outsiders and statistical communication to experts. Several discussants pointed out a crucial mistake in our paper, which was our depiction of a tension between graphics design on one side, and statistics on the other. At noted above, we do see a tension between the goal of statistical communication and the more general goal of communicating the qualitative sense of a dataset. But graphic design is not on one side or another of this divide. Rather, design is involved at all stages, especially when several graphics are combined to contribute to the overall picture, something we would like to see more of. If statistical graphs (even our own!) often show uninspired design, much of this is our fault in that we have not reached out to the design community. Response to specific points We particularly appreciate the comments by Robert Kosara and Stephen Few, as they are coming from different places than we are but have similar views as ours about the value of informative and elegant graphical display. They point out some mistakes we have made, arising from our imperfect knowledge of the broad graphics community that exists outside the field of statistics. Kosara emphasizes that the many of the examples that we discuss would not be considered information visualizations by anybody in the field. Yau's choices were one person's subjective 2

and perhaps idiosyncratic view. We used his provocative selection as a starting point to develop ideas about an area that interests us in our role as practicing statisticians. Since then there have been more formal selections of visualizations and it is interesting to see whether those confirm or refute the ideas we put forward in our paper. If you look at the prizewinners of two more recent, more official competitions, you can see that many of our points are still highly relevant. We refer to the Data Journalism awards from May 2012 (datajournalismawards.org), where, of course, the prizes are awarded primarily for the journalism, not for the visualizations, but the visualizations play a major role in the presentations. The topics are important, the data gathering is impressive, and the projects look worthy winners. However, while the graphics may look good, they generally present individual cases rather than statistical information. The results of the second competition, the Information is Beautiful Awards (www.informationisbeautifulawards.com), include some elegant graphics, though whether they should be considered as data visualizations under Kosara s criteria is another matter. Sad to say, Kosara may be right when he comments, it seems that statistical graphics is much more centered on the statistical properties first, with the visual appearance and ability to see patterns more of a side product. And he is right to be critical of that attitude. But researchers and practitioners in statistical graphics have long been moving away from that outmoded approach. Hence when he defines the division between infovis and statistical graphics as do I want to quickly dig into my data or do I care about precise statistical properties, this seems to us an inaccurate and divisive distinction. From Tukey and Cleveland in the 1970s and 1980s to researchers today, statisticians who do graphs are not generally focused on precise statistical properties but rather have the goal to create images that communicate the data in a way that makes it possible for the human visual system to recognize patterns, including correlation, clusters, and randomness (in Kosara s words). If nothing else, we hope this discussion is useful in conveying to Kosara and others that statistical graphics is indeed about exploration, as well as pointing to us and other statisticians the many aspects of information visualization of which we had been unaware. 3

Kosara picks out three recent papers in InfoVis, presumably as praiseworthy examples from the field. The first one on how well people can assess correlation coefficients from graphical displays seems a poor choice to us. Correlation coefficients are notoriously unsatisfactory summaries of the information in a scatterplot, they only measure linear association and they miss all the other features that might be present. Estimating correlation coefficients from parallel coordinate plots is even worse. The strength of parallel coordinate plots lies in their multivariate capabilities. Thus, this discussion reveals the potential value in collaboration between graphics researchers (who are skilled at measuring and understanding human perception) and statisticians (who have a sense of effective data summaries). Kosara writes This makes InfoVis a very human-centered field, which cares first and foremost about being easy to understand and informative." This is a worthy aim and we recommend it to statisticians and to all scientists. Having an aim and attaining it are quite different matters. At the end of his comments on our paper, Kosara recommends where we can find the Real InfoVis. We looked at the five articles in the latest issue of the Information Visualization Journal (issue of October 2012). The gratuitous 3-d piecharts in one article and the old-fashioned barchart with error bars in another caught our attention, while none of what we presume might be examples of good information visualisation seemed both very easy to understand and informative. (Mind you, we shudder to think what Kosara might be able to unearth in the latest issues of statistics journals.) We also took a look at Kosara s splendid blog, eagereyes. One entry is called The Fascinating World of (Good) Infographics. The parentheses are well chosen; take a look for yourselves. Stephen Few writes that statisticians can do a lot more graphic design on their own without the collaboration of professionals, that what you need to present data effectively is easy to learn given the right resources and a little practice. This message is appealing but we fear that Few is selling his own expertise short! We agree with all his principle of informative display, yet something is lacking in the graphs we make. They re clean but a bit too drab. Individually, many of the graphs in Gelman s Red State Blue State are appealing and have been influential among observers of American politics; but in aggregate we believe the hundred or so graphs in that book could have benefited from collaboration with a 4

professional designer. We agree with Few that every statistician should have the skills to make clean graphs but, as statistical educators, we are sorry to report that these skills are not so easy to learn or, more to the point, that many intelligent and capable statisticians, economists, political scientists, etc. do not yet seem motivated to do so. We hope this will change. Few lists five errors we have made. The first one, equating information visualization with infographics, is the most serious and we will not make it again in a hurry. The fourth seems minor by comparison. We claim that, paradoxically, difficult-to-follow graphics can be appealing because the effort put into the reader to follow the graph represents a commitment to the topic being displayed. Few disputes our claim. Our view may be counterintuitive, but there is research supporting what we wrote (Hullmann et al., 2011), and although we are skeptical of the possible importance of this factor, it should not be ignored. Few s redrawing of the Nightingale displays is informative and definitely better in terms of color and other layout features than ours. Unfortunately he appears to have read the mortality rates as absolute figures. We see no advantage in adding a fourth line to the top graph and his second graph compares periods where the mortality rates were dramatically different, which could result in misleading conclusions being drawn. We prefer our own second graphic, which provides additional information not available in either of Few s graphics. Again, this back-and-forth indicates the potential benefit of direct collaboration between statisticians and designers. Unsurprisingly, statisticians Hadley Wickham and Paul Murrell offer comments that are more closely in agreement with our article, recognizing our goal of visualizations that both illuminate the unknown and unexpected, while also helping to engage interest and effect change. In his historical review, Wickham demonstrates that this is not a new concern, even though it has become more visible given the increasing number of graphs in the media. (Incidentally, for those wishing to track down the original Bertillon text quoted in translation by Wickham, it is useful to know that it appears in the 1903 edition of the Bulletin, p. 313-318.) In reading Wickham s stories, we were reminded of own experiences in the era that 5

preceded widespread availability of computers, when we drew data on graph paper using pencils and colored pens. Graphs were much more difficult to make, but we always had a sense of every point we were plotting. But we are puzzled by one of Wickham s remarks, when he writes that we seem a little confused between design and decoration: designers do more than put lipstick on a pig. We never wrote that design is decoration, nor did we refer to graphs that we did not like as pigs. On the contrary, we emphasized that there is much in infographics (which we unfortunately mislabeled as information visualization) that we value. As noted in the title of this rejoinder, all creators of graphics face tradeoffs and multiple goals, and it would be foolish for statisticians to place the goals of clarity and statistical discovery above the goals of creating a visually appealing display. We would like our displays to communicate, to be intuitive and visually appealing, to clearly display statistical patterns, to have the potential to reveal the unexpected, and to be faithful to the data. In some cases with much effort it is possible to satisfy several of these goals at once, but they are different goals. We should respect successes in each of these dimensions and work on dynamic displays that allow multiple views of quantitative information from different perspectives. Murrell writes, The main points of Gelman and Unwin s discussion article are uncontroversial. Yes, statistical graphics and infographics have different goals. If only it were true that there is universal acceptance of different, often competing goals! But we are not so sure. Our impression is that practitioners statisticians and non-statisticians alike have the habit of celebrating or criticizing particular graphs without considering the tradeoffs. A case in point is statistician/designer Nathan Yau s presentation of his nominees for 5 best data visualizations of the year, where there was lots of praise but no consideration of the tradeoffs, what had to be given up in clarity to achieve the striking visual effects. We certainly would not claim that our article is perfect, but we did try to move the ball forward by exploring the strength and weaknesses of most of the displays under discussion. We very much enjoyed the witty way in which Murrell expressed his points and we are obviously in close agreement, though there are differences. We would like to see statisticians 6

being encouraged to combine several graphics together, not to think only in terms of single graphics, and we would like to emphasize the importance of goals, content, and context. Both in his list of seven basic principles and in his describing the tools available in R for controlling details of graphics, Murrell writes solely of technical features. He may well assume that statisticians don t have to be reminded that the goals, content and context of their data should be kept in mind when preparing graphics. Is that a safe assumption? Murrell asks for some simple guidelines to help statisticians avoid the worst design errors. Few s contribution to this discussion (and naturally his other writings) would make an excellent starting point. We would like to thank all the discussants for engaging in the discussion so positively and constructively and we hope this is a example of the better communication we all seek among different graph makers, as this was one of our goals in inviting this discussion. References Hullman, J., Adar, E., and Shah, P. (2011). Benefitting InfoVis with visual difficulties. IEEE Transactions on Visualization and Computer Graphics 17, 2213-2222. Kosara, R. (2010). The fascinating world of (good) infographics. Eagereyes blog, 16 May, http://eagereyes.org/criticism/fascinating-world-of-good-infographics 7