Washington Metropolitan Area Transit Authority (WMATA) Ridership

Similar documents
Creative. Impactful. Relevant.

The RTDNA/Hofstra University Annual Survey found that 2009 meant another year of TV

AUSTRALIAN MULTI-SCREEN REPORT

Accessibility Advisory Committee

SIDRA INTERSECTION 8.0 UPDATE HISTORY

Sunday Maximum All TV News Big Four Average Saturday

AUSTRALIAN MULTI-SCREEN REPORT QUARTER

NPR Weekend Programs

AUSTRALIAN MULTI-SCREEN REPORT QUARTER

White Paper JBL s LSR Principle, RMC (Room Mode Correction) and the Monitoring Environment by John Eargle. Introduction and Background:

RIDERSHIP SURVEY 2015 Conducted for the San Francisco Municipal Transportation Agency

CAPITAL METRO TRANSIT ADVERTISING

ARLINGTON COUNTY, VIRGINIA

Composer Style Attribution

SNG-2150C User s Guide

OUR MISSION: We at Metro Transit deliver environmentally sustainable transportation choices that link people, jobs and community conveniently,

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

USING MATLAB CODE FOR RADAR SIGNAL PROCESSING. EEC 134B Winter 2016 Amanda Williams Team Hertz

Histograms and Frequency Polygons are statistical graphs used to illustrate frequency distributions.

An Empirical Analysis of Macroscopic Fundamental Diagrams for Sendai Road Networks

PYROPTIX TM IMAGE PROCESSING SOFTWARE

Empirical Analysis of Bus Bunching Characteristics Based on Bus AVL/APC Data. Wei Feng* PhD. Researcher Portland State University

PSC300 Operation Manual

PulseCounter Neutron & Gamma Spectrometry Software Manual

Math 81 Graphing. Cartesian Coordinate System Plotting Ordered Pairs (x, y) (x is horizontal, y is vertical) center is (0,0) Quadrants:

MARCH 23, 2016 NATIONAL MUSEUM OF AMERICAN HISTORY, ARCHIVES CENTER FUNDED BY THE COUNCIL ON LIBRARY AND INFORMATION RESOURCES

Source/Receiver (SR) Setup

Journal Citation Reports on the Web. Don Sechler Customer Education Science and Scholarly Research

A Finding Aid to the Kate Lang Papers, , in the Archives of American Art

The Power of Terrestrial Radio in Puerto Rico. Presented by: Brad LaRock Arbitron June 2012

Softwired Clock. Total Lighting Control. Catalog Number: RCLK8SWS DESCRIPTION FEATURES. Installation Instructions IMRCLK8SWS 1

AUSTRALIAN MULTI-SCREEN REPORT QUARTER

English in Mind. Level 2. Module 1. Guided Dialogues RESOURCES MODULE 1 GUIDED DIALOGUES

Vision Call Statistics User Guide

Environmental Impact Statement (EIS)/ Section 106 Public Meeting Level 1 Concept Screening. May 16, 2017

RIDERSHIP SURVEY 2017 Conducted for the San Francisco Municipal Transportation Agency

Our Business Philosophy

Questions and Comments to Discuss with Staff

Considerations for Blending LED Phosphors

Arts, Audio/Video Technology & Communications

1958B UNIVERSITY AVENUE

Using the Brain to Learn, Laugh, and Continuously Improve

1962 UNIVERSITY AVENUE, UNIT 1

Shift Tool: Adding a Recurring Shift or Event

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

Pre-Inspection Reports Part 2a Orange and Blue Lines

SYMPHONY OF THE RAINFOREST Part 2: Soundscape Saturation

Training Note TR-06RD. Schedules. Schedule types

The APA Style Converter: A Web-based interface for converting articles to APA style for publication

homework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition

A trip to a museum is an excursion, an event it s planned and prepared for in

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #5 Nielsen Television Ratings Problem

Don t let Potential Customers pass you by!

Hidden Markov Model based dance recognition

Other funding sources. Amount requested/awarded: $200,000 This is matching funding per the CASC SCRI project

TV Character Generator

Table of content. Table of content Introduction Concepts Hardware setup...4

The Myth of Dvorak. Joey Day. Writing 2010, Section 032. Michael White. April 5, 2002

AN ELECTRONIC JOURNAL IMPACT STUDY: THE FACTORS THAT CHANGE WHEN AN ACADEMIC LIBRARY MIGRATES FROM PRINT 1

SpikePac User s Guide

Museums Australia Conference, May After the show: Making sense after the event. Gillian Savage Director Environmetrics.

TRANSPORTATION COMMITTEE

MFD Seminar Jun 22 th, 2017, Ehime University. Pengfei Wang Hebei Normal University of Science & Technology, China

University of Tennessee at Chattanooga Steady State and Step Response for Filter Wash Station ENGR 3280L By. Jonathan Cain. (Emily Stark, Jared Baker)

ComfortChoice Touch Thermostat. Designed for ZigBee R Wireless Technology USER GUIDE

Public Perceptions About Artists A Report of Survey Findings for the Nation and Nine Metropolitan Areas

Ofcom s Annual Report on the BBC: 2017/18. Annex 2: BBC Performance Report

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.

KOREA TIMES U.S.A. MEDIA KIT

REAL CROSS MEDIA INTELLIGENCE FOR REAL CROSS MEDIA PLANNING. The PPM contribution. Roberta M. McConochie Beth Uyenco

16B CSS LAYOUT WITH GRID

Jahrgangsstufentest. an bayerischen Realschulen

Empirical Research on the Occurrence Mechanism of Congested Regime in a Macroscopic Fundamental Diagram

CityBike Vienna. Franz Brandl, Valon Lushaj and Artan Toplanaj. University of Vienna, Vienna, Austria

A Color Gamut Mapping Scheme for Backward Compatible UHD Video Distribution

Mind Formative Evaluation. Limelight. Joyce Ma and Karen Chang. February 2007

Patron-Driven Acquisition: What Do We Know about Our Patrons?

Talking about the Future- the Same or Different?

NEWS RELEASE. May 12, 2016 Hakuhodo DY Media Partners Inc.

Foodservice: Delivering the Goods

YAGP 20th Anniversary Season Final Tour Guide New York City, NY - April 12-20, 2019

Linrad On-Screen Controls K1JT

Algebra I Module 2 Lessons 1 19

Major department stores anchoring Hillsdale Shopping Center are Macy s and Nordstrom.

Citation & Journal Impact Analysis

BBC RADIO 5 LIVE: AN AUDIENCE PERSPECTIVE

Video Consumer Mapping Study

BNCE TV05: 2008 testing of TV luminance and ambient lighting control

Using deltas to speed up SquashFS ebuild repository updates

TV Data Report: Time Shifting. alphonso.tv

Subtitle Safe Crop Area SCA

Impacts on User Behavior. Carol Ansley, Sr. Director Advanced Architecture, ARRIS Scott Shupe, Sr. Systems Architect Video Strategy, ARRIS

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

Congratulations to the Bureau of Labor Statistics for Creating an Excellent Graph By Jeffrey A. Shaffer 12/16/2011

The BAT WAVE ANALYZER project

MODE FIELD DIAMETER AND EFFECTIVE AREA MEASUREMENT OF DISPERSION COMPENSATION OPTICAL DEVICES

Table of Contents INTRODUCTION 2. SECTION 1: Executive Summary 3-6. SECTION 2: Where do people get news and how?..7-11

White Paper. Uniform Luminance Technology. What s inside? What is non-uniformity and noise in LCDs? Why is it a problem? How is it solved?

Dave Jones Design Phone: (607) Lake St., Owego, NY USA

TOMELLERI ENGINEERING MEASURING SYSTEMS. TUBO Version 7.2 Software Manual rev.0

Transcription:

1 Washington Metropolitan Area Transit Authority (WMATA) Ridership CMSC734 Homework #2 Allan Fong, PhD Computer Science Introduction and Background Every day thousands of individuals in the greater DC area use the WMATA Metro system to get from place to place. The Metro has collected a lot of information on riders including when and where they pass through turnstiles. In the spirit of open data and collaboration, the Metro has released data on stationto-station rider counts (1). The release was an open invitation for the public to analyze and visualize their data to find interesting insights and patterns. Data Source The data used for this assignment was from the WMATA released on October 31, 2012 (1). The data contained 86 vertices (one for every Metro station) and over 110,000 edges. Below are some technical notes about the data from the website: The data shows average ridership for each day of May 2012, excluding Memorial Day. (May is typically used as an average month, since it falls in the middle of seasonal swings, is relatively unaffected by extreme weather, etc.) Time period shows the time the rider entered (not the time they exited). AM Peak = opening to 9:30am Midday = 9:30am to 3:00pm PM Peak = 3:00pm to 7:00pm Evening = 7:00pm to midnight Late-Night = Friday and Saturday nights only, midnight to closing Overview I was very excited to analyze this dataset with NodeXL, because I thought there might be a lot of interesting insights and patterns. I had also been living in the DC area for over four years and often wondered about the Metro patterns. My first attempt to visualize all of the data in NodeXL failed because there were over 110,000 edges. I had to parse, sort, and aggregate the data in various ways to use it in NodeXL. I also color coded each station based on their respective line/lines. In addition, I chose a related color for stations with two lines. Stations with more than 2 lines were set to black. I used the default disc representation for stations that only have one line and a square representation for those that have 2 or more lines. I tried grouping by vertex attributes with the standard box layouts but the layouts generated were not intuitive. I found that in order to make the visualization more meaningful and understandable, I had to

2 maintain some of the geographical information for the stations. As a result, I modified a Fruchterman-Reingold layout by manually grouping the vertices with similar attributes and arranging them to be reflective of their geographical locations. I tried to maintain some of the geographical clustering while still allowing users read the names of the vertices and the interesting edges. I decided to use this layout for most of remaining visualizations because a common layout allows for easier comparison and analysis. Although, I had to slightly shift some of the vertices in the various graphs to better show interesting edges and patterns, the overall spatial consistency of the vertices were kept. It is also important to note that in some graphs, vertices that did not have enough edge connections were hidden to simplify the visualization. Headline 1: Farragut North, Farragut West, Metro Center, and L Enfant Plaza are the work destinations for many who take part in the DC commuter s rat race I divided the weekly data into four time periods (AM peak, Midday, PM peak, and Evening) and displayed them in the figures below as small multiples. I did this to visualize and compare the different travel patterns on a typical weekday. The edge width varies between 1 and 5 while edge opacity varies between 5 and 100. The edge width and opacity correspond to the number of riders between each station (between 100 and 1900). I had to filter out edges because when they were all included, the visualization was too crowded and difficult to understand. One hundred average riders were chosen because edges with less than one hundred average riders accounted for more than half the data but less than 5% of the total riders. As a result of these changes the graphs are a lot cleaner and easier to understand.

3 The XY plot below shows the In-Degree versus Out-Degree metrics of the Weekday, AM-Peak travel patterns. I chose the AM-Peak rather than the PM-peak hours to understand typical commuter s work patterns because it is less likely that people are traveling during the AM-Peak for leisure.

4 Although I expected Metro Center, L Enfant Plaza, Farragut West, and Farragut North to have high In-Degrees (48, 45, 36, and 33 respectively), I was surprised by how high Union Station, Gallery Place and McPherson Square scored. This is reflected by the large count of the edges without accounting for the weight of the edges. Most people commute from Shady Grove, Union Station, and Vienna (having Out-Degrees of 24, 22, and 21 respectively). Another interesting observation was that Farragut North, Metro Center and Union Station tend to be used more frequently during the midday period. These midday excursions could be for lunch meetings, work related activities, or other reasons. Lunch trips to Union Station, Gallery Place and Dupont Circle may also help explain the slightly darker and thicker lines, because Union Station, Gallery Place and Dupont Circle have good restaurant options. However, with the increasing popularity of food trucks in DC, it makes sense that a large number of people may not use the metro to travel for lunch. The evening usage of the Metro during the weekday is small, except between Gallery Place and Columbia Heights. Columbia Heights has developed a lot over the past two years with a growing number of high rise apartment buildings and new shopping options. There has also been an increase in young adults moving into the area. This can help explain the edge that shows evening commutes between Gallery Place and Columbia Heights.

5 Headline 2: Gallery Place Chinatown, Dupont Circle, U Street, and Clarendon are Night Life Locations The following network visualization shows the travel pattern of people using the metro late at night (between midnight and closing). Because there are significantly less people traveling during the late-night hours, I adjusted the edge width and opacity to better visualize interesting travel patterns. I also calculated the Out-Degree centrality to better understand these travel patterns. Gallery Place Chinatown, U Street, Dupont Circle, and Clarendon have the highest Out-of Degree metrics, suggesting that these places have the highest degree of night life in the city. This trend is very evident in the visualization. Furthermore, those spending time in Dupont Circle will usually return to one of the stations on the Red line. Similarly, people from Gallery Place-Chinatown will typically disperse to Columbia Heights, Silver Spring, and Fort Totten. Fort Totten was initially a surprising destination for late-night wanderers but on reflection this may be because it is the first northeast station on the Green/Yellow line that has parking for metro riders. Furthermore, it was quite interesting to see a VA and DC/MD divide. One might hypothesize that people living in DC prefer to gather in DC and people living in VA prefer to stay in VA. This is especially true at the Clarendon stop, which is the location of the social/bar scene for

6 Arlington. There is a clear contingent of people living in and around Vienna that like to stay out late in Clarendon. Although there is a separation between VA and DC/MD, the edges from Gallery Place to Crystal City and Pentagon City are still fairly strong. This is not surprising because it is easier for people living in Pentagon City and Crystal City to access Gallery Place more easily than Clarendon via Metro. Headline 3: To avoid crowded metros while sightseeing and shopping on the weekends, go out on Sundays The figures below display all the Metro travel patterns for Saturday and Sunday. I used the same edge width and edge opacity scaling between the days to better visualize and compare changes between Saturday and Sunday. It was really interesting to see the contrast between Saturday and Sunday, primarily, the decrease in riders at almost all the stations. This could be reflective of free street parking on Sunday. Furthermore, the amount of people traveling to the Smithsonian from Virginia metros such as Pentagon City, Crystal City and Vienna remained consistent on Saturday and Sunday while other connecting edges, such as the one from New Carrolton, decreased drastically. Furthermore, Pentagon City, Foggy Bottom, Dupont, Metro Center, Gallery Place, Columbia Heights and Union Station are heavily traveled most likely because of the large amount of shopping malls and dining options around these stations. Not surprisingly, the Smithsonian and Navy Yard stations have a much larger percentage of riders on the weekends than on the weekdays. This makes sense because the Smithsonian station is a central location with easy access to a lot of museums; and the Navy Yard station is the closest one to the National s baseball stadium. It is surprising, however, to see the relatively low

7 amount of visitors to both the L Enfant and Archives stops, even though they are as close as, or even closer to some of the popular museums than the Smithsonian station. NodeXL Experience and Critiques Overall, I thought NodeXL was an extremely useful and helpful tool. With some data preprocessing, I was able to upload my data and start exploring the data set fairly quickly. I was able to color code the vertices and arrange them based on geographical location and attributes. This additional spatial coding helped me understand the data much better and faster. I ve tried visualizing networks before using UCINET. NodeXL is several magnitudes better than the free UCINET version I have used. I was very excited to see a free network visualization tool with so many capabilities and functionalities. Many of the interfaces were intuitive and easy to implement. Most importantly, I am thankful that I can download it as an Excel template. While working for the government, it was extremely difficult to install software on the computers. Even when I found helpful, free network visualization programs, it was extremely difficult to obtain permission to install them. That is why I was quite excited when I learned about NodeXL. Even though NodeXL has many benefits, I do have a few critiques and suggestions for improvements beyond those already discussed in class. My data started off with 110,000+ edges and when I tried running some of the layout algorithms, Excel would freeze. Other times, I would run the clustering algorithms and the same thing would happen. I wasn t sure if the program crashed or if it was still processing. I didn t know if I should wait or restart. It would be helpful to provide some indication to the user on the program s progress or the expected calculating time. It would also be helpful to have a filter option to hide or remove self-referencing edges. For this project, I wrote a short macro that did most of this cleaning. However, it would be beneficial to the user if such an option existed in NodeXL. Often times during this assignment, I wanted to compare multiple layout visualization of the data. I had to save my current layout as an image before applying any changes. It would be helpful to be able to create different coordinated layout windows of the same data. However, this would probably take a lot of work, and I would consider the benefits to be moderate. The legend options are also very limited and difficult to edit and manipulate. It would be helpful to have more legend controls available, such as changing the size and location of the legend or adding and removing descriptions in the legend. References (1) http://planitmetro.com/2012/10/31/data-download-metrorail-ridership-by-origin-and-destination/ (2) http://www.wmata.com/ (3) http://nodexl.codeplex.com/