This blog concerns an interactive visualization I put together that, for technical reasons, isn’t directly imbedded in this blog post. Go to http://bigbytes.mobyus.com/commute.aspx to see and play with the visualization.
A month or two ago, I ran across a compelling visualization of commuter data done by Alasdair Rae, a geographer and urban planner based at the University of Sheffield. He runs a very cool blog called Stats, Maps n Pix that focuses on geographically oriented data and related visualizations. His gifs were really compelling and hypnotic, and provided a great sense of where workers were living and working in metropolitan areas. While the data source size, at 4.1m rows, was not even remotely in the ballpark of what people consider “big data”, I’m also interested in visualizations and publicly available data that I could use to work on various “big data” style analyses. Alasdair mentions the idea of creating a version for the whole US in one of his blog entries, so I thought I would take a stab at it.
The commuter data is supplied as part of the ACS. The ACS is the “American Community Survey” and is an ongoing survey that replaced the US Census long form. The government makes this data available to the public to help non-profits, businesses, and government agencies better understand and target their plans and funding geographically.
Part of the ACS includes commuting information for those respondents in the workforce, detailing where the respondent works, lives, and the method of transportation. The lowest level of geographical detail for the ACS is the census tract which according to US Census Bureau defines an area of about 4,000 residents who have similar population characteristics. The data is available for download, either subsets using filters or the entire 4.1 million rows in an Access table.
There’s also a file of information about the individual census tracts that includes latitude and longitude (I assume the center of the tract) that can be used for mapping. I noticed a few missing tracts and I’m not sure why, but didn’t go back to figure it out.
These were the data I used to create the SQL Server tables that drive the visualization. The web app is built in C#/.Net and the data served, as I said, by SQL Server tables.
The resulting animations are somewhat hypnotic (even my dog seemed to go into a trance watching them leading to minutes of human amusement) but also provide a visual way of quickly seeing the distribution of workers into a given city. The points are sized based on the number of commuters, so a large dot indicates a higher relative number of commuters moving from the same tract to the same tract. The dots are also color coded to see which counties are most represented in the commuter sample.
To get more information on a given flow or workplace/home, pause the animation using the PAUSE/PLAY toggle and click on a dot. Options will appear for opening a tab in Google Maps to see either the home tract, the work tract, or the Google Maps driving directions between the two tracts. This can sometimes help to see that a large dot is focused on an airport or university as a workplace.
You also have the option to launch a tab in a great site called Loveland, based near my hometown of Ann Arbor in Detroit, that has detailed information on census tracts and will bring up a map of individual tracts so the size, shape, and contents of the tract can be examined.
More can be added, and please feel free to email me if you have any questions or suggestions. I appreciate you stopping by and taking a look.