Playing with Public (Transportation) Data

As promised a while back after we got our fill playing with the data I would release it to the public to see if you could come up with something interesting. I’d leaked the JSON file to Kars and he applied his skills in visualizing things in processing1 to the dataset.

Then after some more back and forth I retrieved a similar dataset from the ANWB site2: the time to travel a similar distance at a similar time but this time by car3.

The juxtaposition of those two datasets made for some interesting results and some nice applications of interactive filtering. Kars has a full writeup of his process.

So without further ado, here is the dataset under a Creative Commons Attribution license. It’s a JSON file: Traveltimes with as keys a four digit string for the postal code. The value is a dictionary with this key-value mapping:

lat
Latitude GPS coordinate
lng
Longitude GPS coordinate
place
Inferred name of the location
time
The time it takes using public transportation
carTime
The time it takes by car (a small amount of null values where the time could not be retrieved)

All times are from the center of the 4 digit postal code as well as can be determined to Dam Square in Amsterdam around noon on a given day.

I find it interesting (and somewhat appalling) to see how large the difference is between taking the car or going by public transportation. Doing a sampling for 08:00 on Monday morning during rush hour might somewhat equalize this, but I think it’s safe to say that car owners will remain at an advantage.

So next up is e-mailing GroenLinks and Rover to see if they can use this data or these visualizations.

  1. With some help from Ben Fry.
  2. These site operators are so friendly and accomodating.
  3. With historical traffic congestion information added, but because the sampled time was around noon the effect should be negligible.