Mapping Taxi Routes in London

One major aspect of my research is spent looking into how people choose their routes around the city.  And to aid me in this, I managed to acquire a massive dataset of taxi GPS data from a private hire firm in London.  I’ve spent the last few months cleaning up the data, removing errors, deriving probable routes from the point data and extracting route properties.

It’s been a big job, but worth it.  I now have the route data of over 700,000 taxi journeys, from exact origin to destination, over the months of December, January and February 2010-11.  I’m now moving on to the actual analysis of this data, and am beginning to answer some of these questions concerning real-world route choice.  In the meantime, I thought I’d share one striking image that I extracted through this work.

The image below represents an aggregate of journeys on each segment of road on the London road network.  The higher levels of flow are illustrated in red, falling to orange, yellow, then white, with the lowest flow values shown in grey.

The most popular routes are along Euston Road, Park Lane and Embankment, which may be somewhat expected, but make for a stark constrast with respect to the flow of most traffic in London.  The connection with Canary Wharf comes out strongly, an indication of the company’s portfolio, though route choice here is interesting with selection of the The Highway more popular than Commercial Road.

Real insight will come with the full analysis of the route data, something that should be completed in January.  Until then, though, I’ll just leave you with this pretty something to look at.

ArcGIS10: Spatial Indices

Handling large datasets in ArcGIS can be a truly painful process.  When you are up against a deadline, the seconds spent wasted waiting for ArcGIS to update its display or run a query can be excruciating.

That is until you discover spatial indices!  It is like a new world, where querying data is (almost) fun, and not an reason to go and make a cup of tea.  Plus it is incredibly simple.

To apply a spatial index, firstly find your troublesome dataset in Catalog.  Right-click and go to Properties.  Then the Indexes tab.  See Spatial Index in the bottom pane of this window and click Add.  This will take a few seconds but once in place it will significantly improve the speed of your redrawing and querying.

Having said all this, I’m not sure this will be new to many – but I found this very useful.  More information is available from ESRI themselves here.