I’ve always had a problem with the pervasive assumption in transportation research that everyone takes the shortest metric distance path when travelling between A and B. This idea doesn’t seem to have any solid foundations in research, and intuitively it doesn’t make much sense – how do you even know what the shortest distance path is anyway?

So a good deal of my research has looked into what people really do. I’m not going to reveal all here – journal papers are generally more important than blogs in assuring future employment – but I’ll share one interesting finding.

The data I have used relates to 700,000 taxi routes through London (you might remember I blogged about this dataset previously). For each of these routes, between origin and destination, I have also calculated an optimum path, according to a range of metrics, one being distance. Then, as far as this blog post goes, I have compared each route and calculated the percentage match between the real route and the optimum shortest distance journey.

**Realistic?**

So is the shortest distance path a decent representation of reality? No.

On average, the shortest distance path is able to estimate only 39.8% of each route. Pretty poor when you consider that it is often used solely in predicting the behaviour of many individuals.

Not only this, the data shows that the shortest distance path is followed in entirety only very rare occasions. Only 5% of real journeys show a match with 90% of their equivalent shortest distance path, with this value only rising to 13% when that threshold is dropped to 75%.

**Minimising Distance**

So, do people have no consideration for distance when they route through the city? Well, no, that isn’t quite the case.

The graph below shows a scatter plot of real distances against actual distances. As you can see, the relationship and resulting R-square is pretty good.

Note: Overly long routes (three times optimal distance) have been removed.

It appears that people therefore appear to *minimise *distance – or they at least do not at least go extremely far from the minimal – but do **not **generally take the optimal shortest distance path.

This is research I’m still pulling together, but I hope this post has interest to the wider community. For anyone that is interested, do get in touch and I’ll let you know when the paper on this may be out.

Really interesting, thanks. I suspect the data is not available but one thing I would love to see some figures on is the extent to which cyclists avoid taking the shortest route due to a desire to avoid the heaviest/fastest traffic. It seems to me that safety concerns are probably more salient for cyclists than other road users and their behaviour is therefore likely to be somewhat different.

Ed – it’s really interesting you’re working with this dataset, as I work with similar (live) dataset from tracked fleet vehicles in Sydney. Seems like you’re doing really interesting work! Stumbled across your work/blog through a retweet from Toby (went to Uni with him).I’m curious as to whether you’ve considered there may be inherent bias in the dataset for this analysis – in that the Taxi drivers have a vested interest in not taking the shortest route?I’d certainly be interested to read more in the completed paper when you’ve finished it.

Hi Rich,Apologies for slow reply, I just saw this.You’re right that in principle these biases may exist, but I haven’t spotted it in the data (in the form of high use of cut-throughs, shortcuts etc). The drivers involved in this dataset are not on the meter as the company charges a fixed fee for any given origin-destination. You’d expect they’d like to complete the job quickly, in order to move onto the next job, but I wouldn’t expect they to be significantly more incentivised than the average driver.The work is just about complete but I’m also moving on to my PhD write-up, so probably won’t be out until next year. Can certainly let you know when it is though. I’d be interested to know more about your work too, if you’ve written anything up. My email is ucesejm (at) ucl (dot) ac (dot) uk.

Nice. This is specifically distance? Is there any way of looking at shortest time routes? The hunch would be that taxi drivers know which routes are quickest, but those won’t necessarily be the shortest distance (and they may well be using their own knowledge about what routes are faster at certain times of day etc – can you track individuals to see how much their routes vary and if there’s a time of day correlation?)

Cf. Rich’s point about being incentivised to go longer routes, I suspect you’re right – better to get on the next job ASAP, hence me wondering if their optimising of time might show a stronger correlation than distance alone (dunno whether ITN speed limit data would be enough – any other way of getting data on average time for the edges?)