Identifying Communities in Traffic Flow

One recent bit of research I have been working on has been looking at the application of community detection algorithms to traffic flow in London.

The idea is that within the traffic system exist a number of sub-systems of highly interconnected roads.  To a certain extent, these sub-systems are engineered into the system.  Transport for London, for example, specifically manage and maintain 23 key routes into and around central London, known as ‘corridors’.  However, to what extent do further systems exist outside of these defined zones?

Community detection algorithms were developed to identify clusters within a network dataset.  These methods are most often applied to examples within the social network sphere, in the identification of cliques, where a cluster demonstrates high inter-connectivity, with lower connectivity with the rest of the network.  My thinking behind this bit of work was that we might be able to identify similar characteristics in traffic flow, where we can observed high coupling between clusters of nodes.

The map below visualises the modules (distinguished by colour) identified through the application of community detection methods to a topological representation of the road network.  Node connectivity is established using a dataset of 1.5 million private hire cab routes through London.

NodeModularity_GrLondon_3_1k_newcred

The resulting visualisation, apart from being quite pretty (thank Gephi for that), reveal some interesting trends.  To a certain extent, a number of expected patterns in traffic flow are prevalent, with some of the ‘corridors’ into central London, such as the M3, M4 and A2, clearly defined as distinct clusters.  Yet the image also shows how both the M25, the ring road around London, and the North Circular, usually considered as single entities, can be segmentalised into modules defined by their usage.

We also see further interesting patterns in central London too, where certain regions – specifically Knightsbridge, Soho, Shoreditch the City and Hyde Park – are clearly defined as distinct modules.  These would appear to be areas of high internal movement, and thus a clear product of cab usage patterns.

These results, while presented only in their initial stages, demonstrate how measures of network characteristics can help us to understand dynamic patterns of movement in the city.

 

Edit

Thanks to all for the interest in this work!

Just by way of follow up, the image below shows a zoom in on Central London, demonstrating more clearly some of the regions mentioned above.  I’ve annotated this version for people who may not be familiar with London.

CentralLondonModularity_02_annotated

 

Navigating the City: Minimising Distance but NOT Minimal Distance

I’ve always had a problem with the pervasive assumption in transportation research that everyone takes the shortest metric distance path when travelling between A and B.  This idea doesn’t seem to have any solid foundations in research, and intuitively it doesn’t make much sense – how do you even know what the shortest distance path is anyway?

So a good deal of my research has looked into what people really do. I’m not going to reveal all here – journal papers are generally more important than blogs in assuring future employment – but I’ll share one interesting finding.

The data I have used relates to 700,000 taxi routes through London (you might remember I blogged about this dataset previously).  For each of these routes, between origin and destination, I have also calculated an optimum path, according to a range of metrics, one being distance.  Then, as far as this blog post goes, I have compared each route and calculated the percentage match between the real route and the optimum shortest distance journey.

Realistic?

So is the shortest distance path a decent representation of reality?  No.

On average, the shortest distance path is able to estimate only 39.8% of each route.  Pretty poor when you consider that it is often used solely in predicting the behaviour of many individuals.

Not only this, the data shows that the shortest distance path is followed in entirety only very rare occasions.  Only 5% of real journeys show a match with 90% of their equivalent shortest distance path, with this value only rising to 13% when that threshold is dropped to 75%.

Minimising Distance

So, do people have no consideration for distance when they route through the city?  Well, no, that isn’t quite the case.

The graph below shows a scatter plot of real distances against actual distances.  As you can see, the relationship and resulting R-square is pretty good.

DistanceVsOptimalAll.PNG.scaled1000

Note: Overly long routes (three times optimal distance) have been removed.

It appears that people therefore appear to minimise distance – or they at least do not at least go extremely far from the minimal – but do not generally take the optimal shortest distance path.

This is research I’m still pulling together, but I hope this post has interest to the wider community.  For anyone that is interested, do get in touch and I’ll let you know when the paper on this may be out.

Modelling Movement in the City: The Influence of Individuals

‘Modelling Movement in the City: The Influence of Individuals’ was the title of a talk I gave at the AGILE conference in Avignon, France last week.  For the conference I actually initially prepared a poster that never ended up seeing the light of day – except for now that is.

The poster presents some recent work I carried out through agent-based simulation, demonstrating how different behavioural models influence the formation of macroscopic patterns.  As you can see from the results, the impact of mere basic assumptions hold a significant impact upon the unfolding network picture.

[slideshare id=19430885&doc=agileposter-130421150321-phpapp02]

Probably now going to write this up as a journal paper, but hopefully putting the poster up here won’t mess with any copyright stuff – please let me know if it might!

LongLatMe: Location-embedded SMS

So, let’s say you want to meet up with your friends.  You text – “Where are you?”.  “We’re at the Bar Bar on 59th Street”, they reply.  Now you need to look the place up, and navigate your way there.

Instead, why can’t your friend just send you a location object within the SMS, encoding their current coordinates.  The ability to locate exists, all that needs to be developed is a generic method for integration with all current mapping applications, allowing you to easily route your way to their location.

Does this exist?  And, if it doesn’t, then why not?  I’d be amazed if Google haven’t thought to implement this with Android.

Edit: This exists (well, of course it does!) though with not as good a name.  You can find out about ‘GeoSMS’ at this wikipedia page…

A Simple Idea for Making Route Directions More Human

For many, route planners are vital in finding your way around the city.  Type your destination into Google Maps or one of the many other websites or apps available, and you’ll be returned a list of directions from your location.  Simple, right?

Hmm well, let’s have a look at an example.  Taking two well known locations in London, we’ll have a look at the walking directions provided by Google Maps – Buckingham Palace to the Tate Modern – here we go.  Great George Street, fine, Bridge Street, ok, follow the A302, errr, something about the Millenium Bridge, and we’re there, maybe.

OK, if you’re a Londoner, how would you describe the route to someone?  I suspect it might go something like this…

Right, so from Buckingham Palace, head down towards Parliament, keep left of Parliament and go over the bridge.  At the end of the bridge, turn left, go past the Millenium Wheel, carry on along the river.  You’ll pass the National Theatre and the OXO Tower, then the Tate Modern is opposite St Pauls.

So why can’t Google Maps or anyone else include these instructions?  They have the data on the locations of these places.  They have the direction of movement of the individual, so can have an idea of what is in front of them…

“Yes, but what about obstacles stopping people from seeing these places?!”, I hear the perceptive reader ask.

Well, Google and Flickr hold ample amounts of georeferenced photography that would allow them to calculate viewsheds of these locations.  The locations and groupings of these photos show that St Pauls can not be seen from Parliament, for example, and indicate the places where these locations are viewed best.  Furthermore, the volume of photos provide an indication of the popularity or salience of the location, and could even be provided with directions so that even the least familiar tourist knows what to look for.

Considering the volumes of crowdsourced data they hold, I feel like Google are missing a pretty simple trick here.  So, come on, Google, why not improve this feature and make a walk through the city more interesting to everyone.

Mapping Taxi Routes in London

One major aspect of my research is spent looking into how people choose their routes around the city.  And to aid me in this, I managed to acquire a massive dataset of taxi GPS data from a private hire firm in London.  I’ve spent the last few months cleaning up the data, removing errors, deriving probable routes from the point data and extracting route properties.

It’s been a big job, but worth it.  I now have the route data of over 700,000 taxi journeys, from exact origin to destination, over the months of December, January and February 2010-11.  I’m now moving on to the actual analysis of this data, and am beginning to answer some of these questions concerning real-world route choice.  In the meantime, I thought I’d share one striking image that I extracted through this work.

The image below represents an aggregate of journeys on each segment of road on the London road network.  The higher levels of flow are illustrated in red, falling to orange, yellow, then white, with the lowest flow values shown in grey.

The most popular routes are along Euston Road, Park Lane and Embankment, which may be somewhat expected, but make for a stark constrast with respect to the flow of most traffic in London.  The connection with Canary Wharf comes out strongly, an indication of the company’s portfolio, though route choice here is interesting with selection of the The Highway more popular than Commercial Road.

Real insight will come with the full analysis of the route data, something that should be completed in January.  Until then, though, I’ll just leave you with this pretty something to look at.

Space Syntax to GIS Integration: A Roadmap

Something I have been thinking about recently is the possibility of integration between GIS and space syntax.  The motivations are very clear.  Space syntax represents a compelling quantitative model of human behaviour and movement.  While the understanding of human systems is one of its most important areas of GIScience research (I may be slightly biased).  And with the ever increasing availability of movement data on a range of levels, the development of a model underlying this behaviour is ever more important.  So why can’t these two just get it on?

Representation

Well, the old argument has been that axial maps – the fundamental representation of space syntax – is simply not compatible with GIS.  Axial lines represent lines of sight, while GIS data segments are supposedly geographically accurate – at the level of network measures this difference is highly significant.  However, developments in space syntax – notably the development of Angular Segment Analysis by the brilliant Alasdair Turner, who very sadly died last week – mean that GIS integration is very much a possibility.

Turner’s approach was to measure the angular deviation between road segments on a GIS layer, assigning a score of zero for straight-ahead travel.  The greater the movement away from the straight line the higher the score, effectively yielding a new axial line.  Running angular betweenness (aka ‘choice’ in space syntax circles) calculations on the network yields some interesting results that I have discussed previously.  The story is clearly much more complex than this (and more can be read on this here).  But essentially this could be viewed as a new link between the traditional view of space syntax and GIScience.

ASA to the rescue?

However, some recent work I’ve been carrying out suggests that the picture is not so simple.  Specifically, it is not necessarily possible to run an Angular Segment Analysis on a raw GIS layer.  Taking the example of the OS ITN dataset – the most extensive representation of the UK road network – the presence of dual carriageways, roundabouts and other artefacts are contrary to what one would expect from an equivalent to the axial map.  And, indeed, betweeness measures on these networks do not inspire either, with strange variations across the datasets, notably across dual carriageways where big discrepancies can be found.

There are two key aspects at play here, I feel.  Firstly, ASA in it’s current form does not take account of traffic infrastructure and regulations.  Were it to perhaps handle routing information then the results may be more realistic, certainly in terms of the flow on dual carriageways and roundabouts.  Second, dual carriageways and roundabouts do not align with the fundamental idea behind the axial map.  Cognitively speaking, we do not think in terms of dual carriageways, rather simply the existence of a roadway at a given location.  In other words, why should dual carriageways be assessed independently since they were only simply engineered into two lanes?

Roadmap?

So, what can be the way forward here?  Well, I know that where ASA is used commercially, the underlying network model is initially simplified to remove dual carriageways and roundabouts.  But this seems awfully unscientific (well, maybe cartography isn’t particularly scientific either…).  My suggestion, and something I am currently pursuing, is usage of simpler, existing GIS datasets.  In this way, these models are already used widely and better validated than a subjective in-house alteration.  Yet, what about other models and datasets that require more extensive GIS data?  I suggest the development of tools that link together different GIS datasets, allowing an exchange of data yet not disrupting the validity of each approach.  We can even try to link the axial map back to a range of GIS layers, and truly gain an understanding about the strengths of these approaches.

This is something I’ll be working on over the next few months – so watch this space, or get in touch if you’re interested in this.

Rioting, it ain’t so simple…

The last week of trouble on the streets of British towns provides an interesting ‘field study’ of collective behaviour.  While the media and politicians seek to simplify the argument, understanding is only reached by examining the full complexity of the situation.  In seeking to remain as objective as possible, I’ll try to identify some diversity within these groups – starting with the ‘Rioters’:

The Destructors:  These are those intent on destruction.  Simply put, those who break the windows and light the fires.  They are highly influential on those around them, perhaps due to infectious bravado and dynamism.  They are likely to be within or supported by a close group of friends (e.g. gang structure) that encourages and respects this behaviour.  They may be motivated by an underlying resentment for (and perhaps a lack of fear of) the police or their community in general, although this may not be the focus of their actions.

The Followers:  These are those people bought onto the streets by sheer interest of what it happening in their neighbourhood.  On seeing the behaviour of those mentioned above (perhaps viewed as fun, or exciting), twinned with a lack of police intervention, they will join in also, although without the same vigour pursued by The Destructors.  They are likely to be more fearful of police action.

The Opportunists:  These are those who did not get involved with the wanton destruction, rather they were attracted by the potential of looted items.  They are united by a desire for material gain.  This may be twinned with an underlying sentiment that they have not received as much in the way of these items as they perceive to be ‘fair’.  This means that members of this group may be from any part of society, any person who feels that they deserve more. (Possible example: Laura Johnson)

The Observers:  They were those just watching and not getting involved.  Don’t underestimate the influence of hundreds of observers to make a riot look larger or more dangerous than it is.

In essence, it is too simple, too cack-handed to regard the ‘Rioters’ from one viewpoint.  Within the population of people out on the streets during those nights is a great deal of diversity.  This is important as it raises different questions as to how we deal with the underlying problems.  For example, why were these ‘Opportunists’ (as I’ve have coined them) drawn out onto the streets?  What can we do within our society, our society of superficiality and the culture of success attached to material wealth, to stop these people from acting this way again?

Furthermore, it is important the fully grasp the numbers of people we are talking about when it comes to addressing the scale of the issue.  This is hard to get a grasp on, and while the news reports can provide some sense of this, they are only drawn to the worst examples of behaviour.  However, I believe that, contrary to much opinion, there were only a small number of these so-called Destructors.  Rather, the behaviour of these people (within gangs) was highly influential on those around them.  Their own behaviour, and the resulting lack of action against them, encouraged the behaviour of those in other groups.

So when we did begin to see a crackdown by police, and arrests of hundreds of people, the rioting almost ceased straight away.  This would support the idea of a far greater number of ‘Followers’, those keen to be involved in the ‘fun’ but not those who will start it – in some respects, those afflicted by the ‘Madness of Crowds’ (see former blog post).  The Destructors, perhaps depleted in numbers and without the potential cover offered by the presence of many ‘Followers’, ‘Opportunists’ and ‘Observers’, simply stay at home.

The riots were a truly terrible event, but in seeking to understand what has happened we need to get a grasp of the full complexity of behaviour within the rioting populace.  We are not talking about ‘feral youth’ or ‘people gone wild’, the situation is more nuanced and requires a careful form of analysis and politics that, I fear, it won’t receive.

At some point this week I will try to apply the same analysis to the actions of the wider population during this period.

Online social networks maintain anger and irrationality of London rioters

There is no doubt about the importance of social media in organising and directing crowd behaviour.  But there has been little discussion of how these models maintain certain social structures outside of periods of group activity.

As far as I can see, in the case of the London riots, young people are so intertwined with online social networking that they are never disconnected from the crowd.  The ideas that seem ‘normal’ and ‘acceptable’ during the actual riots – vis a vis hatred of the police, the desire to burn down and loot property – are maintained through these online connections.  When otherwise people may have had time to individually draw stock and reflect, there is always the online ‘crowd’ continuing to stoke the fire.

So, naturally then, people get together under the excitement that something might happen.  And when inevitably something does kick off, everyone gets involved.  What we then have is chaos and typical rioting behaviour.

The Madness of Crowds – London Edition

‘The Madness of Crowds’ was a book written by Charles MacKay in 1841, describing the formation of crowd behaviours such as hysteria, economic bubbles and mass panic.  MacKay was among the first to begin to describe widespread phenomena that exist beyond the realm of individual rationality, phenomena that only exist through the interaction of crowds.  One particularly prescient quote may be as follows:

“Men, it has been well said, think in herds; it will be seen that they go mad in herds, while they only recover their senses slowly, and one by one.”

It appears to me that, in trying to understand and explain what has happened in London over the last few days, the press and politicians have forgotten this basic principle of crowd behaviour.

We all know that rioting and looting is a criminal activity (thanks for pointing that out Nick Clegg and Boris Johnson), but it is now taking place within an environment of acceptance and normality, an environment that has developed extremely quickly.  Within these social networks, existing across the intertwined ‘real’ and online worlds, there persists an ongoing idea, for whatever reason, that this behaviour should be taking place.  This is clearly dangerous and irrational, but it is an idea that remains.  Instead of calming the situation, I suspect that the threat of heavy policing and criminal prosecution is inflammatory, riling the crowd and encouraging them to go to further lengths.

In trying to understand these situations, people look to establish the drivers of this behaviour – the shooting that prompted the anger, or Twitter being used a platform for communication.  But this misses the point.  Rioting doesn’t need a cause, it is an irrational herding behaviour, where new norms are established quickly.

The ending of this behaviour must come from the base up.  Individuals – probably many of whom are normally decent and functioning members of society – must realise for themselves that what they are doing is wrong.

Unfortunately, this realisation, with the supporting infrastructure of online social networks maintaining this irrationality, may come later rather than sooner.