Identifying Communities in Traffic Flow

One recent bit of research I have been working on has been looking at the application of community detection algorithms to traffic flow in London.

The idea is that within the traffic system exist a number of sub-systems of highly interconnected roads.  To a certain extent, these sub-systems are engineered into the system.  Transport for London, for example, specifically manage and maintain 23 key routes into and around central London, known as ‘corridors’.  However, to what extent do further systems exist outside of these defined zones?

Community detection algorithms were developed to identify clusters within a network dataset.  These methods are most often applied to examples within the social network sphere, in the identification of cliques, where a cluster demonstrates high inter-connectivity, with lower connectivity with the rest of the network.  My thinking behind this bit of work was that we might be able to identify similar characteristics in traffic flow, where we can observed high coupling between clusters of nodes.

The map below visualises the modules (distinguished by colour) identified through the application of community detection methods to a topological representation of the road network.  Node connectivity is established using a dataset of 1.5 million private hire cab routes through London.


The resulting visualisation, apart from being quite pretty (thank Gephi for that), reveal some interesting trends.  To a certain extent, a number of expected patterns in traffic flow are prevalent, with some of the ‘corridors’ into central London, such as the M3, M4 and A2, clearly defined as distinct clusters.  Yet the image also shows how both the M25, the ring road around London, and the North Circular, usually considered as single entities, can be segmentalised into modules defined by their usage.

We also see further interesting patterns in central London too, where certain regions – specifically Knightsbridge, Soho, Shoreditch the City and Hyde Park – are clearly defined as distinct modules.  These would appear to be areas of high internal movement, and thus a clear product of cab usage patterns.

These results, while presented only in their initial stages, demonstrate how measures of network characteristics can help us to understand dynamic patterns of movement in the city.



Thanks to all for the interest in this work!

Just by way of follow up, the image below shows a zoom in on Central London, demonstrating more clearly some of the regions mentioned above.  I’ve annotated this version for people who may not be familiar with London.



Mapping Taxi Routes in London

One major aspect of my research is spent looking into how people choose their routes around the city.  And to aid me in this, I managed to acquire a massive dataset of taxi GPS data from a private hire firm in London.  I’ve spent the last few months cleaning up the data, removing errors, deriving probable routes from the point data and extracting route properties.

It’s been a big job, but worth it.  I now have the route data of over 700,000 taxi journeys, from exact origin to destination, over the months of December, January and February 2010-11.  I’m now moving on to the actual analysis of this data, and am beginning to answer some of these questions concerning real-world route choice.  In the meantime, I thought I’d share one striking image that I extracted through this work.

The image below represents an aggregate of journeys on each segment of road on the London road network.  The higher levels of flow are illustrated in red, falling to orange, yellow, then white, with the lowest flow values shown in grey.

The most popular routes are along Euston Road, Park Lane and Embankment, which may be somewhat expected, but make for a stark constrast with respect to the flow of most traffic in London.  The connection with Canary Wharf comes out strongly, an indication of the company’s portfolio, though route choice here is interesting with selection of the The Highway more popular than Commercial Road.

Real insight will come with the full analysis of the route data, something that should be completed in January.  Until then, though, I’ll just leave you with this pretty something to look at.

Something I have been thinking about recently is the possibility of integration between GIS and space syntax.  The motivations are very clear.  Space syntax represents a compelling quantitative model of human behaviour and movement.  While the understanding of human systems is one of its most important areas of GIScience research (I may be slightly biased).  And with the ever increasing availability of movement data on a range of levels, the development of a model underlying this behaviour is ever more important.  So why can’t these two just get it on?


Well, the old argument has been that axial maps – the fundamental representation of space syntax – is simply not compatible with GIS.  Axial lines represent lines of sight, while GIS data segments are supposedly geographically accurate – at the level of network measures this difference is highly significant.  However, developments in space syntax – notably the development of Angular Segment Analysis by the brilliant Alasdair Turner, who very sadly died last week – mean that GIS integration is very much a possibility.

Turner’s approach was to measure the angular deviation between road segments on a GIS layer, assigning a score of zero for straight-ahead travel.  The greater the movement away from the straight line the higher the score, effectively yielding a new axial line.  Running angular betweenness (aka ‘choice’ in space syntax circles) calculations on the network yields some interesting results that I have discussed previously.  The story is clearly much more complex than this (and more can be read on this here).  But essentially this could be viewed as a new link between the traditional view of space syntax and GIScience.

ASA to the rescue?

However, some recent work I’ve been carrying out suggests that the picture is not so simple.  Specifically, it is not necessarily possible to run an Angular Segment Analysis on a raw GIS layer.  Taking the example of the OS ITN dataset – the most extensive representation of the UK road network – the presence of dual carriageways, roundabouts and other artefacts are contrary to what one would expect from an equivalent to the axial map.  And, indeed, betweeness measures on these networks do not inspire either, with strange variations across the datasets, notably across dual carriageways where big discrepancies can be found.

There are two key aspects at play here, I feel.  Firstly, ASA in it’s current form does not take account of traffic infrastructure and regulations.  Were it to perhaps handle routing information then the results may be more realistic, certainly in terms of the flow on dual carriageways and roundabouts.  Second, dual carriageways and roundabouts do not align with the fundamental idea behind the axial map.  Cognitively speaking, we do not think in terms of dual carriageways, rather simply the existence of a roadway at a given location.  In other words, why should dual carriageways be assessed independently since they were only simply engineered into two lanes?


So, what can be the way forward here?  Well, I know that where ASA is used commercially, the underlying network model is initially simplified to remove dual carriageways and roundabouts.  But this seems awfully unscientific (well, maybe cartography isn’t particularly scientific either…).  My suggestion, and something I am currently pursuing, is usage of simpler, existing GIS datasets.  In this way, these models are already used widely and better validated than a subjective in-house alteration.  Yet, what about other models and datasets that require more extensive GIS data?  I suggest the development of tools that link together different GIS datasets, allowing an exchange of data yet not disrupting the validity of each approach.  We can even try to link the axial map back to a range of GIS layers, and truly gain an understanding about the strengths of these approaches.

This is something I’ll be working on over the next few months – so watch this space, or get in touch if you’re interested in this.

The last week of trouble on the streets of British towns provides an interesting ‘field study’ of collective behaviour.  While the media and politicians seek to simplify the argument, understanding is only reached by examining the full complexity of the situation.  In seeking to remain as objective as possible, I’ll try to identify some diversity within these groups – starting with the ‘Rioters’:

The Destructors:  These are those intent on destruction.  Simply put, those who break the windows and light the fires.  They are highly influential on those around them, perhaps due to infectious bravado and dynamism.  They are likely to be within or supported by a close group of friends (e.g. gang structure) that encourages and respects this behaviour.  They may be motivated by an underlying resentment for (and perhaps a lack of fear of) the police or their community in general, although this may not be the focus of their actions.

The Followers:  These are those people bought onto the streets by sheer interest of what it happening in their neighbourhood.  On seeing the behaviour of those mentioned above (perhaps viewed as fun, or exciting), twinned with a lack of police intervention, they will join in also, although without the same vigour pursued by The Destructors.  They are likely to be more fearful of police action.

The Opportunists:  These are those who did not get involved with the wanton destruction, rather they were attracted by the potential of looted items.  They are united by a desire for material gain.  This may be twinned with an underlying sentiment that they have not received as much in the way of these items as they perceive to be ‘fair’.  This means that members of this group may be from any part of society, any person who feels that they deserve more. (Possible example: Laura Johnson)

The Observers:  They were those just watching and not getting involved.  Don’t underestimate the influence of hundreds of observers to make a riot look larger or more dangerous than it is.

In essence, it is too simple, too cack-handed to regard the ‘Rioters’ from one viewpoint.  Within the population of people out on the streets during those nights is a great deal of diversity.  This is important as it raises different questions as to how we deal with the underlying problems.  For example, why were these ‘Opportunists’ (as I’ve have coined them) drawn out onto the streets?  What can we do within our society, our society of superficiality and the culture of success attached to material wealth, to stop these people from acting this way again?

Furthermore, it is important the fully grasp the numbers of people we are talking about when it comes to addressing the scale of the issue.  This is hard to get a grasp on, and while the news reports can provide some sense of this, they are only drawn to the worst examples of behaviour.  However, I believe that, contrary to much opinion, there were only a small number of these so-called Destructors.  Rather, the behaviour of these people (within gangs) was highly influential on those around them.  Their own behaviour, and the resulting lack of action against them, encouraged the behaviour of those in other groups.

So when we did begin to see a crackdown by police, and arrests of hundreds of people, the rioting almost ceased straight away.  This would support the idea of a far greater number of ‘Followers’, those keen to be involved in the ‘fun’ but not those who will start it – in some respects, those afflicted by the ‘Madness of Crowds’ (see former blog post).  The Destructors, perhaps depleted in numbers and without the potential cover offered by the presence of many ‘Followers’, ‘Opportunists’ and ‘Observers’, simply stay at home.

The riots were a truly terrible event, but in seeking to understand what has happened we need to get a grasp of the full complexity of behaviour within the rioting populace.  We are not talking about ‘feral youth’ or ‘people gone wild’, the situation is more nuanced and requires a careful form of analysis and politics that, I fear, it won’t receive.

At some point this week I will try to apply the same analysis to the actions of the wider population during this period.

There is no doubt about the importance of social media in organising and directing crowd behaviour.  But there has been little discussion of how these models maintain certain social structures outside of periods of group activity.

As far as I can see, in the case of the London riots, young people are so intertwined with online social networking that they are never disconnected from the crowd.  The ideas that seem ‘normal’ and ‘acceptable’ during the actual riots – vis a vis hatred of the police, the desire to burn down and loot property – are maintained through these online connections.  When otherwise people may have had time to individually draw stock and reflect, there is always the online ‘crowd’ continuing to stoke the fire.

So, naturally then, people get together under the excitement that something might happen.  And when inevitably something does kick off, everyone gets involved.  What we then have is chaos and typical rioting behaviour.

‘The Madness of Crowds’ was a book written by Charles MacKay in 1841, describing the formation of crowd behaviours such as hysteria, economic bubbles and mass panic.  MacKay was among the first to begin to describe widespread phenomena that exist beyond the realm of individual rationality, phenomena that only exist through the interaction of crowds.  One particularly prescient quote may be as follows:

“Men, it has been well said, think in herds; it will be seen that they go mad in herds, while they only recover their senses slowly, and one by one.”

It appears to me that, in trying to understand and explain what has happened in London over the last few days, the press and politicians have forgotten this basic principle of crowd behaviour.

We all know that rioting and looting is a criminal activity (thanks for pointing that out Nick Clegg and Boris Johnson), but it is now taking place within an environment of acceptance and normality, an environment that has developed extremely quickly.  Within these social networks, existing across the intertwined ‘real’ and online worlds, there persists an ongoing idea, for whatever reason, that this behaviour should be taking place.  This is clearly dangerous and irrational, but it is an idea that remains.  Instead of calming the situation, I suspect that the threat of heavy policing and criminal prosecution is inflammatory, riling the crowd and encouraging them to go to further lengths.

In trying to understand these situations, people look to establish the drivers of this behaviour – the shooting that prompted the anger, or Twitter being used a platform for communication.  But this misses the point.  Rioting doesn’t need a cause, it is an irrational herding behaviour, where new norms are established quickly.

The ending of this behaviour must come from the base up.  Individuals – probably many of whom are normally decent and functioning members of society – must realise for themselves that what they are doing is wrong.

Unfortunately, this realisation, with the supporting infrastructure of online social networks maintaining this irrationality, may come later rather than sooner.

Top 2%

At the very broadest scope, Space Syntax can be said to investigate the relationship between movement and the configuration and connectivity of space.  In the past, while much favour has been found in the approach, critics have been distrustful of the axial line concept and of the representation of road segments as nodes in a network.  The construction of the network too, the process of drawing a network of longest lines of sight, has been seen to be unscientific.  Although I personally feel this to be a weak argument against Space Syntax in general, it’s acceptance into the wider research community may be hampered by this fact.

By way of a response to this argument, either intentionally or otherwise, there has been a movement towards segment-to-segment angularity (known as Angular Choice) as a predictor of movement.  The method is described by Turner in this paper, but in summary it is a calculation of betweenness on each network segment using the angular deviation between segments as the weight on which to calculate a shortest path.  The higher scoring segments, therefore, are those which are on a larger number of shortest angular paths passing over them.

One implication of this approach is that it a better fit for through-movement, that is an indicator of the routes we’re likely to use when moving from A to B.  This fits with what has been identified in other literature (particularly spatial cognition) where least angular change is identified as a driver of choice, notably in favour of pure metric distance.

So with a view to better understanding this relationship between the reality and angular choice, I wanted to compare the networks we find in the city and those indicated by this measure.  The first step was to draw out what traffic planners view as the most important roads on the network.  These are the roads identified in network as ‘Motorways’ and ‘A Roads’ (e.g. the ‘main’ roads), and as defined by the Department for Transport.  These were extracted and are as shown below:

This slideshow requires JavaScript.

The top 2% of these measures immediately draw out many of the most used and most well-known roads in London.  The M25 is prominent, as is the North Circular and various corridor roads into the city.  At 5% there is more definition of some of the other key roads, and by 10% we have a network that is quite similar to the map of ‘main’ roads in London.

By way of a statistically breakdown, the top 2% of values of the Choice measure predicts 76.3% of all ‘Motorway’ segments and 28.4% of all ‘A Roads’.  By 10%, these values have risen to 87.4% and 75.4% of all segments, respectively.  It is therefore clear that there is a correlation between this network measure and the definitions applied to the network.

I realise that this is a somewhat unrefined piece of work but I’d welcome any comments and am happy to share more on my method and results for those who are interested.

Urban Network Analysis for ArcGIS10

July 23rd, 2011 | Posted by edmanley in GIS - (0 Comments)

Looks like the MIT City Form Research Group have developed a very useful toolkit for those interested in small-scale urban network analysis. Bit uncertain about how well it might run on larger urban networks, if I get around to testing it I’ll put the results on here.

Check it out here:

From Road Closure to Road Congestion

Much of my work attempts to recreate the macro from the micro.  That is the explanation of large-scale effects through the examination of small-scale behaviours.  I look at how these develop over space and time.

So, more specifically, I look at how road congestion forms in cities and how we, as travellers, all contribute towards it.

As part of my early work on this stuff, I developed a simulation looking at how traveller decisions impact on the flow of traffic in adverse situations.  This consisted of the development of an Agent-based Model (ABM) using the Java-based Repast Simphony framework.  After a fair bit of faffing with Repast (which, I should add, is great although has a considerable learning curve in comparison to some ABM software), I have a model that demonstrates the impact of road closures across a population of driving agents.

The video below shows how the population of individually-cognating agents move from an area of origins (in green) to an area of destinations (in red) through London.  All of the agents move through geographic space, specifically an area around UCL in Euston.  So, this first video shows the normal situation, the next video will show how that changes once we mess things up a bit. (By the way, the video takes a few seconds to get moving, just allowing me a few seconds of in-lecture explanation).

Although the model is relatively simple in traffic simulation terms (with no traffic lights and regulations etc), I think it does show where concentrations of traffic form.  Particularly through the Euston Road/Tottenham Court Road junction.  So, what would happen if we closed this junction?  This…

I think it’s interesting to see the redistribution in traffic around the network.  Knowing that this junction is closed, you get a lot more movement along other roads suggesting that traffic would be considerably slower in these areas.  Clearly, the exact where’s and when’s in this scenario are some way of what reality might show.  Not only do we not have the impact of road regulations, but each individual holds a perfect knowledge of the network, proceeds towards their target along the shortest path and has prior knowledge of the closure ahead.  These are three important aspects I address in other pieces of work that I’ll put up later.  I also realise a bit of flow data would be quite useful here, but considering the pure conjecture of this scenario I’m not sure it’ll add much!