Thursday, 11 February 2016

Bubble grid charts: an alternative to stacked bar/column charts with lots of data series?





Stacked column charts with a large number of categories and data series can quickly become unclear. Deciphering this stacked column chart from the Home Office "Crime Outcomes in England andWales 2014/15 " for example, takes some work - in part because of some unfortunate design choices. 
(Click on the charts for a closer view) 

Stack bar chart with lots of data series.

But even if the design was improved, by flipping the chart through 90° and applying some other formatting tricks, it would still be difficult for readers to see the main points at a glance.

So what to do ?






First I tried a small multiples chart.

I think this is better than the stack chart that was published: readers can quickly identify the predominant outcomes for each crime category (so, for example, 70.0% of recorded thefts had an outcome of 'investigation complete - no suspect identified') without needing to consult a legend.  It is also easy to see which outcomes are rare across all crimes: only a very small proportion of recorded crimes result in an outcome of 'Taken into consideration' (where a defendant admits additional crimes in the sentencing phase of a trial and asks for these to be taken into consideration in sentencing). These 'rare' outcomes tend to disappear in a stacked bar chart, as readers consult the legend to see outcomes that are present - not ones that are absent.




On the other hand though, this still feels a bit clunky: not all of the crime types can be got into the same row, so the chart has to be split into two rows of data (with repeated category axis labels).  I could get rid of the data labels for a cleaner look - but in practice I've found that readers often get irritated if data labels aren't provided for these kinds of charts.

So I had a bit more of a play around in Excel and came up with the dot chart below (again click on it to get a larger version).


Figure 2.1 Outcomes for offences recorded in 2014/15, by type of offence


Note: Offence outcomes are broken down by percentage within each offence type. For example, just over half (50.8%) of recorded robberies had an outcome of ‘Investigation complete – no suspect identified.’ Data labels are shown for values greater than 4 per cent.
Data source: Crime Outcomes in England and Wales 2014/15: Data tables, Table 2.3; https://www.gov.uk/government/statistics/crime-outcomes-in-england-and-wales-2014-to-2015


I think the dot chart probably does the best job of making the big messages within the chart stand out, especially for people who aren't confident with numbers or will look only briefly at the chart. It's hard to miss that recorded thefts and criminal damage and arson generally result in no suspect being identified. It's also pretty clear that sexual offences have a higher proportion of 'unassigned outcomes, ' and that only a very low proportion of recorded crimes across all crime types are 'taken into consideration' at sentencing.

On the other hand, people are famously worse at comparing areas than length. In the small multiples chart it's easy to see that recorded crimes against the person are slightly more likely to run into evidential problems (because the victim does not support action) than to result in the perpetrator being charged or summonsed. Without comparing data labels, it would be very difficult to see this in the dot chart.

The stack chart makes it clear that the breakdowns of crime outcomes are proportions of a whole:  this is less obvious in the small multiples chart or the dot chart (though I've used colour in the dot chart to signal that the proportions relate to columns). But I don't think the stack chart does anything else well in this case.

Which of these three approaches do you prefer? Does the dot chart have legs? Are small multiples the way forward? Are stacks your thing? Is it a case of horses for courses?

Update 14/2/2016

The things I've been calling dot charts are called bubble grid charts. Next up - how to make them in Excel.


No comments:

Post a Comment