Tuesday 23 February 2016

Making a data explorer for Life Expectancy at Older Ages: Part 1


Public Health England (PHE) recently released 'Life Expectancy: recent trends in older ages', which found that "life expectancy at older ages in England has risen to its highest ever level." They also cautioned that this overall gain masks substantial variation at Local Authority (LA) level. Danny Dorling (the well-known geographer, in this case speaking in his capacity as a member of PHE's mortality surveillance group) stated that:

"Although national average life expectancy continues to rise, in many parts of England improvements have stalled in recent years. There is an urgent need to determine why this is happening. Beneath the headline figures of this report there is evidence of worsening health for many older people in some parts of the country."

My initial thought was that there might be a relationship between changes in life expectancy and LA-level ranks on the Indices of Multiple Deprivation 2015. But there wasn't - at least at LA level.

This seems interesting, so I'm going to have a go at building a data explorer to make it easier to view LA-level life expectancy data that PHE have published and link it up with Indices of Deprivation data, including some of the sub-domains, such as Income Deprivation Affecting Older People Index (IDAOPI).

The first task will be reformatting and reorganizing the life expectancy data tables that have been released along with the publication (available here). These are well-designed reference tables for end-users - but I'm going to be using them for further analysis. To that end, I will strip out a lot of the formatting, import them into SQL and restructure them so that they're easier to work with.

Once that's done, I'll do a bit of mapping and chart making, either with SSRS or the Google Visualisation API.  I'll be posting the odd progress report, as well as posts about how to avoid the pot-holes I'll inevitably find into along the way. There may be the odd interim product as well.

In the meantime, here are some sample charts showing change over time in male life expectancy at 65 years and over. I've built these in Excel to get a sense of the look and feel that I want - but obviously the final versions will be built using something else. Eventually they will need to accommodate multiple data series (so that users can compare results for different age bands, or by gender, for example).

Given that this is local authority level data, I think it's important to show the 95% confidence limits as well as the calculated life expectancy. The first approach I took was simply to chart the upper and lower confidence limits as fainter, lighter lines around the life expectancy value itself. I think this looks fairly clean (and would continue to do so with more data series) - but I'm not sure whether or not the confidence limits are emphasised sufficiently.

Figure 1: Life expectancy for males aged 65 and over in Middleborough, 2000/02 to 2011/13





















Note: Life expectancy is calculated using three-year rolling averages. Life expectancy data for 2013, for example, is a rolling average of data from 2011-2013. 
Sources: Office for National Statistics http://www.ons.gov.uk/ons/rel/subnational-health4/life-expectancy-at-birth-and-at-age-65-by-local-areas-in-england-and-wales/2011-13/stb-life-expectancy-at-birth-2011-13.html  Crown Copyright 2015

In this second approach, I've charted the range between the upper and lower 95% confidence intervals as an opaque block, adding a thin 'glow' effect to indicate that a smattering of possible values would fall outside the limits (here I'm indebted to Michael Correll and Michael Gleicher and their paper "Error Bars Considered Harmful: Exploring Alternate Encodings for Mean and Error". A darker bar shows life expectancy. (Please note that my 'glow' effect in my example is currently indicative - not representative of the fall-off outside the 95% limits).

This chart doesn't smooth the data (it's actually a stack column chart in disguise): I think this may make it easier to assess which changes over time are striking.

Figure 2: Life expectancy for males aged 65 and over in Middleborough, 2000/02 to 2011/13















Note: Life expectancy is calculated using three-year rolling averages. Life expectancy data for 2013, for example, is a rolling average of data from 2011-2013. 
What do you prefer? Does the range within the confidence limits feel more prominent in the second chart? Or is it much of a muchness? 


No comments:

Post a Comment