Linked micromaps are an ok way of presenting data and are probably the right tool in some circumstances; but they're not as cool as I thought they might be.
30 Apr 2017
Continuing my exploring methods for spatial visualisation of data in R, today I’m looking at linked micromaps. Micromaps are a way of showing statistical graphics for a small subset of regions at a time, with a small map indicating which regions are being looked at in each of the small multiples. Alberto Cairo has some nice discussion in this blog post.
Poverty and education in the USA
It’s easier to show than explain. Here’s one I’ve adapted from the helpfiles of the R micromaps package by Quinn Payton, Tony Olsen, Marc Weber, Michael McManus and Tom Kincaid of the US Environmental Protection Agency and generously open-sourced by that agency. Under the hood, micromaps uses Hadley Wickham’s ggplot2 to create the graphic assets and (I think) Paul Murrell’s grid to lay them out in aligned fashion.
I used this USA example as my test case for understanding the micromaps API. Some of the changes I’ve introduced here include:
Set my own font
Pale blue background with white gridlines to make them go into the background, ggplot2-style.
White borders for the grey-fill states that have been previously drawn, so the border lines don’t distract the eye from the five states being referred to in each cluster.
Tidied up the code for readability
Here’s the code that draws this, and sets up the overall session. Note that the nzcensus (by me) and mbiemaps (by New Zealand’s Ministry of Business, Innovation and Employment, my previous employer) are only available on GitHub, not CRAN. Code to install them is available in a previous post. They’re needed for New Zealand data and maps later on.
Most of the polishing code is in the list of lists passed to the panel.att argument. Each list refers to the attributes of one of the four panels (state names, poverty dot charts, education dot charts, maps). I had to do a bit of digging to find out how to control things like grid colour; while doing this it was useful to run one of the three lines of code below to see what attributes are within your control for the different panel types:
Note that the lmplot function specifies a file to print the graphic to. I don’t like this, as it’s breaks some commonly accepted R workflows. For example, for this blog I usually create in advance all the graphics for each post in SVG format, which scales up nicely if people zoom in on it and is generally the best format (my view) for web graphics. That can’t be done when lmplot restricts you to particular device types. The pattern also doesn’t work well with Yixuan Qiu’s showtext R package that I normally use for fonts (it lets me access Google fonts, including the Poppins font I use for most graphics).
New Zealand census example
To be sure I understood how to use the technique, I wanted to apply it to some New Zealand maps and data. I’m used to presenting data at the Territorial Authority level in New Zealand by means of a choropleth map like this one:
… which was drawn with this code, using the TA2013 data frame of 2013 census data from my nzcensus package:
Doing this with a linked micromap instead of a choropleth map lets us look at more variables at once, but I can’t say I’m happy with the result:
My reservations about this graphic:
It feels like there are just too many territorial authorities for this to be a really friendly graphic.
Also, my map of New Zealand is probably too crinkly and individual districts and cities too small to show up well.
There’s an awkwardness of New Zealand being tall rather than wide - a smaller aspect ratio than USA. This seems to make the New Zealand map less well suited to the technique than the USA map.
It’s hard for the reader to move their eyes back and forth from the district or city name to the dots and to the map.
I couldn’t work out how (if it is possible) to control the projection of the map, and hence New Zealand looks a little bit stretched and rotated.
Here’s the code that drew this, anyway:
Now, there’s a standard way of showing two variables against each other. We lose the spatial element, but for most purposes I think the good old scatter plot is better for this data:
Maybe better with a smaller number of areas?
New Zealand has 66 districts and cities (not counting Chatham Islands), but only 16 Regional Councils. Perhaps the method works better with a smaller number of areas to show:
… and I think that probably is ok. But really, all we are showing here is 32 numbers. It’s an expensive graphic for something that could almost be as meaningfully shown in a table. All up, my reaction to linked micromaps is one of caution. Like any visualisation tool, I think they’ll be good in some circumstances, but in others they just won’t seem to easily communicate.
Code for the regional council micromap:
Bonus - area unit cartograms
Finally, an aside. My last two blog posts have been about what you might call population-weighted carto-choro-pletho-grams… I’ve been gradually enhancing this Shiny web app which lets people play around with visualisations of 2013 census data. Latest addition is data at the detailed area unit level. Here’s a screenshot showing which area units have higher proportions of people with a background from the Pacific islands:
My day job is Chief Data Scientist at Nous Group, an international management consultancy with over 400 people working across Australia, the UK and Canada. Contact me if you are interested working with us on a grand challenge or broad agenda.
I'm pleased to be aggregated at R-bloggers, the one-stop shop for blog posts featuring R.