Statistical graphics for communicating

Peter Ellis

June 2016

Today’s content

  • Different purposes of graphics
  • What makes graphics excellence
  • Improving graphics

Purposes of graphics

Data science workflow

datascience

Grolemund and Wickham, http://r4ds.had.co.nz/introduction.html

Different purposes

…exploratory…

…analysis and diagnosis…

…presentation…

Comprehend this:

data(anscombe)
anscombe[ , c(1,5,2,6,3,7,4,8)]
##    x1    y1 x2   y2 x3    y3 x4    y4
## 1  10  8.04 10 9.14 10  7.46  8  6.58
## 2   8  6.95  8 8.14  8  6.77  8  5.76
## 3  13  7.58 13 8.74 13 12.74  8  7.71
## 4   9  8.81  9 8.77  9  7.11  8  8.84
## 5  11  8.33 11 9.26 11  7.81  8  8.47
## 6  14  9.96 14 8.10 14  8.84  8  7.04
## 7   6  7.24  6 6.13  6  6.08  8  5.25
## 8   4  4.26  4 3.10  4  5.39 19 12.50
## 9  12 10.84 12 9.13 12  8.15  8  5.56
## 10  7  4.82  7 7.26  7  6.42  8  7.91
## 11  5  5.68  5 4.74  5  5.73  8  6.89

compared to:

Put the data in its place

Use during analysis

present results

Compare to

Dependent variable:
MedianIncome
MeanBedrooms 0.012
(0.011)
PropPrivateDwellings 0.650***
(0.111)
PropSeparateHouse -0.148***
(0.025)
PropMultiPersonHH -0.082
(0.105)
PropNotOwnedHH 0.170***
(0.033)
MedianRentHH 0.0002***
(0.00003)
PropLandlordPublic -0.014
(0.018)
PropNoMotorVehicle -0.274***
(0.067)
PropOld 0.490***
(0.073)
PropAreChildren 0.221***
(0.075)
PropSameResidence5YearsAgo -0.053*
(0.029)
PropOverseas5YearsAgo -0.501***
(0.087)
PropMaori -0.074***
(0.028)
PropPacific -0.249***
(0.041)
PropAsian -0.296***
(0.031)
PropNoReligion -0.137***
(0.035)
PropSmoker 0.064
(0.068)
PropSeparated -0.318***
(0.075)
PropDoctorate 1.914***
(0.215)
PropPTStudent 0.334*
(0.184)
PropUnemploymentBenefit -0.211
(0.172)
PropStudentAllowance -1.959***
(0.189)
PropFullTimeEmployed 1.674***
(0.056)
PropPartTimeEmployed 0.095
(0.107)
PropUnemployed 0.058
(0.208)
PropEmployer 0.876***
(0.073)
PropSelfEmployedNoEmployees -0.333***
(0.053)
PropTrades -0.492***
(0.079)
PropLabourers -0.460***
(0.057)
PropAgForFish 0.029
(0.034)
PropPubAdmin 0.120**
(0.047)
PropFinServices 0.996***
(0.158)
PropProfServices 1.235***
(0.088)
PropWorked40_49hours 0.277***
(0.060)
PropPublicTransport -0.005
(0.057)
PropWalkJogBike -0.076*
(0.043)
PropNoUnpaidActivities -0.842***
(0.086)
Constant 8.878***
(0.139)
Observations 1,785
Note: p<0.1; p<0.05; p<0.01

<>

Illustrate concepts

animation1

Graphic excellence

Principles

  • well-designed presentation of interesting data - substance, statistics, and design
  • complex ideas communicated with clarity, precision, and efficiency
  • greatest number of ideas in the shortest time with the least ink in the smallest space
  • nearly always multivariate
  • telling the truth about the data

Adapted from Tufte

Some specifics

  • Comparative
  • Multivariate
  • High data density
  • Reveal interactions and comparisons
  • Nearly all the ink is data ink

Examples

Change this…

…to this:

This is good

But this is better

This is good

But this is better

More detailed examples

Perception of quantity

From best to worst

  1. Position
  2. Length
  3. Area
  4. Volume
  5. Area and slope
  6. Colour and density

Typical stacked bars…

Orient for easy reading

Sequential colours

Diverging scale

Use position

Much better than

Cluttered

Minimal axis guides

Fade axis title

Remove borders

Remove boxes

Guidelines to back

Background to back

Consistent doc theme