free range statistics

I write about applications of data and analytical techniques like statistical modelling and simulation to real-world situations. I show how to access and use data, and provide examples of analytical products and the code that produced them.

Recent posts

Free text in surveys - important issues in the 2017 New Zealand Election Study

26 September 2020

I try out biterm topic modelling on a free text question in the 2017 New Zealand Election Study about the most important issue in the election.

Trying out Processing

13 September 2020

I have a go at using the Processing programming language, which was developed to help graphic designers but has a broad bunch of potential applications. While Processing proper sits on top of Java, there's a handy implementation in JavaScript.

Mixture distributions and reporting times for Covid-19 deaths in Florida

06 September 2020

I look at some unusual data where the median was higher than the mode, and show how to model it in Stan as a mixture of two negative binomial distributions.

Time series cross validation of effective reproduction number nowcasts

29 August 2020

I confront past nowcasts of effective reproduction number for Covid-19 in Victoria with the best hindsight estimate, and confirm that the nowcasts lag change in the 7-14 days leading up to the time they are made.

Lines of best fit

23 August 2020

I have a go at synthesising data to re-create a controversial and much-criticised chart that used ordinary least squares to fit a line relating university subjects' costs per student to the number of students in each subject.

Essentially random isn't the same as actually random

09 August 2020

An observational study claiming to be an RCT might have something to say but there are far too many discretionary researcher choices taken to believe its findings. But I use this as a chance to play with statistical inference after estimating a regression via lasso.

Visualisation options to show growth in occupations in the Australian health industry

02 August 2020

Exploration of change in occupations in the Australian health industry, and economy more broadly, from 1986 to the present.

Estimating Covid-19 reproduction number with delays and right-truncation

18 July 2020

There is a fast growing body of knowledge and tools to help estimate effective reproduction number of an epidemic in real time; I have a go at applying the latest EpiNow2 R package to data for Covid-19 cases in Victoria, Australia.

Fixing scientific publishing and peer review

13 June 2020

Science isn't broken, but journals are. A joint solution is emerging for disparate problems of access, quality control and replicability in scientific publishing.

Forecasts for the 2020 New Zealand elections using R and Stan

06 June 2020

My forecasts for the 2020 New Zealand general election are out, and predict a comfortable win for Jacinda Ardern's Labour Party either alone or in coalition.