free range statistics

I write about applications of data and analytical techniques like statistical modelling and simulation to real-world situations. I show how to access and use data, and provide examples of analytical products and the code that produced them.

Recent posts


Understanding the limitations of group-level inequality data

07 October 2018

Cross-sectional country-level data will show a relationship between income inequality and life expectancy even if inequality itself has no direct impact on life exectancy; so long as there is changing marginal impact of individual income on individual life space (as of course there is).


Sri Lanka visitor arrivals

26 September 2018

Sri Lanka has a rapidly growing tourism industry, two international tourism seasons, and seasonality patterns in arrivals that vary according to country of origin.


Rents in Melbourne

31 August 2018

Rents in Melbourne have on average grown fastest in suburbs that were the cheapest in 2000; at least for two and three bedroom flats and for two bedroom houses. Also, scatterplots are awesome.


Estimating relative risk in a simulated complex survey

24 August 2018

Simulating complex survey data in order to fit slightly mis-specified relative risk models, we find that confidence intervals' coverage is pretty much as advertised if we use appropriate methods that adjust for the complex survey data, but under-perform if the data is treated naively as coming from a simple random sample.


Relative risk ratios and odds ratios

17 August 2018

Explanation and demonstration with simulated data of the difference between relative risk ratios and odds ratios, and how to extract them from a generalized linear model.


Time series intervention analysis with fuel prices

14 August 2018

I look into whether the regional fuel tax in Auckland has led to changes in fuel prices in other regions of New Zealand.


Leading indicators of economic growth

10 August 2018

A demo of a favourite combination of multiple imputation, bootstrap and elastic net regularization. I look at what are good leading indicators, with reliable data available, of New Zealand's economic growth. The results turn out to be last quarter's economic growth; food prices; visitor arrivals; car registrations; business confidence; and electronic card transactions.


Business confidence and economic growth

01 August 2018

I have a brief look at the relationship between reported business confidence in New Zealand and what actually happens down the track with economic growth. Confidence can help (a bit) explain future growth; but current and past growth isn't helpful in explaining confidence.


Setting up RStudio Server, Shiny Server and PostgreSQL

07 July 2018

A few months back, I set up a server on Amazon Web Services with a data sciencey toolkit on it. Amongst other things, this means I can collect data around the clock when necessary, as well as host my little RRobot twitter bot, without having a physical machine humming in my living room. There are lots of fiddly things to sort out to make such a setup actually fit for purpose.


Spend on petrol by income

01 July 2018

I explore the relationship between household income and expenditure on gasoline and motor oil in the USA Bureau of Labor Statistics' Consumer Expenditure Survey.