free range statistics

I write about applications of data and analytical techniques like statistical modelling and simulation to real-world situations. I show how to access and use data, and provide examples of analytical products and the code that produced them.

Recent posts

Setting up RStudio Server, Shiny Server and PostgreSQL

07 July 2018

A few months back, I set up a server on Amazon Web Services with a data sciencey toolkit on it. Amongst other things, this means I can collect data around the clock when necessary, as well as host my little RRobot twitter bot, without having a physical machine humming in my living room. There are lots of fiddly things to sort out to make such a setup actually fit for purpose.

Spend on petrol by income

01 July 2018

I explore the relationship between household income and expenditure on gasoline and motor oil in the USA Bureau of Labor Statistics' Consumer Expenditure Survey.

Demography simulations

26 June 2018

Simulating a population with changing total fertility rate, life expectancy, infant mortality, and other parameters

Minor updates for ggseas and Tcomp R packages

15 June 2018

Minor updates available on CRAN for the ggseas (seasonal adjustment on the fly) and Tcomp (tourism forecasting competition data) R packages

Demystifying life expectancy calculations

31 May 2018

Life expectancy is calculated directly from death rates. And mathematically speaking, changes in infant mortality have a much greater impact on life expectancy than do changes in death rates in any other year.

Fifteen New Zealand government Shiny web apps

13 May 2018

I had a brief look around New Zealand government agency websites and found 15 high quality web apps written in the Shiny platform.

Survey books, courses and tools

05 May 2018

Books, online courses and tools on surveys I've recently visited and liked.

Weighted survey data with Power BI compared to dplyr, SQL or survey

11 April 2018

I show a workaround to make it (relatively) easy to work with weighted survey data in Power BI, and ruminate on how this compares to other approaches of working with weighted data.

Deaths per firearm violence event

01 April 2018

A negative binomial model isn't adequate for modelling the number of people killed per firearm incident in the USA; the real data has more events of one death, and also more extreme values, than the model. But estimating the model was an interesting exercise in fitting a single negative binomial model to two truncated subsets of data.

Truncated Poisson distributions in R and Stan

20 March 2018

Two ways of fitting a model to truncated data.