free range statistics

I write about applications of data and analytical techniques like statistical modelling and simulation to real-world situations. I show how to access and use data, and provide examples of analytical products and the code that produced them.

Recent posts

Smoothing charts of Supreme Court Justice nomination results

26 March 2022

Sometimes a Twitter storm of chart-shaming is unfair, mean, and frankly misguided. I reproduce and defend a chart originally produced by FiveThirtyEight to illustrate changes over time in how nominations for US Supreme Court Justices have been voted on in the Senate.

Principal components and penguins

14 June 2021

Beware that the direction of a principal component can vary depending on the sequence of the original data.

Making a database of security prices and volumes

14 February 2021

I make a SQLite database of daily observations of Australian security prices, volumes and short positions.

Visualising stock prices and volume

05 February 2021

After some experimenting with how to show stock price and volume at the same time, I conclude unsuprisingly that the charts commonly used in finance are pretty much fit for purpose, but alternative presentations have their place too.

Shiny in production for commercial clients

21 December 2020

Shiny can be an effective platform to quickly build data-intensive web applications that otherwise would not be commercially viable. The rationale for using Shiny at the right time is convenience, cost, and statistical and graphics power.

Animated map of World War I UK ship positions

05 December 2020

An animated map of UK Royal Navy ship locations during World War I.

Reproduce analysis of a political attitudes experiment

14 November 2020

I reproduce the analysis of data from a recently published experiment on the impact on Australians' and New Zealanders' attitudes to overseas aid of being exposed to writing about Chinese aid in the Pacific. Along the way I muse about the Table 2 fallacy, and try to avoid it while still using multiple imputation, bootstrap and adjusting for covariates to slightly improve the original analysis.

Hamlet, data models, interaction graphs and other cool stuff

11 October 2020

I play around with Hamlet's text to set it up for easy data analysis. Hamlet is awesome, this post is really just an excuse to spend time with it; but it does perhaps start to put together something useful about data models for text.

Facebook survey data for the Covid-19 Symptom Data Challenge

04 October 2020

Two huge surveys of Facebook users seem to provide valuable new information on how the world is responding to Covid-19, but I am very unsure about whether they have potential to enable earlier detection of outbreaks.

Free text in surveys - important issues in the 2017 New Zealand Election Study

26 September 2020

I try out biterm topic modelling on a free text question in the 2017 New Zealand Election Study about the most important issue in the election.