free range statistics

I write about applications of data and analytical techniques like statistical modelling and simulation to real-world situations. I show how to access and use data, and provide examples of analytical products and the code that produced them.

Recent posts

Population age changes in the Pacific

27 July 2024

I polish up some visualisations of demographic trends in the Pacific.

Reproducing and adapting the UN Population Projections

05 May 2024

I have a mostly successful go at cohort component population projections to replicate the UN's totals from their published parts, with an idea to making small changes to observations or assumptions that can build on the official projections.

The 'V20' group of vulnerable countries and the MVI

17 October 2023

I compare the GDP per capita and scores on the UN Multidimensional Vulnerability Index (MVI) of the 68 economies in the 'V20' group with other countries that aren't part of the V20.

The UN's proposed Multidimensional Vulnerability Index

30 September 2023

I make some visualisations of the country scores of the UN's proposed Multidimensional Vulnerability Index

Finding a circle in a chart

23 September 2023

I have a go at an 'insanely hard' (actually not that hard) problem to find the radius of a circle from someone's recruitment exercise

Model life tables

06 August 2023

I make an animation and a basic Shiny app to explore the United Nations' model life tables used for demographic estimates in countries where direct estimation of mortality rates by age isn't possible.

Log transforms, geometric means and estimating population totals

30 July 2023

A model that is 'improved' (in terms of making standard assumptions more plausible) by using a logarithm transform of the response will not necessarily be improved for estimating population totals.

Weighted versus unweighted percentiles

24 June 2023

When working with complex survey data where the weights are related to a continuous variable of interest, using a weighted rather than unweighted percentile rank will lead to different results towards the middle of the distribution; but the two measures will be highly correlated with eachother. Also, R reportedly calculates weighted percentile ranks much much faster than Stata.

Simpler drawing of Pacific choropleth maps

17 June 2023

I demonstrate the function I use to make it simpler to draw choropleth maps based on Pacific Island countries' and territories' exclusive economic zones.

Simulating confounders, colliders and mediators

04 June 2023

I do some simulations to reproduce a great figure by Wysocki et al; and show different data where the causal relationship between x and y is in the presence of a third variable that is either a confounder, collider or mediator.