# AFL teams Elo ratings and footy-tipping

## At a glance:

I use Elo ratings from 12 months or from 120 years of AFL results to predict the results in the next round. Predictions based on just the past 12 months do better than those using the full history.

23 Mar 2019

So now that I live in Melbourne, to blend in with the locals I need to at least vaguely follow the AFL (Australian Football League). For instance, my work like many others has an AFL footy-tipping competition. I was initially going to choose my tips based on wisdom of the crowds (ie choose the favourite) but decided that this was a good occasion to try something a bit more scientific.

As is the case for most organised sports these days, there is rich data available on AFL results and other metrics. After wasting 20 minutes trying to locate and scrape various websites I remembered “someone must have already done this”, and sure enough found Jimmy Day’s highly effective fitzRoy R package.

## Elo ratings over the long term

An easy way to generate predictions of the likely winner in a head to head game is by comparing Elo ratings based on past performance. I’ve written a bit about Elo ratings in the context of backgammon, and my frs R package has a couple of functions to make it easy to help generate and analyse them (for example, turning two ratings into a probability of winning).

Here’s the Elo ratings of present and past AFL (and its predecessor, the Victorian Football League) teams, treating them continuously from 1897 to the present:

Amongst other things, this provides at least an answer to the vexed question of which AFL team is the best overall - apparently Geelong (prepares to duck). Interesting to see that sustained period of Collinwood dominance in the 1930s too. Also suprising (to me, showing how little attention I’ve paid) is Sydney sitting on the second highest Elo rating today based on the full history of the game. When I last paid attention to the footy in the late 1990s, the Sydney Swans were literally the punchline of an evening comedy show on TV, which adopted them out of the pure humour of supporting such a perpetually losing team. Obviously they’ve recovered, no doubt in part due to the support of their fans through the tough patch.

I can’t remember the name of the TV show.

Anyway, before we get on to thinking about predictions, here’s the R code to bring in those results and draw the chart of Elo ratings.

Post continues after code extract

## Ratings change depending on when you start from

What happens if we base our ratings only on recent performance? As an example, the chart below shows team ratings for the 2018 season if all teams were reset to 1500 at the beginning of the year:

The rankings differ quite noticeably from those based on the full history.

2018’s top ranked team, Richmond, was only seventh when the full history was used. They didn’t make the grand final - such is the luck inherent in this sort of tournament - but the eventual premiers West Coast (my own team, for what it’s worth, as a result of growing up in Perth) were ranked a solid second. The four top teams were those in the semi-finals, so that system works to a degree.

So that’s ratings based on one year, what if we choose a dozen years? A very interesting story of Geelong’s complete dominance up to about 2014, caught up in the past five years or so by Hawthorn and Sydney. Richmond is much less prominent in this view, reflecting how suprisingly well it went in 2018 (despite missing out on the final) compared to form in the previous ten years:

The ratings based on 2007 and onwards end up in a very similar position to those based on 1897 and onwards; it looks like there is about 10 years of momentum stored up in a rolling Elo rating.

Post continues after code extract

## Recent performance is a better guide for predictions than the full history

Let’s turn to the question of using Elo ratings, whether based on 12 months of performance or 120 years, to predict winners in the coming season. This next chart compares the teams’ ratings at the end of 2018 (having been reset to 1500 at the beginning of the year) with two candidate explanatory variables with predictive power - Elo rating based just on the 2017 season, and Elo rating based on 120 years of performance to end of 2017. Both sets of ratings have predictive power, but the ratings based on only 12 months are slightly better.

Finally, here’s a chart of how well we would have gone in a 120 year footy-tipping competition if we simply tipped the team with the higher Elo rating, based on performance to date, to win. Our success rate in the red line (using full history of performance) hovers around 65%, which isn’t stellar but is clearly better than chance.

The short blue line is success rate when using just performance from 2017 onwards to predict. We see its slightly better than the predictions that used full history. Ideally, I would calculate Elo ratings based on the past 12, 24 and 36 months only for every season to find the best level of history to include, but that’s more rigour than I’m interested in at the moment. I’m going to use Elo ratings based on the 2018 and 2019 seasons for my footy tips.

Post continues after code extract

So here’s my rating table I used for tips for round 1 of 2019 (other than the first game, which being on a Thursday snuck up on me before I realised I had to get tips in - newbie mistake, annoying because I would have picked the winner correctly).

team new_elo
Richmond 1560.921
West Coast 1553.291
Melbourne 1553.150
Collingwood 1543.454
Geelong 1533.995
Hawthorn 1523.387
GWS 1522.148
Sydney 1514.495
Essendon 1514.178
North Melbourne 1509.164
Brisbane Lions 1465.610
Footscray 1461.265
Fremantle 1457.496
St Kilda 1433.498
Gold Coast 1423.489
Carlton 1407.342

This led to tips that were mostly consistent with the crowd favourites and bookies’ odds. The main exception was that I tip Hawthorn to beat the Adelaide Crows, against the strong expectations of everyone else. Possibly they know something additional to what’s in my data; did Hawthorn lose some key players, or have a bad off-season? We’ll only know at the end of the season when we can see if my method gets better results than the average punter.

Post continues after code extract

That’s all.

Here’s the R packages used in producing this post:

maintainer no_packages packages
Hadley Wickham 16 assertthat, dplyr, ellipsis, forcats, ggplot2, gtable, haven, httr, lazyeval, modelr, plyr, rvest, scales, stringr, tidyr, tidyverse
R Core Team 11 base, compiler, datasets, graphics, grDevices, grid, methods, splines, stats, tools, utils
Winston Chang 4 extrafont, extrafontdb, R6, Rttf2pt1
Yihui Xie 4 evaluate, knitr, rmarkdown, xfun
Kirill Müller 4 DBI, hms, pillar, tibble
Yixuan Qiu 3 showtext, showtextdb, sysfonts
Lionel Henry 3 purrr, rlang, tidyselect
Gábor Csárdi 3 cli, crayon, pkgconfig
Dirk Eddelbuettel 2 digest, Rcpp
Jeroen Ooms 2 curl, jsonlite
Jim Hester 2 glue, withr
Kamil Slowikowski 1 ggrepel
Vitalie Spinu 1 lubridate
Deepayan Sarkar 1 lattice
Patrick O. Perry 1 utf8
Jennifer Bryan 1 cellranger
Michel Lang 1 backports
Kevin Ushey 1 rstudioapi
Martin Maechler 1 Matrix
Justin Talbot 1 labeling
Charlotte Wickham 1 munsell
Alex Hayes 1 broom
Simon Wood 1 mgcv
Joe Cheng 1 htmltools
Simon Urbanek 1 audio
Peter Ellis 1 frs
Brodie Gaslam 1 fansi
R-core 1 nlme
Stefan Milton Bache 1 magrittr
Marek Gagolewski 1 stringi
James Hester 1 xml2
Max Kuhn 1 generics
Simon Urbanek 1 Cairo
Jeremy Stephens 1 yaml
James Day 1 fitzRoy
Achim Zeileis 1 colorspace
Rasmus Bååth 1 beepr

← Previous post

Next post →