Introducing Baldo's Crystal Egg Rugby Predictions

Gazing into the crystal egg

For something a bit different, this blog post either marks the start of an incredible get-rich quick scheme, or the start of my descent into a hopeless spiral of gambling addiction.

Recently I’ve been getting interested in prediction models, and particularly in 538’s World Cup predictions. Seeing as I couldn’t find anyone trying to algorithmically predict rugby games in Europe, I thought I’d give it a crack myself. The result is a system called Baldo’s Crystal Egg (TM) which I’m going to be tinkering with and using throughout the upcoming rugby season to predict the result of every match in the Pro 12 and English Premiership (and maybe the I-can’t-believe-it’s-not-the-Heineken Cup too).

There are some numbers to follow – if you don’t want to think about stats, don’t worry, I’ll be talking rugby soon.

Look away now if this isn’t your kind of thing. There’s a rugby picture in a minute

I’ll write a post with the full gory details of how the system works in R and Python soon, but for the stats-shy, it’s pretty simple in principle (if not in practice!). I started by collecting data from the ESPN Scrum website, which reports on every match and gives stats at an individual player level for most. Using this data set, I was able to calculate a measure of a team’s performance in any given game, based on both the outcome of the game and the strength of the opposition (and other factors such as home advantage). I was also able to roughly score a player’s individual contribution, based on how well the team did when they were on the pitch and their own actions – tries, assists, tackles, turnovers, etc. The result is an approximate score for each team and player for both offence and defence.

Sadly irrelevant

These scores are weighted for both time on the pitch and how long ago the game was, with a limit of four years – so Munster’s glory days of 2006-08 won’t unfairly inform their current scores.

Using some fancy statistical analysis (multinomial logistic regression if you’re interested), I can build a model that estimates the probability of various different results based on how two teams with similar scores for offence and defence have fared when playing each other in the past.

For example, at the moment Munster have an offensive score of 22.5 and a defensive score of 17 (roughly signifying how many points you would expect them to score and concede against an “average” team at a neutral ground), while the equivalent scores for Leinster are 25.5 and 16.5. The model estimates that if the two teams met at Thomond Park, Munster would have a 50% chance of winning the game. On the other hand, the same game in the RDS would give them just a 24% chance of victory. Unfortunately, this seems about right!

1360002855_simon_zebo_amazing_backheel_flick_for_ireland_against_wales

Thanks for hanging in. Here’s Simon Zebo

All in all, the model currently accurately predicts the outcome in about 75% of past games. The model is most effective if you have the full lineups for each team when predicting, but it still fares reasonably well based solely on team ratings (>70%). It can also predict the outcome of the game in terms of bonus point wins or losses, but there is naturally a bit of a loss in accuracy. Of course, it’s easy to predict the past – the real challenge is to predict the future!

Which is why I’m going to place bets every week of the rugby season this year based on suggestions from this model. The bets won’t always be on the favourites either, either by the bookies or by the models. The aim will be to find where there’s best value in bets – if my model predicts a team has a 30% chance of winning, but they’re at five to one odds, then they’re still a good bet, even if I expect them on the whole to lose. Hopefully over the course of a year this averages out…

Every week I’ll make my bets (~£5 depending on how good value’s to be had) and post my predictions for each game (Pro 12 and Premiership only at the moment, but hopefully I’ll incorporate international by the time that rolls around too). At the end of the year I’ll either be the next Nate Silver or broke, but let’s see!

First round of predictions due out in two weeks for the first round of games. In the meantime I’ll be tinkering with the model and trying to stop it saying that Riki Flutey is a better centre than Brian O’Driscoll.

14 Comments

frank frenett

21/08/2014 at 06:47

As an ex player and born in Munster and a gambler and a maths lover you have hooked!

Where are you posting your predictions?

Baldo

21/08/2014 at 10:21

Cheers Frank. I’ll be posting them up here, probably of a Friday. I don’t expect to be making much money on it any time soon mind!

Lorcan

21/08/2014 at 11:28

I’m very interested in the gory details, looking forward to seeing that!

- Carli
  
  15/12/2016 at 18:53
  
  Jon, me, a power user?! As I say I will be happily keeping an eye on News 2.0. Already it’s been a great way of seeing what the Australian blogosphere is talking about!Skribe I’d forgotten about that site, pegtrblogs.orh! Will bookmark it too!Thanks Sarah, I’m going to have to go and check those sites out now
  
- http://www.publisheralabaster.biz/
  
  27/03/2017 at 14:16
  
  Until I found this I thought I’d have to spend the day inside.
  
Baldo

21/08/2014 at 11:54

Will try and get working on it – in the meantime, it is basically a shoddy rip off of this, so worth checking out:

http://espn.go.com/soccer/worldcup/news/_/id/4447078/GuideToSPI

John Rowland

21/08/2014 at 17:07

Have signed up for future posts

Ronan

22/08/2014 at 11:42

Sounds interesting

Predicting the leagues : Baldo

23/08/2014 at 17:17

[…] was glad to see some interest in the Crystal Egg model when I made my first post during the week. There seemed to particularly be questions around how the model worked, so I have […]

Stephen

24/08/2014 at 12:22

As a postgrad student blissfully nearing the end of a stat-heavy dissertation, it’s nice to see that stats can be put to good use!

- Baldo
  
  24/08/2014 at 14:16
  
  Glad to hear it – I’m sure almost any other use of stats including whatever you were doing is of more value to the world than this!
  
Round 4 Review - Baldo

30/09/2014 at 21:35

[…] and those who need a reminder, an introductory blog post about the Crystal Egg model can be found here, along with full details of how the system works […]

J

02/10/2014 at 13:00

Added to my pulse. Great idea!!

Rod

17/11/2014 at 09:33

Given the ‘interesting’ autumn international series so far, any thoughts about going international before the six nations starts?

Enlightener of the universe and adornment of hierarchs

Introducing Baldo’s Crystal Egg Rugby Predictions

14 Comments

Leave a Reply Cancel reply

Share this:

14 Comments

Leave a Reply Cancel reply