A few weeks ago, the Euro 2016 started and with it a betting contest at Criteo. I knew almost nothing about soccer (I was still confusing the Ronaldo from Brazil-98 with the one from Portugal-now), but some pretty accurate information is freely available on betting websites, so I decided to give it a try anyway.
A big fan of soccer, Jonathan BLANDIN of Criteo’s internal IT department created ” Bookmakerz.com with a friend as a prognostic platform among friends during soccer tournaments. The goal was to create something solid and free without the complexities of using excel. #teamCriteo is so far the largest group created on Jonathan’s Bookmakerz. The platform now gives anybody the chance to form their own leagues and bet away!!
Here is a summary of the rules :
The most important thing to notice is that internal bets are used, meaning that the odds and the number of points you get by betting on the winning outcome depends on people’s bet.
If all the contestants bet on team A’s win, the odd of the opponent (team B) for a draw or a win will be really low and really rewarding.
Therefore, by trying to predict the bias between the “true” bets (the one from betting website) and the internal bets, one could be able to maximize his average number of points. This was the basic strategy I followed – Bet on the outcome on which people at Criteo bet less than they should.
For the general bets, I just choose the favorites – France as the winner, Germany as the other finalist and T. Muller as the top scorer (I’m looking at his Wikipedia page as I’m writing this right now, and I can confirm that I never heard about him before).
It could have been interesting to choose slightly less likely generals bets, but an implementation of a basic Monte Carlo simulations  algorithm of the contest with different scenarios (different number of competitors, different kind of bias, …) convinces me otherwise.
The simulations also showed two interesting things.
First, even without any bias in the Criteo bets, it was much more interesting to bet on the less likely outcome (because of a higher variance). It is better to either win or lose badly, than to always end in the middle of the pack.
The second thing I got from these simulations was that, the amount of point from the exact score played a very minor role on average. This discovery comforted me on the fact that there were no particular reason to bet on something else than the most likely scores (most of the time 1-0, or 1-1 for draws).
First part : the group phase
For the start of the contest I assumed that, because internal bets were hidden, most people would just bet on the most likely outcome, there will be a bias for the favorites.
So for the first matches I bet on the win of the underdog, except when the match was too unbalanced and the number of points for the underdog win would be capped at 10 (odds are capped at 10 in the rules), in that case I bet on a draw.
At the beginning the bias was actually stronger than I anticipated (for example for the first match more less 95% on people bet on a win of France vs Romania), so I continued with this strategy for all the first phase. It wasn’t right often, but when it was, it paid a lot (nobody expected the Spanish inquisitiondefeat).
At the very end of this phase I grabbed the first place from A.Leloup (a competitor which was leading by playing draws 75% of the time, another strategy with a high variance and high expected value).
Start of the Final Phase
At the end the first phase, the bias for the favorite slightly decreased, and I assumed that a bias would appear against draws, as some people would avoid betting them, not knowing that the prolongations were not taken into account. So for the start of the final phase I always bet on draw, and it worked quite well quickly, as I got several good results in a row.
At this point I had lead strong enough so that, for some matches, it was more interesting to bet on the favorite (betting on a low variance outcome at the cost of a reduced expected value). In order to check that, I re-implemented the algorithm I had, using the current leaderboard and the general bets, and then tried (manually then programmatically) different bets, in order to find maximize the odds of ending at the first place (and the odds of finishing at least in the 2nd and 3rd place).
At this point I added these probabilities to the metrics our team monitor every morning during the stand-up on our TV screens :
End of the final Phase
It was working, but was only testing the different bets on the next upcoming match, so I did the same but on the tree of the possibilities (so 1 bet for the upcoming match, 2 bets for the second match depending on which team was selected is the first match, 4 bets for the third match, and so on). This give 2^N – 1 bets for N remaining matches, and since each bet can take three values, the number of combination to test was 3^(2^N – 1), and each test implies hundreds of thousands of Monte Carlo simulations in order to be accurate.
To find the optimal solution, I couldn’t use an exhaustive search, and a simple hill climbing algorithm (changing randomly one bet and keep the change if it improves) was stuck in local optimums, but a slightly improved hill climbing (by changing two or three bets at a time) was enough to quickly reach the global optimum. So the problem was not so intractable and fancy things like simulated annealing algorithm were not necessary.
At the end of the contest the bias against the draw decreased, and a bias for the underdog started to appear, probably because the end of the contest was not so far, and some people switch to a high risk high reward strategy (betting on the underdog).
For the final, I was quite confident, as the simulation gave me almost 100% probability to win, and the only scenarios in which I was losing required that a draw before the prolongations, and that Ronaldo was the top scorer. For that, Ronaldo needed to score 4 goals in the finale, which didn’t really happen.
I didn’t have to use mixed strategy from game theory, but it could have been a thing to do with a smaller lead.
In the end and I won because of some basic psychology, some statistics/algorithms, and a lot of luck.
I might have missed the point of this betting contest since I never actually tried to predict the outcome of the matches, but I had lots of fun anyway.
Few random facts :
– Always betting 1-1 was enough for the third place
– Betting 1-0, 0-1 or 1-1 randomly was enough for the 8th place on average
– The score of soccer follows very closely a Poisson distribution
– For those familiar with information theory, the information for each match is 1.58 bits without any prior knowledge, and about 1.45 bits given the odds from betting websites, just slightly better.
Here is the graph of the points of each competitors during the competition (without general bets) :
 Monte Carlo simulations is basically running a random process a large number of times in order to get an accurate approximation.
Senior Software Engineer R&DSee Dev Lead roles