This post was originally made on the 4Ps Marketing blog.
The Bing Predicts team have been doing an excellent job over the last couple of years, covering everything from reality TV to politics and, of course, sports. But just how excellent have they been? Is their data good enough to take on the giants of the sporting world? Can Bing beat the Bookies?
This is how the morning of Monday July 3rd began in the 4Ps office, although perhaps with fewer rhetorical questions. We’re big fans of empirical testing here at 4Ps, so Scott and I decided that we should settle the debate for once and for all by betting on Bing’s predictions for Wimbledon. To keep things fair, we decided to only look at the men’s singles, and to place all of our bets on the morning of the match. With no other information to go on, we both put in £10 and simply placed a £2 bet on all players Bing was predicting to win, and a £2 accumulator across all matches.
This was a mistake. We quickly realised as the results came in that we should have chosen our matches more carefully. Since this was the first round, most of our bets were on ‘safe’ players. As a result, the odds we were getting were not very good. On average, we were expecting a return of £0.28 per match, meaning we had to be correct 7 in 8 times to turn a profit. What we needed was a proper strategy, and luckily, we had both the means and data to create one.
We’d been recording the Bing Predicts win percentage for each match as we placed our bets, so at the end of the day we had a list of these percentages against the match outcome. First, we tried using the percentage from Bing to calculate an ‘expected return’ from our stake and the odds. These all turned out negative, which whilst likely accurate, wasn’t very encouraging or helpful.
However, we realised that by treating the numbers from Bing as an arbitrary ‘score’ rather than a percentage, we could employ a statistical technique called logistic regression. This generates a model of probabilities from a set of values and associated binary outcomes. We could then use this model to work backwards from the percentage of games we needed to win, and come up with a threshold score to place our bets.
A little tinkering in Python later, and we had our model:
Using this, the odds for the next day, and a little bit of finger waving, we were able to come up with a much more sensible strategy. We would place bets on the players Bing thought were 70% likely to win, and an accumulator on any above 80%. We also dropped our bets to £1 in order to reduce the effect of any further bad days.
Using this strategy proved much more successful, as over the next few days we gradually clawed our way back into a profit. As things progressed, we fed the data back into the model but found that we didn’t need to adapt our strategy; the modelled numbers stayed about the same.
We did miss one day out, when both of us were out of the office in the morning with no opportunity to place any bets. Luckily for us, this was the day where Murray and Djokovic were both knocked out. Had we been around that morning, we would have bet on them both and probably made an overall loss.
Thankfully, Lady Luck smiled upon us and we ended the tournament with a staggering £20.47 in our account, a 2.35% increase in just 2 weeks. This return sums up how we both felt about the whole thing – yes, we did technically come out ahead, but only with a good number of hours invested and more than a little blind luck.
We were happy to call this one a draw. Bing did come out on top, but if you’re looking for practical betting advice, we’d suggest you look elsewhere.
Authors’ note: This whole exercise was thought up last minute, and intended as a bit of a joke between co-workers. The statistical methods and models we used are shaky at best due to our limited sample and poor planning, and fall down completely when looking at predicting win percentage of the lower ranked player. There are, however, some clear improvements that could be made; we’d love to hear from anyone who’s had a similar idea.