In math we trust

We want to be as transparent as can be for everyone. So we are happy to explain how our algorithm is constructed.

Image 1

What many people do not realise

When making predictions many people do not realise a number of essential elements that constitute the outcomes of matches.

A very interesting element is for example the advantage of playing at home. Many people think the importance of this is limited however if you look at the full soccer season 2018-2019, you can clearly see the advantage:

Image 2

Also with the total number of goals scored per match one often overlooks the statistics, and especially when betting on competitions outside the country of familiarity, one does not take into account the difference in countries (2018-2019):

Image 3

Also the percentage of draws needs attention (2018-2019):

Image 4

Why soccer predictions are not easy

Soccer is a tricky sport to predict as there are only a few goals scored in each match. Basketball is easier as many more goals are scored. With fewer goals scored per match the randomness, i.e. the uncertainty, of the outcomes increases. And therefore the final score will fairly often disagree with many people’s impressions of the quality of each team’s play. The low-scoring nature of the sport could sometimes lead to prolonged periods of luck, where a team may be getting good results despite playing poorly (or vice versa).

The Poisson distribution

The number of goals scored follow the Poisson distribution (see Wikipedia). With for example team A playing team B with an expected score of 3-1 then the chance of a draw is still as high as 13% and a visitor win of 9%. So as the Poisson distribution describes football matches and as this distribution is rather flat the uncertainty of match outcomes can be quantified and is significantly higher than many people think.

More detail

Each team is described by several aggregrate attributes, like strength of attack, strength of midfield, and strength of defence. Obviously home team advantage and visitor disadvantage are also taken into account in the predictions. As matching takes place with actual match results, all statistical information is automatically part of the predictions.

SoccerBright compares the entire season results with the last 10 and 5 matches to identify team improvements or deteriorations during the season. Sometimes the last few matches may have been more luck than actual improvements and by matching the three durations with actual scores the correct weighting is applied to each of these three periods.

The computer programme Python does all the hard and tedious analysis. Every week it analyses the details of the teams and historical matches and the formulas and algorithms which work best are used for the predictions. Simple algorithms are normally the best in the beginning of the season as there is not much data to compare with. Later in the season more complex algorithms become stronger and are then used for the predictions.