The entropy of racing
Formula One racing is often criticized for its lack of excitement compared to previous eras of motorsport. Mercedes has dominated the hybrid era – notching 102 wins or 73% of all races since the 2014 season. The criticism is starting to crack. The excitement of Formula One is on the rise and it is supported by the data.
Changes in race lead are the most exciting moment of any race. Many of the objectively boring races consist of a single driver maintaining race lead on every lap – often Lewis Hamilton, represented within the many long strands of teal in the plot below.
Each horizontal sequence is a single race. The color represents the current constructor lap leader. Red Bull and Sebastian Vettel kick off the decade with a swath of blue. Once the hybrid engines are introduced in the 2014 season, Mercedes’s teal ascends to dominance. They are only briefly interrupted by sporadic Ferrari red 2017-2019.
Shannon entropy, often used to measure uncertainty in a probability distribution or understanding information gain, can be applied to these sequences of color to help us understand race monotony. Entropy, in this case, is a fancy method to quantify the number of race leaders within a race while accounting for the length of the lead stints and length of the entire race. It is a great proxy for excitement as greater entropy corresponds to more dynamic races.
For example, the sequence
[HAM, BOT, VER, VER] will have a larger value than
[HAM, HAM, VER, VER].
[HAM, VER, HAM] will be larger than
[HAM, HAM, VER, HAM, HAM] despite both having only two changes. It also increases as leaders become more competitively balanced.
[HAM, HAM, VER, VER] has a larger value than
[HAM, VER, VER, VER] despite both only having one change. This reflects increasing excitement as the field of drivers becomes more balanced in the long-run. Entropy is an imperfect tool, though. The entropy of
[HAM, VER, HAM, VER] is the same as
[HAM, HAM, VER, VER] even though the former is clearly more exciting. An ideal metric would account for both positions and the number of changes.
A single summary number that captures all information in a sequence is a debated subject in sequence analysis. For the application here, weighing the standard Shannon entropy metric by the number of changes in race leader is appropriate. This new entropy metric increases with the number of race leader changes but still retains the core properties of entropy.
The moving average of entropy over the last two decades shows the slow and then sudden fall in excitement starting with Mercedes’s ascension. The decline starts sooner, though, during Red Bull’s reign. The below plot shows the trailing three season average – so perhaps we should equally blame Toto Wolff and Christian Horner. The trend inflects near 2019 and the recent rise corresponds with a return of excitement. Hopefully this recent trendline will continue upwards as excitement is still systemically low compared to the previous decade.
The top race by excitement was the season opener 2013 Australian Grand Prix with 10 changes in race leader. Openers often result in a chaotic shuffling of cars while drivers learn their new machine and the teams work out their latent rankings. The 2013 race also suffered from a wet qualifying session that mixed up the grid more than usual. The second most exciting race was Kubica’s revenge after his tremendous 2007 crash at the Canadian circuit. None of the top five had wet weather conditions during the race.
The inverse, the worst races by excitement, are races where the first lap leader maintained the lead throughout the entire race. There’s been 80 of these races since 1996.
Formula One supporters (bolstered by television announcers and Formula One itself) enjoy pivoting the debate to the midfield. I agree changes in race lead do not tell the whole story. Mercedes dominated the 2020 season, but, as a viewer, it was an empirically exciting season. It was thanks to haphazard scheduling, novel circuits, double-headers, three red-flagged races, a driver reshuffling at the end of the season, and, of course, the terrible Grosjean incident.
The midfield did appear strong in 2020, though, but the entropy measure does not firmly support it. Peak excitement was early in the decade where the below plot is brightest. The perceived 2020 midfield entertainment might be created from a shift in narratives – anecdotally the Sky Sports announcers covered the McLaren-Renault-Racing Point battles closer than past years.
Entropy is a great measure of race excitement but, since this was all mostly for fun, there are a few drawbacks it does not directly address:
There are other metrics such as turbulence which measure, for lack of a specific word, some form of entropy of a sequence. These could be a better proxy for race excitement and a sensible next step for extending this work.
Note: all of the above only includes races from 1996 onwards. No data on individual laps was available prior.
Find the code here: github.com/joemarlo/formula-one