Now that the Winter Olympics have ended, people can turn off the television and focus on what really matters: predicting which countries will win big at the next Winter Olympics. Is this a bit premature? Perhaps. But from World Cup results , to presidential election outcomes, to Olympic medal counts, many people seem to enjoy predicting the winners of events that happen in four year cycles.
Predicting the winner in any individual event can be challenging. But overall, there are certain predictions that seem pretty reliable. The United States will earn more medals than Jamaica, for example: of this you can be nearly 100% certain.
But will the United States win more medals than Canada? What about Norway or Russia? To answer these questions we need to be able to predict how many medals we can expect each country to receive.
There are quite a few ways that we can try to predict Olympic success, as measured by a final medal count. One way might be economic: after all, wealthier countries have more money to invest in preparing athletes for the Olympics. Another way might be geographic: maybe countries farther north tend to do better at the Winter Olympics. Still a third reason might be historical: countries that have performed well in the Winter Olympics should continue to perform well.
There are other factors as well, but even looking at these three we can find some interesting relationships. For example, here's a plot of medal count vs. GDP (in trillions of dollars) for the 2010 Olympics. Each point represents one of the 26 countries that have received a medal over the past few games:
That line is the line that best fits the scatterplot data. It has a slope of 1.89, meaning that for every additional trillion dollars of GDP, a country can expect another 1.89 medals at the Winter Olympics. However, the correlation isn't particularly strong — the correlation coefficient is around 0.60 — meaning that this model won't always be a great predictor of Olympic success. This is clear from the plot, since there are many points that are far from the line.
Latitude turns out to be an even worse predictor, with barely any correlation at all:
Past Olympic success, however, turns out to be a great predictor of future performance. Here's a plot of 2010 medal count vs. 2006 medal count. The correlation coefficient here is 0.91, and the slope is around 0.95, meaning that with this model a country can expect to improve on its haul by about one medal every four years.
But while this model is the strongest, that doesn't mean it's always going to deliver great results. Based on its 2006 performance, for example, this model predicted that the US would win around 27 medals, not the record-breaking 37 medals it ended up with in Vancouver. And in general, you can see that each model is a pretty good predictor for some countries, and a weaker predictor for others. (In other words, in each scatterplot, some points are close to the line, and some are far away.)
One problem is that we're only looking at one variable — GDP, latitude, past success — in isolation. But what if we combined several variables into a single model? In fact, this is just what brothers Dan and Tim Graettinger did. They created a model that takes into account four different factors when trying to predict Olympic success. Here's a plot of how their predictions panned out: the x-axis represents how many medals they predicted a country would win, while the y-axis represents how many medals the country actually won. Points on the dashed line are where their predictions were right on; countries above the line over-performed, and countries below it under-performed.
Intuitively, it seems like this model should be the best predictor of all, since it incorporates the most information. But what do you think? Do you think this model really is the best predictor, or do you think the errors from one of the previous models were smaller? And, in general, is more information always better when trying to predict some unknown event in the future?
Teachers: interested in having this conversation with your students? Then check out our new lesson, Hitting the Slopes.