Awright, you talked me into it .
Originally Posted by Mrs. P
(Caveat: this is in no way comprehensive or systematic, just some early impressions)
-Firstly, Serious Business gives an excellent explanation of what the output of Nate Silver's "prediction model" represents. It is a probability distribution of potential outcomes (more simply, what are the odds that any defined possibility will become reality?).
-Let's compare the subject of Nate Silver's most celebrated work, presidential elections, with things like figure skating championships (or the Super Bowl). What are some of the salient resemblances, and differences, and what are the ramifications for the analytical results?
The primary point of resemblance is pretty obvious: both elections and sporting events are contested activities, competitive pursuits. The primary purpose, PC niceties aside ("they are all winners..." etc. etc.), is to identify the champeen. One might therefore assume that the analytical expectations would therefore also be largely the same. But I would argue that such an assumption would be wrong.
Why is it that statistical analysts such as Nate Silver can often state that there is a better than 95% probability (and sometimes 99% and beyond), in other words, that it is a mortal lock, that so-and-so will win the election, and can say this before election day? If someone made a similar prediction for ISU Worlds or the Olympics (or the Super Bowl), and claimed it was an objective, probabilistic analysis, my advice would be: first, check to make sure your wallet is intact, and then run for the door as fast as you can. What accounts for this difference?
It's because the results of an election are to a large extent pre-ordained by the time we get to election day. There are two reasons for this. First, because it can be demonstrated that voters generally make up their minds before going to the ballot box, and the closer we get to the election date, this percentage starts to asymptotically approach 100%. Second, because of the trend toward increased early-voting, many actual votes have already been cast before the official Election Day. Thus, the votes, for all practical intents and purposes, have been locked in, and we know the score through robust polling and analysis.
In an election, this is why the projected winner's probability for success gradually increases in the days before the election; it basically reflects the fact (calculated from historical data) that the number of people who might change their minds has become too small to meaningfully alter the result, barring some bizarre game changer or Act of God.
This would be similar to doing a prediction for the outcome of, say, a best-of-seven-games contest such as the World Series, with the proviso that the prediction is iterated after the result of each game played. Like Nate Silver's election predictions, a team that won, for example, the first three games would see their probability rise to something very high, because 75% of their requirement for winning has been locked in.
In figure skating terms, it might be analogous to making predictions after the SP has already been skated, because roughly a third of the winning requirement has been locked in, and therefore the realistic final outcomes have started to narrow considerably.
In all three cases, the election, the World Series, the figure skating championship, if the forecast were made prior to any results being locked in, then they would all be much less predictively powerful, and possibly of a similar order of magnitude to each other (i.e. not that great).
One might argue that in skating, the "reputation" aspect of scoring is in some ways analogous to election votes being "locked in". But assuming for the sake of argument that reputation points exist, IMO such points are much more a "soft circle" phenomenon as compared to eve-of-election voter preferences, and they constitute a fairly small percentage of the total required to win. PCS may represent about a third of the points total, but not all of it can be assigned purely by reputation; the skater has to actually execute the program to a certain standard for a "reputation" bonus to come into play. I would think that such a rep bonus constitutes no more than 5-10% of the total score on the high side, again, assuming that it actually exists.
-While we may think that there are plenty of data points for making statistical predictions for podium results in figure skating, there actually aren't, in my view. Consider: in baseball, there are 160 or so games in the regular season, a wealth of data that can be mined. And this is a series of data that goes back, in recognizable form, for a century. Even so, it is difficult to make predictions about who will win the World Series based on regular season results (again, if one does iterative predictions after each World Series game, then it is a different story).
How many events are there in a skating season? A typical international schedule would be a couple of GP events, perhaps the GPF, possibly a Euro or 4CC, and then Worlds. Five events.
-A figure skating championship. like the Super Bowl, is a one-off rather than an extended series. It is not conceptually unreasonable to think that chance and fortune can possibly play a much greater role than for events in which a contestant has multiple bites at the apple.
-A skating competition is highly compressed relative to other sports. A few minutes on ice divided into two periods. A "clean" skate is the goal; a rough patch of ice surface, a momentary lapse of concentration or fatigue, even one such instance can, as a matter of course, put the top of the podium out of reach, in a way that does not often happen in a contest where each game lasts for several hours and you need to win four of them.
-This is why I admire figure skaters. The amount of mental fortitude required to accept and overcome such impossible demands in the pursuit of a championship engenders a certain amount of awe.