Neither do I think the issue is one of validation, because there is no standard against which to judge the outcome. Just because you, me, and gkelly like Kozuku's performances the best, that does not mean that there is something wrong with a judging system that produces a different result.
To me the question is, what scoring system bests serves the interests of skating. Arguments that begin with, "give the paying customer what he wants" are simply dismissed. The ISU invented CoP scoring in order to make figure skating more like a real sport. In this, they have succeeded. In turn, they sacrifice that which makes figure skating different from other sports. (That is the chronic complaint by this fan. )
Last edited by Mathman; 02-26-2012 at 03:44 PM.
The reason that flip-flops seem weird is that you are keeping score continuously as the event transpires. If instead you withhold tabulating partial results until the event is complete, then run all the data through the formulas at once, flip-flops disappear.
(Instead we are left with violations of the principle of independence of irrelevant alternatives. No biggie, IMHO -- although Mr. Cinquanta, coming from a timed sport, didn't like it.)
Last edited by Mathman; 02-26-2012 at 03:46 PM.
It becomes a validity issue when there is a "standard", whether it be "best serving the interests of skating", expert consensus, or political concerns.
But nobody knows what is in the best interests of skating, there is no consensus among experts, and political concerns -- well, the heck with that.
By default I think all we can really subject to analysis is whether particular events contested under a particular set of rules came out right according to the rules of engagement. (Plenty to argue about right there!)
I made no commitment as to what method should be used to combine the short and long program standings. Nor did I watch much less "judge" the short programs.
If I thought Takahashi's superiority over Kozuka in the short program was greater than Kozuka's superiority over Takahashi in the long, then I would have wanted Takahashi to win the whole thing, but since I haven't watched their SPs yet I have no opinion.
And if I thought Hanyu's superiority over Takahashi in the long program was greater than Kozuka's superiority in the short, then I would have wanted Hanyu to place higher overall, regardless of how many other skaters placed in between in either program.
I don't like the head-to-head approach used in some pro-ams and the Grand Prix Final ca. 2000 where the skater in third after the first phase was skating for bronze no matter how much better s/he was than the top two from the first phase.
I also don't like the setup that you can move ahead of anyone one or two places ahead of you just by beating them in the final, and you can be overtaken by anyone one or two places behind you just by losing to them in the final, regardless of how much better or worse you were in each phase, but to overtake or be overtaken by someone three places ahead or behind the other skaters had to finish in just the right order even if none of them were close to the person you were trying to overtake.
When the rankings were purely comparative, that was the best system available for combining results from the two phases. But it led to paradoxes and was often counterintuitive.
But if it's possible to quantify, even roughly, the differences in quality between skaters in the first phase, why carry over placements at all? If skater A places 1st in the first phase by a landslide, and skaters B and C are 2nd and 3rd but virtually tied (on points or ordinals, as the case may be) with D, E, and F in 4th, 5th, and 6th, then why should C have a chance to win the whole enchilada and not F?Again I was comparing the 6.0 model with the CoP model. All skaters have to go all out in the short program. The top skaters would have to go all out so that they would enter the free skate with a high enough placement to give them a realistic chance to win.
Well, if they're trying to qualify for the final, of course they'll aim to do enough to qualify, which for a mid-ranked skater might mean going all out. A lower ranked, lower skilled skater may have little hope of qualifying, so this first phase would be the whole competition for them and they'll want to place as high as possible, even if that's next-to-last instead of last.The lower-rank skaters would go all out because -- well, what are you there for, if not to go all out?
But for a skater who is pretty sure to qualify for the final, pretty sure not to medal there, it might make more sense to save themselves in the semifinals and go all out in the final aiming for top 10 or better in the phase that means more.
Ah. Well, that has been a place for the ISU to experiment in the past, as have the ISU "opens" (which are really invitationals even less open than the GP).Passing on to plan B...
I was thinking more of the Grand Prix events than the World Championship.
Do whatever you want with those. It would be nice if they're meaningful in some way, but ultimately they're footnotes in the record books that will focus on the championships.
And national championships will probably follow the format of ISU championships to the extent feasible.
The Title Sponsor Made-for-TV Extravaganza can make up its own rules and maybe earn higher TV ratings and offer bigger prize money. I'd hope the ISU would allow top skaters to do these in addition to or instead of the championships and to go back and forth based on their individual needs at the time.
How is it chosen which skaters qualify for or are invited to the Extravaganzas?
I'm still more interested in how to equitably measure the best all-around skater in the world (the purpose of championships and events by which one qualifies to compete there) and what subsets of skills might be worth their own competition phase or own separate championships.
Here is my thought: My concept of Sport is more of a criterion-referenced measure than of a social choice where one decides relative preference among candidates. In sport, I'm used to the concept of measuring distance, speed, number of balls.....Those are all criterion-based.
Last edited by skatinginbc; 02-26-2012 at 04:22 PM.
If enough knowledgeable experts offer good reasons why Q was better, I may change my mind or at least agree it could have gone either way based on personal preferences, which we all have and can be applied perfectly honestly -- and I may learn more details to pay attention to next time.
What drives me crazy is if I think P was better, some or most or all of the judges think P was better, but people insist that Q was better and anyone (i.e., judges) who thought P was better must be wrong, probably blind or corrupt. Hey, I had good reasons for preferring P. I'm willing to listen to your reasons for preferring Q, I might change my mind, but don't tell me I couldn't have reached my opinion legitimately. And maybe listen to my reasons for believing the opposite. That way we can both learn something.
And no, "P fell and Q didn't" is not a good enough reason on its own to insist Q was better. It would definitely be a part of the equation, but not the whole answer by itself.
Sometimes we all agree on who was better in the short program and who was better in the long, but if P and Q were each better than the other in one of the programs we may disagree on how to put those two phases of the event together to determine the final results. That seems to be what this thread is about -- not who was better in each program, but how to combine the results.
Obviously if there are only two competitors and the second phase counts twice as much as the first, then the results of the second phase determines the final results. With more competitors and a ranking-based system, it's never that simple.
Speaking of "validation," by the way, when the CoP was in its developmental stages in 2003 the iSU conducted a lot of retro-scoring exercises to make sure that the CoP wasn't completely out in left field, when contrasted with results from ordinal judging.
One of the events they scored was the 2002 Olympics. Tim Goebel won, beating Alexei Yagudin (not to mention Evgeni Plushenko) by doing three quads.
The ISU immediately lowered the base value of quads to prevent such an obvious anomaly.
At Cambridge University they have an annual intramural rowing contest that has been going on for hundreds of years between the various colleges and clubs. Each race starts out with the boats in the water in the same order as they finished last year's race, a few boatlengths apart. Then you row like crazy.I also don't like the setup that you can move ahead of anyone one or two places ahead of you just by beating them in the final, and you can be overtaken by anyone one or two places behind you just by losing to them in the final, regardless of how much better or worse you were in each phase, but to overtake or be overtaken by someone three places ahead or behind the other skaters had to finish in just the right order even if none of them were close to the person you were trying to overtake.
If you catch up to and bump the boat in front of you before the boat in back of you bumps you, then you move up one place for next year's race. If you get bumped from behind, your place and that of the boat behind are flip-flopped for next year.
If what you're measuring is primarily based on personal preference or qualitative perception of the same skills, then pure ranking makes sense.
If what you're measuring is based on objectively quantifiable skills, then pure absolute scoring of each those skills makes sense.
Figure skating has always been qualitative, and originally it put the most emphasis on everyone doing the exact same skills to compare who did them best, which is why it originally developed a qualitative comparative scoring system.
As the sport developed, the variety of skills that different skaters can choose to include in their performances has multiplied. So has the degree to which some important technical skills (primarily jumps, and number of "features" on other elements in teh current system) can be quantified. The quality of execution of those skills is also still considered important. The way that programs are put together can to some degree be quantified, and qualitative perception of "artistry" (however we define it) is also considered important. Not just in terms of personal preference, but also in terms of how artistic execution of the technical skills demonstrates superior command of the techniques.
If the scoring remains based purely on rankings, the quantifiable and obvious aspects of the performances are not accounted for in a transparent way. Hence a push toward quantifiability by people within the sport who are more interested in recognizing those athletic skills, and by non-figure skaters in the ISU (speedskaters) and in the IOC who are more comfortable with objective measurements.
Technology, such as slow-motion replay and computers that can handle multiple complex rules and numbers, allows more accuracy in scoring especially at the most important competitions. For smaller competitions, the technological and human resources required may not always be financially feasible.
But figure skating doesn't want to give up its qualitative aspects either. How well something was done continues to matter just as much as exactly what was done. And transcending technical content to produce an aesthetically pleasing performance, even to the point of connecting with audiences on an emotional and artistic level, is still valued. And still subject to personal perception and personal preference.
So how can both the quantitative and the qualitative aspects of evaluation each be given appropriate weight?
I think breaking down the scores into element base marks, grades of execution, and program components is a good approach. How they're each translated into numbers is more debatable.
And I expect that some years, perhaps decades, down the line more technology will allow more objective measurements of the what that will include some aspects of the how well (measuring not only jump rotation but also height and speed -- or rotational speed in spins).
Can there be better ways to measure, score, and report scores for qualitative aspects to reward them appropriately? On the one hand, if a skater has such mastery of the use of her blades and body that she can become one with the music in her skating and bring a whole panel of judges to tears, we want a way to reward that more highly than just executing the same technical skills with the same success and generally staying on time with the music. On the other hand, we don't want judges who are moved by a skater's personal story off ice (or by bribes or threats from the skater's supporters) to overreward based on personal preference at the expense of analytical evaluation of the technical skills.
So there's lots of room to figure out better solutions for balancing the objective and subjective aspects of the scoring.
And if the fairest ways become too expensive, how can the system be simplified for use below the level of the most important competitions that attract sponsors?
I guess there is a winner in that after each race either the boat in front is still in front or else the second place boat caught up and is now the leader. I think the goal is to stay in front for as many years in a row as possible.
I am not sure about now, but still in the 1970s this was a very popular event and almost every male student in the college was expected to take part. So each house might have its top boat, its number two boat, down to eight boats or so. If your boat number eight could catch up to the rival fraternity's boat number seven, then they owed you a round of ale.
By the way, physicist Stephen Hawking coxed for the Oxford Crew in the early 1960s, before he became ill.
(*Except as noted below)
There can be ordinal flipflops as well, when judges disagree on who did better in the same program. It can happen in a short program or in an event with only phase.
Simple example with 7 judges:
1 1 1 1 2 2 2 4/1
2 2 2 2 1 1 1 7/2
Head to head, judges disagree on who was better, but with this particular panel Camille is ahead of Babette. Under the majority system, Camille is currently in first place with a majority of 4/1; Babette's majority is 2nd place, by all judges (7/2).
Now Annemarie skates her program and the judges continue to disagree on whether she was better than Camille and Babette or worse than both of them:
3 1 1 1 2 2 2 6/2
2 3 3 3 1 1 1 4/2 TOM = 5
1 2 2 2 3 3 3 4/2 TOM = 7
Camille has lost her majority for first place. Actually no one has a majority for first place, so first place will be decided by who has the most ordinals for second place or better. That would be Annemarie, so she takes over the lead.
Now what happens to Camille and Babette? They both have 4 ordinals of 2nd or better, which makes a majority out of 7. But Babette has more 1s than Camille, so Babette moves ahead of Camille in the standings.
In this case you have to look at Camille's lead after 2 skaters as provisional, not absolute. There wasn't a clear consensus that Camille was better than Babette, and as soon as other skaters got into the mix the waters got more muddied. The more skaters in the competition at about the same overall level (including better skaters having a bad day and weaker skaters having a great day), the more possibility for mixed ordinals and flipflops.
(I won't get into OBO calculations because I never fully mastered them. I believe Camille would stay ahead of Babette in this example under OBO, but flipflops could still have been possible especially if a 4th skater is brought into the equation)
If this happens in the short program, it's confusing while the short program is in progress, but then at the end of the day the standings are fixed and it doesn't matter that Camille was ever higher in the standings than Babette early on.
*Put one more skater Denise in front of everyone else. Now, if Babette beats Denise in the free skate, Babette can win. If Camille beats Babette in the free skate, she cannot win unless someone else also beats Denise but not Camille.
But a majority of judges preferred Camille's performance to Babette's. So why would Babette deserve to win and not Camille?
If the ordinal flipflops happen during the long program along with factored placement flipflops, then it gets even more confusing.