Ladies Free Skate and Results + GPF Finalists | Page 3 | Golden Skate

Ladies Free Skate and Results + GPF Finalists

moyesii

Rinkside
Joined
Nov 28, 2003
Mathman,
I understand what you're saying. Under the CoP, random sampling of the panel of judges statistically has no greater effect on the outcome than the original draw of the panel of judges. This is because the variability in the judges' marks, as pointed out by Sandra Loosemore, will in effect cause any panel of judges to produce a different result under the CoP.
If you draw a sample from a large population, then of course the results may be different if you choose sample X than if you choose sample Y. This is true for any statistic that can be extracted from a sample, whether it is the sum of a lot of component scores (CoP) or whether it is an average ordinal placement (OBO, for instance).
However, the ordinal system will often produce the same results even in a close competition with any given panel of judges (barring bloc judging). There will be some error within each sample (ex. 7-2, 6-3), but the sampling distribution WILL approach the mean (9 judges unanimous). Under the CoP, any sample of judges will NOT belong to the same distribution as another. This is because of the subjectivity of the component marks (which are scored on an absolute scale, contributing to error), and because of the huge variability between judges in the component marks, as discussed by Sandra Loosemore. Therefore the results of any competition under the CoP cannot be said to be meaningfully representative of anything but the given set of judges on the panel, and only those whose scores are counted.
 
Joined
Jun 21, 2003
Therefore the results of any competition under the CoP cannot be said to be meaningfully representative of anything but the given set of judges on the panel, and only those whose scores are counted.
I agree 100%. That is what I have been trying to say, but here you have said it much more succinctly. That is what I meant earlier when I said that I didn't buy the whole "sampling theory" paradigm in the first place.

Unfortunately, clever old Speedy, with his second random draw, has so confused the issue that everybody is arguing about the things that we have been arguing about on this thread, and letting the real point go by. I think that he will drop it (the random draw thing, not the CoP) before the next Olympics. It accomplishes nothing except to confuse and possibly to anger the audience. Cinquanta will probably make a big deal of dropping it, as if making a concession to the reform forces.

The only thing that I would add to your quote is (here is where we probably disagree): "...and the same is true for ordinal placements."

I do not, however, think that this is a bad thing. That's just how it is. A skater has to win favor with the judges that are scoring the event. Period.

Mathman
 

moyesii

Rinkside
Joined
Nov 28, 2003
The only thing that I would add to your quote is (here is where we probably disagree): "...and the same is true for ordinal placements."
And I would add: "but to much much less extent." We agree that no system is 100% reliable, but the ordinal system IS more robust than the CoP.

It has been awhile since I did stats, but I think it has to do with standard error. The error and variability in the judges' scores are large such that we cannot be confident that the total scores and placements as determined by CoP are within reasonable margin of an actual population value. The standard error for each score would be greater than the point difference between closely ranked skaters, and therefore we cannot conclude that the results obtained by CoP are meaningful. Results from Trophee Lalique would look something like:

2 Kevin VAN DER PERREN BEL 197.33 +/- 2.00
3 Michael WEISS USA 195.98 3 4 +/- 2.00
4 Brian JOUBERT FRA 195.58 +/- 2.00

Under the ordinal system, we can be certain that different sets of judges will result in different composition of ordinals, such as 7-2 or 6-3, but they WILL result in the same final placements since it doesn't matter if a skater wins by 7-2 or 6-3. The ordinal system leaves less up to random chance and error than does CoP.
I do not, however, think that this is a bad thing. That's just how it is.
Of course it would be a bad thing. You've just been disillusioned by bloc judging into thinking that the ordinal system isn't capable of producing results with certainty. I would not doubt, however, MK's win in 1996 or Lipinski's win in 1998. On the other hand, the CoP, even barring bloc judging and cheating, will still produce dubious results because of the error and variability in human judgment for a system that requires humans to function like machines in order for the results to be reliable and valid. All these confounding variables will make the CoP a dangerous system, esp. in the face of controversial results. The confounding variables only contribute to the lack of accountability for the ISU and the judges. As I said, it puts the entire results of the competition in the hands of fate.
 
Joined
Jun 21, 2003
Wait a minute, let's think this though! We just agreed that the voting judges are representing only themselves and not some kind of mythical larger population. Therefore the standard error is zero. For example, the standard error for the mean, for a sample of size n taken from a population of size N is: S.E. = sqrt((N-n)/(n-1)). Since the judges represent only themselves, the sample is the popluation, n = N, and the standard error is zero.

This is just common sense. The "statistical error" (sampling error) means the difference between the characteristic of the population that we are studying and the corresponding characteristic of the sample. The "standard error" is the standard deviation of the sampling errors taken over all possible samples. If the sample represents only itself, as we just agreed, then there is only 1 possible sample (itself) and no possibility of any statistical error whatsoever in the CoP.

(There still might be "human error," I suppose, like, "Oops, I reached for the +2 button but I accidentally hit the +3 button by mistake.")

To me, this is an attractive feature -- no statistical error is possible in the results. The scores I give are the scores I give. Once the panel is seated, nothing is in the hands of fate. Statisticians, go home.

The other thing that I hope we have come to agreement on is that the random draw thing is a red herring that has nothing to do with anything, and we are just being silly if we mention it any more. (But help me pass the word to other figure skating forums, LOL.)

Mathman
 
Last edited:

moyesii

Rinkside
Joined
Nov 28, 2003
Speedy? Is that you? Please don't play my words, Mathman.
We just agreed that the voting judges are representing only themselves and not some kind of mythical larger population. Therefore the standard error is zero... Since the judges represent only themselves, the sample is the popluation, n = N, and the standard error is zero.
I can't believe that you, a "mathman," would try to use stats to distort what's going on. The reality is that the CoP aims to produce a sample result (scores) that are representative of the larger population. Through stats, we came to the conclusion that the sample does NOT represent the population... We determined that the standard error in the CoP results was too large to say with confidence that the results were part of the sampling distribution or if they are a separate distribution (population) altogether. In fact, we could take two different random samples of the judging panel and they could produce total scores for a skater that would fall within a standard dev. of each other. The problem is that much of the time, the difference in scores for a skater in different data sets would be greater than the margin of victory of one skater over another in the results. This is why I say that the placements, as determined by the CoP, are not valid and not reliable.
The "statistical error" (sampling error) means the difference between the characteristic of the population that we are studying and the corresponding characteristic of the sample. The "standard error" is the standard deviation of the sampling errors taken over all possible samples. If the sample represents only itself, as we just agreed, then there is only 1 possible sample (itself) and no possibility of any statistical error whatsoever in the CoP.
We cannot agree that the sample represents "only itself," and take that as the starting point in a discussion. It is a conclusion that must come from somewhere. We come to that conclusion based on the data -- results from actual competitions -- and statistical analyses. In statistical calculations, we often don't have the value of certain population parameters. Therefore, we use the sample statistics to estimate the population value. (Some one-sampled t-test or something.) What we would conclude from the t-test is that the sample of scores produced by the CoP has too great a standard error to say with confidence that the scores are representative of the population. Note that the stats involved are still legit even though we concluded that the sample was not.
(There still might be "human error," I suppose, like, "Oops, I reached for the +2 button but I accidentally hit the +3 button by mistake.")
Not quite. Since each judge contributes to the final outcome (By outcome I mean score, not placements), the subjectivity involved in choosing a mark on an absolute scale introduces enormous amounts of error into the results (see Rossano). Every single judge will introduce error into the results. The CoP demands an exactness of judging that is humanly impossible. That's why the ordinal system was invented.
To me, this is an attractive feature -- no statistical error is possible in the results. The scores I give are the scores I give. Once the panel is seated, nothing is in the hands of fate. Statisticians, go home.
I guess we might as well have just one judge judging the entire event under CoP, since statistically it's all the same. :/ And I know you'd say Yes to that.
The other thing that I hope we have come to agreement on is that the random draw thing is a red herring that has nothing to do with anything, and we are just being silly if we mention it any more. (But help me pass the word to other figure skating forums, LOL.)
This is what I said about the random draw: Yes, it IS statistically significant, because it WILL make the results different from that of the larger panel of judges. However, it is not significant only in the sense that no results produced by the CoP are fair, reliable, or valid. Therefore any CoP results from one set of judges are just as good (bad) as another.
 

moyesii

Rinkside
Joined
Nov 28, 2003
One thing I want to make more clear by way of illustration. If we assume that the standard error is 2.00 (probably larger in actuality):

2 Kevin VAN DER PERREN BEL 197.33 +/- 2.00
3 Michael WEISS USA 195.98 3 4 +/- 2.00
4 Brian JOUBERT FRA 195.58 +/- 2.00

Then under CoP, any given set of judges will probably have scored Kevin anywhere from 195.33 to 199.33 about 95% of the time.
Any set of judges will have scored Michael anywhere between 193.98 and 197.98.
Any set of judges will have scored Brian anywhere between 193.58 and 197.58.

Because of the large overlap in the distributions of the scores, we can conclude that statistically none of the placements or scores from 4th to 2nd are significantly different from each other. In other words, the judges put these skaters somewhere in the top 5, but the actual specific placements of these skaters were all due to random chance and error. This is not a good way to judge a competition. The problem with the CoP is that while small margins of victory will decide a competition, those small differences in total points are meaningless because they're entirely due to random chance and error. This cannot be said about the ordinal system, which does not place skaters on a continuous (non-discrete) point scale.
 
Last edited:
Joined
Jun 21, 2003
Then under CoP, any given set of judges will probably have scored Kevin anywhere from 195.33 to 199.33 about 95% of the time.
I think you mean, "anywhere from 193.41 to 201.25"

Moyesii, I am not trying to annoy you or to misquote your claims. I really thought -- from your last two posts I guess I was wrong -- that we had agreed that the "sampling from a larger population" model of statistical inference was inappropriate to the analysis of figure skating judging.

When the details of the CoP first became known last year, and again when data on the new system became available, there was a flurry of comment of the type represented on this thread. Many statisticians were, and still are, apoplectic. Those discussions were interesting for a while, but, as I remarked in my first post on this thread, I think everything has now been said.

Yes, ordinal based systems have some advantages over point-total systems with respect to "robustness" (a very interesting statistical concept, that well rewards study, BTW). To me, however, yes, even though I am "Mathman," LOL, these points are of miniscule importance compared to the real problems in the sport. Frankly, I think we are only playing into Speedy's hands if we allow techy arguments of this sort to deflect our attention from the ethical concerns that plague the ISU and make the sport a laughingstock in the eyes of the general public.

I agree with GKelly's post above, that whatever system we go with there will always be controversial outcomes. This is true in any sport that has judges, referees or umpires. The public accepts that. What they do not accept is the perception that, as soon as the nationalities of the judges are announced the contest is over, why bother to skate? That skating contests are fixed months in advance over a bottle of vodka in a smoke-filled back room. That jockying for political power within the ISU takes precedence over anything that happens on the ice. That the ISU power structure has circled the wagons to prevent any further public scrutiny of it's operations.

So I am going to sign off now. You have presented many valid points. I still like the CoP OK, but for reasons having nothing to do with statistics. I will save my righteous indignation for other causes.

Mathman
 

moyesii

Rinkside
Joined
Nov 28, 2003
I think you mean, "anywhere from 193.41 to 201.25"
If you mean 2 SD's instead on one, fine. That actually makes the case against the CoP stronger, since the error is larger.
Moyesii, I am not trying to annoy you or to misquote your claims. I really thought -- from your last two posts I guess I was wrong -- that we had agreed that the "sampling from a larger population" model of statistical inference was inappropriate to the analysis of figure skating judging.
No, all this talk of the statistical failures of the CoP is based directly on statistical analyses. You know that.
When the details of the CoP first became known last year, and again when data on the new system became available, there was a flurry of comment of the type represented on this thread. Many statisticians were, and still are, apoplectic. Those discussions were interesting for a while, but, as I remarked in my first post on this thread, I think everything has now been said.
Well I guess we are back to where we started. I said at the beginning of the thread that if you were weary of the CoP debate, you needn't continue the discussion. Apparently, you are tired of reading the same arguments against the CoP. However, that doesn't make them any less valid just because you have lost interest. And it doesn't make a discussion of the CoP any less important, since many people are confused and still have questions. Sure, the ISU couldn't care less what we, the skating fans think, but we have a right and a responsibility to be informed and knowledgeable of the issues that affect the sport we love.
Yes, ordinal based systems have some advantages over point-total systems with respect to "robustness" (a very interesting statistical concept, that well rewards study, BTW). To me, however, yes, even though I am "Mathman," LOL, these points are of miniscule importance compared to the real problems in the sport. Frankly, I think we are only playing into Speedy's hands if we allow techy arguments of this sort to deflect our attention from the ethical concerns that plague the ISU and make the sport a laughingstock in the eyes of the general public.
As if we're all supposed to be on the same page at the same time? This was a discussion about the CoP. It's not to say that the other issues have gotten past us. The CoP IS an ethical issue to me, because it makes competition results invalid and unreliable, and thus affects the skaters. To me, a corruptible, statistically and morally unsound system like the CoP says a lot about the ISU, reflecting the administration's own character (or lack thereof). The way that the CoP has been implemented and the non-integrity of the scoring system makes it clear that the ISU is suspect. As for any of our discussions, they are only for the benefit of ourselves. If you really want to make some noise, you should join Skatefair or the WSF.
I agree with GKelly's post above, that whatever system we go with there will always be controversial outcomes. This is true in any sport that has judges, referees or umpires. The public accepts that.
It's not right to say that the public accepts a system that they don't completely understand. The advantage of the CoP is solely on the side of the ISU. Namely, the anonymity and complex chain of the CoP is structured such that the ISU will be able to get away with controversial results without any accountability. Very few people are able to explain the CoP from the bottom-up. Even the commentators admit they are still trying to understand the system, and they are the ones that are supposed to be knowledgeable and direct our attention to what's going on! Psychologically, the public will accept what they don't understand happening in the scoring chain as something that is mathematically over their heads. That's why we need more analyses like those from Loosemore and Rossano, who are able to explain the stats in very nonmathematical terms. The public wants there to be fairness in the sport, which is TOTALLY possible. Eliminate bloc judging and the ordinal system is the answer. Keep the CoP and there WILL be controversial results, even barring bloc judging. Under the ordinal system, the most controversial any outcome would ever likely be is 1998 Lipinski-Kwan. The public accepted the results of this very Sale/Pelletier-Berez/Sik type situation, because they were confident that the judges were being fair (no bloc judging). It was a close competition that generated a lot of debate, but it wasn't a controversy. Strangely, many of the Russian skaters have been skating sub-par in the GP series this year, and so the CoP hasn't really had a chance to be put in the hot seat.
 
Joined
Jun 21, 2003
Moyesii, I appreciate your passion and your eloquence. I know I said I was going to stop, but I don't think I explained my position very well. I would like to try again, so we can be friends.

(1) I do think that over-long discussions of statistical arcana, such as the theory of robustness in the context of sampling theory, use up a lot of energy. We are the good guys. But IMO dissipating our intellectual and emotional resources in such a fashion -- however much fun it may be -- is guarding the back door of the chicken coop while the fox walks boldly in the front.

(2) (This is a little harder for me to explain, because I am not really sure whether I believe it or not myself. Anyway...)

I love mathematical theories. Indeed, it is my profession to create them. But the more I pat myself on the back as I marvel at my own cleverness, the more I find myself humbled by the real world. Whatever the claims of a mathematical or statistical model, the real world always proves to be more subtle still -- immeasurably so.

A good example comes from physics. String theory (m-brane theory) was all the rage three or four years ago. And rightly so. This is the coolest geometric synthesis you have every seen in your life. Its passionate proponents (including serious scientists who ought to know better) have flooded physics journals as well as the pop science shelves with paeons to its promises.

But does it have anything to do with the real world? I do not believe that we will ever be able to answer this question. (As it stands now, it would require an atomic accelerator the size of a galaxy to test the predictions of the theory.)

I do not believe that there is a good fit between the (necessarily grossly simplified) assumptions of statistical sampling theory and the actual reality of how figure skating contests are judged. Calculating standard errors and the like, while a lot of fun (I am not being sarcastic here -- it IS a lot of fun), have validity only within the model.

I was not just toying with words when I referred to Hume. It's OK -- and, as I say, a lot of fun -- to engage in discussions within the framework of a particular model, while at the same time holding reservations about the validity of that model in the real world.

I also believe that this is not just idle sophistry. For instance, I think that there might be merit in the idea of imposing geographical constraints on judging panels -- no more than three from any natural geo-political bloc, for instance. I wish that people were discussing this instead of opining about whether or not it is meaningful to carry CoP scores out to the second decimal place. But this is a real world complication that would compromise the randomness of the draw, and thus throw a monkey wrench into calculations based on sampling theory.

I have nothing against ordinal-based judging systems. As for the CoP, I have enjoyed equally the pleasure of having all those interesting numbers to look at, and the pleasure of reading critical analyses of these numbers by the respected statistical experts that you have quoted.

I guess that's all I have to say. If you regard this exchange of ideas as a debate, well, you have a position to defend and I don't, so I guess you win. I do understand and appreciate the theoretical arguments that show why ordinal-based systems might be more trustworthy than total point systems. To me, these arguments are interesting but not sufficiently compelling as to make me go on the warpath.

Mathman
 

moyesii

Rinkside
Joined
Nov 28, 2003
I had intended to remark on your earlier comment that many of the conclusions of statistical analysis are counterintuitive. I don't think so. I think for the most part what statistics does is to provide ways to quantify things that are qualitatively obvious anyway.
As you said, stats explains things that are obvious. Before I even read the statistical analyses of the CoP, I knew that a having a competition decided by a point system in a subjective sport would be dangerous. Before I read the statistical analyses, I watched the grand prix events and saw really close point totals that failed to differentiate the skaters and rank them meaningfully. It all SEEMED like chance and didn't make sense. That's when I looked to the statistical analyses.
A good example comes from physics. String theory (m-brane theory) was all the rage three or four years ago. And rightly so. This is the coolest geometric synthesis you have every seen in your life. Its passionate proponents (including serious scientists who ought to know better) have flooded physics journals as well as the pop science shelves with paeons to its promises.

But does it have anything to do with the real world? I do not believe that we will ever be able to answer this question. (As it stands now, it would require an atomic accelerator the size of a galaxy to test the predictions of the theory.)
I don't think it's appropriate to compare an obscure and fledgling scientific theory with a tried and true model such as the fundamentals of statistics, which even apply to the behavioral sciences (my field). Statistics is definitely relevant to a system like the CoP, which is all about probability and numbers, and whose creators claimed that it is mathematically sound and precise. Statistics must be used to test those claims. Real-life phenomena have proven statistical models to be remarkably accurate and powerful in their application to real-world situations. Rolling a pair of dice 1000 times is a simple example.
For instance, I think that there might be merit in the idea of imposing geographical constraints on judging panels -- no more than three from any natural geo-political bloc, for instance. I wish that people were discussing this instead of opining about whether or not it is meaningful to carry CoP scores out to the second decimal place.
Cinquanta has proposed the CoP as a cure-all for all things wrong in judging. His answer to bloc judging IS the CoP. His answer to cheating is the CoP. However, while we don't buy that, no one is in the position to remove Cinquanta from power. We are forced to take what he gives. Therefore, the best that we can do right now is expose the CoP for the scam that it is. Since the CoP originated in the hands of the ISU, tackling the CoP is one way of exposing the corruption within the administration. Believe it or not, there are people who "buy" the CoP, hook line and sinker (See article in International Herald Tirbune), and those people do believe that the ISU has made positive changes in the sport. When we decry the CoP, we speak to those people on the level that they are ready to discuss the ethical issues in this sport. If we try to talk to them about cheating judges and bloc judging, they refer us to the CoP as evidence of positive change. This is why all our attention has been focused on the CoP and demystifying it. The ISU has temporarily shielded itself from the public, and if the CoP passes at the Congress next year, I believe that the ISU will have a permanent shield against public scrutiny.
On the other hand, I also believe that it is inevitable that the CoP will produce a major judging fiasco, but I don't know when. Probably at the Olympics if it is passed at the ISU Congress.
 

hockeyfan228

Record Breaker
Joined
Jul 26, 2003
Despite all of the statistical objections to CoP, I fail to see why placements does anything but obscure the judges' reasoning for those placements that are so transparent to statisticians in CoP. CoP provides a lot of data, to determine whether judges are adhering to the code and to feed the objections of its critics. Under 6.0 all of the biases that are displayed in the statistics are encorporated into two scores. All of the statistically insignificant differences between skaters are inherent in the placements -- Judge chooses Skater A over Skater B, regardless of the significance of the differences between the two skaters.

Ordinals give very little data with which to work, since nearly every decision can be explained as relative: "I gave Slutskaya a 5.9 tech score because she is so fast." "I gave Slutskaya a 5.6 tech score because her speed is created through bad technique." "Oksana Bauil moved me, so I gave her great scores, which Nancy Kerrigan skated like an iceberg." "Nancy Kerrigan landed everything, so I gave her great scores, while Oksana Bauil had a technically flawed program under pressure." Etc. Each year, the ISU calls up a series of judges to explain what they perceive as National bias. When was the last time that the judges were suspended as a result of the investigations? Or even given the less "plum" events? Back in the '70's when the Russian delegation was kicked out en masse? The ISU hasn't had much data that proves anything.

There are four things that CoP has over placements, in addition to transparency:

a) Every defined technical element must be given credit when it is performed.

i. Double-footed and under-rotated jumps can't be "forgotten" in the heat of the judging moment
ii. A missed element at the end -- like Pang/Tong's weak Pairs Spin at the end of their CC LP -- doesn't eliminate the excellence at the beginning of a program -- like their superior 3 Twist.

b) The relative difficulty of jumps and especially spins, footwork, and spirals is codified, not up to each judge at each competition for each skater.

c) The relative "goodness" of the skaters in one phase is not minimized by equal placements. If there is a statistically insignificant difference in one phase of the competition, it does not lock that skater into a place. If a skater is significantly better in that phase, the amount of the difference is not lessened.

d) With discrete standards for each mark the ISU doesn't have to prove bias (overt or unconscious) or corruption. They just have to show that the judges aren't following the code.

Ted Barton, a Canadian who helped to develop the system, was clear: this system is meant to show the IOC that the ISU has cleaned up its scoring system by the 2006 Olympics. It's a little early to decide that what we have now is the end, and that's it's same old/same old. It's too early to see if the ISU will address the uselessness of the CoP standards for PE so far this year or what else they will do with the data. The IOC can make itself aware of the statistical objections, and decide to reject the ISU's solution and figure skating itself.

Who knows -- Cinquanta may get what he wants out of this system: judges who are hired and trained and fired by the ISU, and who aren't part of any Federation.
 

Ptichka

Forum translator
Record Breaker
Joined
Jul 28, 2003
Hockeyfan, thank you for getting us back to what's important about this subject.

I think a further plus of CoP is that it can be "tweaked". For example, someone mentioned that judges are supposed to deduct from performance scores for falls and such. This can be easily programmed -- if a -3 deduction was entered for an element, computer would automatically deduct from performance score.
 
Joined
Jul 11, 2003
In defence of the CoP, I agree with Hockeyfan and also with Ptichka about needing to tweak the CoP.

Falls have always been a decision making score by judges. Sometimes, the mark was severely lowered, other times hardly at all, if at all. A fall in dance is serious although it is not always marked down. In fact a fall in any sport is serious. Diving, Gymnastics, Equestrian, etc. Yet in Singles skating it is only serious if the skater is not expected to reach the podium based on past performances, or just the excuse the judge has to mark down or ignore for his personal favorite and rival.

So, if someone can tweak the computer to mark down falls automatically and equally for all skaters that would take care of the favoritism.

Joe
 

moyesii

Rinkside
Joined
Nov 28, 2003
Oh I thought this thread was finished...
Despite all of the statistical objections to CoP, I fail to see why placements does anything but obscure the judges' reasoning for those placements that are so transparent to statisticians in CoP. CoP provides a lot of data, to determine whether judges are adhering to the code and to feed the objections of its critics.
I understand that many skating fans like having all these numbers to mull over. It gives an impression of a rigidly systematic approach to judging, so that you do see a sort of chain of events in the scoring process. But in fact, despite the apparent transparency of the system, each of the links in the scoring chain are very weak, such that the system has a good chance of breaking down at any point in the chain. In other words, you do get more information from the CoP. This makes the fans happy. On the other hand, the system itself is crude and error-prone, so that the skaters can be majorly screwed by the single hand of one judge or random chance. In effect, what most fans seem to be saying is, the system sucks for the skaters, but at least we get to see what the judges are doing to them. Yay for the fans!

My 2nd response is, if the CoP "provides data to determine whether judges are adhering to the code," then what disciplinary actions against cheating judges are available to us in a secretive, pooled scoring system? We have already seen numerous times in the GP series that judges have not been taking the proper deductions.
Under 6.0 all of the biases that are displayed in the statistics are encorporated into two scores.
Yes, and under CoP, bias will be a factor in ALL of the different element and component scores, and those biases are just as difficult to decipher or rationalize. Unfortunately, it turns out that each of the 5 program components are just as obscurely marked as the single presentation mark. That's why we have concluded that ranks are less error-prone than an absolute point scale, and a single ordinal introduces less error into the results than do 5 separate marks for program components. What does it mean when one judge gives a skater 5.50 points for Skating Skills and another judge gives the exact same skater 7.25 points? Then, what are we supposed to think when the final results are decided by less than a tenth of a point?

Just because there is more information provided by the CoP, doesn't imply anything about the integrity of the enormous amounts of information being given. Judges in BOTH systems are going to be biased. The important thing is that in addition to the error from bias, the CoP also introduces random error into all the scores and calculations. The random error is far more serious, because bias cannot be completely gotten rid of in a judged sport, but random error should be minimized. A large amount of error means that the system is not producing reliable results.
All of the statistically insignificant differences between skaters are inherent in the placements -- Judge chooses Skater A over Skater B, regardless of the significance of the differences between the two skaters.
I'm not sure I understand. Do you mean that ordinals simply place the better skater ahead, regardless of the "quantifiable" difference between skater A and skater B? First of all, it's not always possible to quantify instead of qualify the differences among skaters. It would only be possible if skating was a jumping contest. But every skater has their own strengths and weaknesses that cannot be added and subtracted into totals that can be quantifiably compared meaningfully. Really, skaters can only be compared relatively.
Ordinals give very little data with which to work...
Well I didn't know this was your job :laugh: j/k
...since nearly every decision can be explained as relative... "Oksana Bauil moved me, so I gave her great scores, which Nancy Kerrigan skated like an iceberg." "Nancy Kerrigan landed everything, so I gave her great scores, while Oksana Bauil had a technically flawed program under pressure." Etc.
First of all, I really hope you don't think that that's how judges judge. Obviously, bloc judging totally confounds the results process.

Like I said, the CoP is even more confounding than ordinals. Let's take a look at what might have happened with CoP:
Judge A gives Baiul:
8.00 SS 7.50 T 8.50 P/E 8.50 C 8.50 I
Same judge will give Kerrigan:
8.50 SS 7.00 T 8.25 P/E 8.25 C 8.25 I

Judge B gives Baiul:
7.50 SS 7.50 T 7.00 P/E 7.00 C 7.50 I
And same judge gives Kerrigan:
8.00 SS 7.50 T 8.00 P/E 7.50 C 7.50 I
I gave these numbers based on how I recalled their skates. And basically, that is what the judges do in the CoP: At the end of the skate, they assign 5 numbers to the program components. The absolute scale is subjective and inaccurate enough that any WIDE range of scores by the judges will seem justifiable, legitimate, and uncontestable. I made the scores above so that one judge preferred Baiul, and one preferred Kerrigan. Basically, we see that one judge likes Baiul, because of her balletic style and choreography. The other judge preferred Kerrigan, because of her strong, clean lines and perfect execution. Now, how the competition turns out is up to anyone and no one under the CoP system, because of the variability in the judges' marks and the random count of the scores. Would you feel better now, knowing that one judge scored Kerrigan higher in Perf/Execution, even though another judge scored Baiul higher in the same mark? Or are we back where we started with the ordinals, except now we have 5 marks instead of one and a multitude of confounding variables, and no accountability.
Each year, the ISU calls up a series of judges to explain what they perceive as National bias. When was the last time that the judges were suspended as a result of the investigations? Or even given the less "plum" events? Back in the '70's when the Russian delegation was kicked out en masse? The ISU hasn't had much data that proves anything.
I don't know what your point is. The ISU or the judging system?
There are four things that CoP has over placements, in addition to transparency:
Let's be specific here. You mean a more transparent judging chain, a more transparent scoresheet for technical elements, but still equally untransparent marks for program components, and overall secrecy in the judges' marks.
a) Every defined technical element must be given credit when it is performed.
Yes, that is most definitely a positive. Now here is one for OBO:
Each judge marks the elements, including non-jump or spin elements, as THEY see them and not as some technical specialist calls them. In other words, in a 9-judge panel, there are nine judges making independent decisions, instead of a single person making calls for the entire group.
i. Double-footed and under-rotated jumps can't be "forgotten" in the heat of the judging moment
Do you mean Baiul at the Olympics? That was more likely bloc judging. If there's a group of collaborating judges working the CoP system, they can easily manipulate the system in more subversive ways.
Also, under CoP we have seen many times already in the GP that jumping mistakes are often "forgotten," by exaggerating the component marks.
ii. A missed element at the end -- like Pang/Tong's weak Pairs Spin at the end of their CC LP -- doesn't eliminate the excellence at the beginning of a program -- like their superior 3 Twist.
When did it ever do such a thing in the ordinal system? In both systems, the judges must take into account both the strengths and weaknesses of a skater's performance. The CoP seems to be more literal and systematic in its calculations, because of the TES mark, but people are forgetting that the TPC is highly subjective, and has been the deciding factor in numerous events in the GP. In addition, the problem of the systematic adding of the total elements and program components is that the marks are riddled with random and human error. Some consider the leeway with which judges have to award marks in the ordinal system to be a bad thing, but I consider it a good thing because it allows judges to check for error that would otherwise be present in the calculated results.
b) The relative difficulty of jumps and especially spins, footwork, and spirals is codified, not up to each judge at each competition for each skater.
Well some people consider this a bad thing. Some people think that the values that have been assigned to elements in the CoP don't make sense. I like the idea of a democratic judging panel (more reflective of the population) rather than a tyranny of what is considered valuable in skating. Also, a technically and artistically rapidly evolving sport like skating NEEDS an adaptive system like ordinals, not a system rigidly coded with values for certain elements whose emphasis may or may not change with the times (even year to year or event to event!!)
c) The relative "goodness" of the skaters in one phase is not minimized by equal placements. If there is a statistically insignificant difference in one phase of the competition, it does not lock that skater into a place. If a skater is significantly better in that phase, the amount of the difference is not lessened.
We have also seen that skaters can build an insurmountable lead after the SP... But, that is not the main problem. As I discussed earlier in this thread, the ultimate goal of a judging system in skating is to assign placements, not scores. And there can never be such a thing as a "statistically insignificant difference" between any two skaters. This is a fallacy or myth generated by the CoP. The strength of the ordinal system is that it is able to AMPLIFY the marginal differences between any two closely matched competitors into meaningful differences in placements. The CoP is unable to do this, because any skaters that finish within about 5 points of each others' totals will not have a clear and definite justification for their placements (due to error and variability), and statistically there is no reason why those placements wouldn't change with another random draw. The only thing the scores of these closely ranked skaters tells you is that the system was unable to differentiate the skaters. The placements of these undifferentiated skaters are entirely due to chance. Not so with ordinals. The judges must make a conscious, deliberate, and thoughtful decision to put one skater ahead of another so that there is no misunderstanding that skater placed #4 was better than skater placed #5.
d) With discrete standards for each mark the ISU doesn't have to prove bias (overt or unconscious) or corruption. They just have to show that the judges aren't following the code.
And how do you prove that the judges "aren't following the code" when it is undetectable? I will quote myself here:
"The absolute scale is subjective and inaccurate enough {due to norms in human acceptance of variability} that any WIDE range of scores by the judges will seem justifiable, legitimate, and uncontestable."
It's a little early to decide that what we have now is the end, and that's it's same old/same old. It's too early to see if the ISU will address the uselessness of the CoP standards for PE so far this year or what else they will do with the data. The IOC can make itself aware of the statistical objections, and decide to reject the ISU's solution and figure skating itself.
Yeah, who knows, maybe the ISU is just buying itself some time, and even if the CoP is passed at the ISU congress, maybe after years and years of CoP failures and continuous modifications, like maybe after a couple of years of Olympic scandals, the ISU will introduce a "new" system of ordinals to fix the problems of the "old" CoP system. So maybe the ISU is buying itself some time with this experiment... It's not so bad since they're only wasting tons of money and affecting the results of competitons in the meantime.
 

moyesii

Rinkside
Joined
Nov 28, 2003
On the other hand, the system itself is crude and error-prone, so that the skaters can be majorly screwed by the single hand of one judge or random chance.
Before Mathman jumps on this statement :D let me just add that:
Random counting IS significant when dealing with numerical scores (and associated error) awarded on an absolute scale. It is much less of a factor when dealing with judgments put on a relative scale.
 
Joined
Jun 21, 2003
Before Mathman jumps on this statement...
:laugh: :laugh: :laugh: I'm not going to jump on anything. Didn't I say about 10 posts back that I wasn't going to post on this thread any more, LOL?

I agree with Hockeyfan's basic premise that under the CoP it will be quite "transparent" to the statistically inclined when bloc judging and cheating are going on. What, after all, are statisticians good for, if not stuff like that?

We have a right to laugh at CoPers who claim that, in the grand cosmic scheme of things, Michael Weiss' performance at Trophee Lalique was worth $195.98, not a penny more, not a penny less. But I think we can also at least smile a little at ourselves when we we say that, aha, but the standard error of Michael's score is +/- 2.391482 (+/- 0.261967 -- standard errors have their standard errors, too) and for Brian it was 3.196023, providing an overlap of the student's T distributions of the sampling means resulting in the conclusion that there is a 39.64583 per cent chance that Brian's performance (as judged by God) really was better than Michael's.

I had some other things to say about why the sampling theory model should be taken with a grain of salt when applied to the analysis of CoP scores -- there are more things in heaven and earth, Horatio, than are found in our statistics texts -- but this is more fun. Thank you Joe, Pitichka and Hockeyfan for getting this thread back onto a track that has more general interest for Golden Skate readers.

Mathman
 

moyesii

Rinkside
Joined
Nov 28, 2003
I agree with Hockeyfan's basic premise that under the CoP it will be quite "transparent" to the statistically inclined when bloc judging and cheating are going on. What, after all, are statisticians good for, if not stuff like that?
It would not be possible to detect cheating (or it would be very difficult to prove), because of the variability of the scores (due to the natural variability in human judgments of measurement on an absolute scale), which would be impossible to discriminate as either deliberate cheating or human error. The sensitivity of the total to miniscule inflections in the tech or program component scores allows any cheating to be hidden in the variability.
We have a right to laugh at CoPers who claim that, in the grand cosmic scheme of things, Michael Weiss' performance at Trophee Lalique was worth $195.98, not a penny more, not a penny less. But I think we can also at least smile a little at ourselves when we we say that, aha, but the standard error of Michael's score is +/- 2.391482 (+/- 0.261967 -- standard errors have their standard errors, too) and for Brian it was 3.196023, providing an overlap of the student's T distributions of the sampling means resulting in the conclusion that there is a 39.64583 per cent chance that Brian's performance (as judged by God) really was better than Michael's.
I don't think Brian was laughing. Wow, if you really did the calculations, a 40% error is huge. The CoP will always make the results questionable unless there is at least a 5 point difference between every skater in every placement. But that's impossible, unfortunately.
I had some other things to say about why the sampling theory model should be taken with a grain of salt when applied to the analysis of CoP scores...
By all means say it. Or maybe you should have said it about 20 posts ago :laugh:
Thank you Joe, Pitichka and Hockeyfan for getting this thread back onto a track that has more general interest for Golden Skate readers.
The ordinal and CoP systems are built on statistical foundations, ex. ordinals, double trimmed means, random sampling. Therefore, in order to have a discussion of the merits of either system, stats are essential. If anyone won't bother to learn the fundamentals, then they are merely perpetuating its myths.
 
Joined
Jun 21, 2003
It would not be possible to detect cheating (or it would be very difficult to prove), because of the variability of the scores (due to the natural variability in human judgments of measurement on an absolute scale), which would be impossible to discriminate as either deliberate cheating or human error. The sensitivity of the total to miniscule inflections in the tech or program component scores allows any cheating to be hidden in the variability.
I think that you are underestimating the cleverness of statisticians and the tenacity of sports geeks the world over. As the CoP rolls along I think that we will be inundated with scholarly analysis showing us exactly how to separate out cheating from noise -- indeed, from many of the same authors who are leading the way now.
I don't think Brian was laughing. Wow, if you really did the calculations, a 40% error is huge. The CoP will always make the results questionable unless there is at least a 5 point difference between every skater in every placement. But that's impossible, unfortunately.
I did not make any calculations, I just made up some numbers. We do not need to make any calculations at all to tell us that when one competitor noses out another by a few hundredths of a point, then the contest could have gone either way. The 40% just means that if we did the same contest, with the same performances, before the same judges, 10 times in a row, we would expect Brian to win 4 of them and Michael to win 6. If it really was that close, well, that's life. (I was just watching on TV a recap of last year's college football championship, where Ohio State beat Miami in triple overtime, thanks to a late flag on a controversial pass interference call in the end zone on the second overtime.)

I was also just now watching the actual performances from Trophee Lalique. Joubert kicked Weiss' butt!!!! These jokers call themselves judges???

By all means say it. Or maybe you should have said it about 20 posts ago :laugh:

The ordinal and CoP systems are built on statistical foundations, ex. ordinals, double trimmed means, random sampling. Therefore, in order to have a discussion of the merits of either system, stats are essential. If anyone won't bother to learn the fundamentals, then they are merely perpetuating its myths.
I may post some further remarks later. The reason that I did not do so is because I don't want us two to dominated this thread so much that no one else can get a word in edgewise. I also think that all of us should be careful not to use dismissive language towards people who express other opinions. I know that I often come off inadvertently as trying to be a know-it-all, which kind of defeats the purpose of having a forum in the first place. (So I make a conscious, but as you see, not entirely successful, effort not to throw in big-sounding quotes from famous literary sources, LOL.)

Mathman
 
Last edited:

moyesii

Rinkside
Joined
Nov 28, 2003
As the CoP rolls along I think that we will be inundated with scholarly analysis showing us exactly how to separate out cheating from noise
I wonder who it really conveniences or benefits if detailed "scholarly analyses" of the scoresheets are required in order to detect the corrupted marks. Doesn't sound very transparent to me. I like accountability -- one judge:eek:ne ordinal.
If it really was that close, well, that's life. (I was just watching on TV a recap of last year's college football championship, where Ohio State beat Miami in triple overtime, thanks to a late flag on a controversial pass interference call in the end zone on the second overtime.)
Yes, ABC Sports' year in review show. The relevance of that example here is that the single call of the referee affected the outcome of the entire championship. It would be analagous to the power of a single call by the technical specialist in the CoP. For example on Trophee Lalique today, Susie Wynn(sp?) commented that one skater (I forget who) received a Level 2 for an element at Skate Canada(?) but received a Level 1 at Trophee Lalique. It shows the power of the tech specialist to affect the outcome of an event and the arbitrariness of his/her decision-making ability.
To my knowledge, that kind of situation where one judge's call has sway over everyone else's, does not exist in the ordinal system.
I was also just now watching the actual performances from Trophee Lalique. Joubert kicked Weiss' butt!!!! These jokers call themselves judges???
I don't understand what you're saying here. The judges did fine. They put Joubert ahead of Weiss, but the CoP system -- working with scores instead of ordinals -- calculated Weiss to be ahead of Joubert. The judges' marks were so close that a crude system like the CoP was unable to convert the judges' scores into meaningful placements.
I may post some further remarks later.
Ok, but if you do, maybe you should start a new CoP thread, cause I'm tired of checking the NHK thread in GP spoilers :laugh:
 
Top