# Thread: Close scores in COP

1. 0

## Close scores in COP

I started thinking about this when other posters where analyzing Miki's and Allissa's scores in the GPF (if Miki had skated clean in the SP, they may have been almost tied and then.....).

One of the things that bugs me about the COP is how few scores are actually used, versus how many are given. So my understanding is there are 9 judges, but only 5 scores are really used - just a little over half. First 2 judges scores are just randomly eliminated, then the high and low scores of the remaining are eliminated and then the rest are averaged.

The part that seems oddest to me is the eliminating of the two, because that is random, and in a close competition, which judges are eliminated could make the difference between 1st and 2nd or between on the podium or off. That seems to be introducing an element of luck into a system that is trying to be more standardized. We have 9 experts there - why not use all of their scores? And are the same judges eliminated in for all skaters in the event - or is re-randomized for each skater? Or even each element?

I think the USFSA is closer to this - with 9 judges and all 9 count (do they eliminate high and low score?). To me that makes Jeremy's loss to Ross more concrete then if a different random selection of judges would have put him in 3rd.

I would assume the ISU looks at all 9 score sheets - anyone know if any competitions have been won or lost based on the random selection of judges?

Thanks

2. 0
I think they changed it this year and no longer throw out two scores at random. You can check this out by looking at the protocols for GOEs. If you see scores like -1.57, 0.43, etc., these are decimal equivalents of sevenths. That is, they are averaged over seven judges.

Trimming the mean by throwing out the highest and lowest is not so bad. Under normal conditions it won't matter, and it guards against a judge making a key-stroke error (entering a component score of 0.50 when he meant 9.50, the 0 and the 9 being close together on the keyboard ), and also mitigates the effect of a biased score by a lone judge who is trying to help his skater unfairly.

3. 0
Originally Posted by ivy
- anyone know if any competitions have been won or lost based on the random selection of judges?
Yes, that was not terribly uncommon, especially for placements down the line.

Sasha lost a Grand Prix event once where she had the absolute most extreme bad luck possible in terms of which scores were eliminated.

There is an argument why this is not really unfair, though, and it goes like this. Let's say the judges' pool has a thousand candidates. By a random draw this is cut down to nine. Then by another random draw the nine are cut down to seven scoring judges.

Statistically, this is the same as choosing seven out of the thousand to begin with. So the procedure is not really any more "random" than would be any procedure for selecting seven scoring judges.

Still, it was a stupid idea. The purported reason for the random draw was so that if crooked conspiring federation officials told their judges to cheat, this would allow the judges not to cheat but then to lie to the federation heads afterward and say that they did cheat as instructed. (It's OK to judge honestly , just don't get caught judging honestly. )

So that meant at every competition you had two or three straw men sitting there pompously in the judges box pretending to be judges when in fact they were just window dressing. (Of course no one -- including the judges -- knew which of the nine were actually judges and which were the two saps playing along with the iSU charade. )

4. 0
Originally Posted by Mathman
So that meant at every competition you had two or three straw men sitting there pompously in the judges box pretending to be judges when in fact they were just window dressing. (Of course no one -- including the judges -- knew which of the nine were actually judges and which were the two saps playing along with the iSU charade. )
Sort of like a firing squad with few real bullets so nobody knows who exactly did the execution. Diffusion of responsibilites and easing of conscience, etc.

5. 0
ivy, it's likely that Davis/White lost the 2009 World bronze by that stroke of luck. 0.04 separated them from Virtue/Moir.

6. 0
in the earlier days of the CoP, before they started juggling judges from SPs and LPs, we noticed that there were 5 judges determining the placements, not unlike the 6.0 system. With the change of judges, the similarity to the 6.0 system has ceased.

What CoP can do is give a little explanation of how they viewed the elements in their nitty gritty way. Whether one agrees with the results, it is up to the reader. I could say that Dornbush was the best of the three placements, but that may not show up in the CoP scores. Generally speaking the closesness of the 6.0 system and the CoP are neck and neck.

Since I do not check the protocols very often, I do not know for sure the real winner, I just accept what the results are for the CoP. For example, the consensus of this Forum was that Nagasu should have placed second. This may have been true under 6.0 system, but CoP said NO, Flatt was second.

7. 0
As Mathman explained, judge selection is somewhat random to begin with. And as for skaters winning because of a certain pool of judges scoring them instead of another pool, this happened way more often the 6.0 system. Under the 6.0, judges basically voted a straight up or down or whether a skater should be ranked above another. There were plenty of Olympics and world championships where first and second were decided by a 5/4 split! For instance, that's how Oksana Baiul won her Olympic gold medal over Nancy Kerrigan. If the judge selection worked out some other way, Nancy would've won and become an instant legend and a goddess of the sport (Olympic gold + the whack would make her irresistible to people).

Under the COP system, even though they've done away with chucking two random judges, judges actually have far less power in arbitrarily influencing outcomes. For one thing, they have to assign way more scores and often with specific, non comparative criteria (meaning they can't give skater A 1 point more on something just because skater A is better than skater B). If their scores are out of line, they may have to explain it to the powers that be, in detail. For another, and this is the biggest thing of all, at least half (or more in upper levels of men's skating) of the score is set by the tech panel. There's far fewer of them and they set the all important level calls, edge calls, UR calls that make a gulf of difference in the scores. It makes a big difference who gets selected as the tech specialist! Some are known for being particularly strict, some are known for giving edge calls to a skater's flips (when others don't), etc. Now the selection of the tech specialist isn't random, per se, but most of the time the tech specialist isn't selected to influence the ranking in a particular way, so in that regard, it is random. And sometimes skaters do have to hold their breath to see who gets on that panel.

8. 0
As Mathman explained, judge selection is somewhat random to begin with. And as for skaters winning because of a certain pool of judges scoring them instead of another pool, this happened way more often the 6.0 system. Under the 6.0, judges basically voted a straight up or down or whether a skater should be ranked above another. There were plenty of Olympics and world championships where first and second were decided by a 5/4 split! For instance, that's how Oksana Baiul won her Olympic gold medal over Nancy Kerrigan. If the judge selection worked out some other way, Nancy would've won and become an instant legend and a goddess of the sport (Olympic gold + the whack would make her irresistible to people).
MM's view seems to touch on the collusion of judges. I just don't think of that as so obvious as it was during the Soviet times. I applaud the CoP for removing that intrique that once existed - not unlike the UN voting during that same era.

Just about all the judges I read about have said that scoring now is much easier than it was. The Tech Panel has taken over the more difficult job of judging (much of which I disapprove of) and judges are left with a kind of 6.0 system of GoEs that are never unanimous.

I agree with your comments of the selection of a tech specialist. The TS is a powerful influence on the scoring of competitions. He controls the error portion of the score, and I believe his decisions are infalible. It would be of interest to me if the Tech Panel would be chosen from countries not among the Big Six. There are officials in other countries - Great Britain, Scandanavia, Germany, Czechslovaksia, Austria, Korea, Spain, Australia, South Africa, etc., why not send them to a universal 'how to special school' and we can be more sure of a non conspiratal Panel.

As for Kerrigan/Baiul, it was the 6.0 system for the little waif. Had it been CoP she would have lost on 2ft landings alone.

9. 0
Originally Posted by Mathman
Sasha lost a Grand Prix event once where she had the absolute most extreme bad luck possible in terms of which scores were eliminated.
That was 2002 Cup of Russia, which used the "Interim System" of random selection and anonymity with 6.0 judging?
I think there were also some similar situations in the dance event at 2003 Worlds. Maybe some other events outside the medal positions in the year and a half the interim system was in use.

ivy, just to clarify, when you complain about random selection and anonymity (and I do think there are valid complaints to be made), be aware that they're not inherent parts of COP/IJS per se. Those can be used with the IJS and they can be used with 6.0, or they can not be used with either.

IJS can be used with no random selection (now true for ISU events as well as US and other domestic events). It can also be used with no anonymity -- in fact I think it's only ISU championships and the senior Grand Prix that insist on anonymizing the judges.

So that meant at every competition you had two or three straw men sitting there pompously in the judges box pretending to be judges when in fact they were just window dressing. (Of course no one -- including the judges -- knew which of the nine were actually judges and which were the two saps playing along with the iSU charade. )
And in the old days under 6.0, there was a substitute judge who was sitting there playing along, but through 2002 nothing was anonymous and everyone knew which judge was the one whose scores didn't count.

In the interim system, there was no longer one "substitute judge," but instead several "straw men" whose 6.0 scores didn't count and who were just sitting there as window dressing, and no one knew who was who.

Right after 2002, the plan was to use more judges especially for championship events and then randomly throw some out, so there would still be nine judges whose scores counted. But for cost saving reasons the panels were made smaller, especially at fall competitions, so then it made less sense to

The only value of extra judges on the panel whose scores don't count at all is to confuse would-be cheaters. Otherwise, if you're going to pay for the judges to be there, you might as well use them all. The more judges' scores you use, the more robust the results. So the number of judges is a tradeoff between money and accuracy.

Originally Posted by Mathman
Trimming the mean by throwing out the highest and lowest is not so bad. Under normal conditions it won't matter,
Also, note that, now that random selection is gone, all the judges do contribute scores to the final results even with high and low scores thrown out. The trimming is done GOE by GOE and component by component, so even a judge who gets some scores thrown out for each skater will also have some of them left in. Unless the judge is much more generous or stingy with scores than the rest of the panel, for all the skaters, they won't get thrown out every time.

For GOEs, there will be times when all judges are unanimous in how they score some elements (especially failed elements that get straight -3s). And often there will be a string of GOEs like -1 -1 0 0 0 1 1. So even if you gave one of the -1s or +1s, how do you know whether yours was the one that was thrown out?

and it guards against a judge making a key-stroke error (entering a component score of 0.50 when he meant 9.50, the 0 and the 9 being close together on the keyboard ), and also mitigates the effect of a biased score by a lone judge who is trying to help his skater unfairly.
Actually, all the systems I've seen use a touchscreen or mouse click on the screen for the judges to enter their scores; there isn't a keyboard or a standard keyboard setup. Scroll down for a photo at the bottom:
http://www.2008.nbcolympics.com/news...id=257955.html

So a judge might indeed accidentally input 0.50 instead of 9.50, but the reason would have nothing to do with the setup of keyboards.

I agree with everything Serious Business said.

10. 0
Originally Posted by Joesitz
in the earlier days of the CoP, before they started juggling judges from SPs and LPs, we noticed that there were 5 judges determining the placements, not unlike the 6.0 system. With the change of judges, the similarity to the 6.0 system has ceased.

What CoP can do is give a little explanation of how they viewed the elements in their nitty gritty way. Whether one agrees with the results, it is up to the reader. I could say that Dornbush was the best of the three placements, but that may not show up in the CoP scores. Generally speaking the closesness of the 6.0 system and the CoP are neck and neck.

Since I do not check the protocols very often, I do not know for sure the real winner, I just accept what the results are for the CoP. For example, the consensus of this Forum was that Nagasu should have placed second. This may have been true under 6.0 system, but CoP said NO, Flatt was second.
Huh. Random selection of judges used to bother me a lot. Up until a year ago the protocols would list the judges in a regular order, which made it possible for obsessed math geeks to analyze individual judges' score. I know I did some work on it and I saw some more srticles and posts on the subject on the net, which might be why the ISU has decided last year to start listing the judges in a random order (different order for each skater in the protocol), making any comparison impossible.

Anyway, when it was still possible I ran some scenarios based on the results of a few close calls, such as the results of the Word Team Trophy 2009 (a smaller field is easier to analyze). I calculated the possible outcome of any random judging panel (36 possible panles for the free programs) and found that for the men's even only 6 possible judgng panels (out of) resulted in the actual result of the top 4 and for the ladies event it was 17.

The difference between a random draw of the judges, determining which judges are on the panel, and the random selction of marks between the judges actualy sitting on the panel is, that for a random draw we do not know how it could have turned out, we cannot imagine what another judge, who was not on the panel, would have scored. With random selection we KNOW what the discarded judges have scored, and we DO know how it would have turned out had all the judges marks been taken into consideration.

As for the judges marks being scrutinized - do you really think any judge is required to explain why he maked something a +2 instead of +1, why he gave Alex a 8.5 for PE and Jonas a 7.75? As long as they don't put a +3 instead of a -3, a 4.50 instead of 7.50, I don't think anyone cares.

Anna

11. 0
^^^^^
Super post!!! I learned something about Judging setups.

I really liked your last paragraph on marks being scrutinized. I agree, if the judges are careful, no one cares about the small differential scores, but they can add up to produce a favorable result for a skater.

12. 0
Originally Posted by ivy
One of the things that bugs me about the COP is how few scores are actually used, versus how many are given. So my understanding is there are 9 judges, but only 5 scores are really used - just a little over half. First 2 judges scores are just randomly eliminated, then the high and low scores of the remaining are eliminated and then the rest are averaged.

The part that seems oddest to me is the eliminating of the two, because that is random, and in a close competition, which judges are eliminated could make the difference between 1st and 2nd or between on the podium or off. That seems to be introducing an element of luck into a system that is trying to be more standardized. We have 9 experts there - why not use all of their scores? And are the same judges eliminated in for all skaters in the event - or is re-randomized for each skater? Or even each element?
Yes, it is fair. In s subjective sport it is really the fairest way. If one judge is giving Rachel Flatt all 10s and +3s and it is totally out of line with what was presented, that judges should be dropped as being too high, the same would be true if a judge gave Yuna Kim all -2s with 5.5s. Yes, they are all experts but there is something to be said for being way to high or low. My biggest issue is they are not given enough time to accurately assess what they saw, they score too fast and in a sport where 1 point makes a difference between a medal and no medal, more care should be taken. It should seem that the more people you have in evaluating something the more accurate assessment you have but that’s not true, you have to take out the ones who swing the pendulum too far one way or the other and skew the results so much the wrong person/s win.

Originally Posted by annamac
As for the judges marks being scrutinized - do you really think any judge is required to explain why he maked something a +2 instead of +1, why he gave Alex a 8.5 for PE and Jonas a 7.75? As long as they don't put a +3 instead of a -3, a 4.50 instead of 7.50, I don't think anyone cares.
Well, maybe not if it's only by a +1 vs. a -1 but if a judge scores someone like S/S below a 5.00 when the program was perfect or the same judge gives them a -3 in a jump that was fine or a +3 on a jump they fell on; you bet someone in the ISU will take issue with scores and not just "let it go" (and you don't want to make Ingo mad )

I love how some think judges are evil, manipulative slime who want nothing more than to "fix" the outcome or screw the other countries; it's simply not true. I think there have been too many vocal minorities in the judging pool who don't speak for the other couple hundred.

13. 0
Originally Posted by ivy
The part that seems oddest to me is the eliminating of the two, because that is random, and in a close competition, which judges are eliminated could make the difference between 1st and 2nd or between on the podium or off. That seems to be introducing an element of luck into a system that is trying to be more standardized. We have 9 experts there - why not use all of their scores? And are the same judges eliminated in for all skaters in the event - or is re-randomized for each skater? Or even each element?
Originally Posted by mousepotato
Yes, it is fair. In s subjective sport it is really the fairest way. If one judge is giving Rachel Flatt all 10s and +3s and it is totally out of line with what was presented, that judges should be dropped as being too high, the same would be true if a judge gave Yuna Kim all -2s with 5.5s.
You're talking about two different things. The random selection ivy brings up has nothing to do with judges scoring skaters too high or too low.

1) Random selection. Extra judges are assigned to the panel, and the computer randomly picks some of these judges to have none of their scores for the whole competition.

The point of this, combined with anonymity, was to confuse federations, etc., who were trying to pressure judges to judge in certain ways and to check up on whether they did. But it didn't actually help that purpose very well, and paying for extra judges to sit on the panels and not contribute to the results was too expensive.

So this is not done any more. No need to worry about random selection any more. We can still worry about the anonymity and the mixing up of the order of judges' score columns between each skater, because that is still done in international senior event. (It's not done in all IJS events.)

When this was done, it was random. So if there was a judge who was extremely biased or actively trying to skew the results, the random selection might eliminate all of their scores from the calculations, or it might keep them on the smaller subset of the panel whose scores actually counted. There was a chance that they would have zero effect on the results. But there was also a larger chance that an honest judge would be randomly dropped and the biased judge would have a larger effect on the results than if all the judges sitting and scoring were actually used.

2) Trimmed mean. For each element GOE and for each component, the highest and lowest mark is dropped. This is not random. Which judges' scores get dropped changes for each element and for each component depending on which judges happened to be highest or lowest for that particular element or component. Often there are several judges who gave the same highest mark or the same lowest mark, so one of them gets dropped from the averaging but it's meaningless to say which of the two enthusiastic judges who gave +3 for the same spin (for example) was the one whose score was dropped.

The trimming will take out outlying scores that result from judges seeing things differently from everyone else on the panel (probably a mistake by the outlying judge, but possibly s/he saw a detail that everyone else missed) or from judges making data entry errors. (This is a good thing, except in the rare cases when only a lone judge saw the element correctly.)

It will also often take out scores from judges who consistently score higher or lower than the rest of the panel for all the skaters, or who use a wider range of numbers to reflect the differences they perceive between the skaters. (This is probably a bad thing, assuming the judges are honest and competent but just happen to use numbers a little differently than the majority.)

If a cheating judge is way out of line consistently, giving a pretty-good skater stellar marks or a stellar skater mediocre marks, that would get all or most their scores trimmed for those skaters. (Good thing, but it still means another judge whose scores for that skater/element/component were similarly high or low will not get dropped, as would have been the case if the cheating judge had judged honestly.) They might do a good job with the rest of the field and not get trimmed any more than the other judges for the other skaters.

A judge who is actively trying to manipulate the results in favor of a specific skater can probably adjust their marks to give little boosts here and there to their favorites and little dings here and there to the favorite's rivals without getting most of their scores dropped. They can't guarantee a win for their favorite, but they can nudge the averages in the direction they want, which in close contests can definitely make a difference.

If a judge is just incompetent and their scores are too high or too low or all over the place in inconsistent ways because they just don't have enough knowledge of what they're seeing and what the scoring rules are compared to the rest of the panel, then their scores will often be dropped. (Good thing.)

If many of a judges' scores are consistently out of line over several competitions, for whatever reason there's a process for identifying those judges. Again, this might happen because the judge is cheating or strongly biased without realizing it, because the judge is incompetent, or because a competent judge just happens to use a higher or lower or wider scale than the rest of the panel. The latter will be able to give good explanations when flagged. If the judge gets several assessments and can't defend their decisions, they can be demoted. So it's in the best interest of judges who aren't as competent as their peers to improve their competence.

And it's in the best interest of judges who are trying to manipulate results to do it as subtly as possible so they don't get caught.

The extreme examples mousepotato gives are not likely for that reason.

14. 0
Thanks again for an excellent discussion that really helped me, didn't pay quite so much attention to skating for the last couple seasons and it surprising how many details changed. Glad to know that the random dropping of scores is done. Of course there is still some 'luck' with the draw of judges and tech callers are assigned for any given competition, but that seems understandable and inevitable as long as humans are involved in judging skating - some thing I hope we never lose (a computer handing out PCS and GoEs seems like would probably ruin everything I love about skating).

As gkelly points out it was the random dropping of scores that seemed odd to me, not the elimination of the high and low scores. For close competitions, like Jeremy and Ross at Nat'l, that kind of arbitrary dropping could have meant who made it on the Worlds team, though to Anna's point, it probably wouldn't have made a difference. Still it's such an important call any sense of arbitrariness seemed off to me. Now I have one less thing to worry about.

And to mousepotatoes' closing concern - I never was trying to impune the integrity of the judges, just wondering about the structure of the system itself.

15. 0
Originally Posted by ivy
And to mousepotatoes' closing concern - I never was trying to impune the integrity of the judges, just wondering about the structure of the system itself.
Sorry, that wasn't directed at you...just a statement in general. I just didn't open a new topic.

NEW TOPIC.

With all this talk about anonymity it's really only the public who doesn't know who the judges areI the ISU is well aware of who gave what scrore.

Page 1 of 2 1 2 Last

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•