- Joined
- Nov 30, 2013
I call it "Rise of Russian Empire" though.
So will the 2018 Olympics be the Revenge of the Russians or Return of The Korea(South)?
I call it "Rise of Russian Empire" though.
How interesting. To me it is clear that skater A is the rightful winner.
Yes, I was assuming that the judges scores lined up in the two rows. In other words, that the same judges who were the most enthusiastic about skater B were the same as those who didn't like skater A at all.
It is quite easy to raise the same issue with numbers that do match actual results.
Skater A: 8.00 8.00 8.00 8.00 8.00 8.25 8.25 8.50 8.00
Skater B: 7.75 7.75 7.75 7.75 7.75 7.75 7.75 8.00 7.50
Same question. Skater A was preferred by a majority of judges. Skater B got the most total points. In the best of all possible judging systems, who deserves to win?
Mathman, that (one mark from each judge for each skater) is not how any system used by the ISU ever worked, so there's not much point in arguing what a bad system it is.
However, I wouldn't assume that a minority of judges who disagree slightly or significantly with the majority are necessarily cheating, independently or in conspiracy. Other possibilities are honest difference of opinion between experts, general incompetence on the part of some less expert judges, and honest mistakes (e.g., data input error) regardless of competence.
But in most cases the technical content the skater actually completes and the variation in how all the judges on the panel use numbers will tend to cancel out the variance from just two judges colluding.
Do we want to look at how the numbers work with plausible examples, for honest differences of opinions first and then consider cheating as a special case afterward?
Mathman, again, the numbers you're giving do not represent judges choices of who was better overall and deserve to win.
It is not true that Skater A was the choice to win of 6 out of 9 judges. They represent 6 judges' evaluation of Skater A's performance on one program component.
If you prefer, we can stipulate that they are actually averages of all five components from each judge for each skater.
Let's also imagine that all judges gave approximately equal average GOEs to both skaters, and that as far as they can tell without knowing what levels were called and without having memorized the scale of values, the technical content was approximately equal, so the judges on both sides believe that the program component scores will probably determine the outcome.
The judges should know that their estimates of TES are likely to be off by a point or two, maybe more, precisely because they are not human calculators. If the 6 judges honestly believe that skater A was enough better than B on PCS that A should deserve to win even if B had higher levels on all spins and steps, and better GOEs on the elements where the GOEs have higher values, then they should reflect that by giving A significantly higher PCS.
If they don't think skater A was that much better on program components, then by giving A slightly higher component marks, they are not "choosing A" as the winner. They are simply reflecting that they thought A was slightly better on the program components.
If you like, we can say that they are choosing skater A as the winner of the program components, by a fairly slim margin.
If we're still discussing only honest judges, then it just so happens in this case that the minority of judges who prefer skater B on program components also happen to use wider numerical ranges. That may be because they feel more strongly about B's superiority. Or they could just be bolder in their use of numbers.
Neither camp is wrong about who is better in my honest judge scenario -- they just have different opinions.
Is it OK for the minority opinion to prevail in who "wins" the PCS? IMO, that depends on how good the reasons are for giving wider gaps in PCS.
Which we can't tell just by looking at the numbers.
All five program components tend to corollate with the first one that the judges write down, usually Skating Skills. If you take a judges' SS score and multiply by five, in practice that's a pretty good estimate of the total program component scores given by that judge.
If they were giving ordinals, yes. But if they were IJS judges they should do this: For each skater and for each component a contentious judge is expected to give the score that the skater deserves for that component under the IJS rules, regardless of whom the judge thinks deserves to win. This is the crucial difference between ordinal and point-total judging.
But you are right. No matter what scores are given we do not know flat out for sure who a particular judge thought ought to win.
To me, the point of this exercise is to postulate that 6 judges in fact did think that skater A performed better, and then to investigate how this might play out in the marks.
We can eliminate this assumption if you like, but then I don't understand what the question is.
Just chiming in to say I find this whole discussion fascinating. I'm intrigued by the possibility of certain judges using a wider range of marks. This could cause skewered results by the minority--as Mathman notes--even without dishonest collusion. The question is, is it "right" for Skater B to win because those judges that preferred him/her, really preferred him/her?
Generally, I'm with Mathman: the judges that favoured Skater A didn't favour Skater A "less" or something; more likely, they were simply more conservative with their marks.
Perhaps there can be guidelines on how many skaters you need to put in the 5s range, 7s range, 9s range, ect. for each competition,
but even that runs into problems: What if someone suddenly performs very well, but you've "run out" of the 9s you can give in P/E?
But, given two skaters who are close enough in ability to engender honest disagreement, it would be very rare that some judges would have skater A somewhat higher on all five components, and the other judges would have skater B higher on all components, and not one of them would have at least one component a little higher for the other skater. Which is why it makes more sense to look at total PCS rather than just one component.
And, indeed, if they are approaching IJS scoring the way they're supposed to, they wouldn't be thinking in terms of who deserves to win at all. They'd just be scoring each component against their mental standard for that component.
People complain when judges bunch their PCS too tightly for each skater, if they appear to be trying to stay within the corridor more than to reflect real differences in the skating.
Judges are encouraged to "spread their marks."
So in theory, spreading marks as appropriate to reflect differences between components in the same skater and to reflect real differences between skaters is a better use of numbers. Dare we say that judges who use a wider spread of marks appropriately are better judges, or at least better at using the numbers the way they're meant to be used?
Spreading marks also gives judges more control over the results than judges who choose to use narrower ranges. Is it OK if the judges who have the strongest effect on the result are the judges with the (strongest opinions)?
it would be very rare that some judges would have skater A somewhat higher on all five components, and the other judges would have skater B higher on all components
Thanks to randomization of judges' scores, we do not know whether this is rare or common. My intuition is that it is not rare at all.
But anyway, now I am sorry that in illustrating the question I presented sample scores for only one component. This sent the discussion off on a tangent.
I guess that is what the whole controversy comes down to. What is the purpose of a sports competition? Is it to see which competitor outperformed the other, or is it to decide which competitor did a better job of conforming to an objective standard?
Staying in the corridor has nothing to do with bunching the PCS tightly for each skater. It has to do with being not too far off from the other judges for each component. If the other judges spread out their marks, you had better do so, too, or you risk being outside the corridor on some of them.
For each of the five (5) Program Compnoents, the Judge's corridor will be based on 1.50 Deviation Points (15,0% of the maximum 10.0 points per Component) between the score of a Judge and the calculated Judges' average score for the same Component, i.e. in total 7.50 Deviation Points for the 5 Program Components. Plus and minus Deviation Points are subtracted.
Is this true? Do you mean that the ISU officially encourages judges to do this?
The scoring scale has to accommodate all skaters from beginners to world champions. There cannot be too much of a spread between the best skater in the world and the second best.
I don't think so. If the contest is close, the scores should be close together. If one skater is much better than the other then the scores should be farther apart.
Here are all the protocols for US Nationals. Scores are not anonymous or randomized -- judge #1 on the officials list is always judge #1 for all skaters, etc.
Last I heard that was also true for the JGP, if you think international events are a better example.
For any two skaters (not necessarily near the top or even adjacent in the standings) in any event can we find examples in which
1) a majority of judges thought that skater A was better (or equal to) than B on all 5 components
and
2) all the remaining judges thought that skater B was better (or equal to) than A on all 5 components.
I.e., not even one judge had one component reversed from their overall opinions of the two skaters' relative PCS quality. I'll allow ties on some of the components.
ETA: In 24 head-to-head matchups among senior medalists in short and free programs for all disciplines, I found one example:
In the ladies' SP, 8 judges marked Gold higher than or equal to Edmonds in all components. Judge #1 marked Edmonds higher in all.
However, the task of the judges under IJS is not to rank the skaters, vote for which skater they thought performed best, or choose who they think should finish higher. Unlike under 6.0, they're just supposed to score each skater independently….
With IJS it's possible to score skaters who have no one to compete against. This won't happen in international competition, but it does happen at some club competitions or even at some national championships of smaller federations: one skater (usually male) or team enters an event, no one else enters, or one or two others enter and then withdraw.
See page 6 of ISU communication 163
But ultimately, components are just numbers. They are not--in fact, cannot be--objective standards. What happens in the end is ranking skaters, because that determines the medals/placements everyone cares about. I don't think most judges are capable of keeping an objective scale in their heads. They'll have to, at points, go, "Oh, I gave Edmunds 7.50, that means I need to give Gold 8.25 because she's better." If they don't do that, they'll run into fatigue from looking at so many competitors, and likely end up giving scores they don't truly believe in (I think this might be a factor in why people who don't make the final group are low-balled. They're superior to the group they're in, but judges aren't comfortable giving out sudden 9s when the best they've given so far is a 7.50. They don't "need" 9s to place the skater ahead. But by the end of the night, judges are comfortable giving out 9s, thus potentially "screwing over" the earlier skater).
I will work on this, too. I am most interested in examples where a majority of judges favored one skater pretty much down the line, but only by a small amount, while other judges liked the other skater consistently and by quite a bit. Maybe this almost never happens in the absence of collusion and bias.
I do not accuse the IJS judges of not doing their assigned task.
The question is, should the system be changed in view of the fact that it allows an enthusiastic minority to override a complacent majority?
The IJS is good in this setting. However, I believe that under 6.0 judging also it was possible for a lone skater to skate against a "gold standard, silver standard, or bronze standard" in the case of a boy who is the only skater entered. (Or he could skate against the girls. ). The judges would decide whether he met the standard or not. (Sort of IJS in 6.0 clothing. )
I had in mind this kind of example. Judges 1, 2, 3, and 4 score program components SS and TR.
SS: 8.00 8.00 8.00 8.00
Tr: 2.50 2.50 2.50 8.00
Judges 1, 2, and 3 have spread out their marks, but it is judge 4 who is “outside the corridor.”
Judges 1, 2, and 3 have a total of -1.5 deviation points over two components. They are OK. Judge 3 has +4.50 deviation points over two components. This judge is in trouble. The question is not whether the scores in each column are spread out or close together, it is whether the score in each row is close to the average in that row or not.
By the way, the fact that one has to go to such unrealistic extremes to create an example shows that it is almost impossible for any judge, however incompetent or biased, to get caught by the ISU judges’ oversight procedure.