Scoring bias at the national level

Jumping_Bean · Feb 5, 2024

snowed said:
I'm newish on the forum but in my opinion, a fair amount of complaining comes from biased fans or fans that don't know or ignore the rules on purpose....
There was a thread about improving judging and it went very quickly towards AI judging.
I'm not saying that skating judging is perfect but, personally, I don't really see how to practically improve it (except to add a second camera for tech reviews for edges and underestimation).
The next problem I see is that judging is difficult to understand, ISU should find ways to explain the rules to the fans.
Us (or judges) talking about differences in judges scores without analyzing a video of the program and following the rules, doesn't give much information.

Yes, AI judging is not really the answer, as any AI is only as good as the sample set it has been trained on.

Adding additional review cameras is a necessary improvement, but improvements also need to be made in the review process of judges' performances and more concrete consequences have to be introduced. In most cases (including the case that started this whole thread). the judges who can be proven to have acted against the rules get away with barely more than a slap on the wrist. Even judges who have been suspended for some time can return without any issues afterward and don't seem to be under closer supervision than any other judges. That is very troubling to me.

The ISU has judging seminars and online materials that go into detail more closely and provide examples to explain the rules and guidelines better, but to what degree this content is helpful, I am not sure, as I have no extensive personal experience. It's very possible improvements have to be made here too, and certainly, all of these materials should be publicly available.

The funniest (and simultaneously saddest) experience I've heard shared by people who took part in an ISU judging seminar was that Sasha Trusova's FS at Worlds 2021 was used as an example of a program deserving 4s to 5s in transitions when she actually received in the 7s and 8s by all judges - and that with two falls. None of the judges were publicly reprimanded, or the issue even acknowledged, all of this happened behind closed doors and the same judges happily kept and keep on judging away.

Magill · Feb 5, 2024

snowed said:
I'm newish on the forum but in my opinion, a fair amount of complaining comes from biased fans or fans that don't know or ignore the rules on purpose....
There was a thread about improving judging and it went very quickly towards AI judging.
I'm not saying that skating judging is perfect but, personally, I don't really see how to practically improve it (except to add a second camera for tech reviews for edges and underestimation).
The next problem I see is that judging is difficult to understand, ISU should find ways to explain the rules to the fans.
Us (or judges) talking about differences in judges scores without analyzing a video of the program and following the rules, doesn't give much information.

And yet a 10 for Ilia Malinin in PCS at, I think, Skate America was hard to swallow even for his fans. It was not complained about. It was openly and outright laughed at and joked about. Even his hardcore fans were sort of surprised with this score. You don't believe me, go and check the competition threads. Even in this forum, people were asking jokingly what the judge was smoking . And, yes, it is hard to believe it was given in good faith and just within the scope of "subjective variance". Unless you think the judge was completely incompetent.
As far as I remember, the judge was Japanese so you cannot even speak about some national bias. So what was it? Hard to speculate but surely not something that would actually make the audience trust the scoring system and the judges' integrity and good faith.
Did it get any official response? Not that I know of.

Baron Vladimir · Feb 5, 2024

Jumping_Bean said:
The funniest (and saddest) experience I've heard shared by people who took part in an ISU judging seminar was that Sasha Trusova's FS at Worlds 2021 was used as an example of a program deserving 4s to 5s in transitions when she actually received in the 7s and 8s by all judges - and that with two falls. None of the judges were publicly reprimanded, or the issue even acknowledged, all of this happened behind closed doors and the same judges happily kept and keep on judging away.

It was explained by the officials to be likely placed in the 'Green category', so up to 7 was also accessible :biggrin:

E: From 5 to 7 is 'green'... However, the problem is that you are not judging just one category, but 5 of them, so for example Trusova's speed and ice coverage pushed up her transition score, instead to push CO and SS score. So, the problem was that many of components interfere with each other in the judges mind at the time of judging... That's why we got 3 components, but i'm only afraid that with less components, the difference between them will be even smaller, and the feedback for the skaters (by the numbers judges are giving) will be less useful...

Mathman · Feb 5, 2024

Jumping_Bean said:
The funniest (and simultaneously saddest) experience I've heard shared by people who took part in an ISU judging seminar was that Sasha Trusova's FS at Worlds 2021 was used as an example of a program deserving 4s to 5s in transitions when she actually received in the 7s and 8s by all judges...

That reminds me of 2010 Europeans, where Brian Joubert got higher marks in Transitions than Evgeny Plushenko. Plushenko complained,' Hey, no fair, Joubert and i did exactly the same transitions -- none.'

The upshot? Judge Joe Inman picked up on this and wrote a bistering email to everybody in the ISU saying, 'see, I toldd you so. You are not scoring Trastitions according to the rules.' Guess who was raked over the coals? Inman. People blamed his whistle-blowing for shaming the judges into placing Lysacek ahead of Plushenko at the Olympics that year.

Skating judges, again, scrutinized

VANCOUVER, B.C. — Comments by Olympic figure skating champion Evgeni Plushenko of Russia and an Olympic-level American judge have again raised questions about how the sport is judged and has created another skating controversy on the…

www.ocregister.com

Anyway, the simple truth is that figure skating judges do not give bad component sores to world championship caliber skaters like Plushenko or Trusova no matter what they do or don't do. Maybe that's why they got rid of Transitions -- it was too easy to see that skaters were getting high scores for nothing, unlike composition and musical interpretation where it is not so obvious.

Edit: It is hard to fault the ISU judges on this issue, though. On components, there is a range of values that are tacitly reserved for different levels of competition. 4s&5s for novices, 6s&7s for juniors,7s & 8s for seniors, 9s for the top medal contenders. Does this seem a lot like "reputation judging"?. It does, But without some sort of guidance it would be impossibly unwiedy to have a single contunuos scale that encompasses all levels os skating, beginners through Olympians.

gkelly · Feb 5, 2024

Mathman said:
Edit: It is hard to fault the ISU judges on this issue, though. On components, there is a range of values that are tacitly reserved for different levels of competition. 4s&5s for novices, 6s&7s for juniors,7s & 8s for seniors, 9s for the top medal contenders. Does this seem a lot like "reputation judging"?. It does, But without some sort of guidance it would be impossibly unwiedy to have a single contunuos scale that encompasses all levels os skating, beginners through Olympians.

Those numerical scores are not officially reserved for specific competition levels. (Yes, I know you said "tacitly.")

The numbers are defined in terms of adjectives such as Poor, Fair, Average, Above Average, Good, Outstanding.

Or, more broadly, the colors red, orange, green, gold, platinum.

If you look at a large JGP competition, for example, all competitors will be competing at the junior level (although a handful maybe shouldn't be). You might see PCS inching into the 8s for a really stellar junior performance, or as low as 3s for weaker juniors. Or lower than that for skaters who are not really at junior skill level.

At the senior level, e.g., senior B competitions or Euros or 4Continents, you might see occasional 4s for the weaker seniors up to 9s for top elite competitors.

The tricky part is if you look at a skater who has strong skating skills (worthy of 7s or 8s -- and maybe earning 9s from judges who are especially impressed by successful difficult jumps, speed/power more than complexity, and possibly the skater's nationality) who does not use that power to execute varied or complex skating between the elements.

So if the transitional moves are minimal and simple, but performed at great speed, how should that be scored relative to skaters with average, fair, or poor skating skills performing similarly simple skating?

I would guess this has been a topic of discussion at judging seminars and roundtables, especially when there was a dedicated component for Transitions. Even if everyone agrees that the Transitions score should be lower than the Skating Skills score, judges might disagree about how much lower. 0.25 or 0.5? 1.0? 3.0 or 4.0?

Part of the differences among judges may have to do with how different individuals are neurologically wired to process numbers. Another part, especially when we're talking about new International and not already-ISU-level judges, might have to do with the range of skating ability they have the most experience judging.

Is it possible to set firm benchmarks that everyone can agree on about an appropriate Transitions score for an empty program by a strong skater vs. an averagely-constructed program by an average skater vs. an empty program by an average skater, etc.?

I think what happens is that the ideal is put out in the official documents in relatively general terms, and then the specifics narrow in through each judge's experience and interacting (outside of ongoing competition) with other judges to share impressions, and the most experienced judges sharing their experience with the newer ones. And also probably with the experienced judges talking through what they've been seeing and discussing how to get everyone closer to the same page in understanding what skill level and content correlates with what numerical score.

And changing the wording of the official guidelines, or even changing the rules (e.g., 3 components instead of 5) in hopes of getting better agreement on how to apply the guidelines.

I'm not sure that removing the Transitions score, aside from reflecting those moves under Composition, Skating Skills, or element GOEs where relevant, was necessarily the best way to improve judging of that aspect. Or that combining Performance/Execution and Interpretation was necessarily better than evaluating those aspects separately. But that's what we've got now, so that's what judges have to figure out how to evaluate and assign numbers to and come to some agreement on what counts as average, good, or outstanding for these broader component categories.

It's always going to be a work in progress, both for individual judges and for the international judging corps as a whole (or domestic judges in federations around the world). Even if they discuss and discuss and refine guidelines and come to agreements that everyone agrees that X performance by Skater Y deserved 5.0 or 10.0 for a specific component, that doesn't nail down exactly what score to give next year's performance by a different skater that was equally excellent (or equally average) in different ways.

The skating itself is always a moving target, so we can't really expect the judging to reach a point where it is fixed and perfect.

Magill · Feb 5, 2024

Mathman said:
That reminds me of 2010 Europeans, where Brian Joubert got higher marks in Transitions than Evgeny Plushenko. Plushenko complained,' Hey, no fair, Joubert and i did exactly the same transitions -- none.'

The upshot? Judge Joe Inman picked up on this and wrote a bistering email to everybody in the ISU saying, 'see, I toldd you so. You are not scoring Trastitions according to the rules.' Guess who was raked over the coals? Inman. People blamed his whistle-blowing for shaming the judges into placing Lysacek ahead of Plushenko at the Olympics that year.

Skating judges, again, scrutinized

VANCOUVER, B.C. — Comments by Olympic figure skating champion Evgeni Plushenko of Russia and an Olympic-level American judge have again raised questions about how the sport is judged and has created another skating controversy on the…

www.ocregister.com

Anyway, the simple truth is that figure skating judges do not give bad component sores to world championship caliber skaters like Plushenko or Trusova no matter what they do or don't do. Maybe that's why they got rid of Transitions -- it was too easy to see that skaters were getting high scores for nothing, unlike composition and musical interpretation where it is not so obvious.

Edit: It is hard to fault the ISU judges on this issue, though. On components, there is a range of values that are tacitly reserved for different levels of competition. 4s&5s for novices, 6s&7s for juniors,7s & 8s for seniors, 9s for the top medal contenders. Does this seem a lot like "reputation judging"?. It does, But without some sort of guidance it would be impossibly unwiedy to have a single contunuos scale that encompasses all levels os skating, beginners through Olympians.

Not speaking about Plushenko, but without these artificially inflated PCSs some skaters would not have been the "world championship caliber skaters", so, you know... it is justifying the cause by its effect ...and it is against the rules anyway...

Mathman · Feb 5, 2024

gkelly said:
Part of the differences among judges may have to do with how different individuals are neurologically wired to process numbers.

That is an intriguing thought. Can you explain it a little bit more?

(And thank you for the insightful post.)

Mathman · Feb 5, 2024

Magill said:
Not speaking about Plushenko, but without these artificially inflated PCSs some skaters would not have been the "world championship caliber skaters", so, you know... it is justifying the cause by its effect ...and it is against the rules anyway...

For most skaters I would say that "inflated PCSs" and "perceived world caliber status" go in tandem, each influencing the other. There are a few skaters -- maybe Nathan Chen and Alexandra Trusovca -- who start out quadding their way to prominence first, PCSs following. It's hard to say, though. Trusova never made it to the top echelons in PCS, perhasps because she suffered by comparison with some of the other Russian ladies in her class, like Shcherbakova.

Chen, on the other hand, always had good blade-to-ice skills, in addition to his jumps, which made it easier for judges to give him high component scores all the way around.

And of cpurse sometimes skaters get higher scores this year than last because they worked on their weaknesses and got better.

icewhite · Feb 6, 2024

Jumping_Bean said:
Of course, it's unavoidable, but that doesn't mean we shouldn't be striving to minimise the degree of bias, or else we can just give up on the scoring system as a whole and just drop the pretense of this being a sport.

There's a normal degree of variance, and an abnormal one. If we don't allow ourselves to discuss where those borders lie and if we shy away from becoming conscious of our own biases and those of others, there can be no progress in the direction of a fairer sport, as fair as it can be. I guess that's fine, but then I also don't want to see any more complaining about unfair judging or about why figure skating is so unpopular.

Love and agree with all of your posts on this topic.

Magill · Feb 6, 2024

I must admit the mental flexibility required to grasp when one is expected to just accept the scores cause "well, these are the rules" and when they are expected to do the same under the "well, judges are human" excuse is getting well beyond me :scratch2:

Mathman · Feb 6, 2024

Magill said:
I must admit the mental flexibility required to grasp when one is expected to just accept the scores cause "well, these are the rules" and when they are expected to do the same under the "well, judges are human" excuse is getting well beyond me

For me it’s not mental flexibility so much as a feeling that the problems that plague figure skating judging are overblown. I just cannot work up the requisite righteous indignation.

I do not, in fact, have the mental flexibility to suggest a convincing remedy. Everything I think of, in terms of possible changes to the scoring system or to the judges’ evaluation protocols, sound good for two minutes until I consider their drawbacks.

gkelly · Feb 6, 2024

I think it does require mental flexibility to evaluate (judge) figure skating, and therefore to evaluate the judging.

snowed · Feb 6, 2024

Jumping_Bean said:
In most cases (including the case that started this whole thread). the judges who can be proven to have acted against the rules get away with barely more than a slap on the wrist.

The funniest (and simultaneously saddest) experience I've heard shared by people who took part in an ISU judging seminar was that Sasha Trusova's FS at Worlds 2021 was used as an example of a program deserving 4s to 5s in transitions when she actually received in the 7s and 8s by all judges - and that with two falls. None of the judges were publicly reprimanded, or the issue even acknowledged, all of this happened behind closed doors and the same judges happily kept and keep on judging away.

This is exactly where I'm conflicted, the judge that started this thread is was proven in my eyes, to judge outside the corridor, not against the rules. No rules (scores for specific elements or components) were discussed in the ruling. Explain him, and us the fans, why he's marks are not ok.

So if you were a judge, would you give 4 to Trusova, knowing that her "corridor" in the season is 7?

I so agree that judging can be improved and as fans we should challenge it. But fans keep attacking the judges and I think it is the " judging system" that should be improved so it can allow the judges to judge better.

It should give more tools (extra camera for tech panel), maybe give more time or less to do for different judges, the rules should be more clear (especially the components) and more transparent (yes publish the education materials, have judges in the press conference after the competition explaining the scores)

icewhite · Feb 6, 2024

snowed said:
This is exactly where I'm conflicted, the judge that started this thread is was proven in my eyes, to judge outside the corridor, not against the rules. No rules (scores for specific elements or components) were discussed in the ruling. Explain him, and us the fans, why he's marks are not ok.

So if you were a judge, would you give 4 to Trusova, knowing that her "corridor" in the season is 7?

I so agree that judging can be improved and as fans we should challenge it. But fans keep attacking the judges and I think it is the " judging system" that should be improved so it can allow the judges to judge better.

It should give more tools (extra camera for tech panel), maybe give more time or less to do for different judges, the rules should be more clear (especially the components) and more transparent (yes publish the education materials, have judges in the press conference after the competition explaining the scores)

Well, I absolutely agree that just judging inside the corridor alone says nothing about how correct it is, but there is a difference if judge icewhite from Germany gives Trusova 5s in PCS and Mikutina 9s in PCS while everyone else gives Trusova 9s and Mikutina 7s, or if judge icewhite from Germany gives Hase/Volodin a 2 on their twist and a 0 on their throw while everyone else gives them -1 and -3, gives them higher PCS than everyone else, and is not as generous with the other teams, no?

Mathman · Feb 6, 2024

snowed said:
This is exactly where I'm conflicted, the judge that started this thread is was proven in my eyes, to judge outside the corridor, not against the rules. No rules (scores for specific elements or components) were discussed in the ruling.

I got exactly the opposite impression. The judge's marks were actually not outside the corridor and were not flagged as such by the ISU judges' oversight apparatus. The one rule that was broken was a component score of 9.50 to a performace that had a fall. This brought the complaint not to the corridor monitors but to the ISU Technical Committee and from there to the Ehhics Committee that investigates dishonesty and dishonorable activity.

gkelly · Feb 6, 2024

If most judges on the panel gave skater A an average of 7.0 for all 5 former components,

Judge X who gave that skater scores of 7.5 6.75 7.0 6.75 7.0 would be right inside the corridor.

But Judge Y who gave 8.0 5.0 7.5 7.0 7.5 would also be right inside as well. Same total, different distribution.

And there was a leeway of up to 2.0 total IIRC, so even 7.5 5.0 7.0 6.5 7.0 would not get flagged as anomalous.

All the more reason to encourage judges to separate the different components if they thought there was a significant difference between, in this case, the skating skill and the transitions and program construction. But most judges were not brave enough to go there.

Jumping_Bean · Feb 6, 2024

snowed said:
This is exactly where I'm conflicted, the judge that started this thread is was proven in my eyes, to judge outside the corridor, not against the rules.

Except that that is specifically what did not happen in this case. At multiple points during the explanation in the document, you can find the very specific wording of “While his marks were in the corridor, his marks were overall higher" and it even outright states "A Judge’s marks need not be outside the corridor before they can be considered as bias." Doug Williams was found guilty not of judging outside of the corridor, but of a violation of the duties of judges and the ISU Code of Ethics.

For those of you who are interested in how and why his scores got flagged despite that, here's a document explaining the work of the Officials Assessment Commission. Very fascinating read, in my opinion.

Baron Vladimir · Feb 6, 2024

gkelly said:
If most judges on the panel gave skater A an average of 7.0 for all 5 former components,

Judge X who gave that skater scores of 7.5 6.75 7.0 6.75 7.0 would be right inside the corridor.

But Judge Y who gave 8.0 5.0 7.5 7.0 7.5 would also be right inside as well. Same total, different distribution.

And there was a leeway of up to 2.0 total IIRC, so even 7.5 5.0 7.0 6.5 7.0 would not get flagged as anomalous.

All the more reason to encourage judges to separate the different components if they thought there was a significant difference between, in this case, the skating skill and the transitions and program construction. But most judges were not brave enough to go there.

Yes

For example in Trusova's Worlds 2021 performance, if you judged her components score as 9.00/7.00/8.25/8.50/8.25, all of your scores would be integral part of the final judges score, even the majority of judges gave her 8.5/7.75 in the first two components... (it is hard to go bellow 6.5 for components in this particular occasion when the lowest score for TR at the whole competition was 6.25)...

Mathman · Feb 6, 2024

gkelly said:
All the more reason to encourage judges to separate the different components if they thought there was a significant difference between, in this case, the skating skill and the transitions and program construction. But most judges were not brave enough to go there.

I think that the whole business of "program components" is fighting with itself. Are we talking about the program taken as a whole (like the old second mark in 6.0) or are we talking about "components" of the proogam (as in, a triple Lutz is one compement on the tech side)?.

I think an argument could be made for having only two compnents.. The etchnical aspects of the program as a whole (Skating Skills including the old Transitions) and the artistic/performance aspects of the4 program as a whole.

Baron Vladimir · Feb 6, 2024

Mathman said:
I think that the whole business of "program components" is fighting with itself. Are we talking about the program taken as a whole (like the old second mark in 6.0) or are we talking about "components" of the proogam (as in, a triple Lutz is one compement on the tech side)?.

I think an argument could be made for having only two compnents.. The etchnical aspects of the program as a whole (Skating Skills including the old Transitions) and the artistic/performance aspects of the4 program as a whole.

I think the intention of changing how the components are judging is good, but the problem is how that idea is translated in the common practice... For example i would propose for components to be judged as first related to the ice (with your blades), second related to the public (audience/judges) and the public space (ice rink) and third related to the music and overall storytelling, but the question is how to manage that idea to work the best in everyday practice.

Scoring bias at the national level

Jumping_Bean

Magill

Baron Vladimir

Mathman

Skating judges, again, scrutinized

gkelly

Magill

Skating judges, again, scrutinized

Mathman

Mathman

icewhite

Magill

Mathman

gkelly

snowed

icewhite

Mathman

gkelly

Jumping_Bean

Baron Vladimir

Mathman

Baron Vladimir

Similar threads

Connect with us