Scoring bias at the national level

Baron Vladimir · Dec 28, 2023

Magill said:
I beg to differ on your opinion that the corridor does not help anyone to win or lose the competition. Of course, it does as most judges would rather avoid the necessity to justify their scores being too far off so they will simply comply with the corridor, whether they agree with it or not. Now, you say the corridor is based on the previous scoring practice. But these very scores mostly complied with the previously set corridor....
It further complicates things... and, IMHO, it also forms a basic methodological, or cognitive, error - you justify a fact by a factor depending on what you justify by this factor...

But judges dont know other judges scores or what the corridor is, all is done at the moment of the competition... The only bias they may have is related to skater persona, nationality, choice of music etc etc :shrug:

Baron Vladimir · Dec 28, 2023

Mathman said:
I am not sure how you are using the word "corridor" here. In the ISU documentts that define this term and illustrate the definiotion with examples, the corridor for each skater is defined by the marks that the judges actually give to that skater for that particular performance. "Assessments" are then made when scores fall outside this range.

It may be "easy for a referee to make up a "corridor" (in the ordinary sense of this word, not what the ISU judging rules specify) out of thin air, but ... are you sure that referees actually do this instead of following the ISU's own rules?

Referees range of marks is what i meant to say, not the statistical corridor you are referring to. And that is the first thing judges explain on the judges meeting, not the corridor you are mentioning. Referee is not doing a statistical analyze of the scores, just noticing if some score is way different than what (s)he observes as a corridor and discuss that in front of the panel of judges. Not every higher or lower score is bias, it can be a mistake in a rush, or that particular judge saw some special creation or plus something in one particular element which is not exactly written in the rules, but really deserves higher mark... Based on the judges meeting and referees report, the later analyze is made...

Mathman · Dec 28, 2023

Baron Vladimir said:
But judges dont now other judges scores or what the corridor is, all is done at the moment of the competition...

I think that this is the focal point of criticism. The judges don't know for sure what the other judges will do, but (1) they can make a pretty good guess based on how those skaters were scored at previous competitions, and (2) whatever the judge knows or guesses, that judge might be fearful of sticking his neck out even if he or she is particularly impressed by this performance and feels that it is batter than perfomances that the skater has given before.

Either way, this might produce a dampening effect that leas to "reputation judging" or favoring the skaters that other judges have favored in the past.

(Since this is a new page -- at least on my computer display -- I will repeat the relevant ISU document. I just find it hard to believe that the referree all on his own authority is expected to jump up and tell the judges," I just nade up a corridor for this skater and I expect you to be in it or tell me the reason why!"

https://www.isu.org/figure-skating/rules/fsk-communications/31545-isu-communication-2583/file )

Baron Vladimir · Dec 28, 2023

Mathman said:
I think that this is the focal point of criticism. The judges don't know for sure what the other judges will do, but (1) they can make a pretty good guess based on how those skaters were scored at previous competitions, and (2) whatever the judge knows or guesses, that judge might be fearful of sticking his neck out even if he or she is particularly impressed by this performance and feels that it is batter than perfomances that the skater has given before.

Either way, this might produce a dampening effect that leas to "reputation judging" or favoring the skaters that other judges had favored in the past.

(Since this is a new page -- at least on my computer display, I will repeat the relvant ISU document. I just find it hard to believe that the referree all on his own authority is expected to jump up and tell the judges," I just nade up a corridor for this skater and I expect you to be in it or else, buddy!"

https://www.isu.org/figure-skating/rules/fsk-communications/31545-isu-communication-2583/file

Referee is expecting from the judge to explain his questionable score, that was my point. As i tried to explain above, out of the other judges and referees scores doesnt mean something is necessarily wrong, it also can mean progress. But if the judge cant explain what he meant to say with his score in terms of figure skating, that is the problem. Which is the biggest problem in this exact situation - the explanation he gave for his scores.

Andrea82 · Dec 28, 2023

Mathman said:
I think that this is the focal point of criticism. The judges don't know for sure what the other judges will do, but (1) they can make a pretty good guess based on how those skaters were scored at previous competitions, and (2) whatever the judge knows or guesses, that judge might be fearful of sticking his neck out even if he or she is particularly impressed by this performance and feels that it is batter than perfomances that the skater has given before.

Either way, this might produce a dampening effect that leas to "reputation judging" or favoring the skaters that other judges have favored in the past.

(Since this is a new page -- at least on my computer display -- I will repeat the relevant ISU document. I just find it hard to believe that the referree all on his own authority is expected to jump up and tell the judges," I just nade up a corridor for this skater and I expect you to be in it or tell me the reason why!"

https://www.isu.org/figure-skating/rules/fsk-communications/31545-isu-communication-2583/file )

Referees mark the whole competition, with precise marks, both GOEs and PCS
At the end of the competition, there is the round table discussion moderated the referee. The RTD follows the following format:

https://www.isu.org/figure-skating/rules/fsk-forms-reports-seminars/forms-reports/18449-structure-of-judges-round-table-discussion/file

Elements with divergence in marks between judges should be chosen by the referee for discussion.

At the end of the completion, referees have to fill this form: Attachment to the Referee's report for International Competitions

Then for ISU events (from Junior GPs, upwards), the Officials Assessment Commission reviews the scores. They receive excel sheets and everything with all scores outside of the corridor being highlighted and they have to review them assessing if it was an error or there can be a valid explanation to be outside the corridor.
As soon as possible after the conclusion of the respective ISU Event the assigned OAC members
will receive the following evaluation materials:

● Electronic documents the Grades of Execution (GOEs) of every element and the Program
Component scores of all Judges.
● Electronic documents highlighting the cases of evaluation based on the criteria outlined under
paragraph F) below;
● Excel sheets indicating cases of evaluation;

● Electronic documents of statistical grids highlighting cases of possible (national) bias which are based on a mathematical calculation of the percentage difference between / each Judge’s total score for one Competitor (Single Skater, Pair, Ice Dance Couple, Synchronized Skating Team) and his total scores for the two Competitors who, in the official result of the respective segment, are placed immediately above and for the two Competitors placed immediately below that Competitor.
● Video recording of the competition;
● Other supplementary materials, as decided by the respective Technical Committee.

Then they can also evaluate and indicate as errors, scores not highlighted on the electronic documents which they consider as unjustifiable. Because they may establish the whole panel was wrong.

Mathman · Dec 28, 2023

What makes this case unusual -- or maybe it is not unusual? -- is that the errant judge's marks were not accually outside tthe corridor according to the procedures outlined in the Communication. So evidently the Assesssment Committee takes into account other information culled from the event reports. In this case they seemed to spot an unacceptable pattern that raised their eyebrows evn without referring to the corridor.

Did someone outside the official Assessment Committee file a complaint? Did the Assessment Committee use some other measure of deviation form exoectation other than the criteria that are so carefully spelled out in the Communication -- like, for instance, do they make computions alomg the lines that were brought toi light by the data displayed on the SkatingScores.com website that were analysed on this thread by Miller and by Snowed?

Baron Vladimir · Dec 28, 2023

You cant measure bias that easily = what SkatingScores.com is doing is a statistical representation of the thing they call national bias, that doesnt mean their definition of national bias is the only correct one, or more important that doesnt prove the judge in their national bias chart is biased in a broader sense, or in a sense that deserves one or two years of suspension. They could call it for example national preferences chart for the same logic. Any statistical data is of no purpose if is not based on the observation of some individual person behavior and the feedback he gives in that exact situation.

Mathman · Dec 28, 2023

After all this I went back an looked again at the actual decision, as reported in the OP. It is very interesting to any reader who is willing to slog through the procedural legalese.

https://www.isu.org/inside-isu/isu-communications/communications/32839-case-2023-03-isu-v-d-williams-final-decision/file

I was especially interested in the question of what statistical methods were actually brought to bear. According to this document, the ISU's main tool was to compare Mr. Williams marks for Levito, Glenn and Tenell with the marks that he gave to the two competitors who placed directly above and the two that placed directly below her, to see if Mr. Williams overscored the USA skaters (compared to the rest of the panel) while also underscoring Levito's closest competitors (see table 4, paragraph 45). For instance:

(Although Williams' marks were within the corridor,) For Programme Components, the Alleged Offender, awarded to the skater (Levito) the 3rd highest (PCS) score (74.10 points) while the official result placed them in the 4th rank.

Overall (Total Segment Score, comp[aring Levito to Hendrickx, it was

Levito: Williams = 143.33, Panel = 134.62
Hendrickx: Williams = 139.,75, Panel = 138.48.

(That is, Williams marked Hendrickx higher than average, but) , "Mr. Williams gave scores (to Levito) that were even higher above average than for Ms Hendrickx. Hence, Mr. Williams was "less highly" above average with Ms Hendrickx than with the skaters with comparable performances. Ms Hendrickx was marked less well in comparison than the skaters with comparable performances, even though he marked her higher than the panel average.

Um...OK.

Williams response to the two entities that brought the complaint, the ISU Singles and Pairs Technical Committee and Mr. Benoit Lavoie, ISU Vice President for Figure Skating, in part, was:

((I) was at the event, not like the Complainants (ISU SPTC and Vice President) who were not there judging in real time.

Miller · Dec 29, 2023

Mathman said:
What makes this case unusual -- or maybe it is not unusual? -- is that the errant judge's marks were not accually outside tthe corridor according to the procedures outlined in the Communication. So evidently the Assesssment Committee takes into account other information culled from the event reports. In this case they seemed to spot an unacceptable pattern that raised their eyebrows evn without referring to the corridor.

Did someone outside the official Assessment Committee file a complaint? Did the Assessment Committee use some other measure of deviation form exoectation other than the criteria that are so carefully spelled out in the Communication -- like, for instance, do they make computions alomg the lines that were brought toi light by the data displayed on the SkatingScores.com website that were analysed on this thread by Miller and by Snowed?

One thing to bear in mind that this was the business end, the LP, of a World Champs. Medals (Levito in 4th by 0.43 points after SP), and 2/3 spots in next years Champs on the line - Levito in 4th, Tennell 8th, Glenn 10th after the SP, and you need a top 2 placements less than or equal 13 to maintain the 3 spots.

Levito's LP score via SkatingScores - 143.33 vs 134.62 actual (+8.71)
Tennell - 123.28 vs 117.69 (+5.59)
Glenn - 131.63 vs 122.81 (+8.82)

Was someone at the ISU/competition keeping a special eye for something like this, I believe they were.

Mathman · Dec 29, 2023

Miller said:
Was someone at the ISU/competition keeping a special eye for something like this, I believe they were.

In reading through the actual complaint supplied here by Andrea82, I was strck by this. The main datum that the compainants jumped on was the 9.50 that Mr. Williams gave Levito in composition. Pareagraph 34:

“While his marks were in the corridor, they were overall higher for her than for her close competitors”. Regarding components, “with a fall by this US skater, a mark of 9.50 given by Mr Williams was impossible and [under the rules should not have been] near a 10.00. Conclusion: National Bias”.

This was the one thing that the ISU could really hang its hat on, the rest being maybe this, maybe that. I think that without this one clear error, Williams might have been able to defend himself.

Mathman · Dec 29, 2023

Baron Vladimir said:
You cant measure bias that easily...

I think that the reason we have debates about statistics is this. Statistics itself is a branch of mathematics. A mathematician might characterize himself or herself as a statistician – or as a topologist, or a combinatorialist, etc. Statisticians do what all mathematicians do, they construct mathematical models. That is why statisticians are always so careful about stating their hypotheses and conclusions in overly cautious and pedantic language.

Users of statistics, that’s a whole ‘nother ball game. To people who use statistics, especially researchers in the social sciences, economics, politics, education, sports and the like. statistics is a tool that can be used to win an argument or to make a point. Thus users of statistics begin with an idea of what it is that they want to prove, then they go about the business of lining up their “proof.” This is a different kind of activity altogether.

As for the case at hand, I think that it is significant that this was brought before the ISU Ethics Committee, not the Officials Assessment Committee. Ethics -- now we are truly in for a ride.

CrazyKittenLady · Dec 29, 2023

Mathman · Jan 1, 2024

OT, but a similar case just happened yesterday in a diffrent sport, American football. With 23 seconds left the Detorit ions scored on a two-point conversion to put them ahead ny 1 point. But the referee made a wrong call which negated the play and gave the game to the other team. Today it was announced that the entire officiating crew was demoted by the NFL and relieved of their duties for the comong playoffs.

icewhite · Jan 1, 2024

Mathman said:
I think that the reason we have debates about statistics is this. Statistics itself is a branch of mathematics. A mathematician might characterize himself or herself as a statistician – or as a topologist, or a combinatorialist, etc. Statisticians do what all mathematicians do, they construct mathematical models. That is why statisticians are always so careful about stating their hypotheses and conclusions in overly cautious and pedantic language.

Users of statistics, that’s a whole ‘nother ball game. To people who use statistics, especially researchers in the social sciences, economics, politics, education, sports and the like. statistics is a tool that can be used to win an argument or to make a point. Thus users of statistics begin with an idea of what it is that they want to prove, then they go about the business of lining up their “proof.” This is a different kind of activity altogether.

As for the case at hand, I think that it is significant that this was brought before the ISU Ethics Committee, not the Officials Assessment Committee. Ethics -- now we are truly in for a ride.

I don't think there is such a difference. If I look at a set of numbers I try to find patterns. I then try to find out if my hypothesis is correct. I'm not a mathematician, and I often don't start with a hypothesis but with an interest in finding out.

Mathman · Jan 1, 2024

icewhite said:
I don't think there is such a difference. If I look at a set of numbers I try to find patterns. I then try to find out if my hypothesis is correct. I'm not a mathematician, and I often don't start with a hypothesis but with an interest in finding out.

I think that the main difference is in the language used in drawing conclusions. One researcher might conclude,
"The difference of the sample means is statistically significant by a blah blash blah test at the .01 level of significance."

Someone else, reporting on the same data, says (to quote the ISU in this particular case),"Conclusion: National Bias."

The first person's conclusion is a well-defined and true statement of mathematical fact. The second person's conclusion throws in a non-mathematical judgment.

I have, on occasion, been invited to give expert testimony in a legal context. The judge thanks me for my trestimony, then rules that everything I said was so much baloney. Next case. :laugh:

el henry · Jan 1, 2024

Mathman said:
OT, but a similar case just happened yesterday in a diffrent sport, American football. With 23 seconds left the Detorit ions scored on a two-point conversion to put them ahead ny 1 point. But the referee made a wrong call which negated the play and gave the game to the other team. Today it was announced that the entire officiating crew was demoted by the NFL and relieved of their duties for the comong playoffs.

Luckily for the Lions, the Eagles played like crap in the second half yesterday and lost their game to the lowly Cardinals.

With the Eagles' loss, the Lions still have the standings advantage would have secured with a win over the Cowboys in this game.

Jumping off. events like this are why I keep nattering on about football: Any skating fans' angst about scorings pales, and I mean pales, next to the hue and cry around the Lions' penalty. The NFL is probably worth, as a league, 100 times (and I am not exaggerating) the worth of all skaters and all skating federations put together ever. Even a blatant error has not, and will not, reduce the number of people watching the NFL or rooting for their teams, and most importantly for the money to be made from football. Not. Affected. At. All.

Consequently, I will never believe that scoring mistakes or scoring "bias" affects the number of people watching skating. Even with decisions like that of the ISU and Mr. Williams :shrug:

icewhite · Jan 1, 2024

el henry said:
Luckily for the Lions, the Eagles played like crap in the second half yesterday and lost their game to the lowly Cardinals. With the Eagles' loss, the Lions still have the standings advantage would have secured with a win over the Cowboys in this game.

Jumping off. events like this are why I keep nattering on about football: Any skating fans' angst about scorings pales, and I mean pales, next to the hue and cry around the Lions' penalty. The NFL is probably worth, as a league, 100 times (and I am not exaggerating) the worth of all skaters and all skating federations put together ever. Even a blatant error has not, and will not, reduce the number of people watching the NFL or rooting for their teams, and most importantly for the money to be made from football. Not. Affected. At. All.

Consequently, I will never believe that scoring mistakes or scoring "bias" affects the number of people watching skating. Even with decisions like that of the ISU and Mr. Williams

It's the reputation, though. Just talked about it with my sister and mother again this christmas, they don't want to watch figure skating because of the scoring. They don't get it and since they know it's often not done according to the rules they don't care to learn.
Similar to me not being interested to learn the levels and everything in ice dance, since I know it's basically not worth it if the judges do as they please anyway.
They or me would never watch the NFL either, though.

el henry · Jan 1, 2024

icewhite said:
It's the reputation, though. Just talked about it with my sister and mother again this christmas, they don't want to watch figure skating because of the scoring. They don't get it and since they know it's often not done according to the rules they don't care to learn.
Similar to me not being interested to learn the levels and everything in ice dance, since I know it's basically not worth it if the judges do as they please anyway.
They or me would never watch the NFL either, though.

I guess that is "association" bias, which is natural (I am including myself) we are are influenced by what we know in our everyday life/

I don't know one person who currently does not watch skating who doesn't watch it because of scoring bias. Rules, yes (why is that jump worth more than that jump? Why does someone with a fall win over a clean skate?) But that is not bias, but complexity. They don't know, or care, if skating is "biased", it's that the rules themselves make no sense to them.

Mostly they do not want to watch men skate to music in costumes and they don't care if they do 85 revolutions in the air, as long as there is music and costumes, nope. I hate this attitude, but there it is.

The sports we love are the sports we love, and that is also how we grew up. I don't know one soul who watches, or cares to watch, men's soccer (will watch women during the Olympics), but football? It's a religion.

Mathman · Jan 1, 2024

As for football, one thing the instant replay cameras have brought to the sport is a new appreciation for the officiating. It used to be that on every call half the stadium yelled "booooo" and the other half yelled "shut up, crybabies." Now that the play is shown on the huge jumbotron in slo motion from several angles, even the roudies can see that the referees get it right 85 % nof the time, with 12% unclear and only 3% wrong.

Not that it stops them from shouting Boo and crybaby anyway.

elhenry said:
Mostly (American sports fans) do not want to watch men skate to music in costumes.

Well, I have to sign off now. The University of Michigan boys (aka the Victor Valiants) will soon be performing in their maize and blue costumres to their fight song in the beautiful flowered setting of the Rose Bowl.

Do Michigan fans love thier beefy lads? The Big House (UM stadium) has sold out its 109,000 seat stadium for 313 straight home games going back to 1975. $$$

gkelly · Jan 1, 2024

icewhite said:
It's the reputation, though. Just talked about it with my sister and mother again this christmas, they don't want to watch figure skating because of the scoring. They don't get it and since they know it's often not done according to the rules they don't care to learn.
Similar to me not being interested to learn the levels and everything in ice dance, since I know it's basically not worth it if the judges do as they please anyway.
They or me would never watch the NFL either, though.

I think it would be a good thing if TV commentators would talk in more detail about the usual scoring of the elements and components, pointing out the (correct) ways that the judges and tech panels have applied the rules. So most of the time viewers would be seeing the correct application of the rules and would maybe learn to understand some of those rules.

Then when there are real anomalies, commentators could point them out in the context that this was an unusual case where they disagree with/don't understand what the judges or tech panel did in this case.

If the only time there is any analysis of the scoring is when the commentators are pointing out discrepancies, viewers will learn to believe that the scoring is always random and/or biased.

Sure, it's more fun for numerically minded (or conspiracy minded) fans who already have some basic understanding of the rules to seek out the discrepancies. But of the many hundreds of scores awarded during any given competition (9 judges times 7-12 elements plus 3 components per program, times 6 to 30+ skaters), a few will stand out as not fitting in, but the vast majority will make perfect sense to anyone who understands what they're looking at.

So it would be better if newer viewers first getting used to the scoring could learn to appreciate all the times that the scores do make sense before indulging in the fun of spotting the exceptions.

Scoring bias at the national level

Baron Vladimir

Baron Vladimir

Mathman

Baron Vladimir

Andrea82

Mathman

Baron Vladimir

Mathman

Miller

Mathman

Mathman

CrazyKittenLady

Mathman

icewhite

Mathman

el henry

Fangirl of men's spirals and split jumps

icewhite

el henry

Fangirl of men's spirals and split jumps

Mathman

gkelly

Similar threads

Connect with us