Scoring bias at the national level | Page 13 | Golden Skate

Scoring bias at the national level

All teams made mistakes. So I am assuming that you are referring to PCS.
They are referring to the total segment score. The Canadian judge gave S-D/D 140.35 points (11.03 more than their actual FS score ended up being), P/M 127.95 (+5.79) and L/E 121.11 (+5.14). Compared to the untrimmed mean, their score is over 8 points higher on average for the Canadian teams vs. almost 2 points higher than the untrimmed mean for all other pairs.
The Canadian judge was generous overall, yes, but also especially generous to the Canadian pairs in the Free.

Personally, as the Canadian judge's GOEs and PCS were usually discarded anyway, I don't have too much issue with it, it probably didn't affect the scores that much, but I can see why other people would be suspicious (as you only need two people who have the same biases in a judging panel for them to actually count).
 
They are referring to the total segment score. The Canadian judge gave S-D/D 140.35 points (11.03 more than their actual FS score ended up being), P/M 127.95 (+5.79) and L/E 121.11 (+5.14). Compared to the untrimmed mean, their score is over 8 points higher on average for the Canadian teams vs. almost 2 points higher than the untrimmed mean for all other pairs.
The Canadian judge was generous overall, yes, but also especially generous to the Canadian pairs in the Free.

Personally, as the Canadian judge's GOEs and PCS were usually discarded anyway, I don't have too much issue with it, it probably didn't affect the scores that much, but I can see why other people would be suspicious (as you only need two people who have the same biases in a judging panel for them to actually count).
Did the Canadian judge undermark other teams? I was underwhelmed by this event. I actually thought kelly ann and loucas were not marked high enough across both programs. I feel the othe two Canadian pairs at least have programs and perform them while some pairs have dull insipid music to which they stake through going from one element to the next (except for the Chinese team).. I guess I would have gotten my scores tossed too.
 
...

I am a lone voice crying in the wilderness on this issue, but I think that figure skating has become markedly less corrupt, not more, in recent years. The hue and cry over "narional bias" is overblown, in my humble opinion.

Not a lone voice, make that two lone voices. ;)

National bias is what we saw in the 60s and 70s and into the 80s, with the Cold War fought at every skating competition. You knew the East German judge was going to underscore Toller, and sure enough he/she did.

Getting worked up over Judge X from Country X scoring Skater Country X .0892546 higher in the SP and .0621934 in the LP? As proof of "improper bias" or even "corruption". Puh-leeze.:palmf:

It all takes a seat to that 70s and 80s scoring. And yet when was figure skating most popular in "the West"? Hmmmm:unsure:
 
I looked at the protocols.
Canadian judge scored everyone very high. Only judge giving a 5 to the gorgeous throw of Peng and Wang. So that proves that the Canadian judge knows what they are doing.. that throw deserved all the points ;)

Joking aside... I even looked down to lower ranked teams and again the Canadian judge is constantly giving higher scores.

I'd call that more an enthusiastic judge rather than a biased judge LOL Maybe that judge went to Ted Barton's school :)

The only number i feel they truly missed was on the twist with Deanna and Max. At the same time, I didn't see how crashy it was myself until the replays. Perhaps because it's their best element and it's always stunning ??? But when Chris mentioned it, I looked and saw how badly it was executed compared to the SP for instance... So on that element, score of course, was tossed, I felt maybe that judge didn't see something and made an honest mistake.

That's my take on it... of course, I am biased too :)
 
Last edited:
Not a lone voice, make that two lone voices. ;)

National bias is what we saw in the 60s and 70s and into the 80s, with the Cold War fought at every skating competition. You knew the East German judge was going to underscore Toller, and sure enough he/she did.

Getting worked up over Judge X from Country X scoring Skater Country X .0892546 higher in the SP and .0621934 in the LP? As proof of "improper bias" or even "corruption". Puh-leeze.:palmf:

It all takes a seat to that 70s and 80s scoring. And yet when was figure skating most popular in "the West"? Hmmmm:unsure:
If I don't disagree entirely with both of you, I cannot fully let this go ;)

There are sound differences in judging patterns for many judges. Maybe they are not purely nationalistic but cultural.
Just like, as a fan, I enjoy a particular type of skating, the skating I have been exposed to all my life :) in my country LOL :)
At the same time, there are a couple nations with similar skating and I root for their skaters very much as well ;)

When watching men... a fellow GSer and I, we have never met other than on this forum's competition threads, realized that we liked all the same skaters. (not just nationally but internationally). That fellow GSer is also Canadian and knows the sport very well. More than I do for sure, but I consider that I am a hardcore fan and I do watch a lot of skating (instead of sleeping).

So, this is the thing : cultural/nationalistic bias does exist. Call it taste if you wish. But it's there. Does it equal to corruption ? I wouldn't say so. In the pairs case, I looked quickly at the numbers from 4CC and the Canadian judge did score Canadian teams higher than the other judges did... but that judge also scored the other teams higher than the other judges did. So, I have no real problem with their score being higher...

What I dislike is when a judge scores their own skaters higher AND scores the other skaters much lower. This is the way the "corruption" is happening based on national bias nowadays.

Their high scores get tossed.... elevating the mean scores for their own skaters ... and because they lowball the other skaters, their lower scores get tossed for them... lowering the mean scores for the competitors.

This could be enough to change rankings ... we saw total scores with very small margins all year... so the playing the numbers is still possible.
 
whole post

I think this makes sense.

It is clear the type of skater I admire (at least I think it is ;) ). Would I give those types of skaters all the points as a judge? Yes I would

Could I have developed that admiration for that type of skater as a result of cultural upbringing? I'm not aware of it, but of course I could have. Then again, my first one true skating love was Canadian, not American, so I don't know what that says. :biggrin:

Would giving those skaters all the points be a result of nationalistic bias or corruption? No, it would depend on my own unerring and unfailing judgment of what is right and true.

So I am willing to believe that is how judges operate as well.
 
Their high scores get tossed.... elevating the mean scores for their own skaters ...
I think that these effects are quite minimal.

My skater gets 10, 8, 7, 7, 7, 7, 7, 7, 6. Trimmed mean = 7.14.

Now I (judge #1) decide to clean up my act and give a 7 instead iof the inflated 10. New trimmed mean = 7.00

Well, I guess those sevenths of a point can add up, but still I have to say that I am not alarmed.

As for this particular Four Continents judge, my biggest takeaway is that he liked S-D & D's performance quite a bit. Me, too..
 
I don't want to talk this up to the level of a scandal, but looking at the scores I found it quite clear that the Canadian judge scored for Canada.
I will go a bit deeper into the numbers later.

We can say it doesn't matter in the end, that's because this system, as much beef as I'm having with the corridor judging because it enhances other psychological biases, is very suited to erase such nationalistic (or corrupt) judging from one particular judge. That's what this system was built to diminish and almost erase, and that's the job it does pretty well.

That, however, is a different question from whether there was biased judging from one judge to change the outcome of the competition, and I think that in this case there was and I would argue that it was not done in completely good faith (aka: I just like them better because I grew up to like a particular style better).
 
That, however, is a different question from whether there was biased judging from one judge [in order deliberately to] to change the outcome of the competition, and I think that in this case there was and I would argue that it was not done in completely good faith
I was with you until that part. "Bias" is a well-definited statistical term, even a useful one, but it does not carry any moral or value judgement. Numbers alone cannot speak to the question of someone's motivation or good faith.
 
I looked at the protocols.
Canadian judge scored everyone very high. Only judge giving a 5 to the gorgeous throw of Peng and Wang. So that proves that the Canadian judge knows what they are doing.. that throw deserved all the points ;)

Joking aside... I even looked down to lower ranked teams and again the Canadian judge is constantly giving higher scores.

I'd call that more an enthusiastic judge rather than a biased judge LOL Maybe that judge went to Ted Barton's school :)

The only number i feel they truly missed was on the twist with Deanna and Max. At the same time, I didn't see how crashy it was myself until the replays. Perhaps because it's their best element and it's always stunning ??? But when Chris mentioned it, I looked and saw how badly it was executed compared to the SP for instance... So on that element, score of course, was tossed, I felt maybe that judge didn't see something and made an honest mistake.

That's my take on it... of course, I am biased too :)
He scored the Aussies almost 2 points lower and the two lower-ranked Chinese pairs 2 and 4 points lower than the mean TSS, so no, he wasn't consistently generous. And while he did mostly score more enthusiastically than other judges, he scored only four teams with a surplus of over 5 points from the mean TSS - The three Canadians and Peng/Wang.

Is it the worst case of National (or otherwise) bias? No, definitely not, and it likely didn't affect the scores significantly, but it's also not normal for a judge to score a team over 8 points higher than the next highest-scoring judge. No more than it's normal for the lowest score given by a judge to be 8 points lower than the second-lowest, which thankfully didn't happen here, but has happened this season already. A bit of variation is normal and to be expected with a scoring system like this, but for the same program to be scored for example anywhere between 125.33 and 140.35 (i.e. a difference of almost exactly 15 points) feels like a little bit too much variance.

By the way, that goes for many examples, not just here. (For example, multiple of the women's FS scores have not insignificant scattering of over 13 points difference between lowest and highest) It's also not meant to be a moral judgment of who has a more accurate assessment of the programs but the description of a general phenomenon and problem: All of these judges have been trained on the same rules, but come to sometimes very vastly different conclusions. Should this really be the case?
 
By the way, that goes for many examples, not just here. (For example, multiple of the women's FS scores have not insignificant scattering of over 13 points difference between lowest and highest) It's also not meant to be a moral judgment of who has a more accurate assessment of the programs but the description of a general phenomenon and problem: All of these judges have been trained on the same rules, but come to sometimes very vastly different conclusions. Should this really be the case?
Yes, and it's not a bug in the system. It's by design. That's why there's so much room for interpretation in the ISU Guidelines, and the use of the word guidelines instead of rules is an important distinction particularly in the free skate where there are no mandatory deductions for errors in the GOE.
 
Last edited:
I think that these effects are quite minimal.

My skater gets 10, 8, 7, 7, 7, 7, 7, 7, 6. Trimmed mean = 7.14.

Now I (judge #1) decide to clean up my act and give a 7 instead iof the inflated 10. New trimmed mean = 7.00

Well, I guess those sevenths of a point can add up, but still I have to say that I am not alarmed.

As for this particular Four Continents judge, my biggest takeaway is that he liked S-D & D's performance quite a bit. Me, too..
we have talked about this. it meant that Sylvie Fréchette lost gold in barcelona... so it can have a very big effect.

and honestly, maybe i am curious about this case because usually, Canadian judges have been very stingy towards their own athletes... :) and apparently, that same judge didn't favour the Canadians in the SP... so there's that :)
 
He scored the Aussies almost 2 points lower and the two lower-ranked Chinese pairs 2 and 4 points lower than the mean TSS, so no, he wasn't consistently generous. And while he did mostly score more enthusiastically than other judges, he scored only four teams with a surplus of over 5 points from the mean TSS - The three Canadians and Peng/Wang.

Is it the worst case of National (or otherwise) bias? No, definitely not, and it likely didn't affect the scores significantly, but it's also not normal for a judge to score a team over 8 points higher than the next highest-scoring judge. No more than it's normal for the lowest score given by a judge to be 8 points lower than the second-lowest, which thankfully didn't happen here, but has happened this season already. A bit of variation is normal and to be expected with a scoring system like this, but for the same program to be scored for example anywhere between 125.33 and 140.35 (i.e. a difference of almost exactly 15 points) feels like a little bit too much variance.

By the way, that goes for many examples, not just here. (For example, multiple of the women's FS scores have not insignificant scattering of over 13 points difference between lowest and highest) It's also not meant to be a moral judgment of who has a more accurate assessment of the programs but the description of a general phenomenon and problem: All of these judges have been trained on the same rules, but come to sometimes very vastly different conclusions. Should this really be the case?
2 points lower... that's not really worth a mention in my opinion. Seriously. I believe that's quite acceptable when the total score is over 100 points. 2% come on :)
 
I was with you until that part. "Bias" is a well-definited statistical term, even a useful one, but it does not carry any moral or value judgement. Numbers alone cannot speak to the question of someone's motivation or good faith.

Well, I'm not a native speaker and I am struggling a bit with the term bias in the first place because it's not what I would use - and I'm also definitely not an expert when it comes to statistics, so sorry if I used the wrong term there. What I mean is there is subconscious under- or overscoring because you rate a quality more or less than you should - and there is deliberate over- and underscoring.
 
well in any case, if the judge committed a fault, the ISU will issue a warning. Let's see if that happens before crying wolf
 
Okay, so not so much for the sake of this particular case as for the general topic:


The Canadian judge (Fortin) gave Stellato/Deschamps the highest GOE of all judges on almost every element, including the twist, where they were the only one to give +2, while most of the others gave it a -2, a 0 on the combo jumps, while the most common number was a -2, and a +2 on the LoTh, where most gave 0. They had them first in TES - the only judge to do so.
For the direct rivals of this team they gave the highest GOE on only one element (the DSp) to Kam/O'Shea, and had them overall in 3rd in TES- while most had them in 2nd.
They gave Miura/Kihara the highest GOE among all judges on two elements - but had them in 8th in TES in the end, while most had them in 6th, none as bad as 8th, two 3rd and 4rth.
They gave Golubeva/Giotopoulos the highest GOE on 3 elements - but had them in second in TES overall, while several had them first.

PCS: They had the Canadians 1st in PCS - okay, most did so. But they had K/OS in 5th, the only judge to do so, while most had them in third, and they had M/K in second, while half of the judges had them in first. They ranked G/G in 8th, while some other judges ranked them 4th, 5th or 6th.

According to them the Japanese would have been 5th in the end in the FS, not 3rd, and G/GM would have been 6th, not 4th.

So, while they were overall on the generous side with their scores, they scored the biggest rivals overall rather worse than the other judges although I would argue that those three teams are quite different from each other and M/K have quite some reputation.

Pereira/Michaud on the other hand were again scored very well by them - almost all of their GOE were erased because they were the best ones given, and they had them in 3rd in PCS, while most had them in 4th.


Especially the GOE are catching my eye.
 
well in any case, if the judge committed a fault, the ISU will issue a warning. Let's see if that happens before crying wolf

Like I said I don't think this case is so terrible, we see lots of it. I would rather take this as a case study how overscoring your own skaters in comparison works - you do not just throw all the points at your own skaters and give terrible scores to the others, that would be too obvious I think. The way I see it this judge overall does it a bit like the American judge we started with.
 
What I mean is there is subconscious under- or overscoring because you rate a quality more or less than you should - and there is deliberate over- and underscoring.
What I would say about it is this: All this is very likely to be true in many instances. But I don't believe that a list of numbers, whatever it's peculiarities, can alllow us to draw conclusions about sunconscious processes, nor to address the question of what "should" happen, nor to investigate deliberateness of purpose. Thes things are, in my opinion, outside the pwerimeter of concern of statistics.

I have to say that I encountered similar difficulties over in the Kamila Valieva thread. Lot's of facts, figures, quoting of rules, and parallel cases are brought up, but to deduce from these what SHOULD be done -- to me there is a big disconnect there."Should" is just too hard a word to deal with using mathematical tools alone, for me.
 
Last edited:
2 points lower... that's not really worth a mention in my opinion. Seriously. I believe that's quite acceptable when the total score is over 100 points. 2% come on :)
2 points are worth mentioning when you make a generalised comment about someone scoring a certain way for all, as you did. 2 points lower can also quickly add up to a lot more when someone scores everyone else 2, 3, 4, 5, ... points higher.
When I only add up the scores this judge gave the pairs in the SP and FS, only two teams come out with a lower score than their actual score - G/GM and W/Z. I think that's fine to mention when his scoring is portrayed as generous across the whole competition. Also, in a sport where sometimes a few tenths can decide about medals, it's not quite that insignificant. See your own comment:
we have talked about this. it meant that Sylvie Fréchette lost gold in barcelona... so it can have a very big effect.

Yes, and it's not a bug in the system. It's by design. That's why there's so much room for interpretation in the ISU Guidelines, and the use of the word guidelines instead of rules, is an important distinction particularly in the free skate where there are no mandatory deductions for errors in the GOE.
If judges can just kind of follow or not follow the guidelines, interpret their own things, and so on, what even is the point of having guidelines in the first place? In fact, what even is the point of having a judging system, just distribute the medals by vibes (looking at you Ice Dance 🫥), and stop hiding behind "Well, look we have a scoring system, this is all so legit!"

well in any case, if the judge committed a fault, the ISU will issue a warning. Let's see if that happens before crying wolf
This seems unusually naive for how I've experienced you so far in this forum. Warnings rarely are given publicly, and many judges with very clear biases are still acting as judges to this day (even with the excuse of their wacky scoring being the result of incompetence). Also, nobody said this was the worst case of bias, but in many respects, one could also say that about Douglas Williams. In the very same document, it even goes into scoring a direct competitor less highly above the average than the skaters from your own country being a hint towards national bias, which would be the sort of issue we are looking at here.
 
2 points are worth mentioning when you make a generalised comment about someone scoring a certain way for all, as you did. 2 points lower can also quickly add up to a lot more when someone scores everyone else 2, 3, 4, 5, ... points higher.
i did say i looked at scores quickly. with protocols.. so yes, my comment is general and not in depth.. to me 2 points in scores that are over 100 points are normal.

This seems unusually naive for how I've experienced you so far in this forum.
Hey ! No need to assign me with intentions or anything ;) you can call me sleep deprived... that, I will accept but naïve ? :)
My point here is that I really don't think that this judge tried to push things a certain way. As you have pointed out yourself this judge didn't do so in the SP... We all know how important it is to skate in the final group, and while pretty much all teams had mistakes, they didn't favour the Canadian teams in the SP, where in my opinion, it may have had a much bigger effect.

Here is the real question : what do you think of the overall ranking ? Do you really believe that Deanna and Max shouldn't have won ?
Warnings rarely are given publicly, and many judges with very clear biases are still scoring to this day (even with the excuse of their wacky scoring being the result of incompetence). Also, nobody said this was the worst case of bias, but in many respects, one could also say that about Douglas Williams. In the very same document, it even goes into scoring a direct competitor less highly above the average than the skaters from your own country being a hint towards national bias, which would be the sort of issue we are looking at here.
Here is why I took this case from the other angle than you and @icewhite :

Every competition, we find judges who do this and even worse. I think that if we are going to mention a judge and make a case for nationalistic bias, it has to be severe... otherwise we can bring up dozens of judges.... pretty much every event there is something we could discuss. I remember not too long ago, some users mentioning a French judge overscoring Aymoz.
We could find biases for and against tonight in ice dance. The British judge for instance is under my watch ;)

So no. I am not naïve. I just believe that if we are going to pick examples to discuss, this one would be one of many and perhaps one of the milder ones. So I am wondering what's the point really of bringing this one up when as a fan I could give you a pretty good explanation about why some of the GOES are bad for the Aussies lifts for instance. And as we know, the lifts have very high base values... and guess what, Deanna and Max had stellar lifts last night. The twist is iffy... I am not sure why the judge missed it. Probably a honest mistake. The jumps, if you look at the entrance, they are close and in synch. The landings were not good but they had positive bullets. 0 is generous but there is a deduction in there already. I could go on and on an on and pick apart various elements and discuss quality and bullets. However, I slept 4 hours so I am not interested in doing so :)

I will say this : I believe that this is what the ISU actually creates with this kind of scoring system and with "national volunteer judges"
I opened a thread to discuss options on how to minimize nationalistic bias... so that's perhaps why I am being devil's advocate here... to me, this is not an exceptional case even requiring a warning. We have seen worse.

Feel free to discuss what you would like the ISU to do, in the thread I opened about that :) I'd like to read your comments there.
 
Back
Top