Should the IJS use median scores instead of the trimmed mean? | Page 3 | Golden Skate

Should the IJS use median scores instead of the trimmed mean?

drivingmissdaisy

Record Breaker
Joined
Feb 17, 2010
Another question I've been wondering about is if we should just award a certain score bonus for the last two groups? Something like a 3pt bonus reflected on the final score for making the final group and 1.5 for the second group (FS ONLY). We know it already happens in every event. Why not just get it out there and identify it and why it is happening? I recognize there is added hype and pressure being head to head with the best in the competition and as such I don't even know if I'd have a problem with this. It's all these little unspoken rules and ways that the numbers are derived that make the score a :bang: for me. Lets just identify the elephant in the room already and then let the Math take over from there.

This might just be a judging tendency that originated under 6.0, in which judges had to leave room in the marks for later skaters. I'd guess the vast majority, if not all, Olympic judges also judged under 6.0 so this just be a mindset that is hard for them to shake. It does lead to some baffling scores though, such as when Gracie, Julia and Mao are all scored within two points of each other in PCS in the Sochi LP. With IJS it seems less excusable because the scoring tries to appear more scientific, when it really isn't.
 
Joined
Jun 21, 2003
This was mentioned before but I wanted to highlight it. Under the ordinal system, no judge could put a Skater A in first place but Skater A could win. Strange but true.

I don't think this is particularly strange. In many settings there are situations where I love my candidate but hate yours, and vice versa. The only one we can agree on is the compromise candidate that everyone put second. :)

genekelly said:
That's the question the mathematicians are debating in this thread.

Caelum made a very cool suggestion earlier on this thread. Instead of the trimmed mean, there is another method available for removing the distortion of outliers while still acknowledging the intent of all judges. This is called "Winsorizing" (after statistician Charles Windsor) and it goes like this. Instead of discarding highest and lowest, the highest and lowest are replaced by the boundaries of the middle 7 (or 5, etc.)

Example: 7.50 8.25 8.25 8.50 8.75 9.00 9.00 9.25 10.00 Average 8.72, median 8.75

Current IJS method (throw out highest a lowest) 8.25 8.25 8.50 8.75 9.00 9.00 9,25 Average 8.71

Winsorized method. Replace the lowest score, 7,50, with 8.25 and replace the highest, 10.00, with 9.25. Average = 8.72

More severe Winsorization. Replace both of the two lowest scores with the third lowest. Replace both of the two highest scores with the third to the highest. In this example nothing changes and we get the same result. Average = 8.72

In this example all methods produce nearly the same result because there is nothing unusual about the distribution of scores. One score was too high (10.00), another was too low (7.50), but everything balanced out in the end.

By the way, the reason statisticians do not like trimmed and Winsorized means very much is that the sampling distribution is not well understood. (The formulas from our statistics classes that have sigma/sqrt(n) in them are not guaranteed to work in the case of an asymmetric distribution of scores.)
 

drivingmissdaisy

Record Breaker
Joined
Feb 17, 2010
Another important aspect not being considered is that a judging system has to be someone easy to understand. Particularly in cases when the outcome wouldn't change, it makes no sense to overly complicate the system by Winsorizing it (although it is fascinating to read). However, I think it is also important for the system to reflect that, to some extent, a fair judging panel can be established. Throwing out eight scores out of nine implies that as many as eight judges cannot be trusted enough to have their scores count.
 

Alba

Record Breaker
Joined
Feb 26, 2014
This situation, where a determined minority cabal can dominate the majority, could not happen if we used the median (middle score) instead of the mean. The median is simply the maximally trimmed mean -- we throw out the highest four and the lowest four instead of the highest one or two and the lowest one or two. In this example the median scores are

Skater A: 9.00
Skater B: 8.75

What do you think? Would this be a better system?

I don't think this can be a better system. It doesn't make sense to leave out the scores of 8 judges, why not go with the majority as it used to be then?
 

drivingmissdaisy

Record Breaker
Joined
Feb 17, 2010
That would indeed be better in my opinion. Alas, that ship has sailed, never to return. ;)

Combining ordinals with the current system would resolve a few of the problems, including the effects of overscoring/underscoring particular skaters and assuring that a majority of the panel agrees with the placements. Furthermore, since the 5th judge's placement determines where the skater places (unless there is a tie), this effectively is what MM is trying to accomplish without throwing out 8 scores.
 
Joined
Jun 21, 2003
The more I think about it, though, the more I don’t think the median is really “throwing out judges’ scores.” It is more like, “determining the middle value of a range of scores.” But on the other hand, the median does not discourage cheaters from cheating, as I hoped at first that it might. Here is a simplified example. The first six judges go like this:

Lulu: 8.75 8.75 9.00 9.00 9.25 9.25; Median =9.00
Shel: 8.50 8.50 8.75 8.75 9.00 9.00; Median = 8.75

A close contest, but every judge thinks that Lulu is a little bit better.

Now three conspiring pro-Michelle judges enter the fray. They give

Lulu: 8.50 8.75 8.75
Shel: 9.00 9.00 9.25

The three new numbers knock Lulu’s median down to 8.75 and bump Michelle’s up to 9.00. Michelle wins.
 

Alba

Record Breaker
Joined
Feb 26, 2014
@Math: I don't think cheating can be eliminated tbh. If people wants to cheat they will find the way, but at least I think it's fair that the majority should decide and no anonymity. So if there is a way to combine what's good with the new system with the old one that would be great IMO.
 

mskater93

Record Breaker
Joined
Oct 22, 2005
I'm not familiar with the ordinal judging system, but if a skater were 1st according to 5 judges but say 9th according to the other 4 judges, wouldn't a skater that was 2nd from every judge win?

Not in an ordinal majority system. The first skater would win with a majority of 5/1
 

gkelly

Record Breaker
Joined
Jul 26, 2003
The more I think about it, though, the more I don’t think the median is really “throwing out judges’ scores.” It is more like, “determining the middle value of a range of scores.” But on the other hand, the median does not discourage cheaters from cheating, as I hoped at first that it might. Here is a simplified example. The first six judges go like this:

Lulu: 8.75 8.75 9.00 9.00 9.25 9.25; Median =9.00
Shel: 8.50 8.50 8.75 8.75 9.00 9.00; Median = 8.75

A close contest, but every judge thinks that Lulu is a little bit better.

Now three conspiring pro-Michelle judges enter the fray. They give

Lulu: 8.50 8.75 8.75
Shel: 9.00 9.00 9.25

The three new numbers knock Lulu’s median down to 8.75 and bump Michelle’s up to 9.00. Michelle wins.

Of course, we still don't know that they're cheating, much less conspiring. They could just each be honestly reflecting a minority opinion.

Seven of these judges (all of the pro-Lulu judges, one of the pro-Shel) thought the contest was quite close, giving only 0.25 difference between two skaters.

One of the pro-Shel judges gives 0.5 difference, and one gives 0.75. Those larger differences are enough to tip the balance in Shel's favor, whether the scores trimmed to the medium, or trimmed by high and low, or "Winsorized."

However, we don't know whether those judges honestly thought Shel was significantly better in the areas of skating that they considered most important, which might differ from what the pro-Lulu judges considered most important. We don't know whether they gave larger gaps because they thought Shel was just that much better, or they naturally tend to leave larger gaps in their scoring between all the skaters, or because they were trying to manipulate the outcome toward the skater they wanted to win (whom they might not actually have thought was better at all, if they were cheating). Or, if this was ordinal judging, because they thought Shel was better than Lulu and they also wanted to leave room in between them for a later skater who might be better than Lulu and worse than Shel in their opinion.

If there are two scores per skaters with tiebreakers (second mark in the freeskate), that means that judges who think it's a close contest can give two skaters the exact same total. But if the absolute scores count, not ordinals, it would be stupid to use tiebreakers that way, because it's not really giving the skater that judge prefers a higher score at all and makes the results vulnerable to a single judge who prefers the other skater and doesn't use the tiebreaker give the same total. This was a major flaw in the way pro competitions were often scored, with pro "judges" often thinking in terms of ordinal-style 6.0 tiebreakers.

But Mathman's example shows that even using the smallest increment between scores leaves the possibility that other judges who hold the opposite opinion and use wider increments can override a majority.

Ordinals open up the possibility of flipflops and similar paradoxes.

This example shows that total scores can result in paradoxes where a minority who uses wider gaps between scores (in general or for the specific skaters of interest; as a general principle of scoring or to try to force a particular result) can override a more conservatively scoring majority.

Is the answer to encourage all judges to be bold in their scoring gaps? Or to encourage all to keep the scores close together so that large gaps in minority opinions will stand out for scrutiny?
 

YesWay

四年もかけて&#
Record Breaker
Joined
Sep 28, 2013
Is the answer to encourage all judges to be bold in their scoring gaps? Or to encourage all to keep the scores close together so that large gaps in minority opinions will stand out for scrutiny?
If there is an answer, I think it is this:

Accept that judging is subjective. There can never be a perfect system for scoring. Even if the officials are clean as a whistle and never cheat, there will still ALWAYS be results that are "controversial" that some people aren't going to like.

Embrace all of this as part of the fun and drama of figure skating!

It's cool and all when your favourite skaters do well, but if they don't... well pfft big deal. Enjoy the skating. Have a moan about results if necessary, but then move on and let it go - for the sake of your health.

Life's too short...
 

Sandpiper

Record Breaker
Joined
Apr 16, 2014
Agreed, YesWay. At the end of the day, that's all we can do. Enjoy the skating. Because no matter what the scoring system, no matter who the judges are, we're all going to have results we find unfair. Even if we can't ever agree with a certain result (and there are results I'll decry until my dying day), we have to accept that such things happen in this sport.
 

Alba

Record Breaker
Joined
Feb 26, 2014
If there is an answer, I think it is this:

Accept that judging is subjective. There can never be a perfect system for scoring. Even if the officials are clean as a whistle and never cheat, there will still ALWAYS be results that are "controversial" that some people aren't going to like.

Embrace all of this as part of the fun and drama of figure skating!

It's cool and all when your favourite skaters do well, but if they don't... well pfft big deal. Enjoy the skating. Have a moan about results if necessary, but then move on and let it go - for the sake of your health.

Life's too short...

This would be the best and ideal solution. ;):clap:
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
I agree. The focus shouldn't be on results and who's better than who so much as who is able to deliver wonderful, memorable performances. The joy of skating is about performance not credentials and decimals.
 

gkelly

Record Breaker
Joined
Jul 26, 2003
And a whole skating program is complex enough that different individuals will favor different aspects. Even with individual elements there might be differences of opnion as to whose is "better" if they're good (or bad) in different ways.

So most of the time, in a field with numerous skaters including several strong ones at the top, there is not one right answer as to the "correct" results.

The technical panel calls may be black-and-white, yes-or-no decisions that can be clearly right or wrong. Except sometimes when the skater almost/kinda/sorta executes the element or feature as required and it falls in a gray area.

Some kinds of errors on elements require mandatory GOE reductions and may be required to have negative GOE, or specifically -3 GOE, so a judge who gives 0 or better to an element with such an error would be flat-out mistaken on that element.

But otherwise, for all the qualitative decisions that fall under the GOEs and PCS, there aren't right or wrong answers. There are different experts' evaluations within the written guidelines. Some evaluations are better than others, but there may be several that are equally correct depending on the priorities of that particular judging panel at that time.

So the goal of the scoring system can't be to find the single correct answer. But rather to find the best way to combine the disparate opinions of a panel of experts who come to varying decisions, hopefully all tending toward the "better" end of the spectrum.
 
Joined
Jun 21, 2003
Not in an ordinal majority system. The first skater would win with a majority of 5/1

Also in OBO. A skater in a field of nine with five first place ordinals and four nines would automatically have eight "wins" and we would not have to look at the JiF (judges in favor) column to determine the winner.

I agree. The focus shouldn't be on results and who's better than who so much as who is able to deliver wonderful, memorable performances. The joy of skating is about performance not credentials and decimals.

Still, we owe it to the skaters -- who are, after all, dedicated athletes competing for championships -- to do the best we can to determine who's netter than who.
 
Joined
Jun 21, 2003
But Mathman's example shows that even using the smallest increment between scores leaves the possibility that other judges who hold the opposite opinion and use wider increments can override a majority.

To me, that is the crux of the matter. In the example given, and supposing that every judge serves with honesty and competence, who should win? The skater who is preferred (however slightly) by the majority of judges or the skater whose admirers and detractors feel so strongly as to give more wide-ranging marks?

Ordinals open up the possibility of flipflops and similar paradoxes.

I have never understood what is so "paradoxical" about flip-flops.The only reason we even attach the negative-sounding name "flip-flop" to this phenomenon is that we have counted our chickens before they hatch. Just tell the skaters, "hold your celebration until everyone skates."
 
Last edited:

Mafke

Medalist
Joined
Mar 22, 2004
If there is an answer, I think it is this:
Accept that judging is subjective. There can never be a perfect system for scoring. Even if the officials are clean as a whistle and never cheat, there will still ALWAYS be results that are "controversial" that some people aren't going to like.

If people have basic faith in the judging system (that is the judges themselves) then yes, I always say that close could-have-gone-either-way decisions are good for FS.

The problem is that very, very, very, very few people have faith in the judging system (and even less in the judges).

Until that's fixed FS is screwed in NAmerica.
 

Ophelia

Record Breaker
Joined
Dec 6, 2013
If there is an answer, I think it is this:

Accept that judging is subjective. There can never be a perfect system for scoring. Even if the officials are clean as a whistle and never cheat, there will still ALWAYS be results that are "controversial" that some people aren't going to like.

Embrace all of this as part of the fun and drama of figure skating!

It's cool and all when your favourite skaters do well, but if they don't... well pfft big deal. Enjoy the skating. Have a moan about results if necessary, but then move on and let it go - for the sake of your health.

Life's too short...

This only addresses the viewer. What about the SKATERS? I don't see anyone mentioning the most important group of people in all of this. Should skaters just dedicate all those hours and years to practice, and accept that they'll probably get rewarded with unfair scores? We may never achieve a perfect system, but we should aim for it.
 

drivingmissdaisy

Record Breaker
Joined
Feb 17, 2010
I have never understood what is so "paradoxical" about flip-flops.The only reason we even attach the negative-sounding name "flip-flop" to this phenomenon is that we have counted our chickens before they hatch. Just tell the skaters, "hold your celebration until everyone skates."

The paradox is that you have two skaters who are ranked against one another, and their relative rank can change based on what other people do. In SLC, after Michelle Kwan skated she was ahead of Sarah Hughes. Kwan was better than Hughes over the course of the SP and the LP, until Irina skated. Now Hughes was better than Kwan over the course of the SP and the LP, and a third skater was responsible for that flop.
 
Top