Olympic judging changes ( 5 judge results)

steyn · Apr 28, 2009

Hsuhs said:
I'm still not sure what exactly IJS is supposed to measure, what's the name of the construct? Without a certainty of that knowledge, I'm afraid I'm clueless about the best fit.

feraina said:
We can argue forever whether the judges' scores reflect noisy samples of some underlying "truth", or whether there is no independent "truth" except what emerges as a consensus from judges' scores.

I believe there is no independent "truth" but only a consensus at each moment which evolves as time goes on. Judges in the past, say 30 years ago had different standards compared to the current judges. After another 30 years, they will give different scores for the same performance even if the current judging system is still used in the future. It is fundamentally impossible to set the absolute standard because FS is still evolving while judges are human; they are inherently limited to what they have learned, what they have experienced and so on.

I understand the scores as follows.

Suppose we have three skaters and let a judge decide the ranking of their performances according to his own standard. Each skaters will get his own ranking from 1 to 3. Now the judge might feel that, even if he gave the numbers from 1 to 3, the number one guy is extremely good compared with the other two and that it would be unfair to express the difference just by the ranking. This is maybe because the number of skaters are just three. If there were, say, seven more skaters who are reasonably good, then these seven skaters actually might be ranked from 2nd to 8th. The judge would be happier if he gives the number 1 and 9 and 10 to the original three skaters, respectively, even when there are only three skaters.

Go one step further and imagine all the skaters in the world and let the judge do his job. Well, all the skaters in the world at a specific time might not be enough, because sometimes there are only, say 3 top-level skaters and there are 2nd tier 20 skaters and the difference between the two group might be huge. In this case, let the judge fill the gap by imagining hypothetical skaters. What I am trying to do here is, let the judge imagine huge number of "evenly distributed" (according to his own standard) hypothetical performances and identify the ranking of each performance. You may say that the number of hypothetical performances go to infinity but this is not a problem because we can ask the judge to normalize the ranking to the number between say 0 to 10.

One more step. Now consider all judges existing in the world at the present time. Let each judge do the same thing. In general each judge has his own standard which will be all different. So take average. Then you will get a unique number for each peformance (or for each element under consideration). It may be viewed as the best number representing the performance. It may not be considered as an "independent truth" but is a consensus at present.

Of course you cannot do this in reality. The only think you can do is select some finite number of judges. There is no guarantee that their standards are the same. Then the best you can do is make a guideline in advance which most judges at present would think reasonable, and ask the judges to consult it to give the "normalized ranking" for each performance. This will be the score we get. In this way, even if you cannot really "measure" something (in the sense of measuring the length of a rod), whether it is TES or PCS, you can give a quantitative number for each performance (or for each element) and the differences between the scores can have a meaning as differences between the normalized rankings according to the consensus at the moment.

Mathman · Apr 28, 2009

I agree with this analysis.

(As pointed out by gsrossano -- post 44 above -- the ISU has a different view, but never mind that.)

Now consider all judges existing in the world at the present time. Let each judge do the same thing. In general, [each judge might give a different score,] so take average. Then you will get a unique number for each performance (or for each element under consideration). It may be viewed as the best number representing the performance.

Of course you cannot do this in reality. The only thing you can do is select some finite number of judges…

This is the point at which statistics raises it’s ever-curious head. We cannot expect that the average of this finite sample will be exactly the same as that “best number” representing the average score for all judges in the world. But we could hope that “probably” it is “pretty close.”

The job of statistics is to say how probable is “probably” and how close is “pretty close.” There is a formula for this, which works in many, many cases. It goes like this: “We can be 95% confident that the average of the finite sample will be off by no more than twice the standard deviation divided by the square root of the sample size.”

Hence the discussion of how large the size of the finite judging panel ought to be.

(The argument between Dr. Rossano and me on this thread is mostly about whether figure skating judging is – or should be -- the kind of thing that this sort of analysis usefully applies to.)

As a person who still likes ordinals (and the particular statistical ideas that apply in that case – one does not, for instance, take the average of ordinals), I found the following part of your post most fascinating.

steyn said:
Suppose we have three skaters and let a judge decide the ranking of their performances according to his own standard. Each skaters will get his own ranking from 1 to 3. Now the judge might feel that, even if he gave the numbers from 1 to 3, the number one guy is extremely good compared with the other two and that it would be unfair to express the difference just by the ranking.

Why would it be unfair? Who would it be unfair to? Why might a judge feel this way?

I think perhaps the answer is that it seems “unfair” to our pyschological sense of how sports scoring "ought" to work. We have different emotions when our team clobbers the other guys 20 to nothing, compared to when we squeeze out a 5 to 4 victory in an overtime shootout.

gkelly · Apr 28, 2009

Mathman said:
Why would it be unfair? Who would it be unfair to? Why might a judge feel this way?

In a short program, it could be unfair to the skater who won the short convincingly.

Example:
Skater A performs a clean short program

Skaters B, C, and D are at a similar skill level to A, but on that day they all make at least two moderate-to-severe mistakes.

In the long program, A and B both skate approximately equally well, with similar successful jump content as well as similar basic skills, presentation, spins, and steps.

The long program results could legitimately go either way between A and B. It so happens that B wins by a very narrow margin. (5/4 split if we're using ordinals with 9 judges)

In this case, B wins the event, despite the fact that A was pretty much just as good on the second day and much better on the first, i.e., clearly better overall across the two days.

The opposite is true when the fourth or fifth place short program is very close to the first.

A, B, C, D, and E all do the same jumps in the short program with no major mistakes, and all have their strengths and weaknesses in overall skill and presentation. The scores are very close among those five skaters, and in fact some judges have D and E ahead of A. A wins the short with a majority of 2nd or 3rd-place ordinals.

Comes the long program. A makes two significant mistakes. B withdraws. C and E each have three mistakes or otherwise water down their content. D skates a clean and difficult program and wins clearly with unanimous first-place marks well above the scores given to A, who barely squeaks out second place in the long sharing mixed ordinals with F and G who had stronger long than short programs.

A wins the event, despite having been only slightly better than D in absolute terms in the short program and significantly worse in the long.

However, let's suppose that B didn't simply withdraw because of injury, etc. Let's say it turns out that B wasn't really eligible to enter this contest after all and was disqualified after the short program.

Now D's fourth place in the short is changed to third place, and D wins after all.

The final results would have nothing to do with how much better A was than D in the short or how much better D was than A in the long, and would vary solely on the presence or absence of B in the short program standings.

Those are the kinds of examples where factored placements, with the evenly spaced differences between consecutive places not reflecting the difference in quality at all, can be unfair to the skater who is significantly better in one program and only slightly worse (not statistically significantly) in the other.

Of course, there can also be similar problems with carrying over the margin of victory from the short program in an absolute scoring system like IJS.

E.g., A beats B in the short by an insignificant margin, say 1.0, and B beats A in the long by an even smaller margin, say 0.9. B won the long, but A still wins the title.

Joesitz (RIP) · Apr 28, 2009

Mathman;392504 .The 14 judges said:
1. Where is the $avings for less judges? We have 9 plus (9-4+4 new) That seems like a total of 13 for the whole competition where as if the new 4 were omitted, we would just have original 13 and not additional travel and accommodations required. That would be saving money? No? That doesn't seem to say money is the reason for cutting down on judges. Does it?

2. Extremely hypothetical: If I were a judge from the Czech Rep., and selected to judge the LP only, and I saw a young Czech skater sitting in 4th place after the SP, I could carefully inflate my scores for the young skater to get him on the podium. So there is a possibility to cheat. No?

Aside: I really think music is the problem in figure skating. Without music, and elements judged as per CoP, figure skatings would work as other judged competitions, e.g. Diving. Of course, the Free Skate to music could be another competition on a different day.

Mathman · Apr 28, 2009

gkelly said:
In a short program, it could be unfair to the skater who won the short convincingly...

It depends on which sports model we are using. Are the SP and the LP like the first and second halves of a football game? (The CoP says yes.)

Or is it more like the semi-finals and the finals of a tournament (factored ordinals work sort of like this, at least for the top three.)

Tennis player A just barely loses in the semi-finals to B. Then B clobbers C in the finals. Is it unfair to A not to give her the second place money? No, that's just the tournament model. Players know this and they plan their game accordingly.

Under factored ordinals, the winning strategy was to get into the finals by achieving top three status in the semis, then win the finals over your two rivals.

Two different sport models. I don't see how the concept of "fairness" plays a role.

The second point is this. Perhaps there is a sense in which we can say that a particular system is unfair to "the skater who..."

Who is "the skater who...?" As long as everyone has an equal fighting chance to do what has to be done not to end up as "the skater who...," then the system is fair.

Michelle, Irina and Sasha are in a skating contest. The rules treat Michelle, Irina and Sasha exactly equally. During the course of the competition, one of them, alas, turns out to be "the skater who..." and loses. It seems like we are saying that the system is fair to Michelle, fair to Irina, and fair to Sasha -- but unfair to "the skater who."

Mathman · Apr 28, 2009

Joesitz said:
1. Where is the $avings for less judges? We have 9 plus (9-4+4 new) That seems like a total of 13 for the whole competition where as if the new 4 were omitted, we would just have original 13 and not additional travel and accommodations required. That would be saving money? No? That doesn't seem to say money is the reason for cutting down on judges. Does it?

At Goteborg each discipline had 12 sitting judges, plus 4 more that came into play for the LP. So they had a total of 16 judges for each discipline. The new rule cuts this 16 down to 13.

2. Extremely hypothetical: If I were a judge from the Czech Rep., and selected to judge the LP only, and I saw a young Czech skater sitting in 4th place after the SP, I could carefully inflate my scores for the young skater to get him on the podium. So there is a possibility to cheat. No?

I think the ISU would answer that, yes, the Czech judge could cheat for the Czech skater in the LP.

But if that same Czech judge had been assigned to both the SP and the LP, then he could cheat twice.

I think the idea is that almost all judges are honest, with just a few bad apples. By changing up between the SP and the LP, maybe you hope to limit the damage that the one bad apple can do.

Aside: I really think music is the problem in figure skating. Without music, and elements judged as per CoP, figure skatings would work as other judged competitions, e.g. Diving.

I think this is what people like Sonia Bianchetti are so worried about. That figure skating will become like diving and just fade off the radar of public interest.

gkelly · Apr 29, 2009

Mathman said:
Under factored ordinals, the winning strategy was to get into the finals by achieving top three status in the semis, then win the finals over your two rivals.

Two different sport models. I don't see how the concept of "fairness" plays a role.

OK, I'll grant you this point.

The second point is this. Perhaps there is a sense in which we can say that a particular system is unfair to "the skater who..."

Who is "the skater who...?" As long as everyone has an equal fighting chance to do what has to be done not to end up as "the skater who...," then the system is fair.

This one, not so much. If we're objecting to a system being systematically unfair, then we wouldn't identify the skaters it is unfair to by name but rather by the situations in which they would lose out.

E.g., suppose we had some hybrid system that encouraged judges to score skaters comparatively and to leave room in the marks for skaters yet to come, and then the actual marks given and not just ordinals were used in determining results.

In that case, "the skater who" draws an early skate order would be unfairly disadvantaged, much more than in a pure ordinal system where marks are only placeholders.

Yes, every skater has an equal random chance of drawing an early or late start order, but not "a fighting chance" -- they don't have any direct control over the draw.

steyn · Apr 29, 2009

Mathman said:
It depends on which sports model we are using. Are the SP and the LP like the first and second halves of a football game? (The CoP says yes.)

Or is it more like the semi-finals and the finals of a tournament (factored ordinals work sort of like this, at least for the top three.)

Tennis player A just barely loses in the semi-finals to B. Then B clobbers C in the finals. Is it unfair to A not to give her the second place money? No, that's just the tournament model. Players know this and they plan their game accordingly.

Under factored ordinals, the winning strategy was to get into the finals by achieving top three status in the semis, then win the finals over your two rivals.

Two different sport models. I don't see how the concept of "fairness" plays a role.

I think the tournament model works better when there are two player (or teams) in a game and they compete with each other. The game depends not only the performance of one player but also the other. So you cannot really compare two different games.

On the other hand, sports like FS are performed by one player for each game which does not depends on the other players (except psychological effects).

The second point is this. Perhaps there is a sense in which we can say that a particular system is unfair to "the skater who..."

Who is "the skater who...?" As long as everyone has an equal fighting chance to do what has to be done not to end up as "the skater who...," then the system is fair.

Michelle, Irina and Sasha are in a skating contest. The rules treat Michelle, Irina and Sasha exactly equally. During the course of the competition, one of them, alas, turns out to be "the skater who..." and loses. It seems like we are saying that the system is fair to Michelle, fair to Irina, and fair to Sasha -- but unfair to "the skater who."

Here I believe the bottom line is as follows. The rule, whatever it is, should be fixed earlier than the strengths and weaknesses of the skaters are recognized. After major players are identified, the rule should not be changed unless it has some serious and fundamental flaw.

This kind of problem happens for example in politics. In elections, fixing the electoral district in advance is very important. You might say that any redistribution of the district would be fair to all candidates because they have equal chances. But we cannot call this fair This kind of unfairness is called "Gerrymandering" as you probably know.

In this sense, changing the number of judges or changing the guidelines of goe and so on at this moment have the same kind of unfairness and should be avoided if possible. It should have been done much earlier, say, right after the 2006 Olympics.

Joesitz (RIP) · Apr 29, 2009

Mathman said:
I think the idea is that almost all judges are honest, with just a few bad apples. By changing up between the SP and the LP, maybe you hope to limit the damage that the one bad apple can do.

Yup, and thinking it does not make it truthful.

I think this is what people like Sonia Bianchetti are so worried about. That figure skating will become like diving and just fade off the radar of public interest.

Why not? Diving is an as exact a judged Sport as it can get (and artistic too!) Measure the Technical abilities of the skater in one comp and to please the music lovers, have separate comp to test musicality. Combine the totals and voila, a winner. This is not happening in the CoP.

Mathman · Apr 29, 2009

steyn said:
Here I believe the bottom line is as follows. The rule, whatever it is, should be fixed earlier than the strengths and weaknesses of the skaters are recognized.

A particularly egregious example was when they changed the rules for pairs just before the 2006 Olympics. The new rule prohibited an element which was a specialty of Shen and Zhao, and instead required an element that was a specialty of Totmianina and Marinin.

mycelticblessing · Apr 29, 2009

Mathman said:
A particularly egregious example was when they changed the rules for pairs just before the 2006 Olympics. The new rule prohibited an element which was a specialty of Shen and Zhao, and instead required an element that was a specialty of Totmianina and Marinin.

What rule was that? And which element?

Mathman · Apr 29, 2009

mycelticblessing said:
What rule was that? And which element?

That's a good question. I hope someone who follows pairs skating in greater detail than I do will jump in with the answer. I have just spent an hour searching for the relevant rules documents, but the ISU seems to have removed all their Communications from more than a year ago from the archives that are available on their site.

Anyway, I believe the relevant rule is rule number 313 in the document Technical Rules for Single and Pairs Skating 2004. As best I can remember, for the 2005-06 season the ISU switched the required elements in the pairs SP from "group B" to "group A." Maybe someone can help out here with exactly which elelments were involved, and why the change benefitted T&M at the expense of S&Z. (?)

Medusa · Apr 29, 2009

Mathman said:
A particularly egregious example was when they changed the rules for pairs just before the 2006 Olympics. The new rule prohibited an element which was a specialty of Shen and Zhao, and instead required an element that was a specialty of Totmianina and Marinin.

But that could have very well been a coincidence and not a Russian ploy. E.g. COP didn't make anything easier for Plushenko. His specialty, huge and difficult combos, don't get any extra points for difficulty. And his unique jump combo, the 3A-3F, only counts as a sequence and therefore even gets less points than a solo 3A and a solo 3F.

dorispulaski · Apr 29, 2009

Mathman said:
That's a good question. I hope someone who follows pairs skating in greater detail than I do will jump in with the answer.

S&Z had a gorgeous high, fully rotated triple twist that scored high GOE. T&M had a chest crashy triple twist. The rules were changed so that a double twist with a lot of features could score as high as a triple twist, particularly if the triple twisting lady didn't do an extreme split before the rotation (Shen's was not extreme).

Worse, S&Z used a triple toe for their single jump and a triple toe double axel sequence for their combination. T&M did a 3t2t combination and a 3S. It was forbidden to do the 3t seq 2A if the 3t was the single jump. Trying to get a different combination/sequence led to Zhao's injury, for that matter, as well as taking the hit.

Although these 2 changes may not have been deliberately created to favor T&M, they surely did favor them.

Another change was that it was previously allowed to do all 3 lifts with the same entry, so that a team with strong lifts, like Zagorska and Siudek, could do 3 axel entry lasso lifts in the LP (which at the time was the lift entry with the highest potential score. ) S&Z's lifts were quite good, AFAIR, but I don't recall whether they did all three as 5ALi lifts. T&M had trouble with the axel entry lasso lift-it's the one they crashed on at Skate America.

It is interesting to see that the twist rules now favor doing a triple twist rather than a double twist again, which looks even more suspect to me.

Medusa · Apr 29, 2009

dorispulaski said:
It is interesting to see that the twist rules now favor doing a triple twist rather than a double twist again, which looks even more suspect to me.

Hmm, are you sure? You get 4.5 points for a level 4 Double Twist, 5.0 for a level 1 Triple Twist (and most Triple Twists are level 1 or 2). The really bad Triple Twisters like K/S and especially D/D would often be better off with a very well executed level 4 Double Twist, they net the 4.5 points plus good GOEs - and it looks clean. Plus it would allow them to have a better balanced program, both couples have the Triple Twist as their first or second element in the program because they probably need all their power and concentration for it. A Double Twist could be more towards the end, maybe very well timed to the music...

dorispulaski · Apr 29, 2009

Making it more scoring favorable does not make it any easier to do. :biggrin:

However, the fact that D&D and I&B, both teams with lousy twists, have put the 3Tw back in their program, tells you how the potential scoring works, even if you didn't look at the ISU communications.

Joesitz (RIP) · Apr 29, 2009

Medusa said:
But that could have very well been a coincidence and not a Russian ploy. E.g. COP didn't make anything easier for Plushenko. His specialty, huge and difficult combos, don't get any extra points for difficulty. And his unique jump combo, the 3A-3F, only counts as a sequence and therefore even gets less points than a solo 3A and a solo 3F.

How does one land on a backoutside edge (Axels do just that) and execute a Flip? I think it would need a step to bring it into back inside edge, and isn't that a sequence and not a combo?

I am not questioning the move, which I consider quite difficult, just the name in calling it a combo.

Medusa · Apr 29, 2009

Joesitz said:
How does one land on a backoutside edge (Axels do just that) and execute a Flip? I think it would need a step to bring it into back inside edge, and isn't that a sequence and not a combo?

I am not questioning the move, which I consider quite difficult, just the name in calling it a combo.

I think you are right, he had some kind of step inbetween. Half Loop, I guess?

Joesitz (RIP) · Apr 29, 2009

dorispulaski said:
Making it more scoring favorable does not make it any easier to do. However, the fact that D&D and I&B, both teams with lousy twists, have put the 3Tw back in their program, tells you how the potential scoring works, even if you didn't look at the ISU communications.

Should not the appearance of a 3Twist be judged in the PC scores? and isn't it always a personal judgement on how it looks?

It seems all successful 3Twists should get its base value, the height the lady appears in the jump is measured by different eyes unless the roar of the crowd affects the judging?

If a Pairs team needs points, and they are aware the twist is not as high as their competitors, they should do for the base value points anyway. No?

Mathman · Apr 29, 2009

gkelly said:
This one, not so much. If we're objecting to a system being systematically unfair, then we wouldn't identify the skaters it is unfair to by name but rather by the situations in which they would lose out.

So to the definition of "unfair" we should add something to the effect that, besides treating all skaters equally, the rules for scoring should be correlated with the athletes' performances and should not be affected by things outside the athletes' control, such as who wins a coin toss or a lucky placement in the draw for starting order.

About carrying over points from one phase of the competition to the next, there are a lot of different ways to do it (all of them "fair" in my opinion.)

In the World Series, the Cubs win the first game 10 to 0, then lose the next four 2 to 1. The Cubs' manager says, "I sure wish I could have saved some of those runs for the later games."

But he couldn't. All he could carry over was the win (the first place ordinal.)

In pole vaulting, one athlete clears 14 feet, then 15 feet, then 16 feet. Then he attempts 17 feet and fails. Another vaulter passes at all the lower heights, attempts 17 feet as his first jump, makes it over, and wins.

What the first vaulter got to carry over was "the lead so far," plus the opportunity to proceed to the next height -- a prize that the second vaulter got for free, by choosing a different strategy.

At the annual Oxford University Regatta, they put all the boats in the water in the order in which they finished last year, with a certain number of meters between them. Then everyone rows like crazy. You "win" if you catch up to the boat in front of you and bump it. You lose if the boat behind you bumps you first.

The next year the ordinals are carried over, and you get to start one place ahead or one place behind your previous starting place.

The two or three crews that are serious oarsmen try hard to overtake the number one team, or to maintain their position if they are already number one. In contrast, the lads in boat number 43 are probably less worried about boat number 44 catching up to them than about falling into the water drunk.

Medusa said:
I think you are right, he had some kind of step inbetween. Half Loop, I guess?

He should do it anyway. He might take a hit in terms of GOE points, but it delivers the message. I'm Plushenko. You're not.

Olympic judging changes ( 5 judge results)

steyn

Mathman

gkelly

Joesitz (RIP)

Mathman

Mathman

gkelly

steyn

Joesitz (RIP)

Mathman

mycelticblessing

Mathman

Medusa

dorispulaski

Wicked Yankee Girl

Medusa

dorispulaski

Wicked Yankee Girl

Joesitz (RIP)

Medusa

Joesitz (RIP)

Mathman

Similar threads

Connect with us