Men's PCS at Worlds. | Page 12 | Golden Skate

Men's PCS at Worlds.

skatinginbc

Medalist
Joined
Aug 26, 2010
The longer this discuss =ion goes on the more strongly I am pulled toward having two equally weight program components, skating skills and performance. The teo together mean "sking well...to music." (We could call them the first mark and the second mark. :) )
Isn't that close to my proposal: Skating Skills × Presentation (although the multiplying part needs further debate; Even I myself is not totally convinced with the multiplication :biggrin:)?
 
Joined
Jun 21, 2003
A few points of response:

The averages of the whole panel are going to flatten out the differences between highest and lowest components -- some judges will give wider ranges.

I do not see any a priori reason to think that this might or might not happen. You mean that one judge might say, this guy has great skating skills but he is not interpreting the music very well, while another might think he is interpreting the music well but not showing good skating skills? ETA: Yes. This happened with judge #2 and judge #6 for Hanyu.

Since I didn't know, I looked it up. Here are the scores for the nine judges for the first three men at Worlds.

Chan

Judge SS INT diff
#1 9.25 9.25 0.00
#2 9.25 9.25 0.00
#3 9.50 9.00 0.50
#4 9.25 9.25 0.00
#5 8.50 8.25 0.25
#6 8.50 9.50 1.00*
#7 9.25 9.25 0.00
#8 9.25 9.25 0.25
#9 9.25 9.25 0.00

For five out of nine judges, the difference was 0.

Hanyu

#1 7.75 7.75 0.00 (This is a different judge #1 from Chan's judge #1 :eek:hwell: )
#2 8.50 9.50 1.00*
#3 8.50 8.50 0.00
#4 9.00 9.25 0.25
#5 8.00 8.25 0.25
#6 9.00 8.25 0.75*
#7 7.50 7.50 0.00
#8 8.75 8.75 0.00
#9 8.25 8,25 0.00

For five out of nine judges, the difference is 0.

Takahashi

#1 8.50 8.50 0.00
#2 8.75 8/75 0.00
#3 8.25 9.00 0.75*
#4 9.00 8.75 0.25
#5 8.50 8.75 0.25
#6 8.50 8.75 0.25
#7 8.75 8.75 0.00
#8 8.75 8.75 0.00
#9 9.00 9.00 0.00

For five out of nine judges the difference was 0.

Takahashi

I am not sure what to conclude from this.


I think that actually happens pretty often.

But we rarely see that reflected in the scores.

The program components are factored so that, in theory, on average across a field of skaters some of whom are stronger in technique and some in performance, the TES and PCS will be approximately equal.

For senior men, the factors are nice round 1.0 for short programs and 2.0 for long.

If the number of components were broken down differently, the factors would have to change.

The factors would be the same. 1.0 for the short, 2.0 for the long. These factors would be applied to 5xSS instead of to SS+TR+PE+CH+INT.

E.g., suppose it were decided that you're right, everyone is just pretending, there is never any meaningful difference between the way the judges award scores for any of the five components and there's no hope of training them better or dividing the officials' responsibilities differently to make the differences meaningful, so let's just combine all five components into one score similar to the second mark under the 6.0 system. In that case, to keep the TES/PCS balance the same as it is now, the factor for the combined single second mark would need to be 5.0 in men's short programs and 10.0 for long . . . assuming that the maximum value for this score remains 10.0.

Yes, that is what I had i mind.

So let's say a judge wants to distinguish among three skaters who are approximately the same level, but within that level the judge sees a clear overall hierarchy in presentation ability that day. She decides to give one skater a score of 5.0 for this combined second mark and another skater a score of 5.5, and slips a third in between at 5.25. As close as they can get? Not really, when you multiply the differences by 10. The difference between the first two skaters ends up as 5.0, a gap wide enough to drive an average triple jump through. Yet there's only room for one skater between them and no means to differentiate on a finer level than those three.

A single judge already faces that challenge, unless she want to get cute with the five components.

For instance, I suppose now she could say, overall I want Skater A to get 5.00 and skater B to get 5.10. So I will give skater A program component scores of 5.0, 5.0. 5.0, 5.0, 5.0, and I will give skater B scores of 5.0, 5.0. 5.0 5.0 and 5.5. ETA: OK, you addressed this point below. Maybe a judge would have a legitimate reason to do that.

But this is cheating. Under the current system that 5.5 is supposed to have something to do specifically with interpretation, not just a nudge to make the ordinal placements come out right. (That's 6.0 :) ).

Well, that's easy to solve. Let the judges use increments of 0.1 again instead of 0.25.

I would definitely be against that, and I don't think it is necessary. Just let all judges just what they see fairly. The averaging over the nine judges (or seven) would mitigate the unwanted gap between scores all by itself.

We don't really need to speculate as to what might happen. Here is how the Worlds men's LP would have trubed out by the "Skating Skills only" method compared to the "five different components" method. (Posted by skatinginbc in post 197 above).

SSx10...PCS

91.10...90.14 Chan
83.90...83.00 Hanyu
87.10...85.78 Takahashi
81.80...81.66 Amodio
82.10...81.94 Joubert
75.70...74.92 Ten
78.90...77.02 Brezina
82.10...81.56 Abbott
5.00...75.50 Contesti
70.40...67.80 KVDP
75.70...73.30 kOZUKA
70.40...67.22 Song
71.80...70.80 Reynolds
76.80...75.66 Fernandez
68.90...66.14 Voronov
74.60...73.84 Rippon
76.10...74.34 Verner
76.10...74.08 Gachinski
63.20...59.78 Liebers
62.50...61.50 Caluza
62.90...60.36 Pfaifer
59.60...59.20 Lucine
65.40...64.50 Ge
59.30...58.02 Raya

gkelly said:
On the other hand, I think that dividing the scores into five areas gives the judges not only a way to make fine distinctions among skaters who are more or less at the same basic skill level (a purpose that tiebreakers also served under 6.0), but also it's a way to communicate to skaters: This is your general skill level (e.g., low 5s). Within that general level, I thought you were strongest on Performance/Execution (nice posture, beautiful extension, good connection with the audience, totally committed to the movement) and weakest on Skating Skills (your edges weren't very deep or steady, and you were pretty slow out there).

That is an excellent point. By looking at the protocols I am not sure that the judges are actually doing that. I think that would be more valuable at lower levels that at the World Championships. Good point, though.
 
Last edited:

skatinginbc

Medalist
Joined
Aug 26, 2010
Too soon to draw a conclusion from just one competition...Have you studied other ISU competitions yet?

Correlation between SS and PCS:

Skate America men's free: 0.9551
Skate Canada men's free: 0.9944
Cup of China men's free: 0.9885
NHK men's free: 0.9935
TEB men's free: 0.9764
Cup of Russia men's free: 0.9906
GPF men's free: 0.9985
European men's free: 0.9942
4CC men's free: 0.9962

How many 0.99s do you see? Do we even need more evidence? PCS is basically one component, namely, Skating Skills.
 
Last edited:
Joined
Jun 21, 2003
1. It's not true majority of judges thought Joubert's presentation was slightly better. Joubert received 11 higher scores than Hanyu who received 14 higher scores than Joubert.

This can't be determined from the data. The ISU makes sure of this. The judges almost surely did not match up from lowest to highest as in your table.

2. Yes those who liked Hanyu liked him a lot but those who didn't like him so much gave him quite a bit lower scores than Joubert as well.

But in the aggregate, if the mean is larger than the median (the data is skewed to the right), this means that there is a tendency, however small in this example, for a few extra large scores have a greater effect on the mean than the few extra small ones.

In an effort to even out scores by discarding the lowest and the highest, that was how the chips fell that day. Hanyu won over Joubert because more judges gave him higher scores than Joubert and more of Joubert's points were thrown out.

That's kind of a red herring. It is true but does not address the question of which skater was favored by the majority of the judges.

Anyway, the point is that it can, and does, happen in the CoP, that one skater is favored by a majority of judges but the other gets more CoP points, whatever the averaging procedure.

I guess that's OK. That's how an add-up-the-points system works. It's different from 6.0, though. In 6.0, if you win the majority of first place ordinals, you win, period.
 

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
Skate Canada 2011:

LP:

SS....IN....Diff
9.04 8.71 0.33
8.14 8.29 -0.15
8.46 8.46 0
7.39 7.46 -0.07
6.82 6.54 0.28
7.14 7.04 0.10
6.43 6.00 0.43
6.21 5.50 0.71 (Kevin Van Der Perren)
5.96 5.39 0.57
5.64 5.61 0.03

SP:

SS....IN....Diff
7.68 8.04 -0.36
8.43 8.61 -0.18
8.93 8.54 0.39
7.50 7.43 0.07
6.93 6.68 0.25
6.50 6.39 0.11
6.29 6.64 -0.35
6.57 6.25 0.27 (Kevin Van Der Perren)
6.43 6.18 0.25
5.82 5.82 0

Cup of China 2011:

LP:

SS....IN....Diff
6.79 6.93 -0.14
7.86 7.54 0.32
8.00 8.32 -0.32 (Jeremy Abbott)
7.61 7.57 0.04
6.61 6.71 -0.10
7.29 7.36 -0.07
6.43 6.25 0.18
5.86 5.29 0.57 (JiaLiang Wu)

SP:

SS....IN....Diff
7.43 7.54 -0.11
7.43 7.36 0.07
7.71 8.04 -0.33 (Jeremy Abbott)
7.89 7.61 0.28
6.39 6.32 0.07
5.89 5.43 0.46 (JiaLiang Wu)
6.68 6.71 -0.03
6.71 6.64 0.07

NHK 2011:

LP:

SS....IN....Diff
8.96 9.25 -0.29
7.93 7.96 -0.03
6.96 7.43 -0.47 (Samuel CONTESTI)
6.86 6.79 0.07
7.25 7.32 -0.07
6.93 6.82 0.11
6.29 6.25 0.04
6.61 6.68 -0.07
6.71 6.61 0.10

SP:

SS....IN....Diff
8.79 9.04 -0.25
7.89 7.75 0.14
6.89 6.93 -0.04
6.64 6.57 0.07
6.96 7.00 -0.04
6.54 6.57 -0.03
6.93 7.14 -0.21 (Samuel CONTESTI)
6.14 6.04 0.10
7.25 7.04 0.21

So I think we got the pattern.
 
Last edited:

Violet Bliss

Record Breaker
Joined
Nov 19, 2010
This can't be determined from the data. The ISU makes sure of this. The judges almost surely did not match up from lowest to highest as in your table.

This is exactly what the data shows, however they are lined up. I arranged the scores in order only to make it easier to see the spread and what scores were thrown out. If the fact that Hanyu received more higher scores than Joubert does not determine that more judges favored him, you can't support the opposite as proclaimed from the same data either.

But in the aggregate, if the mean is larger than the median (the data is skewed to the right), this means that there is a tendency, however small in this example, for a few extra large scores have a greater effect on the mean than the few extra small ones.

But the data show also there are larger number of higher scores for Hanyu so his aggregate does not come entirely from a few extra large scores. He has more larger scores, especially after the high and low were discarded, which is why the the tendency is small.

That's kind of a red herring. It is true but does not address the question of which skater was favored by the majority of the judges.

Anyway, the point is that it can, and does, happen in the CoP, that one skater is favored by a majority of judges but the other gets more CoP points, whatever the averaging procedure.

But you can't prove that this is the case as announced.

I guess that's OK. That's how an add-up-the-points system works. It's different from 6.0, though. In 6.0, if you win the majority of first place ordinals, you win, period.

In this case Hanyu did win more higher ordinals than Joubert, seemingly even in just the PCS, which is why I debate your statement which can't be proven:

Thus the CoP. A majority of judges think that skater A was better, but skater B gets higher scores.

BTW, I was the one who complied the SSX10 vs PCS data per your suggestion.
 

skatinginbc

Medalist
Joined
Aug 26, 2010
It's not true majority of judges thought Joubert's presentation was slightly better. Joubert received 11 higher scores than Hanyu who received 14 higher scores than Joubert.
I was talking about Presentation (PE + CH + IN). You mixed the SS and TR data in and came up with a different result, which has nothing to do with what I was talking about, and which like the design of CoP is capable of confusing the casual viewers. My conclusion had a scope (i.e., PE + CH + IN) and context (i.e, median as an estimate of majority opinions). You took it beyond its scope and out of context.
eta I realized the original numbers and conclusions were drawn on 3 components of PE, CH, and IN. But the facts remained, Joubert had 6 higher scores while Hanyu had 7 higher scores bestowed by the judges and Joubert had 1.25 X 2 more points thrown out within these categories.
Interestingly, by using your "logic" of rearranging the data and combining "votes" across categories, I combined the data in all three categories and found Joubert actually had a higher mean (8.37) than Hanyu's (8.36) and, of course, a higher median (8.50) than Hanyu's (8.25). Both measures of central tendency supported my hypothesis that Joubert won the majority vote :biggrin:. Seriously, we cannot simply rearrange the data into pairs that did not come from the same judge and then compare who beat whom. Since the ISU hides the judges and scrambles the order of scores, the best we can estimate the majority ranking/vote is through the use of median.
 
Last edited:

skatinginbc

Medalist
Joined
Aug 26, 2010
So I think we got the pattern.
Indeed, from your data I see the pattern: Besides a small number of incidents where the difference between SS and IN scores is greater than the minimal increment, the scores in these two categories were almost identical (i.e., no greater than the minimal gradation). This is the pattern (Correlation coefficient between SS and IN):
Skate Canada
Men's SP: 0.9647
Men's LP: 0.9795

Cup of China
Men's SP: 0.9693
Men's LP: 0.9641

NHK:
Men's SP: 0.9859
Men's LP: 0.9813

The observed correlation between SS and IN is too high (ranging from 0.9641 to 0.9859). It strongly casts doubt about the judges' capacity to treat SS and IN as distinct categories and about whether those few cases where skaters showed discrepancy between SS and IN should have had an even greater difference in their scores if the judges did not have the tendency of scoring them in the same range.
 
Last edited:

skatinginbc

Medalist
Joined
Aug 26, 2010
The averages of the whole panel are going to flatten out the differences between highest and lowest components -- some judges will give wider ranges. In this event, as unfortunately in many events, the widest range is rarely over 0.75 for a given skater. I would expect something like 0.75 or 1.0 to be the average difference (i.e., larger than 0.36). But I would expect differences on the order of 1.5 or 2.0 or more to be exceptions
I largely agreed. I expect the discrepancy between judges' scores to be smaller in a more objective category (e.g., no greater than 1.0 in Skating Skills) and slightly greater in a more subjective category (e.g., no greater than 1.5 in Interpretation). Anything greater than 1.5 is likely a rogue score. And my instinct told me that a normal standard deviation among judges' scores in IN is probably somewhere between 0.35 to 0.65, such as Chan's first showing last season (i.e., Skate Canada) = 0.42, Chan's first almost-clean skate (i.e., 4CC) = 0.46, Yuzuru's first showing (i.e., Cup of China) = 0.40, Yuzuru's first almost-clean skate (Worlds) = 0.61, an unfamiliar face at Worlds (e.g., Harry Hau Yin LEE) = 0.59, and another unfamiliar face at Worlds (e.g., Taras RAJEC) = 0.45.

The standard deviation for Dai's LP at Worlds was 0.16, so small, almost like the judges already made their decision before he even skated. His first showing at Skate Canada had a standard deviation of 0.58.
 
Last edited:
Joined
Jun 21, 2003
This is exactly what the data shows, however they are lined up.

SS:
Hanyu. 7.50--7.50--7.75--8.00--8.25--8.50--8.50--8.75--9.00 - 8.39
Joubert 8.00--8.00--8.00--8.00--8.25--8.25--8.50--8.75--8.75 - 8.21

Joubert wins 3 judges to 2, with 4 ties.

Match up the judges differently:

Hanyu. 7.50--7.50--7.75--8.00--8.25--8.50--8.50--8.75--9.00 - 8.39
Joubert 8.75--8.50--8.25--8.25--8.00--8.00--8.00--8.00--8.75 - 8.21

Hanyu wins, 5 judges to 4, no ties.

Hanyu. 8.50--7.50--7.75--8.00--8.25--7.50--8.50--8.75--9.00 - 8.39
Joubert 8.75--8.50--8.25--8.25--8.00--8.00--8.00--8.00--8.75 - 8.21

Joubert wins, 5 judges to 4, no ties.
 

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
I expect the discrepancy between judges' scores to be smaller in a more objective category (e.g., no greater than 1.0 in Skating Skills) and slightly greater in a more subjective category (e.g., no greater than 1.5 in Interpretation).

But there might be a little concern here. Greater discrepancy on the more subjective categories would mean that sometimes in some cases the placement of the skaters could be decided by these subjective categories. Will you say "Be it. I'll accept it"?;)
 

skatinginbc

Medalist
Joined
Aug 26, 2010
But there might be a little concern here. Greater discrepancy on the more subjective categories would mean that sometimes in some cases the placement of the skaters could be decided by these subjective categories. Will you say "Be it. I'll accept it"?;)
As I stated in Post #229, I expect the normal standard deviation for a subjective category to be somewhere between 0.35 to 0.65. Of course, I will happily accept such subjectivity because that is the way it should be. I'm happy that Yuzuru's PE scores at Worlds showed a great variance among judges (standard deviation = 0.51). It made so much sense to me: Some judges liked his musicality very much while the others disliked his postures. The scores looked REAL. And don't forget we use the mean (or maybe median one day, like the International Tchaikovsky Competition http://www.tchaikovsky-competition.c.../voting_system) to find the middle point of the judges opinions. If you ask me whether I would take the subjective opinions of nine people who observe an elephant (i.e., the "true" or "right" score of a skater) from various perspectives, or I would take the objective opinions of nine people who measures the elephant's ear (i.e., skating skills) and the ear only with a ruler, I would say the former approach is a better way of finding out the whole elephant.
 
Last edited:

gkelly

Record Breaker
Joined
Jul 26, 2003
I'm happy that Yuzuru's IN scores at Worlds showed a great variance among judges. It made so much sense to me: Some judges liked his musicality very much while the others disliked his postures. The scores looked REAL.

Well, the posture shouldn't be reflected in the interpretation category. That's one reason why I don't like the idea of combining those three (or all five) components.
 

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
As I stated in Post #229, I expect the normal standard deviation for a subjective category to be somewhere between 0.35 to 0.65. Of course, I will happily accept such subjectivity because that is the way it should be. I'm happy that Yuzuru's IN scores at Worlds showed a great variance among judges. It made so much sense to me: Some judges liked his musicality very much while the others disliked his postures. The scores looked REAL. And don't forget we use the mean (or maybe median one day, like the International Tchaikovsky Competition http://www.tchaikovsky-competition.c.../voting_system) to find the middle point of the judges opinions. If you ask me whether I would take the subjective opinions of nine people who observe an elephant (i.e., the "true" or "right" score of a skater) from various perspectives, or I would take the objective opinions of nine people who measures the elephant's ear (i.e., skating skills) and the ear only with a ruler, I would say the former approach is a better way of finding out the whole elephant.

But if you did as you said you will, there wouldn't have been 16 pages in this thread which were largely generated by your outrage on Patrick Chan.;)
 
Last edited:

skatinginbc

Medalist
Joined
Aug 26, 2010
Well, the posture shouldn't be reflected in the interpretation category. That's one reason why I don't like the idea of combining those three (or all five) components.
Sorry, I meant PE (standard deviation = 0.51) and I will edit my Post #232 accordingly. Hanyu's emotional involvement in the music was excellent (very convincing, honest), but his carriage was, eh, still much to be desired.

But if you did as you said you will, there won't be 16 pages in this thread which were largely generated by your outrage on Patrick Chan.;)
Come on. My outrage concerns the lack of variance in his scores. It doesn't look real. And it doesn't make sense.
 
Last edited:

doctor2014

On the Ice
Joined
Nov 21, 2010
Data entry mistakes?
SS:
Hanyu. 7.50--7.50--7.75--8.00--8.25--8.50--8.50--8.75--9.00 - 8.39
Joubert 8.00--8.00--8.00--8.00--8.25--8.25--8.50--8.75--8.75 - 8.21

Joubert wins 3 judges to 2, with 4 ties.
In ascending order, the SS scores should’ve been:
Hanyu. 7.50--7.75--8.00--8.25--8.50--8.50--8.75--9.00--9.00 - 8.39
Joubert 8.00--8.00--8.00--8.00--8.25--8.25--8.50--8.50--8.75 - 8.21

So Hanyu wins 6 judges to 2, with 1 tie.
Match up the judges differently:

Hanyu. 7.50--7.50--7.75--8.00--8.25--8.50--8.50--8.75--9.00 - 8.39
Joubert 8.75--8.50--8.25--8.25--8.00--8.00--8.00--8.00--8.75 - 8.21

Hanyu wins, 5 judges to 4, no ties.
With the correct data substituted, I matched up the judges in the same way as you did:

Hanyu. 7.50--9.00--7.75--8.00--8.25--8.50--8.50--8.75--9.00 - 8.39
Joubert 8.75--8.50--8.25--8.25--8.00--8.00--8.00--8.00--8.50 - 8.21

Hanyu wins, 6 judges to 3, no ties.
Hanyu. 8.50--7.50--7.75--8.00--8.25--7.50--8.50--8.75--9.00 - 8.39
Joubert 8.75--8.50--8.25--8.25--8.00--8.00--8.00--8.00--8.75 - 8.21

Joubert wins, 5 judges to 4, no ties.
Hanyu. 8.50--7.50--7.75--8.00--8.25--9.00--8.50--8.75--9.00 - 8.39
Joubert 8.75--8.50--8.25--8.25--8.00--8.00--8.00--8.00--8.50 - 8.21

Hanyu wins, 5 judges to 4, no ties.
 

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
Come on. My outrage concerns the lack of variance in his scores. It doesn't look real. And it doesn't make sense.

This has gone back to the original argument. Who decides what looked "real" and what looked "unreal"? Who decides what made sense and what didn't make sense? YOU?! To use your subjective views against the judges' subjective views, who is "right"? You have no reason to dismiss that there might be sometimes that the judges have all been charmed by some skaters' performances in the same ways and gave very close scores. Tell me that there is no such possibility.;)
 
Last edited:

let`s talk

Match Penalty
Joined
Sep 10, 2009
Who decides what looked "real" and what looked "unreal"? Who decides what made sense and what didn't make sense? YOU?! To use your subjective views against the judges' subjective views, who is "right"?
The audience. A.k.a a paying customer. And he expressed his opinion in Nice.
 

let`s talk

Match Penalty
Joined
Sep 10, 2009
After we've gone through this thread for 16 pages, it seems that opinion in Nice wasn't correct after all.
That is exactly why popularity of FS is declining rapidly- a paying customer is told his opinion is not right after all. So he goes to watch something else.
 
Top