Globe and Mail: Figure skating judging system still has flaws | Page 2 | Golden Skate

Globe and Mail: Figure skating judging system still has flaws

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
...he will be shuffled to the middle or back of the pack. Best case scenario would be a 7th place at Nationals I would guess.

That'll be good for him because he'd have the "proof" that what he said about politics in skating and how unfair USFSA was to him were true.:laugh:

Seriously, I think USFSA will find out soon enough that Weir's "comeback" will bring them more trouble than benefits.;)
 
Last edited:

sweetskates1

Medalist
Joined
Feb 5, 2012
Is there a thread about Johnny's book on this forum? Please let me know. I have not had the chance to read it and am curious about the content.

That'll be good for him because he'd have the "proof" that what he said about politics in skating and how unfair USFSA was to him were true.:laugh:

Seriously, I think USFSA will find out soon enough that Weir's "comeback" will bring them more trouble than benefits.;)
 

Srin Odessa

On the Ice
Joined
Jan 23, 2012
I don't know if this is historically accurate as it applies to figure skating, but judging from 0 to 6 is perfect from the point of view of cognitive psychology. Many studies have confirmed that humans can correctly place items into six or seven comparative categories, but no more.

That is why the program component scores have come in for such criticism. There are forty-one different grades that you can get, from 0.00 to 10.00, graduated in quarters of a point. It is beyond human capacity to distinguish between a performance that deserves 5.50 and one that deserves 5.75.

Not only that, but statistically speaking the sampling errors swamp the thing that is being measured.

During the figures era, skaters traced six figures scored from 0.0 to 1.0. They did three tracing with each foot for a total of six. The scoring system's range carried over to the program portion to keep it consistent.
 
Joined
Jun 21, 2003
But isn't 0 to 6 actually 0.0 to 6.0, which is sixty grades, compared to 40s for PCS? And given that judges have to judge dozens of performances at a time, how can they correctly place them relative to each other if we can only do six/seven comparative categories? Or am I misunderstanding what you're saying.

In principle, the 5.7s etc. are just place holders for ordinals. That is, the judge decides that the third skater skater C is better than A but worse that B, and gives out whatever scores are required to make it come out that way.

Also in principle, a person can rank any number of things from first to last if he has the opportunity to compare them one by one. So except for forgetting what you had seen before, it ought to be theoretically possible for a judge to rank 100 skaters from best to worst and to do it with a fair amount of consistency.

There have actually been some studies -- by George Rossano for instance -- that analyze the percentage of time that ordinal judging "gets it right" compared to add-up-the-points scoring. IMHO the gap between theory and practice is too great for those studies to be very convincing one way or the other.

Anyway, I was just speculating as to why, maybe, someone thought that 6 was a good number.

The serious point, though, is that CoP enthusiasts cannot claim that the current system is better because it is "more mathematical."
 
Joined
Jun 21, 2003
During the figures era, skaters traced six figures scored from 0.0 to 1.0. They did three tracing with each foot for a total of six. The scoring system's range carried over to the program portion to keep it consistent.

Thank you! I love this historical stuff. :)
 
Joined
Aug 16, 2009
During the figures era, skaters traced six figures scored from 0.0 to 1.0. They did three tracing with each foot for a total of six. The scoring system's range carried over to the program portion to keep it consistent.


Aha! That explains a lot. Thanks so much.
 

gkelly

Record Breaker
Joined
Jul 26, 2003
If there had never been a 6.0 system, would anybody today design a scoring system this way at all? Would it even occur to them?

Good question. I suppose it depends what they're trying to achieve.

Suppose that figure skating had reached an advanced state of technique and "production values," for lack of a better term, comparable to the last 20 years in real life. But there had never been any competition format and it was just something to do for oneself and fellow participants, and for audiences. Or else imagine that it's today and the ISU wants to throw out both 6.0 and IJS in all their past incarnations and start from scratch.

Let's say a consulting firm or task force is assigned to develop a competition format and scoring system. What are they told? Focusing on singles skating...

*FS is a sport centered around athletic tricks such as jumps and spins. The skating community has detailed knowledge about what constitutes more difficulty in these elements and what constitutes higher quality. There are some areas of continual debate on these points, but for the most part scoring these elements should be pretty objective. The difficulty and the quality of the skating in between the elements and the way the skater relates to the music and the audience and the surrounding space should also be taken into account, but since these areas are more subjective they shouldn't count as much as the objective part.

OR

*Skating is a highly technical sport based on the different ways the human body can direct blade edges to glide across the ice on curves. All skating skills should be rewarded in terms of the wide range of ways to vary and combine the basic edges and to transition from one edge to another (including multiple rotations in the air). Spectacular and athletic variations are worth rewarding. So are skill shown in executing and combining these moves with visual beauty and in time with music or even better expressing nuances of the music. But the control of edges on ice is always fundamental.

OR

*There's a long history of skating technique and . Different countries have different ways of encouraging their skaters to develop high skill levels in basic technique, advanced spectacular elements like jumps and spins, and integration of these skills into coherent programs that can be considered as athletic works of art. At the international level we don't care about how skaters develop the skills along the way -- we want to produce a competitive product that will appeal to the general public and draw them in to care about the results. Historically a large percentage of skating fans have also been fans of performing arts and love skating for the same reason they love dance and other artforms. Fans from both arts and other sports also tend to place a strongly negative value on blatant errors such as falls, which are unfortunately fairly common in skating due to the nature of the medium. So any competition system should be designed to heavily reward artistry and heavily penalize disruptive errors such as falls.

Each of those three ways of summarizing the sport would probably produce different approaches to designing a scoring system.

Do we tell the task force to take one of the following general approaches, or do we leave it up to them to decide which

*The evaluators should look at the whole performance -- the big picture -- and rank the performances holistically in terms of overall value. They'll use their expert knowledge of skating difficulty and technical quality to inform their rankings, but because so many of them are hard to quantify and hard to separate from each other, we'll focus on scoring at the level of whole programs.

OR

*The evaluators look at all the individual skills -- basic skating, individual elements, visual appearance of the body while performing the moves, relation to the spectators and the space and the music -- and evaluate each of them separately. Skaters and the public should be able to see scores for each of these parts and global aspects of the performance. The results will be based on adding up the scores for all the different parts and aspects.

Today's fans demand justification for every individual mark on every element and component, poring over and debating about parts and all of the protocols. They will never stand for judges' simple verdict on one skater's presentation over another's. That time has passed.

Obviously some fans today (and some skaters, and some judges) do not prefer to see marks for every element and component and would prefer holistic marks for the whole program. But these are mostly fans who learned the sport through the 6.0 system and don't like the change. I suppose one could argue that these are "yesterday's fans" -- but if the skating world wants to keep these fans as fans for the present and future, it can't just dismiss them as part of the past just because the scoring system they got familiar with is now in the past.

I'd be really curious to find some people who know a lot about evaluation systems and nothing about the history of figure skating scoring and see what they would come up with given different descriptions of the task.
 

gmyers

Record Breaker
Joined
Mar 6, 2010
Weir was burned very very badly by the Inman and Nichol dictatorship over what should get high scores in PCS. Weir went out there and probably had the best overall performance of all the men. No quads, no inman-nichol choreography. You could see how quads could almost triumph over the nichol-inman dictatorship. But even on the technical side he did his second 3A in the wrong place. The system was dictating and demanding where to place his second triple axel in order to maximize points no matter where the jumps made sense to the skater and their music or feeling.
 

skatinginbc

Medalist
Joined
Aug 26, 2010
I'd be really curious to find some people who know a lot about evaluation systems and nothing about the history of figure skating scoring and see what they would come up with given different descriptions of the task.
If you ask anyone who holds a degree in Research, Evaluation and Measurement, I guarantee you will discover what they talk about is mostly two words: "Reliability" and "Validity". A consultant that designs a scoring system without considering reliability and validity is not a scientist but an artist, probably a con artist. Indeed, the design itself is more of an art. Through exploring the objectives (e.g., cheating prevention, feedback to skaters, etc) and constraints (e.g., maximum number of judges, cost, time, equipments, etc.), the consultant gets the idea whether a holistic measurement or a cumulative evaluation can better meet the needs. Then he will rely heavily on literature reviews,
interviews and focus groups to define the contents (e.g., elements, appropriate weight for each skill, etc.) and use his creativity to come up with a dazzling system. Not much science about it, more like an interior design.

Now, here come the scientific parts:
1. Content Validity (The extent to which a measurement covers the content that it is supposed to measure): What are the quantitative methods the ISU used when analyzing opinions of the selected experts and representatives in determining the "content" (i.e., what constitutes good skating)? Apparently Salé and Pelletier, who said the new system killed the art, were not selected, and so did other like-minded former champions. Is the scoring system comprehensive and representative of the content? Well, field moves are disappearing, which says something about whether CoP proportionally reflects all aspects of skating.
2. Construct Validity (The extent to which a measurement actually measures what is intended to measure): Misha Ge beat Chan and Dai in choreographic step sequence, which tells us what traits were actually being measured: the emoting face and flailing arms, not so much about the steps. "Landing a quad gets PCS boost from the judges" violates "discriminant validity" (Tests designed to measure unrelated skills should not correlate highly). Peter the commentator exclaimed, "That (Dai's SP) to me should win..Wow, er, the presentation score, er...er, far below." It raises a question of "convergent validity" (Tests designed to measure the same construct should correlate highly amongst themselves). Presentation scores (PE and IN in particular) under CoP should correlate highly with the ones from the perspective of 6.0.

There are a lot more about validity, and I haven't talked about reliability yet, but that's enough for today, or it would be a long thesis that bores people to death.
 

gkelly

Record Breaker
Joined
Jul 26, 2003
Thanks for the response, skatinginbc. Please continue if you have time.

1. Content Validity (The extent to which a measurement covers the content that it is supposed to measure): What are the quantitative methods the ISU used when analyzing opinions of the selected experts and representatives in determining the "content" (i.e., what constitutes good skating)? Apparently Salé and Pelletier, who said the new system killed the art, were not selected, and so did other like-minded former champions.

That probably comes down to the question of defining good skating. Is it primarily good execution of elements? Good use of the blade across the ice between and within elements? Or good art performed by use of skating skills?

Obviously experts who define "good skating" differently will come up with different answers about how to measure it or whether the current system is succeeding.

Even within those broad approaches, there will be differences of opinion over whether to privilege, e.g., speed vs. complexity as a measure of skating skill.

How can an evaluation system either reconcile the different experts' different opinions or else define the subject matter so clearly that everyone is on exactly the same page using exactly the same definition?

Is the scoring system comprehensive and representative of the content? Well, field moves are disappearing, which says something about whether CoP proportionally reflects all aspects of skating.

I'm not sure what you mean by "field moves are disappearing" -- how you're defining field moves, and at what points in skating history you think there are more and when you think they started to disappear.

To a large degree the presence of certain field moves has been dependent on what the program definitions required rather than on how they were scored.

E.g., before about 1980, men's programs were 5 minutes long instead of 4 1/2, and skaters weren't doing as many as 7-8 triples let alone any quads, so there was a lot more time for field moves.
Ca. 2000-2004 the well-balanced program rules required senior men to include a field moves sequence in their long programs. So they almost all included them during that period, combined into a sequence. Before that, skaters who could do those moves well tended to include them and skaters who couldn't tended not to.
Under IJS ca. 2004 to present the well-balanced program rules require two step sequences of senior men in their long programs. So that leaves less time for field moves, which would be included as transitions by those who do them well and ignored by those who don't (or who need more time to set up scoring elements so don't have time for transitions).

Change the well-balanced program rules, or short program required elements rules, or change the time limits, and you'll also likely see a change in program content, without changing the scoring system.

Change the scoring system without changing the rules, and you'll see less change in content.
 

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
In Beverley Smith article:

The presentation mark has five elements to it and judges mark each aspect on a scale of one to 10.

This is inaccurate and misleading. It is not presentation mark. It is program component mark. "Presentation Mark" sounds more artistry filled while Program Component Mark" or "Program Component Score" has 40% technical content in it.

The system also appears to have improved the ice dancing event, an ethereal discipline that attracted the most nefarious deal making in the past.

If a system has improved the ice dancing event, I'd think this system is basically a good system on reflecting artistic aspect. The problem is not the system. The problem is the skaters in singles and pairs who are mostly highly challenged. While men's skating has very much met this challenge with various complexy beautiful programs, the ladies and the pairs have not.

My solution on it is to seperate the Ladies and maybe the pairs scoring from men's and ice dance's. Give different, less challenging criteria for the ladies and the pairs in order to fit them. Then we'll see more beautiful programs from them in a few years once they've adopted and feel comfortable with the new scoring system.

Further more, blaming the general interest decline solely on CoP is absolutely not fair and wrong. I think the people who hold the string on spreading the knowledge of new scoring system, the ISU, the commentators, the journalists, the coaches, the skaters, have much more resposibilities to the decline of the interest than the system itself. I, for one, don't know how to hold my interest in a sport for which I have not much knowledge, and also constantly hear the commentators tell me that it is a fluke, predetermined sport.

The reason that there is much less problem for Asia is, as I understand, because Asia has very much accepted the CoP whole heartedly and put the energy towards meeting the challenge of the system instead of constantly complaining about the system.
 
Last edited:

hurrah

Medalist
Joined
Aug 8, 2009
In Beverley Smith article:
Further more, blaming the general interest decline solely on CoP is absolutely not fair and wrong. I think the people who hold the string on spreading the knowledge of new scoring system, the ISU, the commentators, the journalists, the coaches, the skaters, have much more resposibilities to the decline of the interest than the system itself. I, for one, don't know how to hold my interest in a sport for which I have not much knowledge, and also constantly hear the commentators tell me that it is a fluke, predetermined sport.

The reason that there is much less problem for Asia is, as I understand, because Asia has very much accepted the CoP whole heartedly and put the energy towards meeting the challenge of the system instead of constantly complaining about the system.

Well, no. Asian media and fans---at least in Japan---very well know how pre-determined the whole thing is. Do you know that Fuji television, the station that showed 4CC, did a news piece of Mao's performance later in the evening and they showed Mao's triple-axel at least NINE times consecutively, and I mean literally nine times from different camera angles and at different speeds, so the viewer had ample opportunity to scrutinize what an 'under-rotation' looks like? Of course no one on the show inc. Shizuka who was commentating said that it wasn't an under-rotation. They only clearly showed her triple-axel many, many times.

(Incidentally, I don't think Mao only lost out this time due to the whims of CoP judging. Her double-axel-triple toe combo was underrotated and wasn't called and I think she flutzed---not completely sure---and she wasn't called. Ashley's triple-triple combo in SP was also underrotated and wasn't called and she also probably flutzed---assuming her technique hasn't suddenly changed as no footage of that jump seems to be in circulation---and she wasn't called.)

It IS predetermined---only to a certain extent tho, which makes the whole situation even more irritating and annoying.

There are many figure skating fans in Japan now most primarily because whole of Japan fell in love with Mao when she was 15, and she has never done anything to betray that love; she has only behaved to increase it and strengthened it. And of course the other skaters are great to watch as well and no one seems to give snooty interviews or make grandiose claims. They appear quite humble and supportive of each other, so it's heartwarming to engage with these skaters as media figures. So I'd say CoP is seen as the grand evil against which our most humble and sweet skaters perservere. If it weren't for these great figure skating personalities, there would be no interest in figure skating in Japan, particularly as it seems so stupidly predetermined.

Nowadays, it's common for people to have access to high-definition television and recorder. Once it's recorded the footage can be watched frame by frame by anyone, and then there's Youtube and what not. You don't have to be a figure skating specialist to scrutinize everything in detail.
 
Last edited:

mmcdermott

On the Ice
Joined
Dec 3, 2011
The reason that there is much less problem for Asia is, as I understand, because Asia has very much accepted the CoP whole heartedly and put the energy towards meeting the challenge of the system instead of constantly complaining about the system.

This reminds me of an interesting presentation I saw at work a few years ago. The presenter was talking about how different cultures value different things. Western cultures place more value on soft skills (in the workplace), whereas other cultures place more value on hard skills (e.g. when hiring for a job, a North Americans place more emphasis on communication skills, ability to work well with others, etc.). I wonder if that idea could apply here: North Americans are more interested in a holistic evaluation of a program that values emotional impact, whereas Asian cultures are more interested in hard numbers.

Anyway, I think blaming the scoring system for loss of popularity is kind of missing the point. Asian skaters have been dominating singles and pairs skating for the past 8 or 9 years or so. In any major competition there will probably be multiple medal contenders from Asia in those disciplines. In the same way, North Americans dominated singles skating in the 1990's, when popularity was at it's peak here. People want to see athletes from their country winning more than anything else, IMO.
 

skatinginbc

Medalist
Joined
Aug 26, 2010
That probably comes down to the question of defining good skating. Is it primarily good execution of elements? Good use of the blade across the ice between and within elements? Or good art performed by use of skating skills? Obviously experts who define "good skating" differently will come up with different answers about how to measure it or whether the current system is succeeding. Even within those broad approaches, there will be differences of opinion over whether to privilege, e.g., speed vs. complexity as a measure of skating skill. How can an evaluation system either reconcile the different experts' different opinions or else define the subject matter so clearly that everyone is on exactly the same page using exactly the same definition?
How do we know what the public want from a presidential debate since there seems to be no consensus? There is a scientific approach: Quantifying the opinions. First, give out open-ended surveys to a sample of experts. Second, design scaled questionnaires based on the open-ended feedback. Third, administer questionnaires with proper sampling methods. Forth, quantify the results and analyze.....That's how one decides the content and the weight for each skill to be measured.
I'm not sure what you mean by "field moves are disappearing"
Less and less skaters (especially men) that are willing to include a prolonged edge-holding move in their competitive programs, as you said, partially because it takes away too much time and partially because they have an option of not doing it.
Change the well-balanced program rules, or short program required elements rules, or change the time limits, and you'll also likely see a change in program content, without changing the scoring system.
Any tweak (e.g., changing the well-balanced program rules, etc.) will result in a different measurement because its content or relative weight for each trait has changed. After multiple tweaks, its reliability and validity must be re-examined, and I am not sure if it has been done by the ISU. Mathman once told me that ISU conducted validity tests during the developing stage of the NJS. And coincidentally, the early stage of the NJS was the period when I personally found many memorable programs (e.g., Jeffrey Buttle's) that exhibited a good balance of athleticism and artistry. The NJS today is different from the NJS then. It has yet to be validated.

Another aspect of a measurement is its impact evaluation. Just as students learn from the tests, skaters adjust their training to meet the demand of the scoring system as well. Analyzing changes in learning behavior can give some insight into what are actually being measured. "Contorted spin positions/ edge changes/ footwork that is all very similar/ makes watching skating boring especially at the lower levels. I know a coach that said one of her students got a mitten stuck on his skates while doing a sit spin position and voila a new position was born however ridiculous." (http://www.theglobeandmail.com/spor...stem-still-has-flaws/article2335067/comments/)
 

gkelly

Record Breaker
Joined
Jul 26, 2003
Well, no. Asian media and fans---at least in Japan---very well know how pre-determined the whole thing is. Do you know that Fuji television, the station that showed 4CC, did a news piece of Mao's performance later in the evening and they showed Mao's triple-axel at least NINE times consecutively, and I mean literally nine times from different camera angles and at different speeds, so the viewer had ample opportunity to scrutinize what an 'under-rotation' looks like? Of course no one on the show inc. Shizuka who was commentating said that it wasn't an under-rotation. They only clearly showed her triple-axel many, many times.

(Incidentally, I don't think Mao only lost out this time due to the whims of CoP judging. Her double-axel-triple toe combo was underrotated and wasn't called and I think she flutzed---not completely sure---and she wasn't called. Ashley's triple-triple combo in SP was also underrotated and wasn't called and she also probably flutzed---assuming her technique hasn't suddenly changed as no footage of that jump seems to be in circulation---and she wasn't called.)

It IS predetermined---only to a certain extent tho, which makes the whole situation even more irritating and annoying.

If in fact underrotations that were clearly visible to you and other viewers weren't called, that suggests limitations either in the video equipment available to the technical panel or in the competence of the members of the technical panel. It doesn't suggest predetermination.

You really think that the technical panel gets together before the event and says "We must make sure to call more underrotations for skater M than for skater A, no matter how many jumps either of them actually underrotates, because skater A is supposed to win this event and our job is to make that happen"?

How do we know what the public want from a presidential debate since there seems to be no consensus? There is a scientific approach: Quantifying the opinions. First, give out open-ended surveys to a sample of experts. Second, design scaled questionnaires based on the open-ended feedback. Third, administer questionnaires with proper sampling methods. Forth, quantify the results and analyze.....That's how one decides the content and the weight for each skill to be measured.

Well, I don't think analysis of presidential debates is any more objective than analysis of figure skating programs. And I also believe that the public can be taught to adopt different opinions about what matters in a presidential debate and who "won" any given debate depending on which media analysts they pay attention to or what the media tell them beforehand will be important. Different questions (about what constitutes a winning debate) will generate different answers.

mmcdermott suggests that there may be cultural differences across figure skating countries/regions regarding the kinds of questions they encourage judges to ask about what makes good skating. I think there are also cultural differences between skating experts as a group vs. fans. And on a finer scale between coaches vs. judges vs. TV commentators.

Less and less skaters (especially men) that are willing to include a prolonged edge-holding move in their competitive programs, as you said, partially because it takes away too much time and partially because they have an option of not doing it.

Let me make the point even more blatantly.
How many spirals did you see in ladies short programs prior to 1988 (period A)?
How many did you see 1989 through 2003 (period B)?
2004 through 2010 (period C)?
2011 through present (period D)?

There was a much bigger change between 1988 and 1989, and between 2010 and 2011, because spiral sequence was a required element between 1989 and 2010 and was not required in short programs before or after that period.

There were definite changes in the kinds of spiral sequences that we saw, including kinds of positions, variety of edges, and how long each position was held with the switch in judging systems. There were also changes during the IJS period based on rule changes about what features earned a higher level that year or not. But if you choose a random sampling of ladies' short programs from periods A, B, C, and D, you'll see more similarity in spiral content between B and C than you will between either of those two periods and A or D (and vice versa). That's because of the required element rules, not the judging system.
 

Jaana

Record Breaker
Joined
Jul 27, 2003
Country
Finland
Weir was burned very very badly by the Inman and Nichol dictatorship over what should get high scores in PCS. Weir went out there and probably had the best overall performance of all the men. No quads, no inman-nichol choreography. You could see how quads could almost triumph over the nichol-inman dictatorship. But even on the technical side he did his second 3A in the wrong place. The system was dictating and demanding where to place his second triple axel in order to maximize points no matter where the jumps made sense to the skater and their music or feeling.

Although Weir jumped well at the Olympics, his choreography was real empty. It was originally created by David Wilson (in my opinion the best choreographer), but watered down. I have always thought that happened because Weir is not powerful enough to skate to a good CoP-choreography. He was at his best with a 6.0 choreography, where he could glide slowly on the ice and do beautiful hand gestures and poises.
 
Last edited:

seniorita

Record Breaker
Joined
Jun 3, 2008
^ Huh? His 2006 Olympic programs were of the few memorable ones and his Child of Nazareth and Love is War of 2007 and 2008 were also a fantastic program. I believe the Notre Damme program didnt have the exposure it deserved. All under CoP but not all the same.
In 6.0 programs you didnt glide slowly anyway.
 
Top