Globe and Mail: Figure skating judging system still has flaws | Page 6 | Golden Skate

Globe and Mail: Figure skating judging system still has flaws

Joined
Jun 21, 2003
skatinginbc, do you believe that it's inherent in any "add the points system" to negate/diminish quality as related to difficulty?

I don't think that is an impossible question to tackle. If we decide that quality of execution should be better rewarded

[ -- I can't seem to stop using 6.0 language. In the CoP I should have said "more rewarded" -- :) ]

then we could take greater reductions for falls and other major errors. We could easily juggle the books so that a badly performed quad is worth less than a well-performed triple loop, for instance.

The main criticism offered by the author of the Wikipedia article seems to be that the gymnast with the highest base value in terms of difficulty will automatically win regardless of the quality of his performance.

By the way, as an aside, I never thought thought of gymnastics as having an "artistic" or "performance" aspect. Whoever did the hardest tricks won. The "execution" part meant, "I can hold an iron cross more steady than the other guy can."
 

Bluebonnet

Record Breaker
Joined
Aug 18, 2010
This is what they say about artistic gymnastics Code of Points in Wikipedia:
"In 2006, the Code of Points and the entire gymnastics scoring system were completely overhauled. The change stemmed from the judging controversy at 2004 Olympics in Athens, which brought the reliability and objectivity of the scoring system into question, and arguments that execution had been sacrificed for difficulty in artistic gymnastics."
"Since its inception in major events in 2006, the Code has faced strong opposition from many prominent coaches, athletes and judges. Proponents of the new system believe it is a necessary step for the advancement of gymnastics, promoting difficult skills and more objective judging. Opponents feel that people outside the gymnastics community will not understand the scoring and will lose interest in gymnastics, and that without the emphasis on artistry, the essence of the sport will change. Many opponents of the new scoring system feel that this new scoring system, in essence, chooses the winners before the competition ever begins. Competitors no longer compete on the same level. Each contestant begins with a unique start value; therefore, contestants assigned a lower start value or difficulty rating are knocked out of the winner's circle before the competition begins. They may compete, but they cannot win. A competitor with a higher difficulty rating will begin the competition with a much higher score...There has also been concern that the new Code strongly favors extreme difficulty over form, execution and consistency. At the 2006 World Championships, for instance, Vanessa Ferrari of Italy was able to controversially win the women's all-around title in spite of a fall on the balance beam, in part by picking up extra points from performing high-difficulty skills on the floor exercise.The 2006 Report of the FIG's Athletes' Commission, drafted after a review and discussion of the year's events, noted several areas of concern, including numerous inconsistencies in judging and evaluation of skills and routines. However, the leadership of the FIG remains committed to the new Code...."


Sounds familiar? That clearly tells us that the Code of Points is intrinsically flawed in design, no matter if it is applied to Figure Skating or Artistic Gymnastics.
(1) It does not necessarily improve reliability
(2) It does not necessarily improve objectivity
(3) It kills artistry and changes the essence of the sport
(4) People complain about winning with a fall. "They may compete, but they cannot win". If you don't believe it, just read through the thread "Can Takahashi Close The Gap On Patrick Chan", and you shall figure out the consensus.
(5) It is too complicated for the casual fans to understand.
(5) The leadership (the ISU and FIG) remains committed to new Code despite all criticisms. There must be something good behind the closed doors.

skatinginbc, do you believe that it's inherent in any "add the points system" to negate/diminish quality as related to difficulty? I obviously am pro-COP, but if it's inherently flawed and impossible to improve, I'd be happy to see 6.0 (or whatever) brought back. Granted, I'd stop watching the sport, but that's not a big deal to me.

Wikipedia can broad one's view but it is absolutely not a reliable source for anything. Anybody could write on it and change it. School teachers even normally don't accept Wikipedia as one of the sources for reference in research from students' projects.;)

This quote from Wikipedia tells me that it was written by somebody/somebodies who is/are against code of point. That's all.:p And part of the conclusion Skatinginbc made from Wikipedia is totally wrong.
 
Last edited:

ivy

On the Ice
Joined
Feb 6, 2005
The look at gymnastics is informative to me. It's been a sport I've followed casually for as long time. I watch it during the Olympics or if I happen upon it during a weekend afternoon (which seems to be less and less). It does seem that programs once had an element of beauty, but now they're just hard. Each competition I find a little more boring then the last I watched.

Counting the rotation of a jump is acceptable because it is pretty much one dimension and involves minimum creativity. Counting twists and turns in a footwork section, in my opinion, is a different story.

That's a really good point and it identifies what bugs me about CoP. I'm not sure what the best way to judge these elements that really can be both beautiful and athletic. Flatten out levels a bit and make GoE more about aspects that we associate with PCS or the old Artistry mark?

I understand the desire to straighten out questions of the technical side so there is more consistency and confidence across skaters and competitions, but I'm not attracted to the idea of adding more human resources into measuring technical aspects. More cameras, more computers - great. Most of the evaluators, I think, should be concentrating on the quality of the elements and the more subjective qualities of the program.
 

skatinginbc

Medalist
Joined
Aug 26, 2010
CoP must have killed artistry in gymnastics, otherwise this article would not have existed: "China and Canada want to put the “artistry” back into gymnastics" (http://www.gymcan.org/site/news.php?id=330).

Just to be clear, my doubt about CoP's reliability is not so much in GOEs (similar to the E-Score in gymnastics) as in the Technical Panel decisions (similar to the D-Score in gymnastics). The E-Score has been proven quite reliable (www.fsp.uni-lj.si/mma_bin.php?id=2010020920531429). Conveniently, the D-score (Difficulty Score), the mother of all evils, is designed as such that "due to the D score being a combination of two judges’ evaluations reliability and validity cannot be calculated." That's a polite way to say that it is not a reliable design to begin with.
Judging difficulty levels based on holistic impression has been proven reliable in dance performance (http://www.citraining.com/Performance-Competence-Evaluation-Measure.html).
 
Last edited:
Joined
Jun 21, 2003
^
“If gymnastics displayed its full artistry we wouldn’t need ballet.”

As this article points out, the problem is more urgent in gymnastics than in figure skating because in addition to allowing the audience to wander away, the push toward harder and harder tricks also poses risks to the athletes.
 
Joined
Jun 21, 2003
Just to be clear, my doubt about CoP's reliability is not so much in GOEs (similar to the E-Score in gymnastics) as in the Technical Panel decisions (similar to the D-Score in gymnastics). The E-Score has been proven quite reliable (www.fsp.uni-lj.si/mma_bin.php?id=2010020920531429). Conveniently, the D-score (Difficulty Score), the mother of all evils, is designed as such that "due to the D score being a combination of two judges’ evaluations reliability and validity cannot be calculated." That's a polite way to say that it is not a reliable design to begin with.

Judging difficulty levels based on holistic impression have been proven reliable in dance performance (http://www.citraining.com/Performance-Competence-Evaluation-Measure.html).

The ISU will never give up the technical panels and assign more authority to the judges. The reason is that the technical specialist, assistant technical specialist and the technical controller are appointed directly by the ISU, whereas the judges are nominated by their national federations.

The ISU's position on the Salt lake City judging scandal was that the problem lay not with the saintly ISU but with the scoundrels that run the national federations. The federation heads conspire with each other, display national bias, and pressure their judges to vote the party line.

Take authority away from the judges and give it to officials working directly for Cinquanta -- problem solved.

This is also the reason for anonymous judging -- if those scalawags, the national federation heads, try to pressure their own hand-picked judges, said hand-picked judges can sneak around behind their bosses backs, vote the wrong way, then lie to their bosses about it. Again, problem solved.
 
Last edited:

skatinginbc

Medalist
Joined
Aug 26, 2010
The ISU will never give up the technical panels and assign more authority to the judges. The reason is that the technical specialist, assistant technical specialist and the technical controller are appointed directly by the ISU, whereas the judges are nominated by their national federations.
That's beyond my filed of study. We have to ask somebody in Political Science to find a solution. :biggrin:
 

gkelly

Record Breaker
Joined
Jul 26, 2003
The term "Code of Points" is confusing because the 10-scale old system in gymnastics, based on my quick internet search, is also called "Code of Points".

The figure skating judging system is not officially called "Code of Points." Fans of both gymnastics and figure skating started calling it that, and many other fans and some skating insiders adopted the usage. So unofficially, that may be what the most people call it, if you count fans as well as participants.

Originally it was usually referred to as the new judging system, but by nature that name was inherently destined to become obsolute pretty quickly.

I don't know that it really has an official name. In the US and elsewhere, officially it is called the International Judging System = IJS. The ISU calls it the ISU Judging System, so IJS would also work.

I have no problem with "adding the points system". I have problems with "objectively" counting features for an integrated skill. For instance, to evaluate speaking skill, one can "objectively" count the number of grammatical or semantic errors in one's speech. But this is the problem: Is it valid to arbitrarily place one who makes 7 errors in a different level from one who makes 8 errors? It ignores the possibility that Speaker B, despite having more errors, might have compensated that with a better communication skill. Counting the rotation of a jump is acceptable because it is pretty much one dimension and involves minimum creativity. Counting twists and turns in a footwork section, in my opinion, is a different story.

That's a really good point and it identifies what bugs me about CoP. I'm not sure what the best way to judge these elements that really can be both beautiful and athletic. Flatten out levels a bit and make GoE more about aspects that we associate with PCS or the old Artistry mark?

So what is the best way to evaluate the difficulty of a step sequence.

Keep in mind that, like all other elements, step sequences will be evaluated on both difficulty and quality.
Under 6.0, judges could use their own judgment and knowledge of skating technique to evaluate both the difficulty and quality, as well as how well the choreography and execution went with the music and enhanced to program theme, and come up with a holistic assessment of the step sequence as a whole.
Which left a lot of room for differences of opinion since judges might differ in whether they gave more weight to technical content edge quality or quickness or musical interpretation, etc. (and still do, in assigning GOEs). Also judges who had been high-level skaters themselves had a better feel for the difficulty of the steps and turns and the way they were combined compared to judges who had only reached lower levels of accomplishment on the ice themselves, and much more than judges who had never figure skated themselves at all.

Surely there were some guidelines given in judge training about how to evaluate the difficulty of a step sequence, but nothing that I saw officially published. The only rule of thumb I can think of offhand was that turning in both directions was worth more than turning only in the skater's preferred direction.

When school figures were part of the competition structure, it was impossible to place well or even to qualify to skate at the higher levels without demonstrating the ability to do all the different turns on all the different edges with good edge control. So those aspects of basic skating technique were a big part of competition results.

Later in the 1990s-early 2000s, skaters started to reach senior level without ever having done any school figures; some of them didn't even know what all the different turns were much less have the ability to execute them all. So the technical content of step sequences in that period tended to be lower than in the 1980s.

Those skills are much more fundamental to what makes figure skating figure skating (as opposed to speed skating or acrobatics on ice or ballet on ice, although good control of edges is what allows figure skaters to skate with speed and to execute acrobatic and balletic or other dance-like moves with security). In free skating, they are demonstrated in step sequences and also in the in-between skating.

Under the IJS, difficulty and quality of edge-based skills are rewarded under the Skating Skills and Transitions components. The step sequences (and to a lesser extent spiral sequences) are really the only places that pure skating skills contribute to the Total Element Score as well.

Personally, I don't love the specific ways the features for step sequences are defined. And I'm not 100% wedded to the idea that step sequences must have features that the skater either gets credit for or not to step up the level. Maybe the "choreo" sequence approach is a good compromise between there being no technical points for step sequences and very detailed requirements for levels in which skaters attempt to do the same kinds of steps in similar combinations to get the highest base mark, regardless of execution.

But I do think it's important for there to be a way to acknowledge that a step sequence with deep edges and several difficult types of turns in different directions is more difficult, and more difficult to execute well, than a sequence with no turns harder than mohawks and three turns in the good direction, or turns with unidentifiable shallow or flat edges not held on the exit edges. So if there are to be no levels for a step sequence, and individual judges are each going to reach their own assessments of the merit of each step sequence, then the judges need to be allowed and encouraged to reflect the difficulty in their GOEs. And there need to be some guidelines on what kinds of difficulty to reward.

Otherwise we'd have skaters doing easy sequences that they can execute well getting higher marks than skaters who challenge themselves with more difficulty at the expense of a little bit of speed or quickness. If skaters who never do anything harder than threes and inside mohawks score better than skaters who include difficult turns, then the difficult turns will disappear from the skating repertoire again.

So what's the best way to encourage skaters to include the best balance of difficulty and quality and artistry for their own skills and to reward them appropriately?
 

ivy

On the Ice
Joined
Feb 6, 2005
I'm the first to say I don't know what answers are. I'm just an average armchair fan with a pair of figure skates in my closet. And for some crazy reason I enjoy thinking about the whys and wherefores of how skating is judged. Too much time on my hands, I guess

To me one of the best aspects of CoP is the emphasis on skating skills. Step sequences with deep clear edges and variety of turns and positions should be rewarded. But I would prefer a short level 3 footwork sequence that made choreographic sense and was really performed over a long labored level 4 sequence and I don't have a problem with Level 3 +3 GoE (maybe a few less turns and positions, but those executed all deep and clear edges, with arm and body positions controlled and fully executed) getting more points, the Level 4 +1 GoE. Truthfully though, I'm not a sophisticated enough viewer to tell the difference between a level 3 and a level 4 sequence, though I think I could tell the difference between a sequence that gets a +1 GoE and one that gets a +3. From my observation skaters/coaches/choreographers are more likely to go for the highest possible level they can achieve, regardless of what GoE they might receive. I would guess that seems more predictable and controllable, where as GoE seems more like a roll of the dice. Though from skatinginbc's posts it seems as though the opposite may be true.

Spins are a little easier for me to judge as a TV viewer. Personally I would give a typical layback spin from Angela Nikodinov (maybe only a level 2 because of lack of variety of positions?) far far more points then the typical level 4 spin, muscled up to a Beilmann position while getting slower and slower, that I see in so many programs today. To me, one of Angela's laybacks - centered, with beautiful turn out and sensitive use of arm positions the reflect the mood and character of the music - is worth about the same as a triple loop, while some of these labored level 4 spins are worth more like a single axel.
 
Last edited:

skatinginbc

Medalist
Joined
Aug 26, 2010
So what is the best way to evaluate the difficulty of a step sequence?
Big question = Long answer. Let me clarify some definitions first:

(1) cumulative scoring: A method of scoring whereby points accumulated on individual elements (items or subtests) are tallied.
(2) categorical scoring: A method of scoring in which a performance is placed into a category.
The NJS is actually more of categorical scoring. It assigns difficulty levels (category) to an executed element and gives rating to PCS on a 10-point scale (category). Not all evaluated items are independent from each other (e.g., falling on the last jump may result in missing part of the final spin due to time loss) and therefore they are not truly "subtests" or independent items. And the difficulty difference between levels is not equal like the difference between a temperature of 4 degrees and 3 degrees is the same distance as between 3 degrees and 2 degrees. The NJS gives an appearance of having an interval scale while in fact it is composed of mostly ordinal data.

(A) analytic rubric: a guideline for identifying and assessing components (or features) of a performance.
(B) holistic rubric: a guideline for assessing a performance across multiple criteria as a whole.
Holistic rubrics are used when it is difficult to evaluate performance on one criterion independently from other criteria. Footwork sections for instance, where edges and turns and upper body movements all occur at the same time and intertwine with each other, are good candidates for holistic scoring.

So what is the best way to evaluate the difficulty of a step sequence? My answer:
(1) Treat it as a multi-facet, inter-correlated skill and rate it with a categorical, holistic approach. So first, we should identify those facets through gathering expert opinions and multivariable analysis. Say, we come up with the consensus: A footwork sequence involves (A) various edges, (B) types of turns in different directions, (C) speed/quickness, and (D) upper body movements. Then we design a holistic rubric based on those facets (Here are some examples of rubrics : http://www.csub.edu/TLC/options/resources/handouts/Rubric_Packet_Jan06.pdf, http://jfmueller.faculty.noctrl.edu/toolbox/rubrics.htm, http://books.google.ca/books?id=vMY...ce=gbs_ge_summary_r&cad=0#v=onepage&q&f=false). And of course, a library of sample performance footage representative of each level is established for rater training (An example of rater training: http://usny.nysed.gov/rttt/teachers-leaders/practicerubrics/Docs/Pearson_Implementation.pdf ). Raters should be calibrated periodically and before the competition (Example: http://www.ride.ri.gov/highschoolreform/dslat/pdf/por_100201.pdf).
(2) Ideally, I would like to have a panel of 7 to decide the level of an element, and the average of the middle five would be the level score. But I think it may not be feasible for the ISU, so this is my compromise: Two specialists rate an executed element independently with a holistic approach. If there is an inconsistency between the two, the controller reviews the footage, rates it based on an analytic rubric (rating each criterion separately) and decides the final level (Say, if two out of the first three criteria fall into level 4 but the other two into level 3, it is deemed Level 4).

Under 6.0,..since judges might differ in whether they gave more weight to technical content edge quality or quickness or musical interpretation, etc. (and still do, in assigning GOEs). Also judges who had been high-level skaters themselves had a better feel for the difficulty of the steps and turns and the way they were combined compared to judges who had only reached lower levels of accomplishment on the ice themselves, and much more than judges who had never figure skated themselves at all.
By separating Difficulty from Execution, the raters can concentrate on fewer things and thus the inconsistency in scoring criteria among them can be somewhat reduced. It can be further reduced through training. Are the judges/specialists tested annually and expected to obtain a pass score of, say, a minimum of 80% of inter-rater and intra-rater consistency? Calibration before a competition is also very important.

So what's the best way to encourage skaters to include the best balance of difficulty and quality and artistry for their own skills and to reward them appropriately?
Currently the base value for a 4T is 10.30. If one receives the maximum GOEs or straight +3s, the value for that element would be 13.30. It means that the E-Score contributes only 32% of the total score for that jump, not 50%.
Dai under-rotated and fell on his 4T at the 4CC and received the worst possible GOEs he could get, and yet he still pocketed 4.20 with that. It means that the E-Score negatively contributes only -3/7.2 = 42% to that failed jump, not even 50%.
My recommended ratio ==> Difficulty: Execution: Artistry = 1: 1: 1, or something like D-Score = 35%, E-Score = 35%, and A-Score = 30%.
 
Last edited:
Top