Ice scope data shows zero correlation between height & GOE

cohen-esque

Final Flight
Joined
Jan 27, 2014
Do you have a link to this data? I would love to see it. If indeed there is a stronger correlation for 2As, that may lend some credence to the theory that in judges minds, all 3As and quads check the height and distance box merely for being 3As and quads. I would disagree with that (clearly, Yuzuru's 3A is outstanding, and I would also like to give a mention to Boyang's 4Lz, and I think both should be rewarded for that. And in my opinion, even Samarin, whose skating is often validly criticized, deserves extra credit for the size of his quads.), but it would be an interesting addition to our knowledge of how judges behave. I agree that in general more data would be helpful.

Well, the graph for the 2A distance/GOE is still around; the data is... somewhere, hopefully, but I can't find it at the moment. The data for all the clean jumps vs PCS is here.

The sample sizes are pretty variable; I've broken it down by jump type since my first post. ALL types of solo jumps I could do anything with showed a correlation between PCS and GOE but the overall relationship is clearly being brought down by the 3Lo, for whatever reason (it showed a much weaker correlation than the other jumps).
 

Casual

On the Ice
Joined
Jan 26, 2018
You know what is severely lacking in ISU? Post-event analysis of judging.

I remember an eye-opening video with a side-by-side comparison of Yuna vs. Adelina Sochi skate. Shocking, to say the least. Now, why doesn't ISU perform similar audits of the judging? (Unless they are shamelessly corrupt. :laugh:)

It would be illuminating to compare the actual results vs. what they should have been - and then train the judges accordingly.
 

Edwin

СделаноВХрустальном!
Record Breaker
Joined
Jan 5, 2019
You know what is severely lacking in ISU? Post-event analysis of judging.

I remember an eye-opening video with a side-by-side comparison of Yuna vs. Adelina Sochi skate. Shocking, to say the least. Now, why doesn't ISU perform similar audits of the judging? (Unless they are shamelessly corrupt. :laugh:)

It would be illuminating to compare the actual results vs. what they should have been - and then train the judges accordingly.

Are certain there are no evaluations at all? No new judging videos and study course and examination materials produced from all the extra footage ISU has access to?
 

Elucidus

Match Penalty
Joined
Nov 19, 2017
IceScope sham exposure:
https://www.youtube.com/watch?v=C5LhKLXddts

There is literally one random man who just marks manually, by eye - on a strongly tilted tv image filmed from ceiling camera - two points (presumable beginning and end of jump) and draws direct line between them. The application would calculate the rest after that. Apparently this app was used for baseball before. So.. yeah - talking about "hi-tech" and "computer calculating" - it's just human entering the data manually with huge margin for error (they even don't determine exact area of a boot or blade from where height calculation should begin - with top-down angle it's just impossible) :drama::laugh:
 

Nilf

Rinkside
Joined
May 15, 2018
IceScope sham exposure:
https://www.youtube.com/watch?v=C5LhKLXddts

There is literally one random man who just marks manually, by eye - on a strongly tilted tv image filmed from ceiling camera - two points (presumable beginning and end of jump) and draws direct line between them. The application would calculate the rest after that. Apparently this app was used for baseball before. So.. yeah - talking about "hi-tech" and "computer calculating" - it's just human entering the data manually with huge margin for error (they even don't determine exact area of a boot or blade from where height calculation should begin - with top-down angle it's just impossible) :drama::laugh:
tbh you don't know how it works exactly. Maybe he just marks an area of interest to find and start tracking the path of the blade. There are a lot of cameras and I think they use computer vision. With quality and high-frequency frames produced by calibrated cameras, they can calculate the homography to 2d-surface and estimate height, length, velocity. It's not rocket science.
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
tbh you don't know how it works exactly. Maybe he just marks an area of interest to find and start tracking the path of the blade. There are a lot of cameras and I think they use computer vision. With quality and high-frequency frames produced by calibrated cameras, they can calculate the homography to 2d-surface and estimate height, length, velocity. It's not rocket science.

There still is inconsistency though - note how sometimes the curve of the jump matches where the skater's feet are and other cases there is a gap between the feet and the curve and other times it overlaps with the skater's feet. Ice scope is a fun frivolous viewer tool to enjoy but the accuracy of it has yet to be proven. Also people are using ice scope results as a comparative be all and end all of how well a skater jumps is inaccurate because skaters execute jumps bigger/better or smaller/worse from competition to competition. For example Hanyu's 4L had an exit speed of only 9 m/s but I'm guessing he's landed it with more speed/flow on the landing before. There needs to be a greater sample size and the API needs to be brought to various countries to see if there's a trend between the nationality of the ice scope operator and the nationality/popularity of the skater that they are assessing. It's a good step, but I'm still very skeptical about the accuracy of it.
 

NaVi

Medalist
Joined
Oct 30, 2014
While those other metrics are nice(height, distance, speed), time in the air should be an easier metric to get and probably a better metric to use.
 

macy

Record Breaker
Joined
Nov 12, 2011
kind of proves IJS didn't really get away from or "fix" the issues that 6.0 had...

:slink:
 

GF2445

Record Breaker
Joined
Feb 7, 2012
Ice scope is more for the audience and commentators. Something to continue the conversation.
Height and distance is only one of 6 bullet points for GOE (noted that it is one of the mandatory GOE bullets to receive +4/+5 GOE)
Remember these are guidelines that inform the judges' decision.

Maybe it would become more related the day AI is so advanced that the judges wouldn't be needed to mark the technical elements score.
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
kind of proves IJS didn't really get away from or "fix" the issues that 6.0 had...

:slink:

Well just like in 6.0 how a skater with great amplitude on their jumps might not have been credited for that, in this system it's still subjective to give a skater a GOE bullet for amplitude. Unfortunately you can only lead the judges to water - you can't make 'em drink! :p
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
Ice scope is more for the audience and commentators. Something to continue the conversation.
Height and distance is only one of 6 bullet points for GOE (noted that it is one of the mandatory GOE bullets to receive +4/+5 GOE)
Remember these are guidelines that inform the judges' decision.

Maybe it would become more related the day AI is so advanced that the judges wouldn't be needed to mark the technical elements score.

Maybe if judges knew they could get replaced by AI they would endeavour to be more accurate/fair. Of course, they're only human.
 

gkelly

Record Breaker
Joined
Jul 26, 2003
It's not so much that the rules require jump size to be the primary determinant of jump scoring and judges are either unable or unwilling to follow those rules.

Rather, the rules consider jump size to be one of many positive qualities to be rewarded. Perhaps there is an inverse relationship between jump size and some of the other positive qualities. Or a direct correlation between jump size and some of the negative qualities. For example, I'd expect bigger jumps, on average, to be more telegraphed.
 

Mathman

Record Breaker
Joined
Jun 21, 2003
What is intriguing to me in these discussions is the idea of judging by some sort of "artificial intelligence," versus just "using technology to obtain accurate measurements." The latter could easily be obtained if anyone wanted to put up the money.

What would be cool, though, would be a self-teaching program that could evaluate the GOE bullet point "jump matches the music." Then we could move on to PCS (Perfortmance: "physical, emotional, and intellectual involvement.")

A first step might be to tackle Transitions. It should be easy enough to teach a robot to recognize a Mohawk or a counter annd evaluate the bullet points "variety and difficulty."

This would not be impossible (though again, someone would have to care enough to actually do it.) The problem would be, how to evaluate the success of the program's self-modifying output loops so that it could bolster its successes and devalue its failed attempts. In most projects of this type the goal would be to develop a robot judge whose scoring would be indistinguishable from the scoring of human judges. (What type of flexing of muscle groups do humans regard as esthetically pleasing, etc.) This alone would be interesting and I think would shed light on the "art versus sport" conundrum.

But then again, we already have judges who judge like humans do. ;)
 
Last edited:

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
Height and distance and landing speed isn't everything. And looking at the data further I saw a significant example of it:

Look at Slavik Hayrapetyan's 3A: https://www.youtube.com/watch?v=pXDXocE6QWw#t=4m15s

Obviously the lean/forward landing drop the GOE to 0.11 ... although if he were a top-tier/more popular skater I'm sure it would have been higher -- and you did mention that PCS/planned BV correllates with higher GOE. Generally though, skaters with higher PCS also happen to have cleaner elements.

Slavik still had a 3A that was 3rd highest, 4th farthest and 2nd fastest on the landing speed. This example right here shows that a jump can have some of the highest "quantifiable" stats, and still not earn as high GOE as other attempts with lower quantifiable stats, due to other factors of the jump being considered (like the form breaks Slavik had).

i.e. it's not like the judges weren't considering the height/distance/speed - but they were also considering other jump aspects/issues when calculating their final GOE. In my opinion, the judges are more desensitized to lower-tier skaters with excellent height/distance, and focus more on the fact that he had form breaks and penalize him for it. Whereas a top-tier skater can have not as much height/distance and the same minor error but get higher GOE because the judge is predisposed to seeing that skater scored higher.
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
In essence, if you want to predict how much GOE someone will get for a successful jump, the jump height, jump distance, and landing speed will tell you virtually nothing, whereas the skater's PCS and planned jump BV is a much more reliable indicator.

(Oh, and here are the average stats for all men.)

  • Average height: 0.59m (standard deviation: 4.1 cm)
  • Average distance: 2.87m (standard deviation: 0.39 m)
  • Average landing speed: 14.71 m/s (standard deviation: 3.03 m/s)
  • Average GOE: 1.75 (standard deviation: 0.92)

You can find all the data for skaters who successfully completed (positive execution) their 3A in the spreadsheet. Notable figures (highest in each category bolded, second highest italicized):

SkaterHeight(m)Distance(m)Landing speed(m/s)GOE
Yuzuru Hanyu0.73.6215.33.43
Nathan Chen0.582.6617.12.74
Shoma Uno0.513.4418.33.09
Mikhail Kolyada0.652.5011.82.97
Vincent Zhou0.582.6916.71.6
Jason Brown0.602.3514.62.51
Boyang Jin0.572.55162.51
Morisi Kvitelashvili0.603.5117.01.83
Slavik Hayrapetyan0.643.1317.90.11

Here's the thing. The GOE bullet says "very good height and very good length (of all jumps in a combo or sequence)". BUT there is no actual number as to what constitutes "very good height" or "very good length".

Also important to note that it doesn't say SUPERIOR height or anything that implies the GOE bullet is awarded for better height/length relative to other skaters.

Now, what if, hypothetically, the minimum benchmarks to achieve this particular GOE bullet were 0.50 m for height and 2.35 m for distance?

Then all of these guys would be getting the height/distance GOE bullet on their triple axel, and minimal correlation between amplitude and GOE makes sense, since everyone's getting the minimum height/distance! :laugh:
 

Mathman

Record Breaker
Joined
Jun 21, 2003
Now, what if, hypothetically, the minimum benchmarks to achieve this particular GOE bullet were 0.50 m for height and 2.35 m for distance?

Then all of these guys would be getting the height/distance GOE bullet on their triple axel, and minimal correlation between amplitude and GOE makes sense, since everyone's getting the minimum height/distance! :laugh:

To me, this line of reasoning highlights the whole problem with positive bullet points and GOEs. You have 6 binary bullet points "yes" or "no," with no real explanation of what "yes" means. Then you add up the number of yeses. (Not only that, but you have to decide whether to write yeses or yesses.)

I doubt that the judges are actually doing this. I bet they are looking at the jump, saying to themselves "wow" or "meh" and putting down a number from 1 to 5.

IMHO, this makes sense. The base value is "what did you do?" and the GOE is "how well (qualitatively) did you do it?"
 
Last edited:

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
To me, this line of reasoning highlights the whole problem with positive bullet points and GOEs. You have 6 binary bullet points "yes" or "no," with no real explanation of what "yes" means. Then you add up the number of yeses. (Not only that, but you have to decide whether to write yeses or yesses.)

I doubt that the judges are actually doing this. I bet they are looking at the jump, saying to themselves "wow" or "meh" and putting down a number from 1 to 5.

IMHO, this makes sense. The base value is "what did you do?" and the GOE is "how well (qualitatively) did you do it?"

To me, this is how judges do it. It's very difficult for a judge to consider every single rule/deduction/bullet/etc. as they watch a jump. It's not like they're watching every single jump multiple times to cover every detail. So they make a call - some might focus on a particular aspect of a jump which is why we see disparities in GOE.

The nature of the competition and politics and the popularity of the skater they're assessing all come into play. I'm fairly certain the judges who judge a skater at Nationals or a fluff competition like WTT would judge them more strictly at Worlds (which obviously explains GOE inflation we see at Nationals).

One thing I always marvelled at were diving judges. Like, AFAIK, they don't get replays of a dive so they have to make a split second decision covering all aspects of the dive without slow-mo replay to help them out.

Interestingly (and unsurprisingly), their scoring is often influenced by the divers preceding them even if they're supposed to treat every dive as an absolute.
https://royalsocietypublishing.org/doi/full/10.1098/rsos.160812
 
Top