# Ice scope data shows zero correlation between height & GOE

#### Mathman

Record Breaker
One thing I always marvelled at were diving judges.

By the way, in the changes to the GOE structure last year, the ISU decided to go with the "diving model" to replace the former additive model.

It used to be that you would get the base value and then a GOE score from -3 to +3 to add on. In the current model, the GOE provides a multiplier. +1 GOE means multiply the base value by 1,10, +2 GOE means multiply the base value by 1.2, etc.

This is like diving, where you get a difficulty score and then a performance score. If you get a performance score of 8, say, then you multiply the difficulty score (base value) by 80% (times ten for convenience).

#### Shanshani

On the Ice
It's not so much that the rules require jump size to be the primary determinant of jump scoring and judges are either unable or unwilling to follow those rules.

Rather, the rules consider jump size to be one of many positive qualities to be rewarded. Perhaps there is an inverse relationship between jump size and some of the other positive qualities. Or a direct correlation between jump size and some of the negative qualities. For example, I'd expect bigger jumps, on average, to be more telegraphed.

Maybe, but if so it still means the scoring system is effectively incentivizing small jumps, if the benefit provided to bigger jumps is not nearly enough to outweigh the risks. And while I suspect having higher jumps does negative impact some aspects of jumps (eg. it may make them harder to land), let me note that some of those negative effects are already removed from the data set because the data set only includes positive GOE jumps. So, even *after* potentially excluding much of the negative effect of height on GOE, there is *still* no relationship between height and GOE, or if we exclude Yuzuru from the data set, a *negative* relationship. This means that having bigger jumps is potentially much more punishing than is represented here--and, to be honest, I think that's just a really terrible incentive structure. Big jumps are part of what make figure skating exciting and interesting to watch for a significant proportion of the audience, not to mention a contributor to the athletic side of the sport, so it's something that should be encouraged, not discouraged.

However, let me note that there is one statistic I calculated that might have some direct bearing on the question of whether height negatively impacts other qualities of jumps. There was a mildly negative (r=-0.19) relationship between height and speed out, which can sort of be used as a proxy for outflow (though I'm not sure it's perfect since I don't think faster speed out = better beyond a certain point). So it appears there may be a negative relationship (although at that r value, I doubt it's statistically significant) between height and outflow. But the effect is really weak, so it wouldn't explain much of the negative relationship between height and GOE for non-Yuzuru Hanyu triple axels.

Record Breaker
Maybe, but if so it still means the scoring system is effectively incentivizing small jumps, if the benefit provided to bigger jumps is not nearly enough to outweigh the risks. And while I suspect having higher jumps does negative impact some aspects of jumps (eg. it may make them harder to land), let me note that some of those negative effects are already removed from the data set because the data set only includes positive GOE jumps. So, even *after* potentially excluding much of the negative effect of height on GOE, there is *still* no relationship between height and GOE, or if we exclude Yuzuru from the data set, a *negative* relationship. This means that having bigger jumps is potentially much more punishing than is represented here--and, to be honest, I think that's just a really terrible incentive structure. Big jumps are part of what make figure skating exciting and interesting to watch for a significant proportion of the audience, not to mention a contributor to the athletic side of the sport, so it's something that should be encouraged, not discouraged.

However, let me note that there is one statistic I calculated that might have some direct bearing on the question of whether height negatively impacts other qualities of jumps. There was a mildly negative (r=-0.19) relationship between height and speed out, which can sort of be used as a proxy for outflow (though I'm not sure it's perfect since I don't think faster speed out = better beyond a certain point). So it appears there may be a negative relationship (although at that r value, I doubt it's statistically significant) between height and outflow. But the effect is really weak, so it wouldn't explain much of the negative relationship between height and GOE for non-Yuzuru Hanyu triple axels.

Question: how do the results skew if someone like Hayrapetyan (relatively high height but low GOE) and a skater like Uno (relatively lower height but high GOE) are removed?

Because these two are examples of how height isn't a be all end all to high/low GOE. Uno for example had excellent distance and creative entry and great landing speed and it looked pretty effortless so high GOE was merited... but for Slavik, the landing was low and out of control (possibly due to the amplitude) and he had a bit of a lean and not really any difficult entry transitions so the GOE was lower (also didn't help he's not a popular/top tier skater). These are 2 examples that could have "neutralized" a greater positive correlation between jump height and GOE (and indicates what what we already all know - that there are other factors that contribute to higher/lower GOE).

#### Shanshani

On the Ice
Question: how do the results skew if someone like Hayrapetyan (relatively high height but low GOE) and a skater like Uno (relatively lower height but high GOE) are removed?

Because these two are examples of how height doesn't translate to GOE. Uno for example had excellent distance and creative entry and great landing speed and it looked pretty effortless so high GOE was merited... but for Slavik, the landing was low and he had a bit of a lean and not really any difficult entry transitions so the GOE was lower (also didn't help he's not a popular/top tier skater). These are 2 examples that could have "neutralized" a greater positive correlation between jump height and GOE.

I was going to take a look at that, actually. I think there's a much better case for removing Yuzu than others bc Yuzu is a fairly singular outlier on height, whereas you get into a bit of a mess if you try to remove other data points, but it did occur to me that Shoma might be contributing to the negative relationship somewhat, though there was another skater with the same height who would also have to be removed so I think that counterbalances that effect somewhat. But I think that the second point you're making, about why height does necessarily equal GOE, I don't think really counters my core point that much. Of course there are other factors other than height, and we should not expect a particularly high r-value for height versus GOE because of those other factors. It would obviously be ridiculous to expect GOE to be perfectly correlated with height like clockwork. The problem is not that the r-value is positive but low, it's that it's essentially non-existent or even negative (depending on how you handle the Yuzuru data point--he is singularly responsible for the fact that the correlation is 0.03 rather than -.23, because outliers have really strong effects on lines of best fit). That means that height, at best, is not having any statistical impact whatsoever even after averaging out of the jumps and their respective good and bad qualities, or at worst is negatively affecting GOE, despite the fact that the judging system is supposed to reward height. Of course, this is complicated by interactions between height and other jump qualities (though let me note the evidence for these interactions is also limited), but again, the core point is this:

Regardless of whatever mechanism causes this, if you are a competitive skater, this data indicates you should not expect to be rewarded for jumping high, because it suggests there is either a negative relationship between jump height and GOE or no relationship at all (though this is perhaps less true for ladies, as I remember someone pointing out earlier). Therefore, given that jumping high likely carries inherent risks, it makes little sense from a competitive perspective to jump a centimeter higher than is necessary to land the jump.

Of course, I would like more data, and I would also like to see whether other systems give the same measurements as icescope. But as a preliminary matter, I think the data is worth talking about and suggests that we should endeavor to score the height/distance bullet more objectively.

Record Breaker
Shanshani, regarding the way you've done the trend plots, I've noticed an issue in how you've plotted the x-axes with respect to the GOE y-axis. The y-axis is standardized for all the graphs (height of axis is the same with each partition representing the same percentage of the total in each graph) but the x-axis is not done the same way.

E.g. the x-axis in A) Jump height vs GOE plot and B) the x-axis in PCS vs GOE. The x-axis in A is 28.22 (Demirboga) to 46.71 (Hanyu) -- this range of 24.29 represents 52% of the total range from 0 to 46.71 (and each of the 4 partitions on the x-axis represent 13% of the total range). And the x-axis in B is 0.51 (Uno) to 0.70 (Hanyu) -- this range of 0.19 represents 27% of the total range from 0 to 0.7 (and each of the 4 partitions represents 6.8% of the total range). So your x-axes aren't calibrated for comparison with each other with one range almost double the representation of the total range as compared to the other. In other words the slope of the trend line in graph A is almost doubled (yes, even if calibrated the slope would be higher in A than B ... but it wouldn't *look* as much steeper when comparing the graphs at first glance).

If you plotted the same graphs with the x-axes actually starting from 0 (eg 0 to 46.71 PCS and from 0 to 0.70 m, and same for landing speed and distance), it would give a more accurate method of comparison than starting the x axis from a starting point that isn't zero.

Could you please provide the link to the entire spreadsheet of data? I can't seem to find it in your initial post? I would be curious to see how these graphs and their trend lines would look if the ranges of the x axis were from 0% to 100% of the ranges they represent (eg 0 to 46.71 PCS, 0 to 0.7 height).

#### Shanshani

On the Ice
Shanshani, regarding the way you've done the trend plots, I've noticed an issue in how you've plotted the x-axes with respect to the GOE y-axis. The y-axis is standardized for all the graphs (height of axis is the same with each partition representing the same percentage of the total in each graph) but the x-axis is not done the same way.

E.g. the x-axis in A) Jump height vs GOE plot and B) the x-axis in PCS vs GOE. The x-axis in A is 28.22 (Demirboga) to 46.71 (Hanyu) -- this range of 24.29 represents 52% of the total range from 0 to 46.71 (and each of the 4 partitions on the x-axis represent 13% of the total range). And the x-axis in B is 0.51 (Uno) to 0.70 (Hanyu) -- this range of 0.19 represents 27% of the total range from 0 to 0.7 (and each of the 4 partitions represents 6.8% of the total range). So your x-axes aren't calibrated for comparison with each other with one range almost double the representation of the total range as compared to the other. In other words the slope of the trend line in graph A is almost doubled (yes, even if calibrated the slope would be higher in A than B ... but it wouldn't *look* as much steeper when comparing the graphs at first glance).

If you plotted the same graphs with the x-axes actually starting from 0 (eg 0 to 46.71 PCS and from 0 to 0.70 m, and same for landing speed and distance), it would give a more accurate method of comparison than starting the x axis from a starting point that isn't zero.

Could you please provide the link to the entire spreadsheet of data? I can't seem to find it in your initial post? I would be curious to see how these graphs and their trend lines would look if the ranges of the x axis were from 0% to 100% of the ranges they represent (eg 0 to 46.71 PCS, 0 to 0.7 height).

I will take a look at how the graphs are formatted at some point in the future (this is unfortunately already taking up too much of my time), I just went with the default formatting that google provided. I thought I included the link, but I suppose I overlooked that. Here you go.

- - - Updated - - -

I suspect one thing that is giving people difficulty is lack of familiarity with the correlation coefficients. So let me briefly provide some more context. Correlation coefficients (r) are values ranging from -1 to 1 which measure how closely two variables are associated with each other and which direction the relationship takes (eg. do they go up together or do they go in opposite directions? if r is positive, the former is true, and if r is negative, the latter). Values close to the ends of the range (close to -1 and 1) indicate that a very strong relationship exists, whereas values close to 0 indicate that there is little relationship between the variables. If the value is 1, that means that the two variables increase in lockstep. If it's closer to 0.5, that means that a trend exists, but we will still see considerable variation between individual data points, so just because one variable goes up from one variable to the next doesn't mean the other necessarily moves in one particular direction. If it's closer to 0.2, then there is a relationship but it is quite weak and may be difficult to see.

As people like CanadianSkaterGuy have pointed out, there are other factors affecting GOE other than jump size. This is a strong reason to think that the correlation between jump size and GOE will not be particularly strong--indeed, I would have been very surprised and somewhat concerned if the correlation was higher than 0.5. Nonetheless, it is not by itself a reason to think that there should be no relationship at all. For instance, (using an example from outside of figure skating), there is a detectable correlation between a person's height and a person's income (provided they're from the same country and the same gender), around r=0.25-0.35 (though the article I believe this is from is paywalled for me so someone will have to confirm). Obviously, quite a lot of factors affect someone's income besides their height, and there are probably at least a few factors that exert significantly more influence! But nonetheless, the relationship exists and is detectable using statistical methods. That is what I attempted to do in the opening post--to detect whether there is a relationship between 3A height/distance/speed out and GOE at least comparable to the relationship between height and income or other weakly correlated variables. I found that there was not, and if anything the relationship went in the opposite direction. If the relationship had merely been a weak positive one, I would have been satisfied--after all, there *are* other factors affecting GOE, so the relationship should be weak. But at present, the evidence suggests someone's physical height has more positive effect on their income than World SP 3A heights had a positive effect on GOE.

One question is--why is that? Is the Worlds Men's SP an anomaly? Is it bad judging? Does jumping higher inherently negatively affect other GOE characteristics more than it helps? All of the above? The data by themselves do not necessarily answer these questions (though more data would help for the first question, and there's plenty of evidence to support that judges are not particularly objective about scoring, so one might justifiably have suspicions about the second). But regardless of the answers to these questions, the data do suggest that jumping high is not incentivized, or worse, actually disincentivized by the scoring system, at least if the icescope data is accurate.

Record Breaker
Thanks for the data! Will take a look.

Also regarding the "trend" between GOE vs PCS I'm not quite sure how you can make a point about the GOE of the 3A and PCS having a direct relationship. Sure, top-tier skaters tend to execute elements better so a skater with better PCS is more likely to have a better 3A (as a general statement - obviously skaters like Chan, Lambiel, Brown are exceptions). But the issue is that this doesn't correlate with the rest of the program -- Demirboga had a good 3A but had several other errors in his program (and isn't the best skater with the best program either), so obviously the good 3A doesn't mean his PCS will somehow be great.

#### Mathman

Record Breaker
Also regarding the "trend" between GOE vs PCS I'm not quite sure how you can make a point about the GOE of the 3A and PCS having a direct relationship.

To me, the relationship could not be more obvious. If you do a great big beautiful triple Axel, that enhances your choreography, makes it seem like it goes with the music better, makes it look like you have better skating skills, gets the audience into the performance, and the whole shebang.

Shanshani said:
the data do suggest

Thank you for that "do" instead of "does."

Record Breaker
To me, the relationship could not be more obvious. If you do a great big beautiful triple Axel, that enhances your choreography, makes it seem like it goes with the music better, makes it look like you have better skating skills, gets the audience into the performance, and the whole shebang.

Thank you for that "do" instead of "does."

Sorry I should have clarified... I agree with you that a good triple axel can affect PCS, but that graph would look a little different, with the GOE of the triple of the axel on the x-axis as the independent variable, and the PCS being on the y-axis as the dependent variable. The way the graph (https://i.redd.it/8csugc7rthn21.png) is set up, it places PCS on the x-axis (as the independent variable) and GOE on the y-axis (as the dependent variable). Which is saying GOE of the triple axel is determined by the PCS (I agree with you that the other way around makes more sense). After all, the judge doesn't choose the PCS and then decide to give the 3A better GOE (although as we know, top skaters, who get higher PCS, tend to be given more generous GOE or given leeway on deductions, but that's besides the point)... but a judge does see a good 3A (with +GOE likely) and that can translate into better PCS.

Another way of analyzing it is by asking the question: "So Mathman, Skater A just did their program and got 40 PCS. Their triple axel had positive GOE. Based on the scatter plot (https://i.redd.it/8csugc7rthn21.png), could you tell me what their GOE was?"

The answer is no, because there is a low positive correlation in the scatter plot. You could guess that it's 1.9 since that's where the line of best fit intersects, but of the 5 skaters who got approximately 40 PCS (and a +GOE 3A) their 3A GOE ranged from around +0.34 (Aymoz) to +2.51 (Rizzo/Jin) GOE. A true positive correlation would have many more plot points along the line of best fit (i.e. that trend line), however.

Also, the degree to which an element execution affects PCS differs in scale from the actual potential GOE score on that element --- for example, a top skater lands a 3A perfectly and gets +4.00 GOE, and their PCS for that program happens to be 40. The next time they perform, they have a a stepout or touchdown and get only +1 GOE (or 25% of what they originally scored), but their 40 PCS isn't going to drop to 25% of that, i.e. 10 PCS.

The same thing goes for the other scatter plots... there is low positive correlation between GOE and other variables like height, landing speed, and distance, simply because there are so many other variables that come into play. If Hanyu had fallen on his triple axel, he'd still have the highest height and distance... but the fall would obviously drop his GOE to a negative amount - even if the height/distance stayed the same. The height/distance can add to the GOE (based on one bullet) but the effect of all other executional variables can completely negate that (as we saw with Slavik's triple axel which had great amplitude but landing issues which dropped his 3A GOE below other skaters who didn't have as much height or distance as he had).

Record Breaker
I was going to take a look at that, actually. I think there's a much better case for removing Yuzu than others bc Yuzu is a fairly singular outlier on height, whereas you get into a bit of a mess if you try to remove other data points, but it did occur to me that Shoma might be contributing to the negative relationship somewhat, though there was another skater with the same height who would also have to be removed so I think that counterbalances that effect somewhat. But I think that the second point you're making, about why height does necessarily equal GOE, I don't think really counters my core point that much. Of course there are other factors other than height, and we should not expect a particularly high r-value for height versus GOE because of those other factors. It would obviously be ridiculous to expect GOE to be perfectly correlated with height like clockwork. The problem is not that the r-value is positive but low, it's that it's essentially non-existent or even negative (depending on how you handle the Yuzuru data point--he is singularly responsible for the fact that the correlation is 0.03 rather than -.23, because outliers have really strong effects on lines of best fit). That means that height, at best, is not having any statistical impact whatsoever even after averaging out of the jumps and their respective good and bad qualities, or at worst is negatively affecting GOE, despite the fact that the judging system is supposed to reward height. Of course, this is complicated by interactions between height and other jump qualities (though let me note the evidence for these interactions is also limited), but again, the core point is this:

Regardless of whatever mechanism causes this, if you are a competitive skater, this data indicates you should not expect to be rewarded for jumping high, because it suggests there is either a negative relationship between jump height and GOE or no relationship at all (though this is perhaps less true for ladies, as I remember someone pointing out earlier). Therefore, given that jumping high likely carries inherent risks, it makes little sense from a competitive perspective to jump a centimeter higher than is necessary to land the jump.

Of course, I would like more data, and I would also like to see whether other systems give the same measurements as icescope. But as a preliminary matter, I think the data is worth talking about and suggests that we should endeavor to score the height/distance bullet more objectively.

I think because of height being essentially just half of a GOE bullet, it stands to say it doesn't have a huge impact on GOE. But there are still scoring benefits to a skater jumping with height - namely a greater likelihood to land the jump fully rotated and without error, and it makes the performance more impressive overall if jumps are actual highlights in the music because of how grandiose they are. I don't know if a point you were trying to make was that judges are neglecting to give skaters higher GOE, but as you've acknowledged and many of us have pointed out, GOE has a variety of factors.

Data can also be a bit mercurial too because all it takes is for a skater to have a stepout or touchdown or skid off their landing blade and suddenly their GOE is grossly affected while their amplitude remains unchanged. I just think there are way too many other variables that can adversely or positively affect GOE that it's actually not a huge surprise that amplitude isn't hugely correlated with the GOE give to a jump. As we see in your scatterplot and using your data, the average is 0.59 and within the standard deviation (0.55 to 0.63), we have GOE ranging from 0.34 (Aymoz) to 2.74 (Chen). As I said with Hayrepetyan, he had the 3rd highest 3A, but because of the bad landing/almost touching down, he killed his potential GOE on that element. And of course, a judge could have credited it as very good height moreso than they would have for other skaters who ended up scoring higher GOE than Slavik, but the reductions affected the final GOE.

#### Mathman

Record Breaker
[SUP][/SUP]I think there is a common-sense reason why jump height might be somewhat negatively correlated with speed out of the jump (as a surrogate for flowing exit edge). If you jump straight up you will get a lot of height, but little horizontal ice speed on the landing.

By the way, there is another useful interpretation of what the correlation coefficient r means. It is the slope of the least squares regression line measured in standard units (i.e., in standard deviations instead of raw scores along the x- and y-axes). It is actually kind of cool that this slope cannot compute out to be greater than 1 for any set of data. This is the phenomenon of "regression toward the mean."

You can also think of the square r[SUP]2[/SUP] to be he percentage of GOE that is associated with height/distance, and the rest, 1-r[SUP]2[/SUP]. as the part of the GOE that is associated with random statistical noise ("sampling error") -- or with other factors not involved in he analysis. In this study the random noise part turned out to be pretty much 100%.

Last edited:

##### Skating is Art, if you let it be
Record Breaker
If only there was a way for skaters to have height, distance, and speed on a jump at the same time.

Oh wait, there is, and we used to see it all the time. Technique has just gone massively downhill these days, with people purposefully doing lower quality jumps in order to gain consistency, and yet still receiving sky-high GOE’s for their mediocrity.

#### Shanshani

On the Ice
[SUP][/SUP]I think there is a common-sense reason why jump height might be somewhat negatively correlated with speed out of the jump (as a surrogate for flowing exit edge). If you jump straight up you will get a lot of height, but little horizontal ice speed on the landing.

By the way, there is another useful interpretation of what the correlation coefficient r means. It is the slope of the least squares regression line measured in standard units (i.e., in standard deviations instead of raw scores along the x- and y-axes). It is actually kind of cool that this slope cannot compute out to be greater than 1 for any set of data. This is the phenomenon of "regression toward the mean."

You can also think of the square r[SUP]2[/SUP] to be he percentage of GOE that is associated with height/distance, and the rest, 1-r[SUP]2[/SUP]. as the part of the GOE that is associated with random statistical noise ("sampling error") -- or with other factors not involved in he analysis. In this study the random noise part turned out to be pretty much 100%.

The negative correlation between height and landing speed is pretty weak though, it's not even close to being statistically significant (p=0.38).

I don't think that's quite a correct description of r[SUP]2[/SUP], it's more the amount of variation in GOE that's explainable by differences in height/distance, rather than the % of GOE itself (in other words, it's how much of the difference between the different skaters' GOE is related to differences in jump height). In this case it's kind of splitting hairs though since it's 0 either way. On the other hand, r-squared for GOE versus PCS is 0.43 and r-squared for GOE versus total planned BV is 0.337. So quite a lot of differences in how skaters scored on GOE can be "explained" by how they are scored on PCS and what their planned BV is.

#### tral

##### Match Penalty
Oh wait, there is, and we used to see it all the time. Technique has just gone massively downhill these days, with people purposefully doing lower quality jumps in order to gain consistency, and yet still receiving sky-high GOE’s for their mediocrity.

This, a thousand times this, at the reigning men's world champion's "textbook" jumping, and reigning men's WBM "quads". Of course it's easier to train and jump those many quads when you're jumping small to save up energy. What's so impressive or prodigious about that?!

Record Breaker
If only there was a way for skaters to have height, distance, and speed on a jump at the same time.

Oh wait, there is, and we used to see it all the time. Technique has just gone massively downhill these days, with people purposefully doing lower quality jumps in order to gain consistency, and yet still receiving sky-high GOE’s for their mediocrity.

Well, we could also remove transitions from the picture and just have skaters stroke into their jumps to maximize height, length and speed.

Record Breaker
This, a thousand times this, at the reigning men's world champion's "textbook" jumping, and reigning men's WBM "quads". Of course it's easier to train and jump those many quads when you're jumping small to save up energy. What's so impressive or prodigious about that?!

There is some merit in expending energy to improve all aspects of a program and not just make it about saving up for amplitude on jumps. The "figure skating not figure jumping!" folks would tell you that.

One of the strangest things is applying the concept of "jumping small" for an element like a 3A or quad. Like, sure some can jump these bigger than others, but you can't execute these jumps reliably in the first place if you jump "small".

##### Skating is Art, if you let it be
Record Breaker
Better to skate with strong edges, posture, and speed - and doing jumps with great quality - than to do shallow and pointless transitions, with little awareness of how your body line is being broken up, and lower quality jumps on top of it.

It’s possible to do high quality jumps with transitions or less setup anyway. Look at the first competitive 3Axel ever from Vern Taylor in 1978. He went into it right after doing a 3Lutz and got amazing lift on both of those jumps. Similarly in the 80’s, look at how many transitions the guys were doing, yet still had huge and airy jumps, without pre-rotating.

Record Breaker
I don't think it's a fair point of comparison. The level of transitions in those programs today were lower, telegraphing was fine, footwork sequences were easier... and spins were a pitifully low number of rotations with usually just basic positions and few difficult variations, especially in men's. The demands and requirements for skaters today are much greater.

#### tral

##### Match Penalty
I don't think it's a fair point of comparison. The level of transitions in those programs today were lower, telegraphing was fine, footwork sequences were easier... and spins were a pitifully low number of rotations with usually just basic positions and few difficult variations, especially in men's. The demands and requirements for skaters today are much greater.

Yes, "lower" level of transitions and "easier" footwork, like Browning's. BTW, compared to whom? Vincent Zhou?

Yes, "basic" spin postions, like the classic camel, substituted for foolishly defined "difficult" positions.