Today Show Investigation on Scoring | Page 2 | Golden Skate

Today Show Investigation on Scoring

gkelly

Record Breaker
Joined
Jul 26, 2003
I wonder if IJS needs a "police officer" judge - or algorithm - which would check the scores for situations where rules require a -2 reduction in GOE and a judge awards 2 or more. Which should be impossible with a max possible GOE of +3 (+3-2=1). If your GOE is more than 1, it lights up in red and you have to change it.

An algorithm that flags situations where a tech panel call requires a GOE reduction and therefore the GOE cannot be higher than X would work. This would ensure that judges do not finalize their marks without being aware that there was a call that was supposed to result in a GOE reduction.

There could also be a flag if a judge enters a component score with 0 as the leading digit, since that can also happen accidentally, albeit rarely.

Just saying judges favor their own skaters is kind of misleading unless you show individual cases and spell out exactly what about that performance the judge scored incorrectly. Just looking at stats is very inaccurate because take the JGP for example. In about 99% of the events a a Russian or Japanese judge should favor their athlete because they are better then the rest.

Obviously if a judge represents the same country as the best skaters, their skaters are going to get the highest scores across the board.
I guess the question is whether judges score their own skaters consistently higher than the rest of the judging panel. But then you also have to control for whether that judge is marking everyone higher than the other judges on the panel.

There are three general reasons why a judge would be likely to score their own country's skaters higher:

*Cultural preferences.
If, for example, there is a strong tradition in that country that the most important thing about flying camel spins is a beautiful camel position, then skaters from that country are more likely to achieve beautiful positions and judges from that country are more likely to reward them. A different country might consider the takeoff and air position of the flying entry more important than the position in the spin; skaters from that country would be more likely to have a strong fly and judges from that country more likely to reward it.

There is nothing wrong with those differences of emphasis. They skating community may agree that both areas are important and should be reflected in GOEs, but it's impossible to write into the rules just how important one such quality is compared to another.

That's exactly why there are multiple judges on the panel.

*Feelings of patriotism and familiarity.
Judges are more likely to have warm feelings for skaters representing their own country and specifically for skaters they have judged often over the years since well before junior or senior level. The judges may make a strong conscious effort to avoid their scores being influenced by those feelings, but there may still be an influence at an unconscious level.

And sometimes the judges make such a strong effort to avoid bias that they end up overcompensating and underscoring their compatriots.

*Intentional score manipulation, i.e., cheating.
Judges may consciously go into a competition with the intention of helping their country's skaters to place as high as possible. Thus they may systematically overmark their own skaters, and undermark their skaters' most likely close rivals. If they're smart about math and strategy they'll do it in subtle ways that are unlikely to get flagged just by eyeballing the protocols.

If they're really nefarious they'll make deals with other judges to help out each other's skaters and together slightly lowball the rivals.

Simple solution....your scores are automatically tossed out when a skater from your federation performs in addition to the two outliers.

Interesting. I remember having an e-mail exchange 20+ years ago with a skating fan who was also a statistician, who recommended a similar solution: every time a skater performed who had a compatriot judge on the panel, that judge's scores would be dropped and the substitute judge's or referee's scores would be substitute in their place.

I argued that that would be nonsense with the scoring system at the time, because the whole point of the scores under ordinal judging was to rank the skaters but the actual numbers used could vary from one judge to another. Just because the Fredonian judge gave the Fredonian skater 5.8/5.8 and the substitute judge gave her 5.7/5.7 doesn't mean the substitute judge ranked her lower -- the substitute could have been using a lower score range in general and actually thought this skater should rank higher than the compatriot judge. Or vice versa.

However, with IJS, where judges are not ranking skaters but just contributing evaluations of each element and program component to an average of the panel, and an odd number of judges is not required, it could make sense to replace or omit individual judges' scores for individual skaters.

If overmarking by compatriot judges is a consistent problem, especially of the second and third types mentioned above, then this solution would be worth considering.

IMO...you just can’t throw too many of the judges marks out. I’d even support center ice graphics on the Jumbotron like the old ‘Press Your Luck’ Whammy’s that shame the judge and sweep away the outlier scores publicly for all to see and laugh at!! Let’s keep score of the judge’s whammy stats! I’m not even joking!!

I would not support that. Any given individual score might be out of line because of a data input error (some of which could be caught by algorithms and flags to the judges in the computer) or because the judge had a different view of an element than the rest of the panel, or other reasons that are not nefarious and not appropriate to laugh at.

Sometimes their scores might even be more correct than the majority's.
 

pearly

Record Breaker
Joined
Sep 1, 2017
An algorithm that flags situations where a tech panel call requires a GOE reduction and therefore the GOE cannot be higher than X would work. This would ensure that judges do not finalize their marks without being aware that there was a call that was supposed to result in a GOE reduction.

There could also be a flag if a judge enters a component score with 0 as the leading digit, since that can also happen accidentally, albeit rarely.

That happened to an ice dance team earlier this season, didn't it? Not sure if it was in a GP or CS event, but a Canadian or US top team got 0.25 from a judge. All their other PC scores were between 8 and 9.50 or so. Was there ever any comment on that, did it get corrected?

Interesting. I remember having an e-mail exchange 20+ years ago with a skating fan who was also a statistician, who recommended a similar solution: every time a skater performed who had a compatriot judge on the panel, that judge's scores would be dropped and the substitute judge's or referee's scores would be substitute in their place.

Hmm, a substitute judge. Interesting. My initial thought was that this would leave some skaters with 7 marks and others with 6, possibly helping/hindering in cases of ties. But I'm still not sure if there are any strong pros or cons for this, or if it would make score manipulation easier or more difficult. It's a shame we went back to 9 judges, all of whose scores are counted.
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
I’m 100% against this because I want to remove all of the TP’s power I can. In fact..any call they make like UR’s and Edge Calls I wish were voted on by the panel. So of the TP calls UR then right after the performance the judges get prompted to uphold or decline the calls via a quick “public” vote on their computer screen with up to two quick video reviews....just like Snapchat it then disappears. You can’t involve too many checks and balances with this stuff IMO.

It may take longer but they just need to improve the in arena entertainment while more accurate scores are posted.

Agreed that the tech panel shouldn't have the power to control competitions. But perhaps they could flag an element for certain error on replay like a 2-foot or a stepout or landing on the wrong edge or a hand touching down. And then we get to see what judges are still giving generous GOE in spite of an element being flagged for an error (kinda like a ! call encourages the judges to consider reducing a GOE because of a flutz/lip). And of course, hold judges accountable for a Tech-panel-flagged hand down or two-foot, if they still give +3.
 

gkelly

Record Breaker
Joined
Jul 26, 2003
Agreed that the tech panel shouldn't have the power to control competitions. But perhaps they could flag an element for certain error on replay like a 2-foot or a stepout or landing on the wrong edge or a hand touching down. And then we get to see what judges are still giving generous GOE in spite of an element being flagged for an error (kinda like a ! call encourages the judges to consider reducing a GOE because of a flutz/lip). And of course, hold judges accountable for a Tech-panel-flagged hand down or two-foot, if they still give +3.

I think the point of the tech panel calls for wrong edges/underrotation being shown to judges are that they can be hard to see.

It might be worth having tech panels check for touchdown of the free foot on landing. That often happens in conjunction with an underrotation or can be confused for one because of a stutter in the flow upon landing, so they would likely be reviewing those jumps anyway. It would just mean adding another code alongside or in place of the < or << as appropriate.

A step out is pretty obvious. And although it may not always be clear whether a skater's hand touches the ice on a jump landing, if there's any question that means the landing position was leaning forward so judges at least could see at least that much for themselves.

The ISU specifically decided several years ago that judges should be allowed to give 0 or even positive GOE for jumps with moderate errors, assuming there were enough other positive qualities to the element.

If the recommended reduction is -1 or a range of -1 to -2 or -1 to -3, then a GOE of +2 would be legal.

If the reduction is -2 or -2 to -3, then GOE of +1 would be legal.

So the only examples that would need to be flagged would be a judge awarding +3, or +2 for an error that requires at least -2 reduction.

Have you ever seen a +3 for a jump with one of those errors? I think that's a solution in search of a problem.
 

noskates

Record Breaker
Joined
Jun 11, 2012
Well I think the federation should stop picking judges. Professionalize judging as suggested in the article. It would be near to impossible but IS possible to not have a judge on the panel that has a skater or skaters in the competition. But work at it.
 

Eclair

Medalist
Joined
Dec 10, 2012
Maybe someone should tell those NBC journalists of the Belarusian judge and his behavior earlier this season ... lol.

On a more serious note, I don't think it rids the sport of biased judging if we implant more votes for or against UR etc. The problem still is that judges are not accountable and only answer to their federations.
Getting rid of biased judging is actually pretty easy. For the tech calls:
- Tech panels calls the jumps and levels just as it is now.
- They respeat each jump in slo-mo after the skate in the arena - for the most time, they do that already anyways
- They have to publish the calls and levels etc. they made before the final scores
- Coaches of competitive skaters or the skater's coach themselves can file a formal protest or objection if they think that the tech panel has overlooked an UR or wrong edge or has wrongly accused the skater of an wrong call. This all happens before the scores are announced
.
- the tech panel must review the element, that the coach think was overlooked/ wrongly called. They then can decide if they want to change their call or not.
- If they don't change the call and the coach still thinks it's blatantly unfair, then he can offer a formal protest after the competition and this element will be reviewed by a broader panel.
- this system would benefit fair judging, as the skater's coaches can appeal to wrongly and unfairly made calls etc. in time, while it can at the same time prevent the tech panel from overlooking UR, two-foot and edges (or levels).

PCS and GOE judges panel:
- are required to answer to journalists and coaches after the skate in an open interview
- are required to answer to each other after the competition in an public and open discussion
- are strictly reminded to only judge what the skater did at that day at the competition, not by name, nationality, reputation or practice.
 

schizoanalyst

Medalist
Joined
Oct 26, 2016
The more interesting Zitzewitz paper is the one that shows the presence of voting trading blocs among countries and how ski jumping avoids it through a different judge selection mechanism:

http://onlinelibrary.wiley.com/doi/10.1111/j.1530-9134.2006.00092.x/abstract.

If you don't have access journals, the working paper version is mostly the same: https://www.gsb.stanford.edu/facult...ter-sports-judging-its-lessons-organizational.

This paper is about a decade old, but the incentives he locates that cause bloc voting haven't changed so I wouldn't expect much of a change.

Edit: Also, lol at the comparison to Zitzewitz as just some random guy. Even if you can't judge the quality of the statistical work (it's quite a bit more rigorous than what passes in most social science), he's a tenured professor of economics at Dartmouth lol.
 

schizoanalyst

Medalist
Joined
Oct 26, 2016
I think the only way to change anything is to stop lumping people into stats and by nationality and expose them as individuals. It needs to be a personal responsibility of the judge and they need to take ownership of their marks.

You can't do this because judges act strategically. If I know my fellow judge is particularly biased in favor of "X", other judges will tend to lower their scores for "X" to counter this phenomena. If a Russian and Ukrainian judge sees a panel of Canada, America, and Germany - they will likely try to correct the North American bias of this bloc by going lower to try to offset. This is an empirical phenomena observed in most human settings - we act strategically to affect group decisions. This is why it's exceeding difficult to identify individual bias because you are observing strategic behavior on an panel, not a bias adjustment of their "true" score. Mathematically (for reasons I won't explain) - it's very difficult to measure and quantify this strategic behavior in a rigorous way. And you'd need *way* more data than we currently have on an individual judge.

Edit: I'll emphasize, in terms of individual judging. We have maybe 9 major events per season (you cannot compare scoring across *every* event since when the stakes shift, strategic behavior changes). How many of them is one judge in? Maybe 3? Post-anonymous judging then for major events we might have an event sample size of like 6 to draw from on one judge?
 

drivingmissdaisy

Record Breaker
Joined
Feb 17, 2010
Of the problems mentioned, I think only the ban of judges guilty to wrongdoing could realistically be implemented. Throwing out the score of a judge who is from the same country as a skater brings up the problem of reconciling that with the score of a skater who doesn't have a judge from the same country on the panel. Federation insiders will always have the most opportunity to serve as international judges, and it's unrealistic to assume that anyone picked from a federation isn't going to mark their home country skaters higher. If they don't, they won't be judging for long. Having paid career judges really doesn't seem practical, either. As gkelly said, what would this career path even look like? Moreover, can the ISU even afford to keep a stable of judges available?
 

schizoanalyst

Medalist
Joined
Oct 26, 2016
Of the problems mentioned, I think only the ban of judges guilty to wrongdoing could realistically be implemented. Throwing out the score of a judge who is from the same country as a skater brings up the problem of reconciling that with the score of a skater who doesn't have a judge from the same country on the panel. Federation insiders will always have the most opportunity to serve as international judges, and it's unrealistic to assume that anyone picked from a federation isn't going to mark their home country skaters higher. If they don't, they won't be judging for long. Having paid career judges really doesn't seem practical, either. As gkelly said, what would this career path even look like? Moreover, can the ISU even afford to keep a stable of judges available?

You can do what ski jumping does, whose panels have been empirically found to not exhibit either nationalistic bias or bloc voting. Judges are not selected by national federations, but instead the centralized body. The centralized body is sufficiently independent from the national federations and the selection committees, being independent, don't display national favoritism and intentionally selected judges that do not display bias. Because judges want to be selected for the big assignments they behave well at smaller events so they could have the chance to score at the Olympics. Zitzewitz's research found that Olympic figure skating judges are *more biased* than the typical judge on average, but they are less biased for ski jumping.

This is the fix we need. They don't need to be professionals paid and working for the ISU, they just need to be nominated and selected independent of the federations from a committee looking for unbiased candidates.
 

tjb

Match Penalty
Joined
Aug 22, 2017
Edit: Also, lol at the comparison to Zitzewitz as just some random guy. Even if you can't judge the quality of the statistical work (it's quite a bit more rigorous than what passes in most social science), he's a tenured professor of economics at Dartmouth lol.

and donald trump is the president of united states of america. i bet his work about judging in figure skating should be even more awesome.
can they hire some mathematican at least?
or some "prerotation guy" from a figure skating forum. there is a lot of people on many figure skating forums who already did all the needed research.
and the quality of those forum works are sometimes much higher than what's presented in the nbc article
 

Ares

Record Breaker
Joined
Feb 22, 2016
Country
Poland
Maybe someone should tell those NBC journalists of the Belarusian judge and his behavior earlier this season ... lol.

On a more serious note, I don't think it rids the sport of biased judging if we implant more votes for or against UR etc. The problem still is that judges are not accountable and only answer to their federations.
Getting rid of biased judging is actually pretty easy. For the tech calls:
- Tech panels calls the jumps and levels just as it is now.
- They respeat each jump in slo-mo after the skate in the arena - for the most time, they do that already anyways
- They have to publish the calls and levels etc. they made before the final scores
- Coaches of competitive skaters or the skater's coach themselves can file a formal protest or objection if they think that the tech panel has overlooked an UR or wrong edge or has wrongly accused the skater of an wrong call. This all happens before the scores are announced
.
- the tech panel must review the element, that the coach think was overlooked/ wrongly called. They then can decide if they want to change their call or not.
- If they don't change the call and the coach still thinks it's blatantly unfair, then he can offer a formal protest after the competition and this element will be reviewed by a broader panel.
- this system would benefit fair judging, as the skater's coaches can appeal to wrongly and unfairly made calls etc. in time, while it can at the same time prevent the tech panel from overlooking UR, two-foot and edges (or levels).

PCS and GOE judges panel:
- are required to answer to journalists and coaches after the skate in an open interview
- are required to answer to each other after the competition in an public and open discussion
- are strictly reminded to only judge what the skater did at that day at the competition, not by name, nationality, reputation or practice.

Good proposition but I object the one regarding overlooked UR call. It would toxify the whole competition and taint relationships between skaters, coaches etc. I think the current restrained approach at calling out your opponents' mistakes is the only rational option. It's very rarely used though ... the only example that springs to my mind is last year in Ice Dance at the European Championships when French filed a protest (not sure whether it's the right term here ...) to give -1 point deduction to Capellini / Lanotte already after SD as tech panel overlooked their mini-lift that exceeded allowed number of revolution or something of this sort. Later on C/L SD score have been changed.
 

schizoanalyst

Medalist
Joined
Oct 26, 2016
and donald trump is the president of united states of america. i bet his work about judging in figure skating should be even more awesome.
can they hire some mathematican at least?
or some "prerotation guy" from a figure skating forum. there is a lot of people on many figure skating forums who already did all the needed research.
and the quality of those forum works are sometimes much higher than what they presented in a nbc article

Who the president is has nothing to do with anything, but OK. Zitzewitz wasn't hired by NBC, he did this research years ago without pay and it was published in peer-reviewed journals. Every economist is trained in econometrics. In fact, most of the major statistical science innovations of the last 50 years have come from economists! Mathematicians aren't even trained in statistical analysis and would be awful choices.

Maligning someone's creditably is cheap anyway. Why not debate the validity of his fixed effects choices? Otherwise this is just attacking someone for the sake of it when you are unable to attack their work for lack of knowledge.
 

tjb

Match Penalty
Joined
Aug 22, 2017
Who the president is has nothing to do with anything, but OK. Zitzewitz wasn't hired by NBC, he did this research years ago without pay and it was published in peer-reviewed journals. Every economist is trained in econometrics. In fact, most of the major statistical science innovations of the last 50 years have come from economists! Mathematicians aren't even trained in statistical analysis and would be awful choices.

Maligning someone's creditably is cheap anyway. Why not debate the validity of his fixed effects choices?

i can't seriosly debate an article that starts with some hilarious sotnikova rant.
and goes on like this

"Zitzewitz started studying figure skating after a scandal at the 2002 Salt Lake City Olympics rocked the sport. The Russian mafia was accused of fixing a gold medal victory for Russian pairs skaters, beating out a Canadian team despite the Canadians' superior performance."

this guy must be really credible and non biased "professor" of figure skating sciences:laugh:
 

anyanka

Record Breaker
Joined
Jul 8, 2011
This is the sort of "Greatest Hits" clip I show to my non-skating (i.e. most of) friends and family who ask me, "what's the big deal with the scoring controversy?" They need a refresher of some sort. Mind you it's of course biased towards the West, but that's how a lot of the people I know consume media these days. :shrug: So I just send that to them and say "here's a brief summary, watch for five minutes, then get back to me later". Otherwise I'd be talking at them for hours about skating and they tune me out ... LOL
 

schizoanalyst

Medalist
Joined
Oct 26, 2016
i can't seriosly debate an article that starts with some hilarious sotnikova rant.
and goes on like this

"Zitzewitz started studying figure skating after a scandal at the 2002 Salt Lake City Olympics rocked the sport. The Russian mafia was accused of fixing a gold medal victory for Russian pairs skaters, beating out a Canadian team despite the Canadians' superior performance."

this guy must be really credible and non biased "professor" of figure skating sciences:laugh:

Well, now I have confirmation you didn't read any of his work. He didn't write the NBC article, try reading the byline.

Edit: I'm gonna drop this, if you can't tell the differences between a journal article containing statistical analysis and a puff piece from NBC this is pointless.
 

drivingmissdaisy

Record Breaker
Joined
Feb 17, 2010
The centralized body is sufficiently independent from the national federations and the selection committees, being independent, don't display national favoritism and intentionally selected judges that do not display bias. Because judges want to be selected for the big assignments they behave well at smaller events so they could have the chance to score at the Olympics.

I don't think that, in practice, this would help much. If you define "independent" as evidence that the judge gives marks within a close range of the mean, you could still have voting blocs drive up scores for certain skaters, forcing an "independent" judge to mark those skaters higher just because the judge wants to be closer to the mean score. Furthermore, it discourages judges from awarding outlier scores that the judge genuinely believes are accurate. For example, we sometimes see a judge award much lower TR scores for big name skaters and, usually, these lower scores are justified. It's all very complicated and I think the ISU has taken reasonable steps towards fairness, including identifying judge scores and staffing the tech panels in big events with representatives from countries who aren't expected to compete for the top spots in that event.
 
Top