Originally Posted by Mathman
I will start a new thread about this oddity in the ISU judging system, independent of nationalistic patriotism and unrepentant fandom. (Prediction: No one will want to read it or to contribute. )
Well, at least you stated the fact that all this controversy was about nationalistic patriotism and unrepentant fandom. Can't agree more.

Originally Posted by Mathman
Just as a purely mathematical puzzle, I would be interested to know the following. I do not know what conclusion I could draw from the answer -- probably none.

Here are the total PCS for each of the nine judges for Sotnikova and Kim. (These are slightly different from what I posted earlier. I think these are right. I have a visual handicap that makes adding up lists of numbers difficult.)

SOT 48.25, 48.00, 48.00, 47.75, 46.50, 45.50 45.00 44.25, 44.25

KIM 48.75, 48.00, 47.75, 47.00, 46.75, 46.00, 46.00 45.25, 42.00

Fix Sotnikova's order and for each of the 300,000 possible ordering of Kim's, record the number x of judges that favored Sotnikova (half a point for a tie). What is the frequency distribution of x?

At the moment I cannot think of any reason why anyone would want to know this. But I think it might point my mind in the right direction to worry about it some more. I have a feeling that questions of this sort have already been extensively studied, and the result would be the same if we used the numbers 1, 2, 3, …, 9 instead. But the interesting thing would be, how does the distribution change as a function of the difference in the means? (In this example the means for the two skaters are about the same.)

The distribution of x:

x — Probability
2 — 0.5 %
2.5 — 1.3 %
3 — 1.0 %
3.5 — 9.7 %
4 — 32.4 %
4.5 — 12.9 %
5 — 24.4 %
5.5 — 3.7 %
6 — 4.4 %
6.5 — 0.2 %
7 — 0.1 %

Expectation: 4.28 (< 4.5)

Originally Posted by Vanshilar
Eh okay but the numbers that I'll be using are:

SOT 48.25, 48.00, 47.75, 47.75, 46.50, 45.50 45.00 44.25, 44.25

KIM 48.75, 48.00, 47.75, 47.00, 46.75, 46.00, 45.75, 45.25, 42.00
x — Probability
1.5 — 0.04 %
2 — 1.0 %
2.5 — 1.9 %
3 — 12.9 %
3.5 — 10.5 %
4 — 33.2 %
4.5 — 12.1 %
5 — 21.7 %
5.5 — 3.1 %
6 — 3.4 %
6.5 — 0.1 %
7 — 0.1 %

Expectation: 4.17.

Thank you!

So in 70% pf the trials there is no worse than a 5 to 4 split of the judges. That is no surprise, just from eyeballing the numbers -- pretty even, with a tiny edge to Kim in ordinals as in total. I think we are stymied now without further information from the ISU (which will not be forthcoming). If in fact it turned out that, say, there was a 6 to 3 split (probability only 2.8% in this calculation) then we could move forward.

Originally Posted by cooper
of course qwertyskates is going to react from mathman's post..took it personally and now making an accusation w/ the poster.. how predictable.. and every poster who defended sot's ogm..

that 31 goes+ for sotnikova if it's indeed true is insane.. and it looks like someone who's in payroll..

ISU should reveal the scores!! there's nothing to hide.. besides russia and the US advocates to get rid of anonymous judging.. why not try to make an example of this?? after all it was a legit win.. right??

Please take a look in the mirror. I didn't even bother to address any of your posts, because most of you are actually rather rabid and without content safe for a vindictive delight in attacking others. You, my dear, are as PERSONAL as you can be in your above post in your relentless attack on me.

Please you Yuna fans, have some class,

I point out that Mathman's scenario 2 of assuming cheating on the part of judges is based on a twisted pretzel logic that would be almost implausible. Pointing out the fallacy of this logic is NOT attacking him, ok?

Someone who ISN'T YUNA winning the OGM is not equal to cheating, ok?

As it turns out, statistically, based on PCS which are more evenly distributed, the probability of of 3 vs 6 judges acting to upset the entire contest is unsurprisingly low!

Another fact I want to remind is that the SP judging in fact favored Adelina MORE, as Yuna had a higher BV yet her margin of winning is very small. The panel of SP judges comprise mostly of non ex-Soviet Europeans. I have yet to see a blow by blow "proof" of cheating based on the SP judging and judges.

Originally Posted by Vanshilar
I can't fathom the cognitive dissonance of someone who can put these two sentences side-by-side.

It's easy enough to write code to iterate through different possibilities, but I'm not sure what the study is exactly supposed to be. Is it trying to figure out that assuming each permutation were equally likely, what were the rankings for each skater from each judge? How many skaters would be compared (and which)?

Like I said, almost ALL of the world's mainstream press and media reported on the results of Sochi, statistically, how many % questioned the results?

Yuna fans claimed a MAJORITY of public opinion agrees with them, based on repetitious echo chamber pinging of the same quotes over and over, opinions actively sought and amplified. Truth lies in a real statistical compilation of ascertaining Sochi results dissent as a % of total reporting.

And I don't mean Korea or Russia = World, so whose cognitive dissonance here, really?

Originally Posted by qwertyskates
I point out that Mathman's scenario 2 of assuming cheating on the part of judges is based on a twisted pretzel logic that would be almost implausible.
I still think that you misunderstood my earlier post. My claim is that we cannot draw any conclusions based on an examination of the protocols. In support of this claim I presented the two extreme cases, each statistically very unlikely (as verified by Rhodium's tables) under the assumption of fair let-the-chips-fall-as-they-may judging.

It is neither logical nor illogical to speculate about whether the judges are saints or sinners. This question is beyond the scope of both logic and statistics, without further information.

That's my wishy-washy story and I'm sticking to it!

Originally Posted by Mathman
In support of this claim I presented the two extreme cases, each statistically very unlikely (as verified by Rhodium's tables) under the assumption of fair let-the-chips-fall-as-they-may judging.
I think that's the problem with the analysis, the assumption that the judging was random, fair, and unbiased. If there was willful intent on the part of conspirators, the most unlikely of scenarios would become much more likely, if not probable.

Originally Posted by Ven
I think that's the problem with the analysis, the assumption that the judging was random, fair, and unbiased. If there was willful intent on the part of conspirators, the most unlikely of scenarios would become much more likely, if not probable.
That is what we cannot tell by looking at the numbers alone. The point of the exercise was to illustrate that fact.

Edited to add: By the way, I think that is why the Korean Federation took the route of basing their complaint on the slender reed of hugs and kinship. If you challenge the marks of the judges and the decisions of the technical panel, all they have to say is, "Well, yes, I really did think that Sotnikova's wonderful program deserved a 9.5 in choreography. I don't really care what other judges thought about it at Cup of Russia." Or, "Did I miss an edge call? Oh, darn!" This is just chit-chat, not evidence of anything.

Originally Posted by qwertyskates
It's one thing to argue coherently for your stance, it's another to throw out wildly exaggerated claims.

The media outlets that questioned the results and reported on the suspicions of fraudulent judging included (off the top of my head): CNN, NYTimes (yes, there was an article other than the jumps comparison), the Atlantic, Yahoo, Vanity Fair, New Yorker, USA Today, ESPN, Washington Post, Wall Street Journal, the Chicago Tribune, Slate, LATimes. That's covers just about all the major news outlets in the United States and together, reaches nearly all of the U.S population. Add to that several of the most popular skater blogs (The Skating Lesson being one) voiced dissent with the results, and the majority of the commentators (who aren't Korean judging by their grasp of the English grammar) agreed with the dissent. Outside of the US: International Business Times, France's L'Equippe, China's Xinhua, Japan's JapanTimes, the German newspapers, Britain's BBC, which are all national newspapers.

Most of these newspapers didn't tiptoe around the issue. ESPN's headline was "Russian Homecooking", and JapanTimes was "Judges steals Kim's gold and hands it to Sotnikova!". So I'm not sure where you're getting your grandiose claim of "ALMOST ALL the world's mainstream press concurred with the results". What is your statistical notion of "majority of mainstream opinion"? 2?%?

Originally Posted by qwertyskates
I think >> THIS << may be pertinent

Originally Posted by YesWay
I think >> THIS << may be pertinent

So the data I used was (let me know if I messed up any of the numbers):

48.00 47.75 45.50 44.25 47.75 46.50 48.25 44.25 45.00 Adelina (avg 46.36)
45.75 42.00 48.75 45.25 47.00 47.75 46.75 46.00 48.00 Yuna (avg 46.36)
44.00 45.00 45.75 45.50 47.75 47.50 47.00 48.50 42.75 Carolina (avg 45.97)
42.75 46.50 44.00 41.25 45.50 42.25 40.75 45.75 44.75 Yulia (avg 43.72)
42.75 44.25 43.00 44.25 47.00 41.00 41.25 44.25 44.75 Mao (avg 43.61)

I found it interesting that Adelina and Yuna had the exact same total raw scores by the judges, down to the 0.25 point; the reason why Yuna ended up with a slightly higher PCS was that the extremes were eliminated. Carolina's total PCS was slightly lower.

So onto the permutations, showing the number of judges favoring the second skater (tie was considered 0.5, so equally favored would mean 4.5), the percentage, and the number of permutations out of 362880:

0 = Yuna, 9 = Adelina:
0.0 00.00% 0
0.5 00.00% 0
1.0 00.00% 0
1.5 00.04% 144
2.0 00.95% 3456
2.5 01.94% 7056
3.0 12.94% 46944
3.5 10.48% 38016
4.0 33.17% 120384
4.5 12.06% 43776
5.0 21.67% 78624
5.5 03.13% 11376
6.0 03.41% 12384
6.5 00.12% 432
7.0 00.08% 288
7.5 00.00% 0
8.0 00.00% 0
8.5 00.00% 0
9.0 00.00% 0
mean: 4.1667

0 = Carolina, 9 = Adelina
0.0 00.00% 0
0.5 00.00% 0
1.0 00.00% 0
1.5 00.00% 0
2.0 00.00% 0
2.5 00.05% 192
3.0 01.07% 3888
3.5 02.29% 8304
4.0 13.44% 48768
4.5 12.22% 44352
5.0 32.04% 116256
5.5 13.97% 50688
6.0 18.81% 68256
6.5 03.49% 12672
7.0 02.46% 8928
7.5 00.12% 432
8.0 00.04% 144
8.5 00.00% 0
9.0 00.00% 0
mean: 5.1111

0 = Carolina, 9 = Yuna
0.0 00.00% 0
0.5 00.00% 0
1.0 00.00% 0
1.5 00.00% 0
2.0 00.00% 0
2.5 00.01% 48
3.0 00.72% 2616
3.5 01.16% 4200
4.0 11.27% 40896
4.5 08.31% 30144
5.0 33.07% 120000
5.5 12.09% 43872
6.0 24.60% 89280
6.5 04.01% 14544
7.0 04.42% 16056
7.5 00.22% 792
8.0 00.12% 432
8.5 00.00% 0
9.0 00.00% 0
mean: 5.2778

0 = Mao, 9 = Yulia
0.0 00.00% 0
0.5 00.00% 0
1.0 00.00% 0
1.5 00.00% 0
2.0 00.00% 0
2.5 00.27% 972
3.0 05.03% 18270
3.5 05.23% 18978
4.0 27.94% 101376
4.5 12.84% 46608
5.0 32.03% 116220
5.5 06.76% 24540
6.0 08.78% 31872
6.5 00.68% 2484
7.0 00.42% 1542
7.5 00.00% 18
8.0 00.00% 0
8.5 00.00% 0
9.0 00.00% 0
mean: 4.6111

Basically it was around 5-4 for Yuna over Adelina, 5-4 for Yuna over Carolina, and 5-4 for Adelina over Carolina in terms of the PCS. Between Mao and Yulia it was pretty evenly matched. I'm not sure how well this could actually detect any bias by judges though -- to me the clearest thing would be if we knew which judge gave which scores, so that the we can look at correlation and such. Of course, that's impossible under anonymous judging, where the correlation can be hidden inside of the overall variation between judges.

15. 0
Thannk you for doing all this work, Venshilar.

No surprises.

Still, I am pleased to see these results. If it were later revealed that, say, 7 judges gave Yuna higher marks than Adelina, we could say, "Whoa, Jack -- that would happen less than one-tenth of one percent of the time if the numbers, high and low, are not artificially matched up.

I found it interesting that Adelina and Yuna had the exact same total raw scores by the judges, down to the 0.25 point; the reason why Yuna ended up with a slightly higher PCS was that the extremes were eliminated.
I noticed that, too. It the effect were more pronounced I suppose that might count for something. Too slight, though.

