Proposed CoP Changes for Singles

Blades of Passion · Dec 10, 2009

In the end I don't care so much about exactly how the Program Components are categorized, as long as the impact of the performance and program from the skater are weighted fairly.

The one true question, when you get down to it, is "how much did I like watching this person skate/how much do I care?" Edge quality, speed, energy, body line, expression, cohesive and interesting choreography, nuance, emotional depth...these things just form the parts of that question.

gkelly · Dec 10, 2009

Mathman said:
For the two PCSs, again the judges could use the same 10 point scale with quarter-point gradations, and the scaling factors (2.0, 1.6. 1.0, and .8) could be adjusted as appropriate without the judges having to learn a new system. Actually, I would prefer fewer gradations in the PCSs, rather than more. I just don't think it is possible for a judge or anyone else to distinguish objectively between a program that "deserved" 5.50 rather than 5.25 for Interpretation, say.

Half-point gradations would be better, I think, for the overall "first program mark" and "second program mark."

I really hate this idea, because it gives the judges almost no control in distinguishing between skaters who are close in overall performance. Judges would end up having to tie skaters in PCS even when they have a clear opinion on which one was better, so often only the TES would end up determining the results.

Under the ordinal system, a judge could give three skaters marks such as 5.7/5.6, 5.5/5.8, and 5.6/5.7 to rank them third, first, and second, respectively, in a long program compared with each other. These skaters all have a total of 11.3 from that judge, but the tiebreaker makes it clear which one in his opinion should rank highest of the three.

Under today's scale, all those skaters might end up with PCS in the 7s and a few verging into the 8s.

So with your proposal, maybe these three skaters could get scores of 8.0/7.0, 7.0/8.0, and 7.5/7.5 for skating skills/transitions and performance/execution/choreography/interpretation, respectively.

If the factors for the two components are the same, they're absolutely tied on PCS. The TES will determine the results.

If one of the factors is larger, then that will have more effect. But it might not be the area where the judge thought there were the most significant differences between the skaters.

You're taking decision-making power away from the judges.

amateur · Dec 10, 2009

^^ I agree. And really, is a gradation 0.25 really that difficult to manage?

For a high-level senior international, we could say that a score of below say 2.5 is meaningless or should at least be rare (or am I being naive here? - I almost never get to watch the bottom-of-the pack competitors...). And above 9.0 should also be rather rare and exceptional, one should think. So, from 2.5 to 9.0, (or let's say from 0.5 to 7.0 for a lower-level competition) if the gradations are by 0.5, there are just 14 grades that can be given - and since the large majority of the skaters will realistically fall between say 4.0 and 8.5, just 10 different scores possible. May as well assign a whole number between 1 and 10, and weigh it differently. Surely the judges are capable of more nuanced opinions than that? And the idea of comparison, preferences being expressed among skaters close in ability - is, at least historically, an important aspect of this sport.

Now I know the actual mark the skater receives on a component is actually an average of 5 judges' (as currently counted) opinions, so that if the panel is split between giving a skater 8.0 and 8.5 on Skating Skills, the actual mark recived could be 8.1, 8.2, 8.3, or 8.4, according to how many judges lean each way. Maybe an ideological argument about this is, whether it is more important to have each judge's preference influence the results down to the (admittedly, somewhat meaningless) 100th of a point, or whether it suffices to have the judges' consensus of what general "tier" or level the skating falls into, since the consensuses (consensii?) or averages will tend to fall in between, and show more variation than the grades available. Is it the business of the judges anymore to be ranking and showing preference (as oposed to just rewarding quality in a broader sense across defined categories)? That could be a valid debate, but I'm not sure that was where we were wanting to go. Though it is rather fundamental to CoP, or the design of any system, isn't it?

If we are to claim that 7.0 level of skating skill should be readily distinguishable to a judge from that of 6.5, what to do about the grey area: a performance that the judge might feel "reached" the 7.0 level but not consistenly throughout, but was still better than the last skater who was a 6.5 through and through? It is always useful, I would think, to have grades available that fall in betwen the readily disguishable marks. What's the difference between B+ and A- ?

If we are to stick to 5 program components, and if these were to be judged "properly" (as we all know they currently are not), then IMO there could be more of an argument for half-point gradations, in terms of simplifying what would appear (judging by the way it is used..) a slightly overwhelming judging task. With just 2 or 3 components, though (not to say it then becomes "easy" to judge, mind you), it just seems like a oversimplification, and taking power away from judges to indicate preference, as gkelly suggests

Bah! At the end of the day, it is all so arbitrary and imperfect anyway.
We could just as easily have gradations of 0.1 on a 6.0 scale, and factor them appropriately... (to be honest, for a few reasons, I would prefer just that... maybe I'll get into that one another time)

Mathman · Dec 10, 2009

gkelly said:
Under the ordinal system, a judge could give three skaters marks such as 5.7/5.6, 5.5/5.8, and 5.6/5.7 to rank them third, first, and second, respectively, in a long program compared with each other. These skaters all have a total of 11.3 from that judge, but the tiebreaker makes it clear which one in his opinion should rank highest of the three.

To me, this is an argument in favor of ordinal judging and against point-total judging altogether.

Yes, under ordinal judging, the only thing that counts is whetehr the judge thinks skater A was better than skater B. Once that judgement has been made, the judge can manipulate the 5.8s amd 5.7 however he/she wants to, to make it come out right.

You can always use finer gradatons when you are comparing one against another than when comparing each against remembered and esrimated objective criteria. Think of a horse race. One horse runs around the track and it is your job to guess -- without a stop watch -- how fast he ran.

Compare that to a two horse race. If one edges out the other by a neck it is perfectly clear which horse won, even though the race was close.

So with your proposal, maybe these three skaters could get scores of 8.0/7.0, 7.0/8.0, and 7.5/7.5 for skating skills/transitions and performance/execution/choreography/interpretation, respectively.

If the factors for the two components are the same, they're absolutely tied on PCS. The TES will determine the results.

Well, that's the CoP. If two skaters get the same number of points on PCSs, then the TESs will decide the outcome -- and vice versa.

amateur said:
So, from 2.5 to 9.0, (or let's say from 0.5 to 7.0 for a lower-level competition) if the gradations are by 0.5, there are just 14 grades that can be given - and since the large majority of the skaters will realistically fall between say 4.0 and 8.5, just 10 different scores possible. May as well assign a whole number between 1 and 10, and weigh it differently - surely the judges are capable of more nuanced opinions than that?

No, I don't think so. Not if you are comparing the skaters, not one against the other, but against a list of bullets published in an ISU document.

I think 1 through 6 would be possible, but not one throguh 10.

Again, I have to ask this question. I read all the ISU bullets about what constitutes good Interpretation. Then I see a skating performance. Referring to the ISU rules, was that a 6.25 performance or a 6.50 performance? Can you tell me why it was worth only 6.25 points and not 6.50?

gkelly · Dec 10, 2009

amateur said:
For a high-level senior international, we could say that a score of below say 2.5 is meaningless or should at least be rare (or am I being naive here? - I almost never get to watch the bottom-of-the pack competitors...).

2.5 means they're not really senior-level skaters. Or even junior.

Occasionally you'll get a skater who at those levels who has a really bad day and deserves a mark that low or lower for one or more components.

If that was the best they could do, they wouldn't be competing at that level.

Unless their federation was desperate to send someone to international events.

At ISU championships you'll occasionally see some skaters with marks that low, at least from the stricter judges, in the bottom of the short program standings.

gkelly · Dec 10, 2009

Well, that's the CoP. If two skaters get the same number of points on PCSs, then the TESs will decide the outcome -- and vice versa.

But the judge doesn't actually think that the skaters were identical in PCS qualities.

He can distinguish that Isabelle had great skating skills, a few impressive transitions, poor carriage and body line but good projection and physical/emotional/intellectual involvement, pretty good concept and layout to the choreography and phrasing of movement to the music.

Lucy's skating skills were just pretty good, her transitions were less impressive better carriage/line, great emotional involvement, very detailed choreography in terms of storytelling and phrasing point of view but less interesting patterning/use of the ice surface, and stunningly nuanced interpretation of the music.

He could reflect those differences with scores like 8.0 7.5 7.25 7.5 7.25 (total 37 before factoring) for Isabelle and 7.25 6.75 8.0 7.75 8.5 (total 38.25) for Lucy. That way he can give Lucy a slightly higher total PCS, enough to make up for a small difference in technical content.

With only two components and 0.5 increments, he can give 8.0/7.0 and 7.0/8.0 and tie them on PCS. Or he can give 8.0/7.0 and 7.5/8.0 to give Lucy more PCS points than Isabelle. But that isn't really reflective of what he thought of the quality of Lucy's skating and transitions, if he already gave previous Virginia 7.0 for skating/transitions and thought that Lucy was no better than Virginia in that area.

Again, I have to ask this question. I read all the ISU bullets about what constitutes good Interpretation. Then I see a skating performance. Referring to the ISU rules, was that a 6.25 performance or a 6.50 performance? Can you tell me why it was worth only 6.25 points and not 6.50?

It's not just meeting bullet points -- most of these areas are qualitative. They happen in an analogue reality and the judge perceives them at an analogue level, but they have to be translated into a digital score.

From watching thousands of performances, judges can develop their own sense of what a 6.0 standard is and what a 7.0 standard is. So if they see a performance that falls somewhere in between, they have to make a decision about what score to give.

All judges aren't going to have exactly the same sense of what a 6.0 vs. a 7.0 performance is, but they'll usually be in the same general range.

At the average Grand Prix event, for example, you might have half the ladies fall somewhere between judge John's sense of 6.0 and 7.0 quality for most aspects of their performances.

So how is Judge John going to score six skaters who all fall somewhere between 6.0 and 7.0? Does he throw up his hands, give them all 6.5 for both of your components, and let the TES sort them all out?

Or is it more useful for him to say skater Minnie was smack in between 6.0 and 7.0 in skating skills (i.e., 6.5), closer to 6.0 in interpretation (i.e., 6.25), and closer to 7.0 (6.75) in performance/execution?

Skater Jeri was so close to 7.0 for interpretation that he'll just go ahead and give her the full 7.0 in that area; the performance/execution was not quite as close, certainly no better than Minnie's, so he'll give 6.75 there.

And so forth.

amateur · Dec 10, 2009

Mathman said:
Again, I have to ask this question. I read all the ISU bullets about what constitutes good Interpretation. Then I see a skating performance. Referring to the ISU rules, was that a 6.25 performance or a 6.50 performance? Can you tell me why it was worth only 6.25 points and not 6.50?

It depends on whether we are judging skates purely agains the bullet points, or also (if even only to a subtler degree than under an ordinal system) to each other. I see that you are saying that a points system should do away with the concept of comparing skaters, wheras gkelly talks about preserving the notion of comparing skaters to each other. Both are valid points of view, and something that should be made clear regarding the aims of the system (or future incarnations), if it is not already. (CoP in its fundamental concept, I would imagine is probably meant to side with Mathman, but the "soul" and tradition in figure skating would have one side with gkelly, the way I see it)

But also to consider: a skating performance is long and not necessarily of a consistent quality all the way through. What if it wavers between 6.5 and 7.0 quality? Also to be made clear is whether you're talking just about PCS as it is currently conceived/scored with its 5 catgories (where, while still debatable, I might tend to agree more with your point) or do you equally think this diminishing of gradations should apply to the "reimagined" PCS we've alluded to, with its fewer and broader categories. Slightly redundant now, but I'll repeat this paragraph from my last long post, which I might have edited since first posting:

If we are to claim that 7.0 level of skating skill should be readily distinguishable to a judge from that of 6.5, what to do about the grey area: a performance that the judge might feel "reached" the 7.0 level but not consistenly throughout, but was still better than the last skater who was a 6.5 through and through? It is always useful, I would think, to have grades available that fall in betwen the readily disguishable marks. What's the difference between B+ and A- ?

Those would be my main thoughts on the matter.

amateur · Dec 10, 2009

Mathman said:
No, I don't think so. Not if you are comparing the skaters, not one against the other, but against a list of bullets published in an ISU document.

I think 1 through 6 would be possible, but not one throguh 10.

Not sure I understand what you mean here, in response to what I wrote. Could you clarify? (Because I'm not sure whether I need to clarify what I meant)

gkelly · Dec 10, 2009

amateur said:
I see that you are saying that a points system should do away with the concept of comparing skaters, wheras gkelly talks about preserving the notion of comparing skaters to each other.

I wanted to add that the judge wouldn't really be comparing the skaters to each other in this system. He's comparing them to his mental standard for 7.0 and how close they come to it. But if he just gave 6.75 to another skater for that component, he can use that as a mental check to make sure he's being consistent across this competition.

Mathman · Dec 10, 2009

amateur said:
Not sure I understand what you mean here, in response to what I wrote. Could you clarify? (Because I'm not sure whether I need to clarify what I meant)

What I meant was, I do not think that a judge can consistently and objectively distinguish among 10 different graded steps. But he probably could distinguish among 6 graded steps.

But I might be wrong about that.

amateur · Dec 10, 2009

Mathman said:
What I meant was, I do not think that a judge can consistently and objectively distinguish among 10 different graded steps. But he probably could distinguish among 6 graded steps.

But I might be wrong about that.

So you're saying, in a general fashion, that for a particular "program component", a judge in your opinion, making his/her quick and sweeping decision as he/she must, (and presuming we're talking about judging a group of skaters of the same generel level of competence) can judge decisively something to be of one of about 6 levels of quality (and not necessarily make more levels of distinction?). Or otherwise stated, that even a range of marks from 4.0 to 8.0 with half-point gradations, for example, which would thus give 9 choices of marks to assign (4.0, 4.5, 5.0, etc), represents, if we're talking in ideal terms, too much choice to for a judge to be effectively and "objectively" assigning a meaningful values to the various performances?
I'm still trying to clarify your meaning, before I say more, not entirely sure whether our comments are going in the same direction.

ETA: If this is indeed what you mean, then to me it touches upon the idea that while a point system may profess or aim to express total objectivity and consistency across separate events, where the type of grading you are alluding to is the goal, in reality (analog reality as gkelly pointed out) we are always going to be comparing skaters in an event to their competitors (even if just using a competitor's performance as a reference point providing a fresh image in the judge's mind of what is meant by something like 7.0 level Skating Skills). There are an infinite number of degrees of "slightly better than", or "not quite as good as". This is where smaller gradations then become useful, manageable, meaningful - but - there is certainly no need to exaggerate to complexity of the decision, many practical constraints there, much simplification, generalization must be done. Having "smaller but manageable" gradations, like 0.25 in this case, is a sort of compromise between the "efficient categorization" type of thinking you are talking about, and the problem of giving a precise meaningful "comparative score" reflective of opinion/judgement when differences are slight yet a judge would feel compelled to note them. Key word there being compromise. It's all a compromise really, isn't it? ;-) ....At the end of the day, a number is always arbitrary in any case. I think this whole undertaking is actually futile! :lol:

gkelly · Dec 10, 2009

Mathman said:
What I meant was, I do not think that a judge can consistently and objectively distinguish among 10 different graded steps. But he probably could distinguish among 6 graded steps.

In any given competition, a judge is not likely to have to distinguish among 10 different levels of skating.

What's more likely is that most of the skaters will be grouped in a range of 2-3 full points of the 10-point scale, and there might be a few outliers.

At a big ISU championship-type event, the range may be a bit wider. But the majority are still going to be in, say, the range of approximately 4.5-7.5 for seniors and maybe a point lower for juniors.

Anyone who deserves scores in the 8s and 9s is an exceptional skater in world-class terms and by definition is always an outlier.

Anyone who deserves scores in the 2s who enters a senior competition will look out of place.

What's probably more of a question is whether judges can adequately separate the different components and judge them each on their own scales rather than relating everything to the skating skills.
Several of the judges at Skate Canada sure gave it a valiant effort!

Mathman · Dec 10, 2009

amateur said:
ETA: If this is indeed what you mean, then to me it touches upon the idea that while a point system may profess or aim to express total objectivity and consistency across separate events, where the type of grading you are alluding to is the goal, in reality (analog reality as gkelly pointed out) we are always going to be comparing skaters in an event to their competitors..:

That is what I am wondering. Do judges just tend to say, well, this skater was the best, so I'll give her an 8.25, and that skater was second best so i'll give her an 8.0? If so, then this is just 6.0 ordinal judging all over again. Why pretend?

Edited to add: By the way, the question came up earlier about how the mathematics would work out if there were only two program components instead of 5. It would work out absolutely perfectly with no averaging, weighting, or scaling factors at all.

For example, let's say that an outstanding man's LP might get 78 points on jumps and spins, including GOEs. He gets an average of 8.25 from each judge on both the new Skating Skills component and the new Performance/Interpretation component.

Just add up the points. From the five judges whose marks counted in the average, he gets a total of 41.25 points for each component. (Just add up the 8.25s -- you don't have to average or anything.) His total segment score is

78 for elements (including GOEs)
41.25 Skating Skills/Transitions
42.25 Performance/Interpretation

Total 160.50 points -- just about what an outstanding men's LP gets now.

And now here's the beauty of the whole thing.

A ladies LP will get, say, 60 points in jumps and spins, but still 41.25 and 41.25 in components. This gives a total of 142.50 -- a huge ladies LP score, but more heavily weighted toward the components, relative to the men. Viva la difference!

gkelly · Dec 10, 2009

Mathman said:
That is what I am wondering. Do judges just tend to say, well, this skater was the best, so I'll give her an 8.25, and that skater was second best so i'll give her an 8.0? If so, then this is just 6.0 ordinal judging all over again. Why pretend?

I don't think so, because there are too many numbers to keep track of to try to rank all the skaters in this system.

Even just with the five components, if the judge is comparing each performance to a mental standard, they're not going to be able to keep track of what marks they gave to all the previous skaters. Especially in a randomly seeded short program with a largish field.

They don't have an easily accessible list of all the marks for all the previous skaters, and unlike the ordinal system there's no need to go back and comb through notes to compare the current skater to all the previous ones.

Of course if there are some standout top skaters in the field it would be easier to remember giving a previous skater an 8.25, because that's not a mark that's given every day. Specific skaters and some specific marks might be memorable.

The immediately previous skater's marks might remain in memory or be easy to find in notes. It wouldn't be much of a strain to compare the second skater in the lineup to the first.

If a dishonest judge were going out of her way to try to undermark Bill and help Sam's chances, then she might make a point of remembering the marks she gave to Bill. Or if Bill gave a particularly memorable performance for one reason or another, many of the specific marks might also be memorable.

But keeping track of all five component marks for 12 or 24 or however many skaters is not really feasible.

That doesn't even take into account the TES. The judge is not going to be doing the calculations "Norma had one more triple than Roxanne, and one of Roxanne's was two-footed and probably downgraded, but Roxanne was faster and her interpretation was so much better I think she should win. [Honest evaluation based on that night's skating] How much higher do I have to mark her components to make up for the difference in jump scores?" There's no way to figure that out on the fly.

At best the judge might remember giving Norma scores in the high 7s/low 8s and mark Roxanne in a slightly higher range overall, with a different mix of which component was highest or lowest. Or remember "I gave Norma 8.0 for skating skills -- Roxanne's were pretty comparable, although good in different ways. I'll give her 8.0 as well. But Norma's interpretation was pretty wooden. I went down to 6.0 for her on that mark. Roxanne's interpretation was magical. I'm going to reward her with a 9.0." Etc.

So the judge knows he gave Roxanne significantly higher PCS overall. He knows that Norma probably had significantly higher TES. He doesn't know whether adding his GOEs and PCS for the two skaters to the base marks assigned by the tech panel would give his first place "ordinal" to Norma or Roxanne, although he may well be hoping that Roxanne prevails.

gkelly · Dec 10, 2009

Mathman said:
Edited to add: By the way, the question came up earlier about how the mathematics would work out if there were only two program components instead of 5. It would work out absolutely perfectly with no averaging, weighting, or scaling factors at all.

For example, let's say that an outstanding man's LP might get 78 points on jumps and spins, including GOEs. He gets an average of 8.25 from each judge on both the new Skating Skills component and the new Performance/Interpretation component.

Just add up the points. From the five judges whose marks counted in the average, he gets a total of 41.25 points for each component. (Just add up the 8.25s -- you don't have to average or anything.) His total segment score is

78 for elements (including GOEs)
41.25 Skating Skills/Transitions
42.25 Performance/Interpretation

Total 160.50 points -- just about what an outstanding men's LP gets now.

That works for the long program. For the short program, should the PCS still be 83.50? That's about twice what the TES will be. Do you want to introduce a factor of 0.5 for each component in the short program (or take the average of the two components, which amounts to the same thing)?

And it only works if there are five judges' scores that count, i.e., seven judges on the panel. That is not always the case.

E.g., on the JGP, there are nine judges on the panel and no random selection, so all the judges figure in the calculations. Same at US Nationals and other events around the world where random selection aimed at foiling nationalist gameplaying is unnecessary.
After dropping high and low for each GOE and component, there are seven scores that count for each. Do you drop the two highest and two lowest scores? Require smaller panels?

At US regionals, sometimes there are seven judges on a panel but often only six. At club competitions, sometimes five. Again, of course, none randomly eliminated from counting through the whole event. Do you insist on exactly seven at all events even if that increases the cost of running the event? Or is five judges also allowable, in which case there will be no trimming of the high and low scores?

a huge ladies LP score, but more heavily weighted toward the components, relative to the men. Viva la difference!

This I would be OK with.

Mathman · Dec 10, 2009

Yes, for the short program there would be a .5 factoring for both men and women.

About the size of the panels, if I had my druthers there would be no random draw. For a nine-judge panel the two highest and two lowest would be thrown out, for a seven-judge panel the highest one and the lowest one would be thrown out, and for a five-judge panel there would be no trimming.

If I didn't get my druthers on this, then we would have to take the average and then multiply by five.

Blades of Passion · Dec 11, 2009

Mathman said:
That is what I am wondering. Do judges just tend to say, well, this skater was the best, so I'll give her an 8.25, and that skater was second best so i'll give her an 8.0?

That's an important point, Mathman.

Judges need to understand how the PCS marks work in terms of scoring.

If they think Skater A was better than Skater B on the technical mark, but Skater B was good enough artistically to deserve higher placement, then they need to understand exactly how much higher Skater B needs to be marked to place as such.

If they are just using the PCS marks as ordinals and only grading Skater B .25 or .5 higher in the PCS (when that skater might actually deserve to be much higher in comparison), then it's not going to work out correctly.

Every judge should see how many total points a skater has in the technical mark after assigning all of their GOE grades. They should also see how many points the PCS totals out to be after they have assigned grades. These scores should be recorded (by themselves, on a piece of paper) so that they can compare the marks they gave to each skater if need be.

Tying into this, judges' scores should NOT be anonymous. We should be able to see exactly what each judge is doing. There should be no random selection of scores and no dropping of the highest/lowest score either. Each judge has equal weight and scoring should not be left up to chance. Their marks not being anonymous would (hopefully) prevent them from blatantly overmarking skaters.

The overall score AND overall ranking (which is determined by the score) every judge gives to each skater should also be displayed when the scores come up. This wouldn't mean anything in terms of the actual scoring, but I DO think it would help to make CoP a bit more audience friendly. People may not know what the numbers mean, but they can at least see how the judges ranked the skaters.

Mathman · Dec 11, 2009

Blades of Passion said:
Every judge should see how many total points a skater has in the technical mark after assigning all of their GOE grades. They should also see how many points the PCS totals out to be after they have assigned grades. These scores should be recorded (by themselves, on a piece of paper) so that they can compare the marks they gave to each skater if need be.

To me, that strikes at the very heart of the concept of the CoP. In principle, the score that a skater gets in Interpretation should be determined solely by how well that skater interpreted the music, according to the ISU guidelines for this mark. It should not, in principle, have anything to do with the element scores that this skater has racked up, or with her other component scores.

it certainly should not have anything to do with the marks that this particular judge gave to other skaters, or with the judge's opinion about which of two contestants skated the best and deserves the higher placement. (That's ordinal judging, not CoP.)

That -- the independece of marks across components and across the field of skaters -- is the whole point of the CoP. If that kind of scoring is impossible (as I believe to be the case), then what we are saying is that the whole concept of "add up the points" is a fraud.

Blades of Passion said:
There should be no random selection of scores and no dropping of the highest/lowest score either. Each judge has equal weight and scoring should not be left up to chance.

Actually, neither the random selection nor dropping the highest and lowest introduces an addition element of chance into the outcome.

About the random selection, look at it this way. There are 1000 qualified judges in the pool. There is a random draw 3 months before the event to select 9 of those 1000 to sit at rinkside. Then later there is another random draw among those 9 to see which 7 will actually judge the competition.. Each of the 1000 judges in the pool has a .007 probability of being slected as one of the seven scoring judges.

Contrast that with this model. Three months before the event, there is a random selection of seven judges from the pool of 1000. There is no random draw at the event. These seven judges judge the contest. Each of the 1000 judges in the pool has a .007 chance of being chosen. Exactly the same.

So, in my opinion (I might be wrong, but I don't think so :laugh:

), the argument against the random selection is not based on introducing an element of chance into the procedings. The same element of chance is present no matter how you select the judges from the pool of 1000. Rather, the problem is the decrease in sample size. Statistically, it would be better to select 9 from the pool of 1000 than 7. It doesn't matter how or when the selection is made, the only consideration is that 9 is better than 7 (by about 13 per cent in terms of statistical reliability.)

About trimming the mean by throwing out the highest and lowest, this also does not introduce any factor of blind luck into the procedings. It just produces a statistic (the trimmed mean, rather than the full arithmetic mean) which has a little more statistical stability -- it is not as much affected by outliers, like when a judge makes a keying error, and it protects a little against cheating by individual jusdges and coalitions. Statistically the trimmed mean and the untrimmed mean behave pretty much the same, except for unusually asymmetric distributions of data.

The real statistical problem -- blind luck -- that is built into the CoP is that there are too many different scores for various things, each carrying its own statistical error term. These errors do not, on the average, cancel each other in the sum. Quite the contrary, they augment each other. So if each judge enters 1000 individual numbers over the course of a competition, all those small "sampling errors" can add up in such a way that the statistical noise swamps the thing that we are trying to measure.

janetfan · Dec 11, 2009

Mathman said:
To me, that strikes at the very heart of the concept of the CoP. In principle, the score that a skater gets in Interpretation should be determined solely by how well that skater interpreted the music, according to the ISU guidelines for this mark. It should not, in principle, have anything to do with the element scores that this skater has racked up, or with her other component scores.

it certainly should not have anything to do with the marks that this particular judge gave to other skaters, or with the judge's opinion about which of two contestants skated the best and deserves the higher placement. (That's ordinal judging, not CoP.)

That -- the independece of marks across components and across the field of skaters -- is the whole point of the CoP. If that kind of scoring is impossible (as I believe to be the case), then what we are saying is that the whole concept of "add up the points" is a fraud.

Actually, neither the random selection nor dropping the highest and lowest introduces an addition element of chance into the outcome.

The real statistical problem -- blind luck -- that is built into the CoP is that there are too many different scores for various things, each carrying its own statistical error term. These errors do not, on the average, cancel each other in the sum. Quite the contrary, they augment each other. So if each judge enters 1000 individual numbers over the course of a competition, all those small "sampling errors" can add up in such a way that the statistical noise swamps the thing that we are trying to measure.

The whole post was very interesting and also shows that CoP is not a very effective or accurate scoring system.

I have read lots of comments on this thread but if it was up to me, CoP would be thrown out. And no, I am not suggesting a return to 6.0.

But a different system is needed.
It must be more fan friendly - if skating actually wants a chance at increasing or even winning back fans.

There are too many variables/choices in CoP. The more one has to consider the greater the chances for errors or inconsistencies.

It was very easy to see how much of it happened this season and to defend or improve this system is a waste of time imo.

Starting over with a more streamlined system and losing the tech callers - who should not needed if the judges are competent would be the best bet.

gkelly · Dec 11, 2009

janetfan said:
Starting over with a more streamlined system and losing the tech callers - who should not needed if the judges are competent would be the best bet.

So what would you suggest?

Proposed CoP Changes for Singles

Blades of Passion

Skating is Art, if you let it be

gkelly

amateur

Mathman

gkelly

gkelly

amateur

amateur

gkelly

Mathman

amateur

gkelly

Mathman

gkelly

gkelly

Mathman

Blades of Passion

Skating is Art, if you let it be

Mathman

janetfan

Match Penalty

gkelly

Similar threads

Connect with us