Page 3 of 6 FirstFirst 1 2 3 4 5 6 LastLast
Results 31 to 45 of 83

Thread: Ladies Free Skate and Results + GPF Finalists

  1. #31
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    Well, first I should say that I am not a statistician. I don't think that either Sandra Loosemore or George Rossano is, either. Dr. George Rossano works in aerospace engineering and Dr. Sandra Loosemore, I believe, in the theory of computation. My field is geometry with applications to cosmology. Still, I have taught statistics courses at the undergraduate and graduate levels off and on for quite a number of years.

    I am not exactly enamored of Mr. Cinquanta and the ISU, but I do appreciate a good con man when I see one. It's not so much that I am bored with debating the statistical merits of various scoring systems as that I think these debates mostly serve to deflect attention from the real issues, namely, doing everything we can to catch cheaters and banning them for life when we do. Why are people who have been caught red-handed with their fingers in the till still judges in good standing? I think that issues such as regional representation on judging panels and weakening the grip of national federations over the judging process are more important than splitting hairs about how points are tallied up.

    One of the points that Dr. Rossano is particularly concerned with is the effect of the random draw. I agree. But this is a public relations issue, not a statistical one. The public says, hey, wait a minute, doesn't this introduce an unwanted element of random chance into the mix? Not really. If you have a sitting panel of, let us say, 9 judges, it does not matter statistically whether or not an additional 5 dummy judges, whose votes are predetermined not to count, are taking up space at the judges' table. It's rather silly, of course, and it does give the public something to howl about. But statistically speaking, choosing 9 judges out of a pool of 14 by computer 15 minutes before the competition starts, has the same effect as if the voting judges had been chosen months in advance by drawing names out of a hat, which is how they did it under the old system. No matter when the judges' draw takes place, or how, skaters face the same probability of obtaining a draw that is favorable to them, or not, for whatever reason.

    Rossano and others go on to say that the best thing would be to increase the judges' panel to, say, 25 and count every vote. Of course it would. It would be better yet to increase it to 1000 judges, if we could find that many qualified. But it would be no better or worse first to choose 2000 judges to crowd around the table and then eliminate half of them by a random draw just before the contest started. If we have in mind the paradigm of sampling theory, the only thing that counts is the sample size, along with a guarantee that each judge has an equally likely a priori chance of being selected.

    But I have a little bit of a problem with the sampling theory model anyway. This depends on a tacit assumption that there is somehow a "true score" for every performance, that this "true score" is in principle quantifiable (as being the mean score of the hypothetical population of all possible qualified judges past, present or future, for instance), and that we can then treat the judges' panel as a sample of this population (like a political poll in trying to predict the outcome of national election).

    Statisticians like this model because for many statistics it has been thoroughly understood for 100 years -- just pop the numbers into the formula and you're done. (Like everyone who likes both numbers and skating I jumped right up after the Nebelhorn competition and ran the numbers through every test I could think of. This was amusing for a while.)

    But I am not convinced that this mythical "true score" actually exists. As we all admit, skating is subjective. In sober reality, the only thing that counts is, do these 9 judges like your performance of not. Thus the voting panel is the entire population, and everything that we learned in our statistics classes goes out the window.

    Another point at which I disagree with the critics of the CoP is about the whole secrecy issue, and whether we can tell (as easily as before) when individuals and blocs are cheating. Yes, it's true that the public does not know that judge number 4 is Joe Blow from Outer Slobovia. But if the ISU continues to give us all the details of the scoring of each element, it will be easy enough to figure out which judges (by number) were eliminated in the random draw, and then to work up statistical analyses like, "judges 2, 7 and 9 are obviously conspiring to hold down skater A." Much more to the point, IMO, is what the ISU plans to do with this information, if anything.

    Anyway, I guess the bottom line is, I don't really have a horse in this race. The mean, trimmed mean, median, ordinals -- none of this can stop people from cheating if they are determined to do so. Like Pollyanna I hold out a foolish hope that somewhere down the line the ISU folks will come to realize that they are killing the sport because the public, which after all pays the bills, think they're a bunch of crooks. Maybe then, when it hits them in the pocketbook, they will actually do something about it rather than just tinkering with the scoring system.

    If I had my druthers, I'd rather see the USFSA sign up with the WSF.

    Mathman

    PS. Despite this rant, I love this sport! Go AP and Jenny at U.S. Nationals!

  2. #32
    Tripping on the Podium
    Join Date
    Nov 2003
    Posts
    64
    I think these debates mostly serve to deflect attention from the real issues, namely, doing everything we can to catch cheaters and banning them for life when we do.
    I agree. And I appreciate your post. But I think you said some interesting things about the CoP that I want to address, because I find them irreconcilable:
    But I have a little bit of a problem with the sampling theory model anyway. This depends on a tacit assumption that there is somehow a "true score" for every performance, that this "true score" is in principle quantifiable (as being the mean score of the hypothetical population of all possible qualified judges past, present or future, for instance), and that we can then treat the judges' panel as a sample of this population (like a political poll in trying to predict the outcome of national election).
    ...
    But I am not convinced that this mythical "true score" actually exists. As we all admit, skating is subjective. In sober reality, the only thing that counts is, do these 9 judges like your performance of not. Thus the voting panel is the entire population, and everything that we learned in our statistics classes goes out the window.
    If I understand you correctly, we both agree that the CoP is inadequate. We both agree that a "true score" in skating cannot possibly exist. The only thing we can hope to achieve with some certainty is to rank the skaters. In effect, this is the first objective of both the ordinal and CoP systems. A secondary goal of the CoP is to give a meaningful score for each skater. Ironically, the secondary goal is the one being hyped by the ISU, but it is the first goal where the CoP is severely deficient. Deficient because results based on these scores -- of dubious value and meaning -- will themselves be dubious.

    I took your quote above to mean that it is impossible to sample a population for a true score that doesn't exist. I'm unclear however, if we both agree that we can indeed "treat the judges' panel as a sample of this population" under the ordinal system. In the ordinal system, if a different sample of 9 judges out of a population of 1000 was taken multiple times, the system could produce samples with consistent results, i.e. ranking of skaters (but NOT assigning individual scores as in CoP).
    One of the points that Dr. Rossano is particularly concerned with is the effect of the random draw. I agree. But this is a public relations issue, not a statistical one. The public says, hey, wait a minute, doesn't this introduce an unwanted element of random chance into the mix? Not really. If you have a sitting panel of, let us say, 9 judges, it does not matter statistically whether or not an additional 5 dummy judges, whose votes are predetermined not to count, are taking up space at the judges' table. It's rather silly, of course, and it does give the public something to howl about. But statistically speaking, choosing 9 judges out of a pool of 14 by computer 15 minutes before the competition starts, has the same effect as if the voting judges had been chosen months in advance by drawing names out of a hat, which is how they did it under the old system. No matter when the judges' draw takes place, or how, skaters face the same probability of obtaining a draw that is favorable to them, or not, for whatever reason.
    It is only true that a random draw is insignificant when we are talking about the ordinal system. And first, we have to bar the issue of bloc judging, which is a separate issue that confounds both systems, and in which case we do often say, "Oh but with a different set of judges, so-and-so skater would have won..." For a discussion of the ordinal system (as for the CoP), we have to assume that all judges in the panel are marking each skater the same, regardless of nationality and using the same standards and criteria. With these standards in place, we have to assume that the results of a judging panel with a 5-4 or 6-3 split are indicative of bloc judging, which is why the majority of judges' placements determines the final results under the ordinal system, and which is why the actual scores themselves don't matter. (Under the CoP, the scores matter, to the detriment of the final placements.) For the ordinal system only, the placements of the 9 judges ARE representative of the larger population and how the population would have judged the same event. Your argument is that a sample of a sample will be representative of the population, such that a sample of 5 marbles out of a sample of 9 marbles out of 100 would probably be equal to a sample of 5 marbles out of a population of 100. Only in the ordinal system would this be a correct assumption. Under the CoP system, however, every judge introduces error and variability through each little mark that they generate, therefore each judge can and will individually affect the final outcome regardless of the majority opinion of the panel. Under the ordinal system, 9 judges' placements are compared, but in the CoP system, each judge contributes MULTIPLE scores -- with error -- into a sum total. Because actual scores are being used to compute the results and not ranks, every time you take a random sample of the judges' marks in the CoP, the results of the competition will be different! As we have seen, the GP competitions have been coming down to within hundredths of a point margins. Therefore, random count of judges' marks has a significant impact in the CoP system.

    The ordinal system uses a majority system of ranks to determines the results. The ordinals eliminate error that would be inherent in raw scores as used in the CoP. The unfortunate conclusion is that it is not possible in any fair way to use an absolute point system in our subjective sport. It'd be nice if it were possible, but it's not, and the ISU is misleading people. The scoring can't be done systematically on such a micro level without enormous amounts of significant error introduced into the results. Secondly, the ISU claims that the point system gives a more meaningful measure of a skater's performance. Unfortunately, this cannot be true, because 20 out of 20 draws of a hat, the competition results will be different. This is not true with the ordinal system, except in the case of bloc judging, which must be dealt with irrespective of the judging system being used.

    Under CoP, the random count of the scores by itself is a fault considering that the sample size is small to begin with. But in addition, even if we assume that the judges aren't cheating, there is the issue of human error in judgment. The ordinal system effectively eliminates this error by refining the judges' marks into relative placements. The CoP does not eliminate error, despite the trimmed mean, since the mean is not a robust measure of central tendency in small samples, and since EVERY judges' mark will have error and will introduce its own error into the sum. Therefore no amount of randomization or dropping scores will change the fact that the CoP's results are unreliable. In the ordinal system, the majority count should effectively eliminate human error and produce reliable results, consistent results time and again.

    One of the reasons that the CoP is inherently faulty is that it is not complete in its objective. The first part, TES, consists of the technical components, which are judged element by element as the skater's program progresses. It is more objective than the 5 presentation component scores, which must be judged at the end of the skater's performance, just like the way it was in the ordinal system, except now there are 5 subjective marks instead of 1. These component scores are all as subjective and liable to human error as the single presentation mark in the 6.0 system. The difference is that these 5 subjective marks are added together to form part of a total that is touted to be an objective measure of performance. This is just not possible given the subjective nature of judging the presentation components. The total is not equal to the sum of its parts.

  3. #33
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    Give me a day or two to ponder that post, OK?

    Hey Rgirl, are you reading this? (Rgirl is my Golden Skate nemesis on this issue.)

    MM

  4. #34
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    "Beware of Mathematicians and all those who make false prophesies. The danger already exists that the Mathematicians have made a covenant with the Devil to darken the spirit and to confine mankind in the bonds of Hell." -- St. Augustine

    Well, I'll do my best to try to respond to some of the points that you raise. I think my overall position is that the CoP has problems and so do the various systems based on ordinal placement, but none of it matters very much in view of the real problems that plague the sport of figure skating and the ISU.
    But I think you said some interesting things about the CoP that I want to address, because I find them irreconcilable.
    I think this is the price we pay for living in the real world. No matter how clever we think we are, all of our mathematical and statistical models are very quickly exposed as being just that -- models (in the sense of a model airplane, say) of something that can't really be captured in such a simple fashion. The great Scottish philosopher David Hume staked his reputation on his classic work, A Treatise on Human Nature, in three volumes. The thrust of volume 1, proved in a couple hundred pages of close argument, was that there is no such thing as causality. Volume three, on ethics, begins, no theory of ethics is possible without causality, so forget everything I said in volume 1.

    So it is with statistics. We pretend that something exists (a true mean or a true ordinal ranking), even though we know it doesn't, because otherwise we couldn't carry on these discussions.
    I took your quote above to mean that it is impossible to sample a population for a true score that doesn't exist. I'm unclear however, if we both agree that we can indeed "treat the judges' panel as a sample of this population" under the ordinal system.
    No, I don't think so. I think that the same arguments against the reliability of CoP-type scores weigh also against ordinal placements.
    In the ordinal system, if a different sample of 9 judges out of a population of 1000 was taken multiple times, the system could produce samples with consistent results, i.e. ranking of skaters (but NOT assigning individual scores as in CoP).
    I think that such a sampling distribution would have variations like any other sampling distribution. True, two different panels could give the same rankings, while it would be virtually impossible to achieve a perfect match of CoP scores. But I do not see any reason to think that there would be less variation in one system compared to the other in the final outcomes of the contest if we looked at all 2.7 * 10^21 possible ways of choosing a 9 judge panel out of our 1000 candidates.
    It is only true that a random draw is insignificant when we are talking about the ordinal system.
    This is really the only point that I am fairly confident about. Any statistic whatever, be it mean, median, ordinal ranking or whatever, will have the same sampling distribution no matter how convoluted the process is of selecting the sample, so long as each prospective judge has an equal chance of being included in the final group of nine.

    Imagine the most extreme case of random draw. At 0.010 seconds before the program begins, the panel of 1000 is randomly cut down to 999. A thousandth of a second later, it is cut down to 998....Finally only nine are left. The scores of these nine count and all the other 991 sit at the table, having been made fools of by Speedy, pretending to mark. No matter what statistic you extract from the final sample, the probability distribution of that statistic will be exactly the same as if you had, say, rolled dice for it three months earlier. In either case, every possible 9-judge combination out of the original 1000 will have an equally likely possibility to be the final voting panel (namely, 1 chance in 2.7*10^23, LOL).

    It is not that the effect of having the random draw in two stages is "insignificant," rather that it is impossible for it to have any effect at all.
    For a discussion of the ordinal system (as for the CoP), we have to assume that all judges in the panel are marking each skater the same, regardless of nationality and using the same standards and criteria. With these standards in place, we have to assume that the results of a judging panel with a 5-4 or 6-3 split are indicative of bloc judging,
    I think that each judge should be judging fairly from one skater to another, but the judges need not agree among themselves. Therefore a 5-4 or 6-3 split may indicate nothing more than a difference of opinion.
    For the ordinal system only, the placements of the 9 judges ARE representative of the larger population and how the population would have judged the same event.
    I do not see how this can possibly be asserted? For any statistic whatever which admits of variation within a population, there is always an associated sampling error, by which we understand the difference between the population statistic and the sample statistic. The expected value of the sampling error (the "standard error") is typically easy to quantify for most commonly arising statistics (sigma/sqrt(n) for the mean, sigma*sqrt(pi/2n) for the median, etc., etc.) Non-parametric statistics such as ordinal rankings are certainly not immune to this phenomenon.

    Did you mean to say MIGHT BE instead of ARE?
    Your argument is that a sample of a sample will be representative of the population, such that a sample of 5 marbles out of a sample of 9 marbles out of 100 would probably be equal to a sample of 5 marbles out of a population of 100. Only in the ordinal system would this be a correct assumption.
    On the contrary, this is absolutely and incontestably true for any statistic whatever.

    Choose 9 out of 100, then 5 out of 9. What is the probability that any one particular 5-judge sample ends up as the sitting panel?

    Answer: Our particular 5-judge panel has 1 chance out of 597520 of being included in the 9-judge panel, then 1 chance out of 126 of surviving the final cut (the hypergeometric distribution). So the probability of that particular set of 5 marbles ending up as the one that counts is 1 chance in 597520*126 = 75287520.

    Suppose instead that we chose five marbles out of 100 to begin with. Our given panel now has one chance out of 100!/5!95! = 75287520 of being chosen. Exactly the same.

    Now that this particular panel has been selected, any statistic whatsoever that we can extract from it is, well, the statistic we can extract from it. (?)
    Therefore each judge can and will individually affect the final outcome regardless of the majority opinion of the panel.
    Yes. That's what the judges are there for. To affect the final outcome.
    Because actual scores are being used to compute the results and not ranks, every time you take a random sample of the judges' marks in the CoP, the results of the competition will be different!
    I hope you mean that "every time you take a random sample (i.e., seat this particular 5-judge panel instead of that one), the total scores under the CoP scoring will be slightly different." If by "results" you mean "who won, who came in second, etc." then of course your underlined and exclamation-pointed sentence cannot possibly be true.
    It'd be nice if it were possible, but it's not, and the ISU is misleading people.
    In my previous post I jokingly complimented Cinquanta by calling him a good con man. In fact, however, I don't think he's conning anybody at all. Everybody knows the CoP is crap -- but it's so much fun to roll around in even so!

    Everyone that I have tried to explain the CoP to has responded with, "but how will that stop judges from cheating?"

    So I think the most we can say is that the ISU is attempting to mislead people, but I don't think that anyone is actually being misled.

    Maybe this is a good time to insert again this disclaimer: I am no big fan of the CoP. As I say, I think it's fun. I guess I just don't think that the ordinal system is any better at speaking to the problems that figure skating and the ISU have brought on themselves by not running a clean house.
    ...because 20 out of 20 draws of a hat, the competition results will be different.
    Again, you mean that the total scores will be different, not necessarily the final placements, right?
    ...the majority count should effectively eliminate human error and produce reliable results, consistent results time and again.
    I guess I'm just more pessimistic by nature. Whenever I see a claim that someone has "eliminated human error" I think first of Murphy's law. I really think that you are way too impressed at the sublime glories of the ordinal system.

    I wish instead people would just say, "well, it's a little better than the CoP."
    ...just like the way it was in the ordinal system, except now there are 5 subjective marks instead of 1. These component scores are all as subjective and liable to human error as the single presentation mark in the 6.0 system. The difference is that these 5 subjective marks are added together to form part of a total that is touted to be an objective measure of performance.
    I take it that your objection here is to the "touting," not to the method of determining the scores. As you say, the method of determining the "presentation" scores is about the same in both systems.

    This is one of the points about the CoP that statisticians have jumped all over with great glee, because it's easy to run statistical tests about it. So far under the CoP the judges have not made any pretense at figuring out the five program components and the multiple categories within each. If a judge likes a skater, he or she just gives that skater uniformly high marks across the board. Just like the old system, where a skater got just one score which said, OK, that was pretty, or maybe not so pretty.

    BTW, you don't have to be a statistician to see this. A cursory glance at the numbers shows it as plain as day. In fact, I had intended to remark on your earlier comment that many of the conclusions of statistical analysis are counterintuitive. I don't think so. I think for the most part what statistics does is to provide ways to quantify things that are qualitatively obvious anyway.

    So, let's see, do I have a conclusion? No, not really. The reason that I like the CoP is that I have learned a lot about skating by studying it. It is interesting to me to see what relative value the experts put on certain elements, to ask with Sasha Cohen, "So if my spiral isn't a level three then what the heck to I have to do to make it better," to think about whether a flutz deserves a -1.0 or a -2.0 GOE, etc.

    As for the ordinal system, I learned all I wanted to know about it at Salt Lake City. Several months before the contest, the panel of judges was announced for the ladies events. Everyone took one look at the panel and said, oh, no, Michelle is screwed. As it happened,

    5 judges thought Sarah was better than Irina, and 4 thought Irina was better than Sarah.

    4 judges thought Michelle was better than Irina and 5 judges thought Irina was better than Michelle.

    I don't know. I'm just not seeing: "In the ordinal system, the majority count should effectively eliminate human error and produce reliable results, consistent results time and again."

    Mathman

    Edited to add:

    PS. One more thing about the random draw. Although this does not affect the results statistically, nevertheless IMO it is a terrible public relations blunder on the part of the ISU. It does not accomplish its goal of contributing to the anonymity of the judges (itself a bad thing anyway), and all it accomplishes is to make the public think either that the ISU is trying to pull a fast one or that it has lost its mind.
    Last edited by Mathman; 12-04-2003 at 10:45 AM.

  5. #35
    Forum translator Ptichka's Avatar
    Join Date
    Jul 2003
    Location
    Boston, MA
    Posts
    4,430
    This is one of the points about the CoP that statisticians have jumped all over with great glee, because it's easy to run statistical tests about it. So far under the CoP the judges have not made any pretense at figuring out the five program components and the multiple categories within each. If a judge likes a skater, he or she just gives that skater uniformly high marks across the board. Just like the old system, where a skater got just one score which said, OK, that was pretty, or maybe not so pretty.
    That is only partially correct. I think the judges have made an effort to judge Skating Skills (the fist program component) somewhat fairly. Look, for instance, at the Men's SP at Cup of China. There, even though Gao's overall TCS is 9th, his SS is 4th. I know this is an extreme case since the judges probably could not justify saying that a skater with the STRONGEST TES has very poor SS, but I have noticed this in other competitions.

  6. #36
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    Hi, Ptichka. Well, I think that this is one area where statistics really can contribute to understanding the numbers. Here are links to two recent articles that quantify the tendency to give blanket marks across the board in the program components. One is by Dr. Sandra Loosemore and the other by Dr. Dirk Schaeffer.

    http://www.goldenskate.com/articles/2003/101703.shtml

    http://www.frogsonice.com/skateweb/a...mponents.shtml

    I also posted an analysis of variance approach to this question a while back, but I can't find it in the archives right now. Anyway, as expected, for each judge the variation across the different categories was very small compared to the total variation among all the numbers.

    To me, though, this is not really a fault in the system. It just means that some skaters are better than others.

    Mathman
    Last edited by Mathman; 12-04-2003 at 03:14 PM.

  7. #37
    Tripping on the Podium
    Join Date
    Nov 2003
    Posts
    64
    MM,
    I'm surprised at myself that I was actually able to follow your post, since you threw in so many quotes and stats. :D Let's concentrate on one point, which I think is the main point, and which you addressed this way:
    I hope you mean that "every time you take a random sample (i.e., seat this particular 5-judge panel instead of that one), the total scores under the CoP scoring will be slightly different." If by "results" you mean "who won, who came in second, etc." then of course your underlined and exclamation-pointed sentence cannot possibly be true.
    In a close competition (see Trophee Lalique men's event), random sampling DOES affect the results. Every judge will mark a skater's component scores differently (see http://www.frogsonice.com/skateweb/articles/cop-components.shtml ) with huge amounts of variability between any two judges' scores for each skater. Therefore, the final outcome is completely dependent on the random draw since the final placements are determined by the results of the total scores, which will ALWAYS be different given a different set of judges. I think we both agree on this. But you don't agree that this will affect the final placements.

    In the ordinal system, it is true that two different samples of 9 judges might produce results like 5-4 and 6-3 in favor of the same skater, but this does NOT affect the placements. The majority of 1st place ordinals is what counts, and the others are just error. Also keep in mind that statistically, this is expected since no sample will be an exact representation of the population. The ordinal system is much more robust than the CoP system, because taking random samples from the population of judges has a very likely chance of producing consistent results (outcomes) time and time again, even in a close competition (ex. 1996 Worlds, Lu Chen and Michelle Kwan, no bloc judging). The results from the ordinal system are valid and reliable, unlike the CoP results. I think Sandra Loosemore's point in the article above is that each judge should effectively NOT be considered a sample from the same population, because of the enormous amounts of variability between any two judges' scores for the same components. Therefore a sample of the judges' marks under that particular system can in no way produce reliable or valid results, since the scores themselves are not reliable or valid. If you analyze the CoP from the bottom-up, it completely falls apart.

  8. #38
    Custom Title
    Join Date
    Jul 2003
    Posts
    3,896
    Originally posted by moyesii
    final outcome is completely dependent on the random draw since the final placements are determined by the results of the total scores, which will ALWAYS be different given a different set of judges. I think we both agree on this. But you don't agree that this will affect the final placements.
    Sometimes a different sample will change the results, sometimes not. The closer the contest, the more likely the results are to change, regardless of which system you use.

    I'll leave it to Mathman to put it in mathematical terms and to figure out whether *results* are more or less likely to change under one system compared to the other.


    In the ordinal system, it is true that two different samples of 9 judges might produce results like 5-4 and 6-3 in favor of the same skater, but this does NOT affect the placements.
    Ah, but if you take a 5-4 decision in favor of skater A arrived at by 9 out of a larger pool of judges and randomly substitute one or more of the judges who counted in that decision with others who did not initially count (e.g., choose a different "substitute judge" out of the initial draw from the 10-judge panels used through 2002), the ordinal breakdown might still end up being 5-4 in favor of skater A, or it might become 6-3 in favor of skater A, OR it might swing to 5-4 in favor of skater B.

    Substitute judge prefers A:
    5/9 chance of keeping ordinal breakdown the same
    4/9 chance of changing to 6-3

    Substitute judge prefers B:
    4/9 chance of keeping the breakdown exactly the same
    5/9 chance of switching the 5-4 split to 4-5 and giving the win to B

    What does that make, a 5/18 chance of reversing the results?

    With a larger pool of judges and more options for the random selection (e.g., a different set of 9 out of the 14 used under the interim system, or hypothetically choosing any 9 out of a set of 1000), it's much more likely choosing a different set of judges would change the ordinal breakdown and potientially the results. There is no guarantee in a close contest that any one particular result is the "correct" one, so you can only use as a result what that particular panel happens to yield according to that particular scoring system. Because judging is a matter of (educated) opinion, it's not an exact science

    The majority of 1st place ordinals is what counts, and the others are just error.
    Think of a 5-4 decision where you preferred the skater who came out on the losing end. Do you consider the opinions of the judges who agreed with you to be "just error"?

    Also keep in mind that statistically, this is expected since no sample will be an exact representation of the population. The ordinal system is much more robust than the CoP system, because taking random samples from the population of judges has a very likely chance of producing consistent results (outcomes) time and time again, even in a close competition
    If you look at *close* competitions, especially when the ordinals are mixed among more than two skaters, you can find many examples where different calculations applied to the exact same sets of ordinals (e.g., majority vs. OBO) would produce different results. You can also find plenty of examples where substituting the rankings of the referee or the substitute judge for any of the official judges at random would change the results.

    If you look at unanimous or near-unanimous decisions, any valid system should produce the same results.

  9. #39
    Tripping on the Podium
    Join Date
    Nov 2003
    Posts
    64
    Unfortunately, your entire argument is based on a wrong assumption, gkelly, which is that a "close" competition is defined as a 5-4 split of the panel. A 5-4 split is not the definition of a close competiton. An event where MK and Lipinski skated like at Nagano, or like 1996 Worlds b/w MK and Lu Chen, is a close comp. Note that in both of those events, there was a convincing majority of 1st place ordinals for one skater, AND neither skater was from E. Europe. A 5-4 split is NOT indicative of a close comp. It is indicative of bloc judging. Consider that a sample of the judging population for any given comp will approach the mean. The mean will be the ideal where ALL 9 judges place the same skater 1st. All other splits of the panel, ex. 5-4, 6-3, are due to variability. However, as long as the results reflect the majority opinion of the panel, the sample of judges can be said to represent the population.

    If you look at unanimous or near-unanimous decisions, any valid system should produce the same results.
    Under CoP, there is no such a thing as unanimous decisions of the panel, because of the large contributing factor or error and chance into the component marks. The sum of these marks will produce results that no one could have predicted under a valid system.

  10. #40
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    Hi, Moyesii. We seem to be talking about different things in this discussion of the effect of the random draw. (BTW, did you notice that we both gave the link to the same paper by Sandra Loosemore in our last two posts, LOL. So I guess we¡¦re on the same page after all.)

    If you draw a sample from a large population, then of course the results may be different if you choose sample X than if you choose sample Y. This is true for any statistic that can be extracted from a sample, whether it is the sum of a lot of component scores (CoP) or whether it is an average ordinal placement (OBO, for instance).

    What I am saying is that it does not matter WHEN this random selection is made. If the random selection is made by drawing straws three months before the event (the old method) then this introduces the possibility of ¡§sampling error.¡¨ If the draw is done in two stages ¡V first a draw of 14, then later a draw of 9 from the 14 (the new method) ¡V the amount of sampling error is exactly the same.

    Think of it this way. Would we like the system any better if they conducted the second stage of the random draw a day ahead of time instead of 15 minutes? A week before? Three months before? At the same time as we did the initial draw of 14?

    Would we like the system any better or any worse if the 5 judges who were not selected then went home and did not participate in a phony charade of play-judging? Would the system be any better or worse ¡V would the results of the competition be any different ¡V if the five dummy judges were struck by lightning two minutes before the skating began?

    I will try to give an example, whittled down in size to make the issue more transparent.

    There are 50 judges in the pool. Call them judge #1, judge #2, etc.

    THE CoP WAY: Three months before the event, there is a random draw of five judges. Judges 3, 13, 25, 41 and 43 were chosen. It could have been different, but that¡¦s what happened.

    Here are the scores given to the top men¡¦s competitors. I will use Plushenko (your personal favorite, LOL) and Joubert.

    -----#3 #13 #25 #41 #43
    Jou 200 200 200 150 150
    Plu 150 150 150 200 200

    Now we have a random draw of three. As luck would have it, we choose judges # 25, 41 and 43. Plushenko wins! Joubert is robbed by the system!

    THE OLD WAY: Three months before the event, there is a random draw of three judges. As luck would have it, we chose judges # 25, 41 and 43. Here are their marks:

    -----#25 #41 #4
    Jou 200 150 150
    Plu 150 200 200

    Plushenko wins, fair and square.

    This little example is supposed to illustrate why the only thing that matters is this: who are the final three judges? In either case Plushenko wins with 550 total points to Joubert¡¦s 500 (or, by two to one in first place ordinals). This is what I mean when I say that statistically the random draw thing is a red herring.

    But still, it is terrible and the ISU should do away with it pronto. Not for statistical reasons but because it is a public relations disaster. To look again at the example, although the results are the same for both the CoP way and the old way, the real outcome of the contest is, with the CoP you¡¦ve got a million Joubert fans angry at the system and 10 million casual fans saying, what kind of a farce is this?

    Mathman

  11. #41
    Tripping on the Podium
    Join Date
    Nov 2003
    Posts
    64
    Mathman,
    I understand what you're saying. Under the CoP, random sampling of the panel of judges statistically has no greater effect on the outcome than the original draw of the panel of judges. This is because the variability in the judges' marks, as pointed out by Sandra Loosemore, will in effect cause any panel of judges to produce a different result under the CoP.
    If you draw a sample from a large population, then of course the results may be different if you choose sample X than if you choose sample Y. This is true for any statistic that can be extracted from a sample, whether it is the sum of a lot of component scores (CoP) or whether it is an average ordinal placement (OBO, for instance).
    However, the ordinal system will often produce the same results even in a close competition with any given panel of judges (barring bloc judging). There will be some error within each sample (ex. 7-2, 6-3), but the sampling distribution WILL approach the mean (9 judges unanimous). Under the CoP, any sample of judges will NOT belong to the same distribution as another. This is because of the subjectivity of the component marks (which are scored on an absolute scale, contributing to error), and because of the huge variability between judges in the component marks, as discussed by Sandra Loosemore. Therefore the results of any competition under the CoP cannot be said to be meaningfully representative of anything but the given set of judges on the panel, and only those whose scores are counted.

  12. #42
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    Therefore the results of any competition under the CoP cannot be said to be meaningfully representative of anything but the given set of judges on the panel, and only those whose scores are counted.
    I agree 100%. That is what I have been trying to say, but here you have said it much more succinctly. That is what I meant earlier when I said that I didn't buy the whole "sampling theory" paradigm in the first place.

    Unfortunately, clever old Speedy, with his second random draw, has so confused the issue that everybody is arguing about the things that we have been arguing about on this thread, and letting the real point go by. I think that he will drop it (the random draw thing, not the CoP) before the next Olympics. It accomplishes nothing except to confuse and possibly to anger the audience. Cinquanta will probably make a big deal of dropping it, as if making a concession to the reform forces.

    The only thing that I would add to your quote is (here is where we probably disagree): "...and the same is true for ordinal placements."

    I do not, however, think that this is a bad thing. That's just how it is. A skater has to win favor with the judges that are scoring the event. Period.

    Mathman

  13. #43
    Tripping on the Podium
    Join Date
    Nov 2003
    Posts
    64
    The only thing that I would add to your quote is (here is where we probably disagree): "...and the same is true for ordinal placements."
    And I would add: "but to much much less extent." We agree that no system is 100% reliable, but the ordinal system IS more robust than the CoP.

    It has been awhile since I did stats, but I think it has to do with standard error. The error and variability in the judges' scores are large such that we cannot be confident that the total scores and placements as determined by CoP are within reasonable margin of an actual population value. The standard error for each score would be greater than the point difference between closely ranked skaters, and therefore we cannot conclude that the results obtained by CoP are meaningful. Results from Trophee Lalique would look something like:

    2 Kevin VAN DER PERREN BEL 197.33 +/- 2.00
    3 Michael WEISS USA 195.98 3 4 +/- 2.00
    4 Brian JOUBERT FRA 195.58 +/- 2.00

    Under the ordinal system, we can be certain that different sets of judges will result in different composition of ordinals, such as 7-2 or 6-3, but they WILL result in the same final placements since it doesn't matter if a skater wins by 7-2 or 6-3. The ordinal system leaves less up to random chance and error than does CoP.
    I do not, however, think that this is a bad thing. That's just how it is.
    Of course it would be a bad thing. You've just been disillusioned by bloc judging into thinking that the ordinal system isn't capable of producing results with certainty. I would not doubt, however, MK's win in 1996 or Lipinski's win in 1998. On the other hand, the CoP, even barring bloc judging and cheating, will still produce dubious results because of the error and variability in human judgment for a system that requires humans to function like machines in order for the results to be reliable and valid. All these confounding variables will make the CoP a dangerous system, esp. in the face of controversial results. The confounding variables only contribute to the lack of accountability for the ISU and the judges. As I said, it puts the entire results of the competition in the hands of fate.

  14. #44
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,348
    Wait a minute, let's think this though! We just agreed that the voting judges are representing only themselves and not some kind of mythical larger population. Therefore the standard error is zero. For example, the standard error for the mean, for a sample of size n taken from a population of size N is: S.E. = sqrt((N-n)/(n-1)). Since the judges represent only themselves, the sample is the popluation, n = N, and the standard error is zero.

    This is just common sense. The "statistical error" (sampling error) means the difference between the characteristic of the population that we are studying and the corresponding characteristic of the sample. The "standard error" is the standard deviation of the sampling errors taken over all possible samples. If the sample represents only itself, as we just agreed, then there is only 1 possible sample (itself) and no possibility of any statistical error whatsoever in the CoP.

    (There still might be "human error," I suppose, like, "Oops, I reached for the +2 button but I accidentally hit the +3 button by mistake.")

    To me, this is an attractive feature -- no statistical error is possible in the results. The scores I give are the scores I give. Once the panel is seated, nothing is in the hands of fate. Statisticians, go home.

    The other thing that I hope we have come to agreement on is that the random draw thing is a red herring that has nothing to do with anything, and we are just being silly if we mention it any more. (But help me pass the word to other figure skating forums, LOL.)

    Mathman
    Last edited by Mathman; 12-04-2003 at 09:40 PM.

  15. #45
    Tripping on the Podium
    Join Date
    Nov 2003
    Posts
    64
    Speedy? Is that you? Please don't play my words, Mathman.
    We just agreed that the voting judges are representing only themselves and not some kind of mythical larger population. Therefore the standard error is zero... Since the judges represent only themselves, the sample is the popluation, n = N, and the standard error is zero.
    I can't believe that you, a "mathman," would try to use stats to distort what's going on. The reality is that the CoP aims to produce a sample result (scores) that are representative of the larger population. Through stats, we came to the conclusion that the sample does NOT represent the population... We determined that the standard error in the CoP results was too large to say with confidence that the results were part of the sampling distribution or if they are a separate distribution (population) altogether. In fact, we could take two different random samples of the judging panel and they could produce total scores for a skater that would fall within a standard dev. of each other. The problem is that much of the time, the difference in scores for a skater in different data sets would be greater than the margin of victory of one skater over another in the results. This is why I say that the placements, as determined by the CoP, are not valid and not reliable.
    The "statistical error" (sampling error) means the difference between the characteristic of the population that we are studying and the corresponding characteristic of the sample. The "standard error" is the standard deviation of the sampling errors taken over all possible samples. If the sample represents only itself, as we just agreed, then there is only 1 possible sample (itself) and no possibility of any statistical error whatsoever in the CoP.
    We cannot agree that the sample represents "only itself," and take that as the starting point in a discussion. It is a conclusion that must come from somewhere. We come to that conclusion based on the data -- results from actual competitions -- and statistical analyses. In statistical calculations, we often don't have the value of certain population parameters. Therefore, we use the sample statistics to estimate the population value. (Some one-sampled t-test or something.) What we would conclude from the t-test is that the sample of scores produced by the CoP has too great a standard error to say with confidence that the scores are representative of the population. Note that the stats involved are still legit even though we concluded that the sample was not.
    (There still might be "human error," I suppose, like, "Oops, I reached for the +2 button but I accidentally hit the +3 button by mistake.")
    Not quite. Since each judge contributes to the final outcome (By outcome I mean score, not placements), the subjectivity involved in choosing a mark on an absolute scale introduces enormous amounts of error into the results (see Rossano). Every single judge will introduce error into the results. The CoP demands an exactness of judging that is humanly impossible. That's why the ordinal system was invented.
    To me, this is an attractive feature -- no statistical error is possible in the results. The scores I give are the scores I give. Once the panel is seated, nothing is in the hands of fate. Statisticians, go home.
    I guess we might as well have just one judge judging the entire event under CoP, since statistically it's all the same. :/ And I know you'd say Yes to that.
    The other thing that I hope we have come to agreement on is that the random draw thing is a red herring that has nothing to do with anything, and we are just being silly if we mention it any more. (But help me pass the word to other figure skating forums, LOL.)
    This is what I said about the random draw: Yes, it IS statistically significant, because it WILL make the results different from that of the larger panel of judges. However, it is not significant only in the sense that no results produced by the CoP are fair, reliable, or valid. Therefore any CoP results from one set of judges are just as good (bad) as another.

Page 3 of 6 FirstFirst 1 2 3 4 5 6 LastLast

Similar Threads

  1. More tapes for copy and trade
    By Koshka in forum 2012-13 Figure Skating archives
    Replies: 0
    Last Post: 09-12-2005, 02:58 PM
  2. Tapes for copy and trade
    By Koshka in forum 2012-13 Figure Skating archives
    Replies: 0
    Last Post: 08-30-2005, 04:43 AM
  3. Trophee Eric Bompard - Results - Ladies Free Skate
    By SailorGalaxia518 in forum 2004-05 Figure Skating archives
    Replies: 39
    Last Post: 11-21-2004, 11:07 PM
  4. NHK Trophy - Results - Ladies Free Skate
    By SailorGalaxia518 in forum 2004-05 Figure Skating archives
    Replies: 6
    Last Post: 11-06-2004, 09:43 PM
  5. Extensive, updated list of tapes for trading or copying...
    By springerluv in forum 2002-03 Figure Skating archives
    Replies: 2
    Last Post: 05-27-2003, 07:08 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •