Page 3 of 5 FirstFirst 1 2 3 4 5 LastLast
Results 31 to 45 of 67

Thread: Should the IJS use median scores instead of the trimmed mean?

  1. #31
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,649
    Quote Originally Posted by concorde View Post
    No matter what system you choose, someone is going to be unhappy with the outcome.
    This is true, but I think we can say something even stronger than that. No judging system, however excellent and however detailed, can anticipate everything that can possibly happen. No matter what judging system we choose, there will always be individual competitions where we all say together, oops, the scoring system failed us on this occasion.

    Quote Originally Posted by drivingmissdaisy View Post
    The other alternative is for each judge to decide for themselves how much each jump should be worth, which isn't preferable IMO.
    To me, some triple toes are worth more than others, in terms of their impact as choreographic exclamation points, etc. I guess this can be addressed with GOEs and PCSs, though.

  2. #32
    Custom Title
    Join Date
    Apr 2014
    Posts
    2,296
    Quote Originally Posted by drivingmissdaisy View Post
    The other alternative is for each judge to decide for themselves how much each jump should be worth, which isn't preferable IMO. As a judge under 6.0 I'd have a hard time deciding whether to mark Yuna's LP or Caro's LP higher in Sochi; with IJS I'd don't have to decide whom I liked better, I just mark what I see based on set criteria. Of course, in any judged system a judge can manipulate marks but I think it is an improvement when a judge is more focused on the elements than they were under 6.0.
    But as a judge, isn't your job to decide who was better? In this case, if I were a judge under 6.0, I would say: I will place Yuna ahead, because the quality of her 3-3 combination outweighs Carolina's extra triple. Also, I enjoyed Yuna's program more.

    You might disagree. Hence why there are multiple judges, and majority rules.

    You do have a point about the judges needing to keep track of all 20+ contestants though. But they will have computers to see how they scored everyone that already went. Also, as I've stated before, I do think it's good to have protocols for the elements everyone did.

  3. #33
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,649
    Quote Originally Posted by Sandpiper View Post
    But as a judge, isn't your job to decide who was better? In this case, if I were a judge under 6.0, I would say: I will place Yuna ahead, because the quality of her 3-3 combination outweighs Carolina's extra triple. Also, I enjoyed Yuna's program more
    That's the big riddle, right there. There might be many reasons for enjoying one program more than another, having little to do with who deserves to win an athletic competition.

  4. #34
    Size 7 Knife Boots Sam-Skwantch's Avatar
    Join Date
    Dec 2013
    Location
    At the Rink
    Posts
    3,653
    A major issue for me is simply how we get our PCS scores. How is the value derived? Is a 9.0 really supposed to be equal to a 9.0 in all other events or is it just event specific? By that I mean is there a specific set of requirements to achieve this score or is it mostly awarded like 6.0 and instead just representing the skaters artistic marks in the event at hand and more importantly based in relation to the skaters in the event and not a predetermined set of skills? This should be more clear. Is it spelled out in the ISU guidelines because I have not seen it.

    Another question I've been wondering about is if we should just award a certain score bonus for the last two groups? Something like a 3pt bonus reflected on the final score for making the final group and 1.5 for the second group (FS ONLY). We know it already happens in every event. Why not just get it out there and identify it and why it is happening? I recognize there is added hype and pressure being head to head with the best in the competition and as such I don't even know if I'd have a problem with this. It's all these little unspoken rules and ways that the numbers are derived that make the score a for me. Lets just identify the elephant in the room already and then let the Math take over from there.

    Somehow to me this all feels like we are setting the price of a piece of art. We all know the cost of the frame and it is always going to be an objective price. So maybe the frame is our TES scores. Now comes the hard part. Who wants to be the jerk to set the price of the painting inside and imagine having to explain it out and imagine expecting a consistency from dealer to dealer?

  5. #35
    Custom Title
    Join Date
    Jul 2013
    Location
    USA
    Posts
    119
    Most high level judge are former competitive skaters and their personal experience also plays into how they score. As one international level judge told me, if someone does a move well that you found to be particularly difficult to execute, you tend to give that skaters a high mark. The judge beside you may not view the same move as particularly difficult, so that judge will not give out as high of marks.

    As an ice skating parent, you quickly see how "fickle" the scoring system is. You just hope that on your child's big way, the stars align so the system works in your child's favor.

  6. #36
    Custom Title
    Join Date
    Jul 2003
    Posts
    3,929
    Quote Originally Posted by Sandpiper View Post
    But as a judge, isn't your job to decide who was better?
    Under ordinal judging, it was the job of each judge to rank the skaters as best, second best, etc.
    And then the accounting algorithms (majority or OBO) combined the individual judges' rankings into a consensus for the whole panel, which often did not exactly match any individual judge's rankings.

    That job changed with the change in scoring systems.

    Now it's the job of the technical committees and system designers to publish in advance how much each technical element is worth, relative to each other and relative to the range of scores available for the program components.

    It's the job of the technical panel to determine which technical skills the skater actually performed well enough to get credit for.

    It's the job of the judges to determine how well the skater performed each technical element, and how well they fulfilled the program component criteria.

    Unlike ordinal judging, it's not supposed to be a comparative system.

    Then all those decisions are combined into an overall score for each skater, and the different total scores will result in ranked results for all the skaters. The rankings by base value alone and the rankings by GOE and PCS alone might both be very different not only from each other but also from the final overall rankings.

    No one official ranks the skaters or even one panel -- that is not their job now.

    Perhaps some judges, as well as some fans, still think in terms of ranking skaters, but that's 6.0 thinking and will diminish as older judges retire and younger ones come through the system without having devoted years to thinking in terms of comparing skaters.

  7. #37
    Custom Title
    Join Date
    Jul 2013
    Location
    USA
    Posts
    119
    [QUOTE=gkelly;946609]Under ordinal judging, it was the job of each judge to rank the skaters as best, second best, etc.
    And then the accounting algorithms (majority or OBO) combined the individual judges' rankings into a consensus for the whole panel, which often did not exactly match any individual judge's rankings.

    This was mentioned before but I wanted to highlight it. Under the ordinal system, no judge could put a Skater A in first place but Skater A could win. Strange but true. That is why I think IJS is a better system since each component of a program is evaluated & scored, and then a final score is given to the entire program. The final score is a sum of the individual components scores.

  8. #38
    Custom Title
    Join Date
    Jan 2013
    Posts
    5,282
    The problem with ordinals is that a judge can easily hold down a skater with the artistic marks. Under IJS, the judges have less ability to manipulate the final score. Certainly if you take out the highest and lowest judges, you mitigate any outliers. And if a judge is worried about their scores being the skew, maybe they should mark accordingly.

    In an ideal world, every judge would give the same marks, without conferring with each other, because they would give what the skater deserves. In this perfect world, you don't get outliers, so why not modify the system to at least MINIMIZE the number of highs and lows.

    I think it would certainly reduce the amount of corruption, because you can't really pay off judges to give higher or lower marks to fit your agenda because their marks might be thrown out anyways.

    It's so stupid that GOE is a random selection too. It should be the median 7 marks out of 9. Ideally, it would be the median 5 marks out of 9, so as to really minimize the number of outliers, in case 2 judges are in cahoots together. Any basic knowledge of statistics would show that this leads to more consistent judging/results across the board.

  9. #39
    Custom Title
    Join Date
    Apr 2014
    Posts
    2,296
    As Mathman has explained, 6.0 requires a majority of votes. That's how they deal with the outliers.

    Judges aren't going to all give out the same marks. Ever. No two people are going to react the same way to the same performance. That's just the way figure skating is. There is no universal truth.

    I agree the "computer throws out random marks" thing is incredibly stupid though.

    Under the ordinal system, no judge could put a Skater A in first place but Skater A could win.
    How is the current system any different? No judge could give a skater the highest score for a certain component, but that skater could still get the highest marks for that component if they have the highest average score.

  10. #40
    Custom Title
    Join Date
    Jul 2003
    Posts
    3,929
    Quote Originally Posted by CanadianSkaterGuy View Post
    It's so stupid that GOE is a random selection too.
    It isn't. IIRC the random selection was eliminated ca. 2006 or 2008.

    Since then, all judges' scores count throughout the competition. Only the highest and lowest score are thrown out for each GOE and for each component, but usually even a judge who is marking especially high or low will have at least some of their scores count for each skater.

    It should be the median 7 marks out of 9.
    It is.

    Ideally, it would be the median 5 marks out of 9, so as to really minimize the number of outliers, in case 2 judges are in cahoots together. Any basic knowledge of statistics would show that this leads to more consistent judging/results across the board.
    That's the question the mathematicians are debating in this thread.

  11. #41
    Custom Title
    Join Date
    Feb 2010
    Posts
    3,735
    Quote Originally Posted by Sam-Skwantch View Post
    Another question I've been wondering about is if we should just award a certain score bonus for the last two groups? Something like a 3pt bonus reflected on the final score for making the final group and 1.5 for the second group (FS ONLY). We know it already happens in every event. Why not just get it out there and identify it and why it is happening? I recognize there is added hype and pressure being head to head with the best in the competition and as such I don't even know if I'd have a problem with this. It's all these little unspoken rules and ways that the numbers are derived that make the score a for me. Lets just identify the elephant in the room already and then let the Math take over from there.
    This might just be a judging tendency that originated under 6.0, in which judges had to leave room in the marks for later skaters. I'd guess the vast majority, if not all, Olympic judges also judged under 6.0 so this just be a mindset that is hard for them to shake. It does lead to some baffling scores though, such as when Gracie, Julia and Mao are all scored within two points of each other in PCS in the Sochi LP. With IJS it seems less excusable because the scoring tries to appear more scientific, when it really isn't.

  12. #42
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,649
    Quote Originally Posted by concorde View Post
    This was mentioned before but I wanted to highlight it. Under the ordinal system, no judge could put a Skater A in first place but Skater A could win. Strange but true.
    I don't think this is particularly strange. In many settings there are situations where I love my candidate but hate yours, and vice versa. The only one we can agree on is the compromise candidate that everyone put second.

    Quote Originally Posted by genekelly
    That's the question the mathematicians are debating in this thread.
    Caelum made a very cool suggestion earlier on this thread. Instead of the trimmed mean, there is another method available for removing the distortion of outliers while still acknowledging the intent of all judges. This is called "Winsorizing" (after statistician Charles Windsor) and it goes like this. Instead of discarding highest and lowest, the highest and lowest are replaced by the boundaries of the middle 7 (or 5, etc.)

    Example: 7.50 8.25 8.25 8.50 8.75 9.00 9.00 9.25 10.00 Average 8.72, median 8.75

    Current IJS method (throw out highest a lowest) 8.25 8.25 8.50 8.75 9.00 9.00 9,25 Average 8.71

    Winsorized method. Replace the lowest score, 7,50, with 8.25 and replace the highest, 10.00, with 9.25. Average = 8.72

    More severe Winsorization. Replace both of the two lowest scores with the third lowest. Replace both of the two highest scores with the third to the highest. In this example nothing changes and we get the same result. Average = 8.72

    In this example all methods produce nearly the same result because there is nothing unusual about the distribution of scores. One score was too high (10.00), another was too low (7.50), but everything balanced out in the end.

    By the way, the reason statisticians do not like trimmed and Winsorized means very much is that the sampling distribution is not well understood. (The formulas from our statistics classes that have sigma/sqrt(n) in them are not guaranteed to work in the case of an asymmetric distribution of scores.)

  13. #43
    Custom Title
    Join Date
    Feb 2010
    Posts
    3,735
    Another important aspect not being considered is that a judging system has to be someone easy to understand. Particularly in cases when the outcome wouldn't change, it makes no sense to overly complicate the system by Winsorizing it (although it is fascinating to read). However, I think it is also important for the system to reflect that, to some extent, a fair judging panel can be established. Throwing out eight scores out of nine implies that as many as eight judges cannot be trusted enough to have their scores count.

  14. #44
    Yuzulia & Ruslena Team Alba's Avatar
    Join Date
    Feb 2014
    Location
    Milan
    Posts
    4,077
    Quote Originally Posted by Mathman View Post
    This situation, where a determined minority cabal can dominate the majority, could not happen if we used the median (middle score) instead of the mean. The median is simply the maximally trimmed mean -- we throw out the highest four and the lowest four instead of the highest one or two and the lowest one or two. In this example the median scores are

    Skater A: 9.00
    Skater B: 8.75

    What do you think? Would this be a better system?
    I don't think this can be a better system. It doesn't make sense to leave out the scores of 8 judges, why not go with the majority as it used to be then?

  15. #45
    Custom Title Mathman's Avatar
    Join Date
    Jun 2003
    Location
    Detroit, Michigan
    Posts
    28,649
    Quote Originally Posted by Alba View Post
    ...why not go with the majority as it used to be then?
    That would indeed be better in my opinion. Alas, that ship has sailed, never to return.

Page 3 of 5 FirstFirst 1 2 3 4 5 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •