I don't agree. All nine marks are being used to determine the median in your example. If eight of the nine marks were randomly dropped, the one remaining mark would often be significantly different from the median of the nine. If all nine marks are different, the one chosen mark would differ from the true median 8 of 9 time. And in all cases, the median (for PCs) could differ from the mean of the distribution by up to 0.25 points while the standard deviation of the mean for marks is typically 0.13 points.
The benefit from using a median also depends if you are trying to filter out random noise, or systematic errors, or outliers; and also depends on the number of samples in the distribution.
To expand on an earlier comment. Designing the system around the obsession with the occassional deal making judge is counter productive because the impact of day to day national bias, incompetence and random differences of opinion gets ignored -- and those problems are more common (than deal making) and generally more important.
The system has to be designed to deal will all potential sources of error, not just the one that is most popular to discuss.
But in general, more judges are always better than less. I can't think of any error sources where having more judges is a liabiilty. Worst case is that at some point adding more judges does not improve the reliabliity of the results. But we are so far from having enough judges to reliably decide scores to 0.01 points, that having too many judges will never be an issue.



, but after I wrote that sentence I got to wondering if it was really true or not. I had to look it up
(I am far from being an expert on this subject.)

.

Bookmarks