Let's see another one, and it could be that I don't understand you. Say group 4 guy is given 9.75. And then someone in group 5 gives a slightly lesser but still great performance, and they are given 9.5. And then someone comes in group 6 and falls in between those two. What must they be given?
9.75? 9.5?
So is the next argument "well, 9.5 guy should be given 9.25 as a ranking"?
I also try to use all-time scale but only as a reviewer, so it's silly to me to say it's also what goes through the judges' heads in real time. It is maybe not possible they'll come to the same conclusions as you when they judge that this is how scores should be dealt out - even if they agree results should be on an all time scale, because they simply might want to leave room for the skaters ranked above the ones in group 4. (I get that this flies in the face of how they keep inflating scores as they move to higher groups in reality, but we're talking flaws with this system in a single competition). And then group 4 guy giving the best performance and getting a 7 will be criticised by everyone, and everyone else will be left confused as to why they got 6.5s. (again could be drawing a poor conclusion here).
If scores are ranks anyway, then just use a 6.0 system, but patch it so that we can have CoP's flexibilty? We need to tell skaters there's room for improvement, so I get "all-time" scores are to be used, but I doubt what we have on hand passes any meaningful criterion set by some true science. I even don't think "all time" scores are used in any other competition, sport or otherwise, that uses a rubric (citations definitely needed). I am not one to believe the judges have any understanding of artistry, or technique, and of course that would fix some, but if we are to get to "accurate" judges, probably we'll always fail with this system. But maybe one's aim is to get to the greatest efficiency possible, which still is doubtful given time constraints to me, but I guess more achieveable with the things you've suggested.
I'm also curious if I'm actually correct in my understanding that this system would work in a bulk review, but not in a real time judging environment. Anyone have an opinion there?