I understand the scores as follows.
Suppose we have three skaters and let a judge decide the ranking of their performances according to his own standard. Each skaters will get his own ranking from 1 to 3. Now the judge might feel that, even if he gave the numbers from 1 to 3, the number one guy is extremely good compared with the other two and that it would be unfair to express the difference just by the ranking. This is maybe because the number of skaters are just three. If there were, say, seven more skaters who are reasonably good, then these seven skaters actually might be ranked from 2nd to 8th. The judge would be happier if he gives the number 1 and 9 and 10 to the original three skaters, respectively, even when there are only three skaters.
Go one step further and imagine all the skaters in the world and let the judge do his job. Well, all the skaters in the world at a specific time might not be enough, because sometimes there are only, say 3 top-level skaters and there are 2nd tier 20 skaters and the difference between the two group might be huge. In this case, let the judge fill the gap by imagining hypothetical skaters. What I am trying to do here is, let the judge imagine huge number of "evenly distributed" (according to his own standard) hypothetical performances and identify the ranking of each performance. You may say that the number of hypothetical performances go to infinity but this is not a problem because we can ask the judge to normalize the ranking to the number between say 0 to 10.
One more step. Now consider all judges existing in the world at the present time. Let each judge do the same thing. In general each judge has his own standard which will be all different. So take average. Then you will get a unique number for each peformance (or for each element under consideration). It may be viewed as the best number representing the performance. It may not be considered as an "independent truth" but is a consensus at present.
Of course you cannot do this in reality. The only think you can do is select some finite number of judges. There is no guarantee that their standards are the same. Then the best you can do is make a guideline in advance which most judges at present would think reasonable, and ask the judges to consult it to give the "normalized ranking" for each performance. This will be the score we get. In this way, even if you cannot really "measure" something (in the sense of measuring the length of a rod), whether it is TES or PCS, you can give a quantitative number for each performance (or for each element) and the differences between the scores can have a meaning as differences between the normalized rankings according to the consensus at the moment.