Jay, you're not posting anything that hasn't been known for years regarding fluctuations in ratings on the same course on the same day under what appear to be the same conditions with the same pool of props or a different pool of props like Des experienced. Those fluctuations are not flaws but expected with a statistical system. I posted this above (#3134) and it's worth posting again:
"The 95% error range is directly affected by how many props shoot scores on a layout, not how good the ratings are of the props. For a course around 50 SSA, the 95% range for 5 props is +/- 2.7 in SSA which is 5.3% potential error. For 60 props, the 95% range is +/- 0.8 or 1.5% error. Those are the ranges for playing the same course under the same conditions which is never exactly the same even going from morning to afternoon with no wind either round."
Based on these parameters, we would expect the ratings for 4 of the 82 rounds played (5%) by the women in the 41 Global events to fall outside these ranges perhaps 2 above and 2 below. It has no bearing on the average ratings of the props. It just happens naturally with this or any other statistical system. The good news is that the other round at each event likely fell within the range where the average of the two rounds produced a better representation of the true SSA for the course that day which can't be known by any other means.
If I had my preference, Global events would be at least four rounds so the statistical fluctuations on all of the sites have a chance to converge toward the "true" numbers whatever those may be. We went with three rounds the first Global to see players' acceptance of the format as an option. The Women chose two rounds not because it was good enough statistically but because they felt this would be more likely to increase participation which was a more important goal for them.
The reason Worlds is at least 7 or 8 rounds is to reduce the impact of lucky aspects that are part of our sport. Ratings have been shown to be pretty accurate in that 93% of the final ranking at Worlds can be predicted by the initial rank of the players' ratings before it starts.
Think of a basketball game. Now imagine instead of determining the winner after 4 quarters, they decided to play two 2-quarter games a night due to the shortened NBA season. If we agree that only a 4 quarter game truly determines a winner in that sport, then determining standings based on a bunch of 2-quarter games is a lesser approximation statistically in the same way using 2 rounds of DG has lesser statistical quality than 4 rounds to determine winners. Why don't they have single round playoff matches in B-ball or hockey and instead play up to 7 games? They've determined that a single game even at a neutral venue is not statistically good enough to determine who is better.
Here's an example directly relevant to disc golf where using ratings would have been better than using actual scores for rankings. At the Charlotte Worlds, the Open A pool played two wide open courses on the first day and B pool played the two mostly wooded courses. The next day they switched courses and the winds were wailing. This was in 1997 before ratings, but it's estimated the open courses played 4-6 shots tougher on the second day.
The B pool players got the shaft in this format. At the time, the A & B pools were evenly seeded with top players in both pools unlike the current procedure to stack at least the top 18 players in the A pool. Now had ratings been used for the competition, obviously a radical concept, the B pool players would have gotten ratings equivalent to the A pool players on those courses and still been in the competition rather than being torched using the conventional scores system. While individual round ratings are less accurate than many ratings averaged together, they still would have been superior to straight scores in that scenario.
What it boils down to is what is a good enough format for players to be willing to enter and compete. Each format has a different level of statistical "quality." We know players are willing to play single round leagues even with a weaker version of handicapping all the way up to Worlds currently with 7.5 rounds.
Should I be surprised to see the highest rated local player enter a one round league night where he has the lowest probability of winning? The more rounds played in a competition, the more likely he will win from a statistics standpoint. I shouldn't be surprised because the player has chosen to compete under that format knowing ahead of time it's not the best format statistically for him.
I'm guessing less than half of the 646 women participating in the Global knew how it all worked but accepted that it was good enough. If you look at the final rankings, there were very few surprises in the rankings compared with their initial ratings. Only the unrated non-PDGA players in some of the lower divisions surprised by not knowing where their skills currently were in comparison to the division they entered.
Rater of the tossed arc.