- #1
DragonPetter
- 830
- 1
Are there any chess players, statisticians, or smart people who are familiar with the ELO ranking system? This concerns a game with at least over 100,000 players, so its not an obscure concern.
Basically, I play an online video game that tries to apply an ELO ranking system to rank individual player skill, however its a multiplayer team-oriented game. I think the application of ELO in this case is severely flawed, and it feels so obvious to me logically; however, a way to prove it is not immediately obvious to me and I would like to get input before I venture into calculations, simulations, and models.
I'll try to describe the method and why I think its invalid, and then the supporters' reasons for why its valid.
The games consist of 2 teams, each with 5 players. The gameplay is heavily dependent on team performance, where 1 player can influence the outcome at times, but in general, it takes a team effort to outperform another team. Imagine it being basketball if you want. So, if you are on a team with bad players, you are severely disadvantaged. At the end of the game, all players of the winning team receive a boost to their ELO rating, while players of the losing team receive a deduction from their rating, independent of each player's individual performance.
Now, one of the key flaws I see is that the system applies the team performance to an individual's ranking. If the teams were always constant, as in the players being rated were always on the same team together, this would actually work, because then the ELO rating is applied to the team, rather than the individual, and the team has a record of statistical datapoints that are valid. Say some lucky bad player plays with 4 professional players on his team every game, then the rating he has should not represnt his personal skill, but rather his team's skill.
In the game I'm playing, each game, or trial if you can call it that, the teams are completely randomized with players that could be very experienced or players that have never played the game before. The system finds 5 players on a team, and it picks 2 teams of random players that have the same average rating (in an attempt to make the matchup fair). So each game a player plays, his team has changed completely, but the end of game results (win or lose) are applied to the individual player and so that random team's performance is applied as a datapoint for someone's individual, non-random, statistical performance. Does this start to sound flawed? The dataset being compiled is basically random events, since the players are picked out of a random pool with an average ranking value.
The argument in support of this method is that A.) the player pool to make teams is random and B.) the dependent variable in all trials is that the individual player was an influence in every game (in other words, the common element to all the random games is the player being rated was a participant in all of them). And with these premises, the individual's influence should begin to average out above the randomness of his team makeup in each event. So with a sufficient enough amount of games played, the accumulation of performance of the random team matches that he participated in should start to reflect his own performance.
Now, my argument is this: You cannot base an individual's ranking based on how randomized teams he plays on perform. The +/- received from a win or loss should only be applied to the team performance, rather than an individual's performance. I would also argue that the distribution of player skill in the randomized teams and the fact that the player is only 1/10th of the participants in a match, that the "averaging" effect that they think should surface is completely drowned out by "noise". I think of it as a signal to noise ratio analogy, and the 9 other players are the noise floor, and the individual player is below this noise threshold and his influence cannot be measured.
Sorry if this doesn't belong here or doesn't make much sense, but if you have any experience or thoughts on this I'd appreciate comments.
Basically, I play an online video game that tries to apply an ELO ranking system to rank individual player skill, however its a multiplayer team-oriented game. I think the application of ELO in this case is severely flawed, and it feels so obvious to me logically; however, a way to prove it is not immediately obvious to me and I would like to get input before I venture into calculations, simulations, and models.
I'll try to describe the method and why I think its invalid, and then the supporters' reasons for why its valid.
The games consist of 2 teams, each with 5 players. The gameplay is heavily dependent on team performance, where 1 player can influence the outcome at times, but in general, it takes a team effort to outperform another team. Imagine it being basketball if you want. So, if you are on a team with bad players, you are severely disadvantaged. At the end of the game, all players of the winning team receive a boost to their ELO rating, while players of the losing team receive a deduction from their rating, independent of each player's individual performance.
Now, one of the key flaws I see is that the system applies the team performance to an individual's ranking. If the teams were always constant, as in the players being rated were always on the same team together, this would actually work, because then the ELO rating is applied to the team, rather than the individual, and the team has a record of statistical datapoints that are valid. Say some lucky bad player plays with 4 professional players on his team every game, then the rating he has should not represnt his personal skill, but rather his team's skill.
In the game I'm playing, each game, or trial if you can call it that, the teams are completely randomized with players that could be very experienced or players that have never played the game before. The system finds 5 players on a team, and it picks 2 teams of random players that have the same average rating (in an attempt to make the matchup fair). So each game a player plays, his team has changed completely, but the end of game results (win or lose) are applied to the individual player and so that random team's performance is applied as a datapoint for someone's individual, non-random, statistical performance. Does this start to sound flawed? The dataset being compiled is basically random events, since the players are picked out of a random pool with an average ranking value.
The argument in support of this method is that A.) the player pool to make teams is random and B.) the dependent variable in all trials is that the individual player was an influence in every game (in other words, the common element to all the random games is the player being rated was a participant in all of them). And with these premises, the individual's influence should begin to average out above the randomness of his team makeup in each event. So with a sufficient enough amount of games played, the accumulation of performance of the random team matches that he participated in should start to reflect his own performance.
Now, my argument is this: You cannot base an individual's ranking based on how randomized teams he plays on perform. The +/- received from a win or loss should only be applied to the team performance, rather than an individual's performance. I would also argue that the distribution of player skill in the randomized teams and the fact that the player is only 1/10th of the participants in a match, that the "averaging" effect that they think should surface is completely drowned out by "noise". I think of it as a signal to noise ratio analogy, and the 9 other players are the noise floor, and the individual player is below this noise threshold and his influence cannot be measured.
Sorry if this doesn't belong here or doesn't make much sense, but if you have any experience or thoughts on this I'd appreciate comments.