ELO chess ranking system applied incorrectly in video games

In summary: So in the end, the average player would be 2 points above where they started. This variation would result in the same thing as the traditional ELO system, where the number of people whose scores are increasingly skewed would decrease.
  • #1
DragonPetter
830
1
Are there any chess players, statisticians, or smart people who are familiar with the ELO ranking system? This concerns a game with at least over 100,000 players, so its not an obscure concern.

Basically, I play an online video game that tries to apply an ELO ranking system to rank individual player skill, however its a multiplayer team-oriented game. I think the application of ELO in this case is severely flawed, and it feels so obvious to me logically; however, a way to prove it is not immediately obvious to me and I would like to get input before I venture into calculations, simulations, and models.

I'll try to describe the method and why I think its invalid, and then the supporters' reasons for why its valid.

The games consist of 2 teams, each with 5 players. The gameplay is heavily dependent on team performance, where 1 player can influence the outcome at times, but in general, it takes a team effort to outperform another team. Imagine it being basketball if you want. So, if you are on a team with bad players, you are severely disadvantaged. At the end of the game, all players of the winning team receive a boost to their ELO rating, while players of the losing team receive a deduction from their rating, independent of each player's individual performance.

Now, one of the key flaws I see is that the system applies the team performance to an individual's ranking. If the teams were always constant, as in the players being rated were always on the same team together, this would actually work, because then the ELO rating is applied to the team, rather than the individual, and the team has a record of statistical datapoints that are valid. Say some lucky bad player plays with 4 professional players on his team every game, then the rating he has should not represnt his personal skill, but rather his team's skill.

In the game I'm playing, each game, or trial if you can call it that, the teams are completely randomized with players that could be very experienced or players that have never played the game before. The system finds 5 players on a team, and it picks 2 teams of random players that have the same average rating (in an attempt to make the matchup fair). So each game a player plays, his team has changed completely, but the end of game results (win or lose) are applied to the individual player and so that random team's performance is applied as a datapoint for someone's individual, non-random, statistical performance. Does this start to sound flawed? The dataset being compiled is basically random events, since the players are picked out of a random pool with an average ranking value.

The argument in support of this method is that A.) the player pool to make teams is random and B.) the dependent variable in all trials is that the individual player was an influence in every game (in other words, the common element to all the random games is the player being rated was a participant in all of them). And with these premises, the individual's influence should begin to average out above the randomness of his team makeup in each event. So with a sufficient enough amount of games played, the accumulation of performance of the random team matches that he participated in should start to reflect his own performance.

Now, my argument is this: You cannot base an individual's ranking based on how randomized teams he plays on perform. The +/- received from a win or loss should only be applied to the team performance, rather than an individual's performance. I would also argue that the distribution of player skill in the randomized teams and the fact that the player is only 1/10th of the participants in a match, that the "averaging" effect that they think should surface is completely drowned out by "noise". I think of it as a signal to noise ratio analogy, and the 9 other players are the noise floor, and the individual player is below this noise threshold and his influence cannot be measured.

Sorry if this doesn't belong here or doesn't make much sense, but if you have any experience or thoughts on this I'd appreciate comments.
 
Physics news on Phys.org
  • #2
The result will be a progressively smaller number of people whose scores are increasingly skewed.

For instance, suppose the following variation on the traditional ELO system where you are scored by your own games only. However, after each game a coin is flipped and the loser of the coin toss gives one ELO point to the winner.

At the end of the first round, half of the players are 1 point ahead, half are 1 point behind. At the end of the second round, half will have correct ELO scores, a quarter will be 2 points ahead, a quarter 2 points behind. Etc.
 
Last edited:
  • #3
I think one thing to remember is that the elo in a team game would not reflect how good of a player you are but more so how well you work with a team. It would be possible to add some factors that would allow it to value how good of a player you are. It's possible that would lead to people just trying to work the system to get scores that reflects that they are good.

With just about any elo system the score will become more accurate the more games you play have you taken this into account?
 
Last edited by a moderator:
  • #4
If I understand the OP, you don't play as a team, you play as a group of 5 individuals. It is only the rating points that are distributed on the basis of the team's result. When you win a game in the traditional ELO system, you get some of your opponent's points. In this system you only get your opponent's points if your team wins. Say you defeat your opponent and the other 4 members of your team lose their games. In the traditional system you would win points. In this system you lose points.
 
  • #5
It seems to me that Jimmy's conclusions are reasonable. For the majority, with sufficiently large player pool, will average out their scores to reasonable levels. A small minority may have gotten exceptionally lucky or unlucky and was always paired with players above or below their skill level consistently and will have a skewed rating.
 
  • #6
Jimmy Snyder said:
If I understand the OP, you don't play as a team, you play as a group of 5 individuals. It is only the rating points that are distributed on the basis of the team's result. When you win a game in the traditional ELO system, you get some of your opponent's points. In this system you only get your opponent's points if your team wins. Say you defeat your opponent and the other 4 members of your team lose their games. In the traditional system you would win points. In this system you lose points.

Hi all, thanks for the replies.

You do play as a team. Its a strategic game where 5 players play together to beat the other team. My problem is that you are then rated as an individual based on your team's performance, and your team changes every match and is randomly chosen.
 
  • #7
I also believe elo ranking system is flawed in such games (like football-soccer too).
I played for a few months a video game in which teams remained fixed even after a game would end. We would create "clans" and were teamed up only with people of our clan vs other clans. The number of players wasn't fixed, a game could start 2 vs 5 if there was 5 people from 1 clan and 2 people from the other clan.
I was stronger than the average guy in my team; when the season of clans ended my elo rating went up by almost 200 points and I entered the top 50 or so. However my skills remained the same or almost, so that my elo didn't mean absolutely anything reliable.

In one of my first games (elo 1500), I played a 3 vs 3 against the strongest guy of the video game (elo around 2200). I managed to beat him almost exclusively in 1 vs 1 but my team mates failed to win the game. Result of the game: I lost elo points, the strongest guy of the video game gained elo... though I personally did better than him in that particular game. Such a ranking system cannot be right. In my opinion elo isn't suited for team games.
P.S.:In that game if you die entirely but your team manages to win the game, you "win" for the ranking system.
 
  • #8
One thing to remember: rating systems are meant to be a predictor, not a sure thing. (and from the sounds of it - OP is talking about League of Legends?). Even with team games, over a long enough period of time, a k-weighted ELO system should be relatively accurate as long as ratings are used in their appropriate pool. Multiplayer ratings used for randomly assigned teams etc. Also, something to note for League of Legends - they use a personal rating (hidden to you) to help granulate some match making even further.

Magic: The Gathering (and several other 1v1 tabletop games) have all recently dropped the ELO rating system for their competitions. Over a long enough period of time, the ratings became a bit unwieldy (because they weren't using them for match making, just large scale comparisions). People would sit on their ratings for large events forgoing small events (ie: a sufficiently high rated player would need to go undefeated to not lose significant points at a smaller event). There is also the thought that ELO ratings don't accurately represent games with a chance element.

Finally, especially in a team game something that is being discounted is your skill as a team player. Sure, 1v1 you may beat someone - but 2v2 that other individual may be far better at utilizing his partner. I think that you're discounting "plays well with others" as a measured skill in this case. To your soccer analogy: your striker may have the best kick in the game, but if he's a ball-hog things may not bode well. But again - over a long period of time, in a randomly-selected team environment, the 'better' individual will have a higher rating.
 
  • #9
DragonPetter said:
Hi all, thanks for the replies.

You do play as a team. Its a strategic game where 5 players play together to beat the other team. My problem is that you are then rated as an individual based on your team's performance, and your team changes every match and is randomly chosen.
How do you play as a team? Is there a single game and the 5 of you vote on which move to play? Or do you take turns making moves? Can you clarify for me what it means to play a chess game as a team?
 
  • #10
Ah, League of Legends. My little brother just got me into that game.

I guess the main counterargument to the OP is that the high ELO players are legitimately the best players, and the low ELO players are generally crap. I disagree that your performance in the game will get drowned out by the others in the long run. We've all had games where one guy on either team basically solos the entire game, and we've all had games where some horrible player on one of the team just feeds the opposing carries.

But, my main point is that players with high ELOs exist, and that is the biggest counterexample to your argument.
 
  • #11
Jimmy Snyder said:
How do you play as a team? Is there a single game and the 5 of you vote on which move to play? Or do you take turns making moves? Can you clarify for me what it means to play a chess game as a team?

It's a real-time game, not turn based. Instead of trying to describe it, let me post a video here. This is a video showing some high ELO players with commentary.

https://www.youtube.com/watch?v=rdgXYwanyDk
 
  • #12
Aha!. I though we were talking about chess.
 
  • #13
Is this where I can jump in and flame LoL over HON/Dota?

Ha.

I kid.

This is interesting, I'd like to see what people think about the system that exists in all the above games.
 
  • #14
fluidistic said:
In my opinion elo isn't suited for team games.
I agree. It was designed for chess, afaik. For chess, and other 1 on 1 games, it's a good predictor. For team competitions that keep pretty much the same personal from game to game it should be a pretty good predictor of team performance. For rating individuals in team competition, where the team membership changes randomly from contest to contest and individual performance stats aren't taken into account ... definitely no.
 
  • #15
ThomasT said:
I agree. It was designed for chess, afaik. For chess, and other 1 on 1 games, it's a good predictor. For team competitions that keep pretty much the same personal from game to game it should be a pretty good predictor of team performance. For rating individuals in team competition, where the team membership changes randomly from contest to contest and individual performance stats aren't taken into account ... definitely no.

Disagree, because your individual performance will impact whether or not your team wins. So, better players will win more often regardless of the rest of their team.

Again, the biggest counterargument is that the strongest players have the highest ELOs, even in solo queue. This fact cannot be explained if ELO isn't related to individual performance.
 
  • #16
Jack21222 said:
Disagree, because your individual performance will impact whether or not your team wins. So, better players will win more often regardless of the rest of their team.
In the game where I was teamed up with my clan mates rather than randomly, my elo was in the 1400-1500's. When the "season" was over, the team would be randomly created. My elo suddenly went up to high 1700's, my skills however remained the same. I'm not the only one to whom this happened, many people criticized the elo ranking system for that particular game due to this and totally unbalanced games where one could guess the outcome of the game from start even regardless of what the elo had to say. The same would apply even when the teams would be randomly balanced. In that particular game economy (think of a starcraft-like one's) is shared. If you have a noob in your team and he's wasting all the economy on useless stuff, even the best player can't do much to win the game.
Again, the biggest counterargument is that the strongest players have the highest ELOs, even in solo queue. This fact cannot be explained if ELO isn't related to individual performance.
This did not happen in my game (game name is Zero-K and the seasons of clan teams is called planet wars).
 
  • #17
fluidistic said:
In the game where I was teamed up with my clan mates rather than randomly, my elo was in the 1400-1500's. When the "season" was over, the team would be randomly created. My elo suddenly went up to high 1700's, my skills however remained the same. I'm not the only one to whom this happened, many people criticized the elo ranking system for that particular game due to this and totally unbalanced games where one could guess the outcome of the game from start even regardless of what the elo had to say. The same would apply even when the teams would be randomly balanced. In that particular game economy (think of a starcraft-like one's) is shared. If you have a noob in your team and he's wasting all the economy on useless stuff, even the best player can't do much to win the game.

This did not happen in my game (game name is Zero-K and the seasons of clan teams is called planet wars).

You're talking about a different game than the OP is, so I have no comment on that.

In League of Legends, there are team queues and solo queues. The teams generally have lower ELOs than individuals, so I don't know how meaningful it is to compare the ELO of a team vs the ELO of an individual, as you seem to be doing. However, the point stands that you're talking about a completely different game.
 
  • #18
Jack21222 said:
You're talking about a different game than the OP is, so I have no comment on that.

In League of Legends, there are team queues and solo queues. The teams generally have lower ELOs than individuals, so I don't know how meaningful it is to compare the ELO of a team vs the ELO of an individual, as you seem to be doing. However, the point stands that you're talking about a completely different game.

My fault, in post #15 I thought you were answering to any team game rather than League of Legends in particular.
 
  • #19
Glad people started using game names, I didn't want to appear as a nerd too badly. The game I'm referring to is HoN.

I'm also glad to see people agreeing with me. But does anyone know how to do a mathematical analysis to prove its invalid?
 
  • #20
Jack21222 said:
Ah, League of Legends. My little brother just got me into that game.

I guess the main counterargument to the OP is that the high ELO players are legitimately the best players, and the low ELO players are generally crap. I disagree that your performance in the game will get drowned out by the others in the long run. We've all had games where one guy on either team basically solos the entire game, and we've all had games where some horrible player on one of the team just feeds the opposing carries.

But, my main point is that players with high ELOs exist, and that is the biggest counterexample to your argument.

Thanks for the counterexample, and this brings up some subtle points I forgot to make originally.

First, most of the high rank players also tend to play with other high rank players, and play in organized teams rather than randomized teams. If not organized teams, they usually at least have a buddy they play with consistently. From my first post, I mention that the ranking system becomes more accurate as the makeup of the team remains unchanged rather than randomized. I highly doubt the "strongest" players can do much
when they are thrown back down to below the average rank and have their hands tied by 4 beginners.

Secondly, as the "randomness" tilts you in one direction or another, you start to notice a landslide effect. If I go on a bad streak, my rank takes a dive. If I get on a winning streak, I tend to stay up at that position until I get bad luck (horrible teammates) again.

Its because if you start to win a couple, the system starts to pair you with other people who have won recently, and then these winners help to pull you further away from the average. It has very little to do with your own actual skill ranking.
 
Last edited:
  • #21
DragonPetter said:
Thanks for the counterexample, and this brings up some subtle points I forgot to make originally.

First, most of the high rank players also tend to play with other high rank players, and play in organized teams rather than randomized teams. If not organized teams, they usually at least have a buddy they play with consistently. From my first post, I mention that the ranking system becomes more accurate as the makeup of the team remains unchanged rather than randomized. I highly doubt the "strongest" players can do much
when they are thrown back down to below the average rank and have their hands tied by 4 beginners.

Secondly, as the "randomness" tilts you in one direction or another, you start to notice a landslide effect. If I go on a bad streak, my rank takes a dive. If I get on a winning streak, I tend to stay up at that position until I get bad luck (horrible teammates) again.

Its because if you start to win a couple, the system starts to pair you with other people who have won recently, and then these winners help to pull you further away from the average. It has very little to do with your own actual skill ranking.

First: In league of legends, organized teams are ranked separately from individuals in random teams, so this issue does not happen. It may be different in your game.

Second, the "landslide effect," happens in chess too. The higher ranked players you beat, the more points you get. The more points you get, the higher ranked players you play.

Third, one higher level can carry an entire game even with 4 weak teammates. Sometimes, higher level players will create new "summoner" profiles and play in the lower level games. When this happens, they usually dominate the game. It's abundantly clear who has level 30 alts and who is a legitimate level 10. In the low level queues, if you get a high level on your team, you're virtually guaranteed a win.

I don't know if you play league of legends, but watch some of the "top plays of the week" videos on youtube. As a fairly new player, I can look at those plays and tell that those players are very good. It is no accident that they have high ELOs.
 
  • #22
Jack21222 said:
First: In league of legends, organized teams are ranked separately from individuals in random teams, so this issue does not happen. It may be different in your game.

Second, the "landslide effect," happens in chess too. The higher ranked players you beat, the more points you get. The more points you get, the higher ranked players you play.

Third, one higher level can carry an entire game even with 4 weak teammates. Sometimes, higher level players will create new "summoner" profiles and play in the lower level games. When this happens, they usually dominate the game. It's abundantly clear who has level 30 alts and who is a legitimate level 10. In the low level queues, if you get a high level on your team, you're virtually guaranteed a win.

I don't know if you play league of legends, but watch some of the "top plays of the week" videos on youtube. As a fairly new player, I can look at those plays and tell that those players are very good. It is no accident that they have high ELOs.

I must just say the common phrase "correlation does not imply causation". Just because good players also have high ELOs does not mean being a good player gives high ELO.
 
  • #23
DragonPetter said:
I must just say the common phrase "correlation does not imply causation". Just because good players also have high ELOs does not mean being a good player gives high ELO.

It does when there's a specific mechanism that mathematically gives good players a high ELO.
 
  • #24
Jack21222 said:
It does when there's a specific mechanism that mathematically gives good players a high ELO.

That is a possibility . . or you could be doing the mathematical equivalent of running in circles.

A good first step was if we had the formula they use before us. Then we could evaluate how closely tied the two are mathematically.
 
  • #25
DragonPetter said:
That is a possibility . . or you could be doing the mathematical equivalent of running in circles.

A good first step was if we had the formula they use before us. Then we could evaluate how closely tied the two are mathematically.

They don't give the exact formula, but it's a modified version of the chess one.

http://na.leagueoflegends.com/learn/gameplay/matchmaking gives some details.

If you want the math of the chess formula, which they modified, you can find that here:

http://en.wikipedia.org/wiki/Elo_rating_system#Mathematical_details
 
  • #26
fluidistic said:
In the game where I was teamed up with my clan mates rather than randomly, my elo was in the 1400-1500's. When the "season" was over, the team would be randomly created. My elo suddenly went up to high 1700's, my skills however remained the same. I'm not the only one to whom this happened, many people criticized the elo ranking system for that particular game due to this and totally unbalanced games where one could guess the outcome of the game from start even regardless of what the elo had to say. The same would apply even when the teams would be randomly balanced. In that particular game economy (think of a starcraft-like one's) is shared. If you have a noob in your team and he's wasting all the economy on useless stuff, even the best player can't do much to win the game.

I disagree with this. I would say your elo deserved to be in the 1400s-1500s because your team was presumably not as good as the teams you were randomly matched with.

Starcraft does this correctly IMO because you have a ranking for every team you play with, as well as a "Random Team" ranking. Sure, if you play by yourself you'll get crappy players but your ranking is accurate on average.

Halo Reach broke off from this type of ranking system from the previous games in the series, and it was part of the reason that I stopped playing. They had some sort of voodoo to figure out how well you as an individual did in a team game. Winning was no longer the objective in team games, because it was no guarantee that your rank would increase. To rank up, you basically needed a lot of kills, which took a lot of the strategy out of the game since everyone went into run and gun mode most of the time.
 
  • #27
JaWiB said:
I disagree with this. I would say your elo deserved to be in the 1400s-1500s because your team was presumably not as good as the teams you were randomly matched with.

Point taken. I honestly do not know what would have been my approximate elo. But there's a reason: for that particular game, the elo system was so bad that intuition would work much better. I used to bet on the outcome from start, I once predicted the right outcome 11 games in a row, while the !predict command based on elo failed totally. That command's output was something like "team 1 has 65% chance to win vs team 2". While intuition could be "team 1 has 0% chance to win vs team 2".
I knew many players because I'd spectate games (you can put all your attention into a particular player and therefore learn from him). I knew for instance a guy rated 1500 that could kill about 4 people before dying but his clan/team was so bad that he'd still lose elo points compared to an "average player" in an opposite team despite being a really strong player. I could not beat that particular guy in 1 vs 1 even though my elo was in the 1700's, even if I had played 10 games in a row. I know this for sure. In 1 vs 1 we usually choose small maps and in that game this means that the commander (special unit which is customizable) has a very important role. That guy's commander was a beast and he knew very well how to manage it.

Another example that elo wasn't well applied for that game is that one would win/lose more elo points when there was less players per team. I've seen a 1300 elo teamed up with the strongest player -elo 2200- vs an average player of 1500. 2 vs 1. The noob (1300 elo) would give his commander to the pro player who would win at any time he wanted. The noob who almost didn't play got a boost in elo, the pro player too while the average guy that could be a good player all in all, gets a huge drop his elo points. After a few games like this you end up with a good player rated 1300, a bad player rated 1400 and a pro player rated 2300. That's just terrible.
 
  • #28
[nitpick]
It's actually Elo, not ELO. It's a person's last name. The guy who invented the system was named Arpad Elo.
[/nitpick]
 
  • #29
Fredrik said:
[nitpick]
It's actually Elo, not ELO. It's a person's last name. The guy who invented the system was named Arpad Elo.
[/nitpick]

Oh sorry. I'm always thinking of the band when I read it.
 

FAQ: ELO chess ranking system applied incorrectly in video games

What is the ELO chess ranking system?

The ELO chess ranking system is a method for calculating the relative skill levels of players in a competitive game, originally developed for chess but now used in many other sports and games. It assigns a numerical rating to each player based on their performance against other players, with higher ratings indicating a higher skill level.

How is the ELO chess ranking system applied to video games?

In video games, the ELO system is often used to match players of similar skill levels in online multiplayer games. As players win or lose matches, their ELO rating will increase or decrease accordingly, allowing for more balanced and competitive gameplay.

What are some common mistakes in applying the ELO chess ranking system to video games?

One common mistake is using the ELO system without adjusting for factors specific to video games, such as team composition or individual player performance. This can lead to inaccurate rankings and unfair matches. Another mistake is not regularly updating ELO ratings, which can result in outdated rankings and imbalanced gameplay.

How can the ELO chess ranking system be applied correctly in video games?

To apply the ELO system correctly in video games, it is important to take into account the unique factors and mechanics of the specific game. This may involve adjusting the formula used to calculate ratings or implementing additional systems, such as performance-based adjustments or decay of inactive players' ratings. Regular updates and adjustments to the system are also crucial for maintaining accurate rankings.

Are there any alternative ranking systems that may be more suitable for video games?

Yes, there are other ranking systems that have been developed specifically for video games, such as the TrueSkill system used by Microsoft for their online gaming services. These systems take into account more complex factors and may provide more accurate rankings for video game players.

Similar threads

Replies
1
Views
3K
Replies
2
Views
2K
Replies
7
Views
2K
Replies
1
Views
1K
Replies
1
Views
2K
Replies
2
Views
1K
Back
Top