Overview
Each NFT submitted into Ranked Battle is ranked on a global leaderboard.
The Ranked Battle mode is currently using an ELO scoring system .
Elo
Elo, named after Arpad Elo, is the basis for ranking systems in many games, ranging from competitive sports to chess. In a typical Elo system, players start with the mean score of 1500. If players perform well (wins against their opponent), their Elo score rises. If they perform poorly (lose against their opponent), their Elo score falls. The system balances at the mean score, meaning from an Elo score perspective, each match is zero-sum (the winner’s score rises as much as the loser’s score falls).
Over time, each player’s score should accurately represent the skill relative to the other players on the leaderboard. In fact, the system is designed in a way to represent the expected win rate of players. Specifically, a player ranked 200 points higher than another player is expected to win against the lower-ranked player ~76% of the time while a player ranked 100 points higher than another player is expected to win against the lower-ranked player ~64% of the time.
This also means that upset wins affect the scores more. For instance, in a match between two players with vastly different scores, if the player with the lower score wins, their score will go up significantly. However, if the player with the higher score wins, their score may not change.
Elo Methodology
Let and be the ratings of Player and Player , respectively.
The expected winning rate of Player is calculated as follows:
The expected winning rate of Player is calculated as follows:
and can be interpreted as the probability that Player and Player wins the matchup against each-other, respectively.
This means that Player has around an expected 36% chance of winning against Player while Player has around an expected 64% chance of winning against Player .
We note that since the matches are played between two players. The ideal match ups are those where the skill differences between the two players are minimized, in other words, a match up where , which is possible only if .
Let Player be the instigating player (from whose perspective we consider the results of the match). The is recorded as , , or , depending on whether Player wins, ties, or loses.
The adjusted Elo rating after the match for Player is calculated as follows:
The adjusted Elo rating after the match for Player is calculated as follows:
The is known as the K-factor or the development coefficient, and it’s the maximum absolute amount a player’s score can change in one Elo update. In other words, the higher the K-factor, the more volatile the player’s Elo ratings will be. See Elo Adjustments section on methodology for determining .
In the calculations for both players, the K-factor of the instigating player, in this case Player , will be used.
We note that since we use the same K-factor to update both parties and since , the points being gained by the winning party is exactly the points being lost by the losing party. We also notice that the player with the higher Elo score has more to lose; the higher-rated player gains less than the lower-rated player for a win but loses more than the lower-rated player for a loss. This is because the higher-rated player is expected to be more likely to win (in the example, was higher than ), and thus if they win it’s less of an upset than if the lower-rated player won. Furthermore, we also note that even if the two players tie, the lower-rated player still gains Elo score from the higher-rated player.
Elo Adjustments
The methodology for determining a player’s K-factor is taken from FIDE Rating Regulations. The system is dynamic and the criteria are as follows:
- → This K-factor is used for players who has played less than 30 games; we want more variability to speed up Elo score discovery. The higher initial K-factor adds extra variability and ensures that new players can be quickly moved into a range commensurate with their skill.
- → This K-factor is used for players who have played more than 30 games and have a score of less than . More experienced players are assumed to already be in a range commensurate with their skill; less variability in ratings changes assists with more accurate matchmaking.
- → This K-factor is used for players who have played more than 30 games and have a score of over ; the K-factor remains at even if the player’s Elo drops below . Highly-experienced and highly-skilled players do not need as much variability in ratings change to have an accurate score.
Match Ups
When a player initiates a fight with their NFT (we will refer to this as the Challenger NFT), the matchmaking system randomly selects a group of NFTs that are within an acceptable deviation of the Challenger NFT’s Elo. We then determine a target size of the opponent pool. A random opponent will be selected out of opponent pool and matched against the Challenger.
Example
➡️ Challenger NFT Elo: 1,600
➡️ Acceptable Elo Deviation: 100
➡️ Opponent Pool Elo Range: 1,500 ≤ x ≤ 1,700
➡️ Opponent Pool Size: 30¹
Opponent Pool Makeup
15 opponents will be selected between 1,500 to 1,600 and 15 opponents will be selected between 1,600 to 1,700.
Notes:
- Number of players for opponent pool will be finalized closer to the launch of the game.