Towards the improvement of matchmaking for MMOGs

Get Complete Project Material File(s) Now! »

Matchmaking approaches for MMOGs

Solving the matchmaking problem will not benefit the gaming industry only. In [28] students are automatically matchmade into groups that fit their skills and academic requisites in order to form groups that are more suitable for each student. And [29] uses a matchmaking system to improve the e-marketplace.
Managing to match players both accurately and fast is an open issue and game developers often communicate their approaches publicly [13]. This is to reassure the community about both the efficiency and the fairness of the system. Matchmaking systems inherit from the trueskill algorithm by Microsoft research [30] as a base to evaluate player performance in the game. Besides Elo performance (described in the following section), other criteria matter just as much in a cooperative game, such as the spoken language. In [19, 31–33] latency is the parameter studied and shown as requiring a balance between matched entities, even though games rarely take latency into account for matchmaking.
In [20], the matchmaking of Ghost reckon online is deeply analyzed but this game does not rely on the same conditions as MMOGs as it targets FPS (First Person Shooter) games. More advanced solutions use agents to solve the matchmaking problem [15, 16, 28] but all of them lack a performance study in a real environment to verify its efficiency. Most of the solutions use P2P network and does not offer possibilities for a centralized server architecture [18, 29]. As the current MMOGs rely on client/server architectures, those existing solutions cannot be applied without an intensive redesign of MMOGs platforms.

Elo ranking

In order to introduce the gaming matchmaking concepts I need to briefly introduce the Elo ranking and its definitions.
The Elo rating system [34] created by Arpad Elo is an improved method for calculating the relative skill levels of chess players. Today many other games, including multiplayer competitions, have adapted it for their own use. Thereafter, I use ranking, Elo, or MMR (short for MatchMaking Rating) to talk about player ranking values.
It is assumed that a player’s performance varies from game to game according to an approximately normal distribution; a person’s Elo rating is the mean of that distribution. A person with a high Elo is someone who performs better on average than a player with a low Elo. This score is determined by win/loss statistics with respect to other players using a variant of the algorithm called the algorithm of 400. For players A and B with respective Elo ratings of Ra and Rb, the probability Ea of a victorious outcome for player A is computed as follows:
For each 400 points of distance between players, the player with the highest score is ten times more likely to win as the other player. This is the standard computation for Chess and it may differ in League of Legends. The actual outcome of a game is compared to the expected outcome and each team/player rating is adjusted accordingly. As a result, the score of a victorious team changes less if it was expected to win than if it was expected to lose. Successive games eventually bring each player/team to a point where they are expected to win 50% of the time against opponents of equal score.
K value is associated with each player to inhibit Elo changes after games between evenly matched opponents and to prevent inflation of the highest Elo ratings. The reason for the latter is that excessive Elo differences produce a feeling of unachievable goal for new players. In chess, the initial K value is big (25 for the 30 first games) resulting in large changes in Elo. Thus a player can rapidly find her/his correct place in the ranking system. As the number of wins and losses becomes more even, K decreases gradually (K = 15 to 7).

League of Legends case study

This section describes my study case: the matchmaking service of League of Legends (LoL). LoL is a popular game with a community regrouping over 12 million players worldwide per day in 2012 ; this ensures an abundance of user traces from a genuinely successful online game.
It is a competitive game where ranking is a central element; as such, ranking is a precious metric for studying player behaviours and extrapolating quality of service. Finally it is a session-based game with relatively short sessions, averaging around 34 minutes according to the game developers. Its matchmaking service is used often and plays an important role in the overall game experience.
League of Legends is a multiplayer online battle arena (MOBA) video game developed and published by Riot Games. LoL players are matched in two opposing sides comprising five teammates each. Both teams fight in an arena in order to destroy the enemy team’s main building called nexus.
The LoL server definition of a game is the brawling session that pits players against each other inside the arena. I will therefore use this term to refer to the players’ games history.
Three factors are crucial to the gaming experience: waiting times, matching accuracy, and server response times. With an average waiting time of 90 seconds in between ses-sions, players can spend a lot of time waiting for a session to begin (see Subsection 3.2.1). This is especially true for very skilled players: their scarcity makes it harder to match 10 such players together. Matching players with disparate skill levels reduces waiting times significantly, however it can reflect poorly on the experience. Unskilled players will feel helpless against much stronger opponents, while the latter will spend time on an unchallenging session with little or no reward to look forward to. Orthogonal to these issues, server response time is crucial in this game which requires extremely sharp re-flexes. Lags caused by the servers often increase the ping by up to 300%, which severely impedes the gameplay.
LoL is a competitive game where ranking takes a significant part, both in terms of matchmaking and in terms of player status. Rank defines the skill level of a player, and rank improvement is the main goal of most LoL players. Once players are ranked, they get placed in competition categories called Leagues, and subcategorized into Divisions. This ranking, as it can evolve quickly over time, needs to be computed precisely and that is why most games use the Elo rating system. Thereafter, I use ranking, Elo, or MMR (short for MatchMaking Rating) to talk about player ranking values.
Figure 2.1 describes the equivalence between Leagues and the rank of a player. The formulas used in League of Legends for ranking calculations have not been disclosed publicly. However, most ranking implementations share the same bases inherited from the original Elo rating system. Riot Games developers do divulge interesting information about the matchmaking in LoL through its website and dedicated forums [35] , though. Based on this information and on my own data analysis, I inferred the general rules that guide LoL player matching.
League of Legends uses a system with dynamic K values. All players start with an Elo of 1200 for their 10 first games. From there they are assigned a score, and changes occur smoothly according to the wins and losses.
League of Legends players can join several types of games, associated with different queues on the servers. A group of persons joining the same queue in order to play together in the same team is called a premade. A premade can comprise from one to five players. Normal games do not count towards official ranking, whereas ranked games do and are only open to seasoned players (above level 30). My study focuses mainly on ranked games since they draw together players whose statistics are more likely to be representative of the game’s core community. Also, fair matchmaking is more critical for ranked games since they induce real stakes for the players.
According to Riot Games developers, matchmaking in LoL is based on: player ranking, the experience of each player (number of games played), and the size of the premade.
Player ranking weighs most since it has the highest impact on the outcome of the game. LoL tries to match teams as fairly as possible: it computes the average of the player ranking values for each team, and then uses these averages to match teams similarly to one-versus-one fights. This solution speeds up the matching of multiple players, yet it may result in an unbalanced game if two players with very different rankings join in a premade since the least skilled player is likely to face a highly ranked opponent. In a competitive context, this kind of quick fix is very unsatisfactory.
Adding the personal number of played games as a matching criterion alleviates this issue, but it also increases the average waiting time drastically [36](roughly around 50%). This shows the direct impact on the average waiting time of adding a single preference to the matching system.
Premades also increase the waiting time, and can even induce unbalanced battles. If the matching system cannot find an opposing team with a premade of the same size and level in the allotted time (roughly 30 seconds for a normal game), it gathers single players with higher individual ranks to build an opponent team. The higher rank is supposed to compensate for lesser coordination in the opposing team, but usually leads to a victory of the higher ranked players.

Cheating in MMOGs

In any MMOG the community of players builds a real economy in the virtual world, either in terms of ranking or in terms of virtual currency. Yet the community can have a negative influence on the game experience as players cheat in order to get unfair advantages over other players. One does not want to join a game that is full of cheaters.
There are many ways to cheat in MMOGs, which range from small, undetected advan-tages acquired by the player to deep modifications of the world state that completely unbalance the game economy.

Gold farming – use of normal behavior for illegitimate reasons

Some cheaters pay third party services to play on their accounts while they are away from their computers. During the hours the player is not connected service employees come to take his place and play for a determined amount of time.
These cheaters gather an unfair advantage quickly: any particular resource they re-quested when purchasing the service. This behavior known as gold farming is illegal as the player have to spend money for the service, which renders the player financially advantaged in a game where all players are supposed to have the same odds. It is called gold farming as it is the most common currency of MMORPGs.
Gold farming is very difficult to detect, given the quasi-standard behavior of employees who gather the currency for the player. If the company that provides this gold-farming service is always in the same area, a study of the rate of currency generated by the game allows identification of gold farmers.

Trust exploit

Another category of cheating attempts, and the easiest one to implement is the trust exploit. It relies on exploiting the confidence a player can have into a service while the service is actually malicious and gathering the player sensible information. These are often easy to implement because they rely on exploiting software weaknesses.
An example of such cheating is the ”map hack” that reveals an entire map when only a part of the map is supposed to be visible to the player. These hacks are possible when weak servers [7] actually send the entire map information to the player and hide it graphically in the game. The cheater only has to block this ”in-game hiding” by manually overriding a setting in the graphical driver parameters to view the entire map.

Bug/hacks exploit

”Bug exploit”, ”Hack exploit”, those terms express the most common type of bugs that users will exploit in order to get advantages. Those bugs target two type of modifications: short-term and long-term.
Short-term modifications aim at modifying game content directly in the RAM. An exam-ple of such changes is FunnyPullBack [37] in LeagueOfLegends. This software developed by a third party injects its code into the game client, and returns a snapshot of the game state to a script engine. The scripting engine offers to execute commands on the game after analysis of the game state. With this tool, users can create their own scripts to automate actions in the game very accurately.
Long-term modifications address data saved on the storage medium of the client. These attacks are harder to achieve due to the protections of the program deployed by the server. But they offer durable benefits for the cheater: changes to backups of his avatar, its appearance, its objects.
When insufficiently checked by the server this category of cheats often leads to games full of cheaters. As an example, Diablo2 [38], had two multiplayer modes: Closed, and open The closed mode forces to create characters on blizzard servers (and were therefore controlled by the server), the open lets users play characters stored on their hard disk and connect to one another in a peer to peer model. Once the correspondence between hexadecimal backups and the character’s characteristics in game had been made, a team created the Hero Editor. The open quickly became the kingdom of cheaters fighting each others.
The legitimate playerbase resorted to closed servers as it was their only option. In closed servers, avatars were deleted if inactive and players faced an added latency compared to local area network (LAN) gameplay, which reduced their game experience quality.

Cheating by destruction / deterioration of the network

In a game with ranking, protection against attacks that target the network is crucial. A cheater that manages to prevent opponents from joining the game can get easy victories by forfeit, climbing in the rankings.
A variant of this cheat is the creation of fake accounts to raise the level of a real player account. By influencing the system so that the server matches the legitimate account with the fake ones, a cheater mimics a sybil attack.
Of Legends 6: a player was using the community to organize games where he could easily win [39].
Another network related cheat is to delay data emissions. In FPS or fighting games where response time is important, when a player takes an abnormal amount of time to answer a technique is used to avoid artifacts of teleporting players to appear: the player avatar will follow an estimated trajectory based on the direction vector.
A well known cheat is to use this technique to make enemy players see an action in a fake direction, or froze the avatar. The cheater sends two messages: one announcing a start in the wrong direction, and the other to take advantage of the surprise generated by the abrupt change. A simpler and mechanical way to do so is to use a lagswitch [40] that will block any receiving/incoming data. If the player is hosting the game, he will still be able to play locally while everybody will see a frozen avatar. On a second press of the button, the console will then resume messages emissions and update the other players.

READ  Computational foundations of bilevel programming 

Cheat detection

This section present the possible countermeasures to cheat in MMOGs. Solutions can either be hardware based, or by creating control authorities to assess players actions.

Solutions based on hardware

Solutions based on hardware have been adopted by manufacturers of video game consoles from the beginning. Hardware components provide the means to check whether contents loaded in the console are legal or not. For example, upon insertion of a DVD into a Nintendo Wii console, the system input/output driver moves the DVD reader head in the security zone of the DVD before executing any game code section. This allows signature control and possible rejection of the inserted disk. The Nintendo 3DS goes further: it stops working indefinitely if an illegal cartridge is inserted [41].
The problem with this solution is that the protection is embedded in the hardware and becomes useless once it has been circumvented. ”Chip” soldering on console mother-boards re-routes the security check and always claims that the content is legitimate.
Sometimes the system console software itself includes a fault that can go undetected by hardware verification. Such was the case of the Wii, which had a breach in one of its first games. A software design flaw allowed players to get the keys used to sign content: a player managed to recover backup data. By modifying a backup and signing with the key, cheaters gained access to an area called ”debug”. Once the fault was found, hackers could exploit the debug area and inject a buffer overflow [42]. Sony PS3 supported the linux operating system virtualized on the first PS3 generation, which allowed access to a runlevel where a malicious user could execute unsigned code and cause a long chain of security breaches that finally ended in running unsigned content on the console [43].
A thorough analysis of all cheat detection systems describes the long list of hardware mechanisms that exist as been made by Feng, Wu-chang and Kaiser [44]:
o Study of RAM,
o Listening to system calls,
o Listening to hardware,
o Using coprocessors to check calculations etc.
Hardware based solutions alone are too vulnerable in the long term. MMOGs specific solutions using hardware are cited in the literature such as: keyboard key-press detec-tion [45], for example, to discover bots. Bots are software used to perform actions on behalf of the user. They enter orders directly into the game client without using any input device.
StarCraft2 includes a protection-oriented listening hardware [46] that prevents automatic actions by allowing only one in-game action per keystroke.
World of Warcraft also included tests to detect bots automatically. By analyzing the keyboard / mouse activity they automatically disconnect players after a long period of inactivity.(around 30 minutes).
Hardware detectors are rarely used alone because they cannot be updated and therefore do not offer long-term security. Once a cheater exploits a flaw in the physical protection the game company cannot change all previous versions of the hardware, rendering it obsolete.
Downloadable content is strongly disliked by most players, but it is very convenient alongside hardware solutions. If the executable binary game is decrypted, the game company can apply and distribute an update that changes the encryption system. There is no backward compatibility to manage: game servers can directly refute the old key and propose a software update to any version with the old encryption system.

Solutions based on a control authority

Many solutions rely on the usage of a trusted authority that will assess clients requests.
The following subsection describes the different types of authority.
In the client/server model, the server is an entity controlled and monitored by the creator of the game. In order to detect simple attacks (like modifying the contents of a packet), the most trivial solution is to host the state of the game on a centralized trusted server. Thus the client does nothing more than communicating and respond to information it receives from the server. This is the case with most current MMOGs.
A protected application checks what is sent by the client. Such solutions have a cost on the server load and the amount of data that is transferred. The server may skip some cheat detection cycles to reduce the load. SpotCheck reduces the computation overhead by 50% [47].
Distributing some data to the client and delegating control of communications reduces the computational load on the server. In [48] an authority checks every transaction (the transaction is then marked valid or invalid), but the client generates the hash of the transaction. The authority only verifies that hashes are correct and correspond to a nominal behavior. One can imagine a solution coupling this approach with SpotCheck to limit the computational load of the authorities.
The problem with these control techniques is that they are cheat-proof. If a cheater passes the tests, it does not mean he did not cheat. It just means he was able to adapt to the control either using undetectable cheats or by respecting game rules like gold farming.
In a peer to peer environment it is difficult to monitor network similarly to centralized servers. Some P2P solutions use centralized authorities to help ensure security in their network [48, 49]. The solution of Josh Goodman [48] is both a centralized solution and a peer to peer one as it relies on a centralized authority that peers of the network can assist. Peers will regularly become authorities and are able to handle requests for the server. Watchmen [49] targets smaller number of players (up to 48) by focusing on FPS type of games but covers a wide range of cheat detection.
Jardine, Jared and Zappala [50] established a hybrid architecture using both computa-tional power offered by the P2P model and security offered by the client/server model. Hybrid solutions offer a good compromise between size and maximum network secu-rity, but the centralization of security always induces high bandwidth consumption on servers.
Peer to peer environments open more potential breaches and types of attacks as they give a part of the control to the nodes of the network. Sybil attacks aim at yielding more information than any single node should have, or at attacking a network node with multiple identities. There is also the possibility of nodes that collude in the peer network. The first defense is quite direct and explicit: the goal is to verify that peer ”A” is not malicious. The state of A is sent to peers, who then calculate the same actions as ”A”. After calculation, they can check that A returns the correct result. These checks can be made either:
• by testing with a softcoded algorithm in the application: these tests are vulnerable without regular updates.
• by tests generated randomly at runtime of the application: testing is then more costly to maintain but are more reliable over time [51].
This section lists some of the approaches based on the existence of rules in the universe that allow the cheat detection mechanism to detect incorrect behaviors.
Mathematical properties of a game can prevent cheating. Pittman [52] proposes a solu-tion for card games where two peers can assess what actions are possible or not. A state machine describing the state of the game determines which actions are possible at any given time. Such solutions offer opportunities outside the server. State machines and mathematical laws establishing the behavior of players in MMOGs are far too complex and have yet to be written.
Bots can be detected relatively easily because it has been shown that bots do not have the same behavior as humans and their movement traces are fundamentally different. Agent based solutions allow effective bots detection [53, 54]. These small entities provide a good local analysis of a simple problem at a low cost, and work together when the problem needs to be solved in a distributed manner.
To protect against attacks exploiting data flow, mobile guards [55] encrypts memory.
Guards are the only ones able to read the contents of the game distributed at the launch.
Delap and Al [56] run the client through a checking application to assert that it is indeed the original code which was launched. This can prevent attacks targeting long term storage modification.
Finally to protect against typical attacks that use duplication of identity, a solution is to make the second authentication more difficult than the first. With crypto- puzzle [57], clients are required to perform complex and very costly computational tasks. Acquiring many identities cost a lot of time and resources, thus reducing the appeal of Sybil attacks.

Reputation systems and their mechanics

This thesis plans to detect cheat in MMOGs using the concept of reputation to describe the behavior of a player in the game. A player with bad reputation would be ultimately seen as a cheater.
A reputation system is a distributed algorithm for obtaining global feedback on every node of the network. For example in CORPS [58], reputation feedbacks are used to establish a trusted network based on nodes from the P2P system. This trusted network is then used to route packets efficiently, and enhance the global transfer rate of the network. The number of hops and the number of re-emissions decrease when using a network built with trusted nodes.
Reputation systems are not only used for IT algorithms and software. For instance, an auction website can use a reputation system in order to judge its users. Gathering feedback on eBay helps building a reputation scale between buyers/sellers.
A reputation system is not fully reliable, as malicious nodes can also send feedback. [59] gives multiple definitions of trust, depending on the type of relation between peers of the network :
• dependency trust: a peer expects another peer to behave in a certain manner based on the trust ratio. A trust ratio is an expression of a confidence percentage regarding another node of the network. The higher the trust ratio, the more likely the other node is correct.
• decision trust: peers rely on the actions of others peers to take their decision, even if some of the actions were malicious.
Reputation is highly related with trust, but their meaning differs. Reputation is what is said or thought about another entity and its actions. Because of this slight difference, a node can say of another one:
• I trust you because of your reputation
• I trust you even though you have a low reputation
Those two sentences show the difference between a reputation value and trust. Trust is a subjective view created from past experiences between two entities, while reputation is more like a global view gathering subjective views in a collective way.

Table of contents :

1 Introduction 
1.1 Current MMOGs limitations
1.2 Contributions
1.3 List of publications
1.4 Organization
2 Decentralized services for MMOGs 
2.1 Context
2.2 Matchmaking approaches for MMOGs
2.2.1 Elo ranking
2.2.2 League of Legends case study
2.3 Cheating in MMOGs
2.3.1 Gold farming – use of normal behavior for illegitimate reasons
2.3.2 Trust exploit
2.3.3 Bug/hacks exploit
2.3.4 Cheating by destruction / deterioration of the network
2.4 Cheat detection
2.4.1 Solutions based on hardware
2.4.2 Solutions based on a control authority
2.5 Reputation systems and their mechanics
2.5.1 Effectiveness of a reputation model
2.5.2 Reputation system designs
2.5.3 Countermeasures to reputation systems attacks
2.6 The current state of MMOGs services
3 Towards the improvement of matchmaking for MMOGs
3.1 Gathering data about League of Legends
3.1.1 The nature of the retrieved data
3.1.2 Services used to retrieve data
3.2 Analysis of a matchmaking system
3.2.1 Influence of ranking on waiting times
3.2.2 Impact of the matching distance on the game experience
3.2.3 Impact of latency on the game experience
3.2.4 Crosscheck with data from another game
3.3 Tools for a better player matching
3.3.1 Measuring the quality of a matching
3.3.2 Various algorithms for matchmaking
3.4 Measuring up to an optimal match
3.5 Performance evaluation
3.5.1 Matching capacity
3.5.2 Average waiting time
3.5.3 Matching precision
3.5.4 Adjusting the size of the cutlists
3.5.5 P2P scalability and performance
3.6 Conclusion
4 Scalable cheat prevention for MMOGs
4.1 Scalability issues for cheat prevention
4.2 Design of a decentralized refereeing system
4.2.1 System model
4.2.2 Failure model
4.2.3 Architecture model
4.3 Distributed refereeing protocol
4.3.1 Node supervision
4.3.2 Referee selection
4.3.3 Cheat detection
4.3.4 Multiplying referees to improve cheat detection
4.4 Reputation management
4.4.1 Assessment of the reputation
4.4.2 Parameters associated with my reputation system
4.4.3 Jump start using tests
4.4.4 Reducing the overhead induced by the cheat detection
4.5 Performance evaluation
4.5.1 Simulation setup and parameters
4.5.2 Latency
4.5.3 Bandwidth consumption
4.5.4 CPU load
4.5.5 Cheat detection ratio
4.6 Performance in distributed environments
4.6.1 Evaluation in a dedicated environment
4.6.2 Deployment on Grid’5000 and on PlanetLab
4.7 Conclusion
5 Using reputation systems as generic failure detectors 
5.1 Detecting failures with a reputation system
5.1.1 Assessment of the reputation
5.1.2 Parameters associated with my reputation system
5.1.3 The reputation based failure detector
5.1.4 Scalability
5.2 Comparison with other failure detectors
5.2.1 Bertier’s failure detector
5.2.2 Swim’s failure detector
5.2.3 Communication complexity
5.3 Performance evaluation
5.3.1 Experimental settings
5.3.2 Measuring the accuracy of multiple detectors
5.3.3 False positives in the absence of failures
5.3.4 Permanent crash failures
5.3.5 Crash/recovery failures
5.3.6 Overnet experiment: measuring up to a realistic trace
5.3.7 Query accuracy probability
5.3.8 Bandwidth usage
5.3.9 PlanetLab experiment: introducing realistic jitter
5.4 Related work
5.5 Conclusion
6 Conclusion 
6.1 Contributions
6.2 Future work


Related Posts