1
Gameplay Networking
Jacob Steinfort
Software Engineer
University of Wisconsin - Platteville
Abstract
There is no doubt how popular video games are, and the most popular ones can be played with your friends over the internet, but how is it possible to share a real-time environment with someone across the country? 15 years ago, playing an action game online was a frustrating experience. Now, games like Call of Duty and Halo are hugely popular due to innovations in this field that reduces lag as much as possible.Today’s developers are now using very complicated algorithms in order to “hide” the latency from their users. Those algorithms will be discussed in detail. Additionally, some developers have thought of interesting solutions for their games which they were kind enough to make public.Part of this presentation will be going over the evolution of gameplay networking. The other part will be walking through specific examples of gameplay networking implementations that developers have recently used in their games.
Importance of Multiplayer Games
What is the importance of a multiplayer game? If gamers had to choose between a multiplayer game and a single-player game, they would choose multiplayer. The reason behind this is simple: playing with your friends is a lot more fun than playing with an artificial intelligence. The longer you play with an AI, the more predictable their behavior becomes. However, playing multiplayer games gives users a different experience every time. Also, most modern multiplayer games allow you to not only play with your friends, but play with other people you don’t even know.
Figure 1: USA 2011 Top Video Game Unit Sales. The three crossed out games were bundled with their respective game consoles, so their numbers are inflated.
As you can see from Figure 1 above, it is clearly important for developers to have a multiplayer component to their game, but it is not easy to accomplish this task. Before we dive into Gameplay Networking, let’s go over some Computer Networking terminology. Latency is the delay encountered from a packet traveling from one place to another. This is measured in units of time. The most common latency metric, and what is very important in Gameplay Networking, is the Round Trip Time (RTT); this is the time it takes for a packet to get from source, to destination, and then back to source. Another Networking term is bandwidth, which is the amount of data that can be transferred per unit of time. Bandwidth doesn’t have as huge of an impact on Gameplay Networking as Latency does, but it is still crucial to consider when developing a multiplayer game. Latency and Bandwidth are both problems; each player is going to have a different latency and bandwidth. This is usually governed by their Internet Service Provider (ISP).
Another computer networking term is a Socket, which is a bidirectional communication endpoint for sending and receiving data with another socket. There are two main types of sockets: Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). The differences in the two are very extreme (Table 1).
Table 1: Comparing TCP and UDP
TCP / UDPConnection based (handshake) / No concept of connection, have to code this yourself
Guaranteed reliable and ordered / No guarantee of reliability or ordering of packets. They may arrive out of order, be duplicated, or not arrive at all!
Automatically breaks up your data into packets for you / You have to manually break up your data into packets and send them
Makes sure it doesn’t send data too fast for the internet connection to handle (flow control) / You have to make sure you don’t send data too fast for your internet connection to handle
Easy to use, you just read and write data like it’s a file / If a packet is lost, you need to devise some way to detect this, and resend that data if necessary
Slow: extra steps taken to ensure successful transmission slow down the transfer rate and increase latency. / Fast: no extra steps
TCP provides the most reliable connection. This protocol is great for data that is not time-sensitive like files or web pages. However, games have a real-time requirement for data delivery. This means that if the data is not up to date, it’s useless. TCP can cause old data to be delivered if an acknowledgment packet isn’t received by the sender. This is a waste of bandwidth since the receiver can’t make any use out of the old data.
Gameplay Networking is a technology to help multiple players sustain the belief that they are playing a game together. There are multiple difficulties in implementing this: latency, bandwidth, and dropped packets will always be an issue. Another problem will be cheaters, players that try to break into the networking code and give themselves an unfair advantage.
There are a lot of types of games out there. However, the hardest type of game to implement multiplayer in is an action game. These games emphasize physical challenges, including hand-eye coordination and reaction-time. The most-popular example of this would be a First-Person Shooter (FPS). In these games, there can be upwards of 16 players interacting with each other in real-time, but it actually isn’t in real-time because of latency. So, how do developers do this?
Peer-To-Peer Lockstep
The first technique of Gameplay Networking was Peer-To-Peer Lockstep. In this, each computer was exchanging information with every other computer. The process was to extract the game into a series of turns and a set of command messages. For example, a turn would be 50ms long, and a set of commands could include “move unit”, “attack unit”, and “construct building”. Here is what happens during a turn on one machine in this example:
- Stop recording player’s commands and send them to other players
- Player interaction is halted
- Wait for and receive other player’s commands
- Commands = {move unit X from position P1 to position P2}
- Evolve the game state
- X starts to move from P1 to P2
- Start recording commands
- Player interacts with game for 50ms
This technique was created for Real-Time Strategy (RTS) games (e.g. Age of Empires, Starcraft). It was created because the game state of an RTS game is too large to transmit all at once. So, this model settles on transmitting changes (commands) only. The nice thing about this model is that it is deterministic: it will always produce the same output (game state) when given the same input (commands). The theory is that since every player’s game state starts the same way and they were given the same inputs, each person’s state must be in the same state at any given time. This would provide full synchronization. However, in practice, this isn’t always the case.
Problems with Peer-To-Peer Lockstep
First, the game could become out of sync. While in theory it is impossible, in practice sometimes a command could be incorrectly transmitted. This would make the game states slightly out of sync across players, but Peer-To-Peer Lockstep has no way of detecting this. Then, if more commands are applied to the game states that are no longer in perfect sync, the game will become even more out of sync.
The second problem is that this technique doesn’t support joining a game in progress. This is because everybody has to start from the same state in order for the model to function correctly. A minor fix would be to transfer the entire game state to the new player when he/she joins, but this would require pausing of the game since it would take a while to transfer this data.
The third and most significant problem is that everybody’s perceived latency is equal to the slowest latency. This is necessary since every player has to wait until all other players’ commands are received before simulating a turn to ensure synchronization. Often, developers use tricks to hide the latency like an audio/visual confirmation of an action, but any truly game-affecting action will only occur after a turn cycle has completed.
Despite its problems, this model worked fine in Real-Time Strategy games, and a more-improved version is still in use today by most RTS games. So, does it work for an action game? It only works when Latency can be kept to a bare minimum. Basically, Peer-To-Peer Lockstep for action games worked on Local Area Networks (LANs), but it definitely could not be used over the internet. The problem is called Input Latency: the time it takes between when the user changes their input and when the user sees the game change from that input. For example:
- Player presses the move forward button on their keyboard
- When the turn is over, that command gets uploaded to every other player
- Other player’s commands get downloaded
- The next frame is rendered
There is a huge delay between when the player performs an action and when the player gets to see that action performed. This worked fine in RTS games since the player is not in direct control of any units, but in an action game (where the user was in direct control) the lag makes for a terrible experience.
Client/Server
In Computer Networking, the client/server model represents a system where the data that all of the clients need is stored in one location (See Figure 2). The clients only know about the server; they do not need to know about each other.
Figure 2: Computer Networking Client-Server Model
In Gameplay Networking, Client/Server was introduced for action games to help minimize the problems that came with Peer-To-Peer Lockstep. In this model, each player’s computer is turned into a “dumb” terminal where the input is sent to the server to be evaluated, and the server sends updated game states back to the player. In this model, there is the benefit of each client running minimal game code. The client doesn’t need to know about collisions or other physics since all of that is handled by the server.
In Gameplay Networking, there are two types of the client/server model: Non-dedicated Server and Dedicated Server. The first is where the server is also a player. In this scenario, all players have a similar computer setup, but one of them is chosen to be the server (usually based on reducing the average latency among all players). This person is known as the Host of the game. All other players connect to the game which exists on the Host’s machine. The Dedicated Server model is more like the Computer Networking client/server model where the server simply shares data with the clients and is not part of the game (there is no host).
The main benefit of the Client/Server model is that there are no more turns. This reduces latency since clients do not have to wait to receive data from other clients. Instead, the server is always pushing game states to the clients, and the clients can always send input data to the server.
Figure 3: Client/Server data transfer. The Client sends user input to the server (e.g. Move Forward). The Server sends updated game states to the client.
Another benefit of Client/Server is that it doesn’t have any consistency issues. In Peer-To-Peer Lockstep, there was a possibility of players’ game states becoming out of sync. This issue is eliminated in Client/Server since the game state only exists on the server, and the clients are only getting snapshots of the game state.
Entity Interpolation
A big problem with pure Client/Server is that the frame rate on the client machines is limited to how fast it gets updated game states from the server (See Figure 4). The client’s machine has to wait until the game state is received before rendering each frame. If the server can send 60+ game states a second, this wouldn’t be an issue, but this is almost never possible because of bandwidth constraints.
Figure 4: This shows the frame rate limitation of the Pure Client/Server model. The client can only render a frame when it has received a game state from the server.
Figure 5: Explaining Interpolation. Say we know player X’s position at time = 2 seconds and time = 8 seconds. If we wanted to calculate player X’s position anywhere between these two points, we would draw a line between the points. Then, the position of player X would be equal to where the desired time intersects with the interpolation line.
The solution to this is called Entity Interpolation. Interpolation is creating additional data points inside of known data points (See Figure 5 above). Using this technique, we can create an unlimited number of frames given that the frame we want to create falls between two known game states. The problem with this is that the Client would never be between two game states if it was trying to render the game in real-time (See Figure 6).
Figure 6: Timeline showing the game states received by the client and the possible frames it can render. The problem is the last two frames since they don’t fall between two known game states.
Figure 7: Possible implementation of Entity Interpolation. The Client’s current rendering time falls between snapshots 10.20 and 10.25. The actual client time is after snapshot 10.30.
The solution to this is to shift the render time back so that the client always has two game states to render between (See Figure 7 above). In this example, the Client’s rendering time is delayed by two snapshots (game states). Shifting the render time back one interval would simply make the interpolation possible, but shifting it back two intervals also provides protection against dropped/corrupt packets. If any single snapshot is not received correctly, the client can simply disregard it and interpolate between two other snapshots. For example, if 10.25 was dropped, the client could still interpolate between snapshot 10.20 and 10.30. However, if 10.25 and 10.30 were both dropped, the model would break. This would most likely cause the client’s machine to freeze until it another snapshot was received to interpolate with. The model could be changed to handle up to two packets dropped in a row by shifting the rendering time back another interval, but this would be at the cost of more input latency for the player.
This model can also be used to explain why getting a packet late isn’t an option (why we chose UDP over TCP). Receiving a packet late would most likely place it to the left of the current rendering time. However, at that point, the client would already have two game states to interpolate between. For example, if 10.15 were received by the client at the current time, it would be too late; the client could already interpolate between 10.20 and 10.25, so it doesn’t need 10.15.
In summary, Client/Server with Entity Interpolation provides some key benefits. One, it gives the client unlimited frame rate. So if the player has a good enough graphics card, that player will have a very smooth visual experience. The second benefit is that the client still runs minimal game code; the only extra work is interpolation, which is a lot easier to compute than the physics the server has to perform.
The big problem with Client/Server with Entity Interpolation is the input latency. Now, in addition to the Round-Trip Time between the client and server, there is an additional delay caused by the interpolation window, which was a tenth of a second in the previous example (Figure 7). This combined delay is undesirable for gamers who want perfect control of their game. Also, if this is a non-dedicated host, the Host player will have a huge advantage since it doesn’t have to deal with Round-Trip Time or Interpolation.
Client-Side Prediction
The solution to this input latency problem is called Client-Side Prediction. This gives more responsibility to the client machine. It allows the client to predict where to put the user’s character immediately after a user interacts with the game. For example:
- User presses the move forward button on the keyboard
- The client moves the user’s character forward
- The client still sends the command to the server to be evaluated
- The client gets the updated game state from the server 1 RTT later and compares its state to the server’s state. If the client’s prediction was wrong (server state is different), the client adjusts its state to match the server’s.
Figure 8: Client-Side Prediction rainy day scenario where the client’s prediction is incorrect. The first command was predicted wrong, so the client undoes all commands and redoes the commands after it applies the server’s correction. In this case, it results in the user’s character shifting down.
This “adjustment” isn’t as easy as it sounds however. The problem is that the corrected state coming from the server is too old to use for the client. So, the client has to undo the commands until it reaches the command that it incorrectly predicted for. Then, instead of predicting the state for that command, it uses the server’s actual state. The client then repredicts the other user’s moves starting from that new state.