Enhanced synchronization by adopting agreement algorithms

Andres Gutierrez
Brainz Engineering
Published in
4 min readJan 19, 2017

--

World Ward Doh is a multiplayer fast-paced RTS for mobiles, it allows two players located at different geographical locations to have battles through mobile networks. While building the networking system for World War Doh we faced many technical challenges including desynchronization as one of them.

Desynchronization in a multiplayer game is mostly caused by network latency, which is the term used to indicate delays that happens in data communication over networks. Mobile networks in certain places of the world suffer higher latency rates than others. In general, high latencies can affect any data network at any point.

If not addressed, these technical tradeoffs manifest themselves in ways that users notice. Overcoming these challenges requires a radical departure from the status quo of a low latency network.

Background

Suppose a player that sends a command informing the position of a unit to the pair at the other side of the link and gets the message 500ms later. While it’s a small difference in time due to this both players are experiencing a slightly different state of the game. The game can rapidly take actions to improve user the experience but it has to know in advance how much delay has happened since the moment the message was sent.

Latency variation sample from a WWD battle

In the previous sample graph we can see how latency varies practically every second, the time taken to send a message across the internet varies, and because it can be different in each direction relying on a single number to compensate latency is almost impossible. Latency varies depending on a number of factors, such as the physical distance between the end-systems, as a longer distance means additional transmission length and routing required and therefore higher latency.

Since the amount of latency is not stable is hard to find an optimal or near-optimal strategy to perform client prediction and correction. How do we properly measure network latency? It doesn’t follow a normal, Gaussian, or Poisson distribution, so looking at averages, medians, and even standard deviations isn’t enough.

Looking for confidence

One of best solutions we thought to mitigate this problem was to apply a modified version of the Marzullo’s algorithm to get an “optimistic” value for latency based on very recent historical measurements of latency.

Organizing measurements in confidence intervals in the form of symmetric round-trips: [-latency, +latency] (assumes that the delay in both directions is equal), Marzullo’s will return the smallest interval consistent with the largest number of samples, however we take a more centered interval value
as correction for latency [beststart, bestend].

The previous formula is used to calculate the round-trip delay of the transmission using a intermediate server to relay a message. There are various diagrammatic explanations of how this algorithm works under the hood, including Wikipedia:

The values of these calculations are corrected and adapted as the battles progress in time. Also, having a look into RTDM and TCP timestamps we can extract the approximate delta times in which packets are sent at a very low computational cost. Using all these data we can reduce or disguise delays to improve user experience.

We can see now the following comparison between real latency and normalized latency used for client prediction:

Real latency vs normalized latency

Conclusion

We use many calculations to avoid the impact of network latency on the gaming experience. Here lies an important difference between developing single player and multiplayer games through mobile networks. This methodology has helped us to synchronize and reconcile many aspects of the game such as remaining battle time and units positions with enough confidence in World War Doh.

--

--