ForcePositionUpdate issues

Hello,

We are seeing issues where a blocked client connection can cause an autonomous proxy to have their movement updated twice on the server for the same time interval.

When a connection is blocked for MAXCLIENTUPDATEINTERVAL, ForcePositionUpdate is called, the client is moved, and the ServerTimeStamp is updated. Subsequently, if the client’s connection is unblocked, the ServerMove() calls are then processed as the ClientTimeStamp is still valid. For a character moving forward, this can result in the player being moved forward at twice their maximum speed.

Solutions include:

  1. Updating the ClientTimeStamp by the dt passed to ForcePositionUpdate().

  2. More friendly to blocked connections - only consume the dt that TimeSinceUpdate exceeds MAXCLIENTUPDATEINTERVAL. This would only penalize clients by the time they exceed MAXCLIENTUPDATEINTERVAL rather than the entire interval.

Thank you,

  • Chris

Hi Chris,

I think the first suggestion intuitively makes sense. A colleague brought up the concern about there still being an issue if error tolerances are set to a lenient value (or client auth movement is enabled), but we’re examining that.

Can you clarify your second suggestion?

More friendly to blocked connections - only consume the dt that TimeSinceUpdate exceeds MAXCLIENTUPDATEINTERVAL

Do you mean use this delta time for both the ForcePositionUpdate and when incrementing the ClientTimeStamp by this amount, or doing something different for both?

I think option #2 is best (just ticking by the amount greater than MAXCLIENTUPDATEINTERVAL). As you say, it respects latency and that moves might be in-flight.

Thanks for pointing this out. I’ve filed UE-30262 for tracking and will implement the fix soon.

As for reproducing this, it seems like causing a client or server hitch > MAXCLIENTUPDATEINTERVAL should be sufficient, or did you have a different repro?

Hi Zak,

Thanks for the response. Sorry I wasn’t more clear.

Do you mean use this delta time for both the ForcePositionUpdate and when incrementing the ClientTimeStamp by this amount?

Yes, that’s what I meant. To be explicit, let’s say a client hasn’t responded to the server for TimeSinceUpdate = MAXCLIENTUPDATEINTERVAL + someDeltaTime.

Calling ForcePositionUpdate(TimeSinceUpdate) brings the client up-to-date, but might be harsher than it needs to be. We could balance the need for the simulation to be updated with the desire to honor the client’s (probably in-flight) moves.
Calling ForcePositionUpdate(someDeltaTime) would keep the client at most MAXCLIENTUPDATEINTERVAL behind, while still allowing most of the in-flight moves to still be consumed, thus resulting in a smaller client correction.

Thank you,
-Chris.

That’s correct. The bug appears if you block the connection for time greater than MAXCLIENTUPDATEINTERVAL.
As a side note, in our playtests we observed quite a few home connections that would occasionally block for times greater than 0.25 seconds.

Thank you,
-Chris.

Hey Zak,

You are right, we hadn’t considered a client stall. As you mention, in the suggested approach, a client that has stalled will never catch up with the server’s CurrentClientTimeStamp.

To address this I have changed the ServerTimeStamp flow. Currently the ServerTimeStamp is used only to trigger the ForcePositionUpdate. It is used essentially as a “time since valid client move”. By changing it to “time since client move received” I was able to get around the stall issue. I am now updating the ServerTimeStamp at the start of ServerMove_Implementation before the validity of the client time stamp is checked. This way when a client is stalled, the first move (and all subsequent moves) received will update the ServerMoveTimestamp, thus stopping the server from advancing the CurrentClientTimeStamp.

This method isn’t perfect though as the stalled client’s moves are still rejected for the period of time the client was stalled. We might be able to address this by having the client inform the server it was stalled.

A different approach would be to preprocess the RPCs. A blocked->unblocked connection will flood the client with many messages. The net tick could look at the total client time stamp delta for all queued RPCs before choosing to process the RPC. In the case of a lag switch, the total client time stamp delta will be very large, and could take appropriate actions. A stalled client on the other hand would have a normal looking client time stamp delta, and so the moves would be allowed.

Lastly, as we are discussing the rate of RPC arrival, I would like to mention a simple client hack that we are thinking of how to catch. If a malicious client was to change the world sim rate, similar to the command “Slomo 2” they will now run at double the rate on the server. Thoughts?

Thanks, c

Very cool, we will take a look!

Thanks,
c

I’ve seen that sort of delay on my own home network with outdated router firmware. I’ve also seen it in Windows 8 when you click on the network tray icon; it seems to scan for networks which can block any current connections. Also obviously an app hitch will do it.

I’m sure there are plenty of other ways for this to occur. So I’m glad you pointed it out.

I experimented with both approaches briefly, and ran in to a problem.

If the server advances the client timestamp, this assumes the client time is monotonically increasing and the issue was only in the network layer dropping packets. It appears that if the client actually has a long hang and their timestamp is effectively “paused” for a while and then resumes, the client will never be able to catch up to the server-advanced timestamp.

This is easy to repro by having the client experience a big hitch, or simply holding on to the title bar window in Windows and starving the app of updates. I think this is mainly because of the way we advance game time (clamped to a max delta time, and updated with deltas, not a wall-clock time).

Let me know what you think.

Version 4.12 includes client speedhack detection and remediation as an engine feature. It’s off by default but simple to enable via ini. Look for it in the release notes, there is a lot more detail there.

Version 4.12 also now tracks the time of the last received client move, this may make the change to detect last client move easier.

If you come up with something that appears to address both blocked connections and client stalls it would be great to see a pull request or code snippet. I probably won’t get to the JIRA issue for a couple weeks but it is marked for 4.13. Thanks!

(edit: version numbers turned in to numbered lists…)