[Bug Report] NetDriver Time and general network assumptions will fail after a certain amount of time

Dear friends at Epic.
I have recently, unfortunately, had to do a bit of spelunking through the connection challenge and replication code only to find that some of the floating point numbers in use by UE4’s networking code cannot support long running servers.

lets take a look at some of the offending code as an example. This can be found in StatelessConnectHandlerComponent.cpp in the function IncomingConnectionless

else if (Driver != nullptr)
{
    bool bChallengeSuccess = false;  
    float CookieDelta = Driver->Time - Timestamp;
    float SecretDelta = Timestamp - LastSecretUpdateTimestamp;
    bool bValidCookieLifetime = CookieDelta > 0.0 && (MAX_COOKIE_LIFETIME - CookieDelta) > 0.f;
    bool bValidSecretIdTimestamp = (SecretId == ActiveSecret) ? (SecretDelta >= 0.f) : (SecretDelta <= 0.f);
    if (bValidCookieLifetime && bValidSecretIdTimestamp)
    {

After a soak test, the CookieDelta above will regularly be a flat 0, because the DeltaTime added to it in Tick() is too small to change Driver->Time between the time stamped and the current time due to floating point errors and their nature to grow at larger values.

The error occurs some time before Driver->Time becoming 67817.4141(seconds ~19 hours) when the server has it’s NetServerMaxTickRate set to 350. This would theoretically break faster at much higher tick rates, but definitely breaks at slower tick rates, the standard NetServerMaxTickRate being 30 means it is only a matter of days before a UE4 server is unusable.

This is a problem, now for possible solutions.

RESET THE TIMER
A hard reset is less than desirable, and if there is a way in which one could reset these values without heavily interrupting service (full player disconnects, replication errors, etc), it would be infinitely more functional. I unfortunately have no idea how to achieve anything but a hard reset in this case.

USE DOUBLE PRECISION OR LARGER NUMBERS
This would require a bit of an engine injection, but I think it is possible, and would increase the amount of time a server could persist by a factor of 2. It might be worth considering for all of those engine programmers that can get their pull requests to the network code approved in a timely fashion, or just want to fix it for themselves locally.

To implement the Double Precision solution, you’ll need to edit the following files:

Files:
Engine/Source/Runtime/Engine/Classes/AI/Navigation/AvoidanceManager.h
Engine/Source/Runtime/Engine/Classes/Camera/PlayerCameraManager.h
Engine/Source/Runtime/Engine/Classes/Engine/NetDriver.h
Engine/Source/Runtime/Engine/Classes/Engine/NetworkObjectList.h
Engine/Source/Runtime/Engine/Classes/Engine/World.h
Engine/Source/Runtime/Engine/Classes/GameFramework/Actor.h
Engine/Source/Runtime/Engine/Classes/GameFramework/CharacterMovementComponent.h
Engine/Source/Runtime/Engine/Classes/GameFramework/GameNetworkManager.h
Engine/Source/Runtime/Engine/Classes/GameFramework/PlayerController.h
Engine/Source/Runtime/Engine/Private/AI/Navigation/AvoidanceManager.cpp
Engine/Source/Runtime/Engine/Private/NetworkDriver.cpp
Engine/Source/Runtime/Engine/Private/PacketHandlers/StatelessConnectHandlerComponent.cpp
Engine/Source/Runtime/Engine/Private/PlayerCameraManager.cpp
Engine/Source/Runtime/Engine/Public/PacketHandlers/StatelessConnectHandlerComponent.h

Basically making the floats into doubles.

Here’s a paste that shows our house edits to accomplish this:

This just totally saved me days worth of scratching my head. Thanks for posting problem!