Network serialization is sensible to packet corruption
I started to notice this issue after our update from 4.11 to 4.13, but i'm not sure if it was present even before.
In our project, we manually serialize some data, and send it to clients using some replicated TArray, to grab them (if and when they arrive client-side), deserialize and use the data. So far so good: locally it worked like a charm.
But when we tested it online, we got totally random issues/crashes: after a lot of tests, we saw that sometimes the deserialization output data was just garbage, and we narrowed the problem to the deserialization input data.
We created a new project to test this issue, and confirmed it. Below you can find the issue description, how to reproduce it, source files etc.
A replicated TArray of bytes (uint8) can get altered/corrupted over a internet network connection, probably due to UDP packet alteration/corruption. I'm not sure if this can happen with every replicated variable (i.e. if it's a global problem or just a TArray's one).
How we tested it
To test the isssue, we created the following situation:
server: it spawns 4 actors that will contain a TArray, and randomly generates 4 byte sequences that will be used as test cases (random length, random data), and save them in the GameState (these are replicated too, since the client will use them for the validity check).
Every tick, the server will swap the sequence between replicated actors, forcing them to be sent to the client (in particular, we choose to cycle them, i.e. the first actor will have on the first tick the sequence 1, on the second tick the sequence2, on the third the sequence 3, and so on).
client: it receives the replicated data on each actor, and checks it with the 4 known sequences, to see if corresponds or it got corrupted somehow.
Testing it locally, it got 0 errors in an hour of continued testing, as expected.
However, we got 2 errors testing it over the internet network after half an hour (we did just a test with this setup, but as we saw in our project, it could be as low as 3 minutes, it's totally random).
How to reproduce it
I will attach a .rar with the configs, content and source files. (source attachment)
I can provide you with the packaged version too if needed.
If a packet gets corrupted, the data should be dropped/ignored, and the variable should not be updated.
At this point, i am curious and have a couple of questions:
Test result example
Here a log result example:
Server side (test sequences)
client side (wrong sequence received):
As you can see, it was the same sequence in this case, with 2 different errors contained. This happened after 25 minutes from the test begin.
I was able to recreate the issue on a project on my end. The result is pretty inconsistent but it does to appear to be an issue.
I have created a issue report regarding it, which you can follow here:
Thanks for the report.
answered Nov 02 '16 at 08:00 PM
ImVawx ♦♦ STAFF
To start, if you are intentionally using third party software to prevent your network from functioning as it normally does, that isn't a bug with the Unreal Engine; I understand it is for your test but we can't verify bugs in this fashion.
Secondly, I did send an email out to a networking engineer and got an answer to how UE4 handles packet loss:
If packets are dropped, properties will eventually replicate such that the client will eventually match what the server says.
The caveat is that it's possible to change property a on frame 1, b on frame 2, and c on frame 3. If all 3 frames worth of packets were dropped, frame 4 could deliver the properties values on the same frame, even though they changed on separate frames.
On the other hand, if you change values all on the same frame, they are at a minimum guaranteed to arrive on the same frame (but you might also get other properties combined still as explained above).
This means that even though you are seeing a mismatch between your client and server response with the byte arrays, UE4 is setup in a way that will make sure to sync the client to what the server has, it just might take some time for another packet to get through.
Lastly, thank you for the detailed write up explaining your issue. We appreciate the time you took to put that together so we can investigate your problem.
Thank you for submitting a bug report, however at this time we believe that the issue you are describing is not actually a bug with the Unreal Engine, and so we are not able to take any further action on this. If you still believe this may be a bug, please provide steps for us to reproduce the issue, and we will continue our investigation.
I had the same problem. It did not happen when dedicated servers installed in local network. However, it was a frequent occurrence when we installed dedicated servers in foreign countries. I installed a dedicated server locally and easily reproduced it with "net pktloss = 30". It seems to occur not only in TArray but also in other replicated property types. This is a big problem that game service is impossible. Why wait until 4.17?
answered May 11 '17 at 02:41 AM
Follow this question
Once you sign in you will be able to subscribe for any updates here