Replicating lots of data: Replication, RPC, or custom sockets?

I am trying to find the quickest way to replicate a voxel world, made up of chunk actors. Each chunk contains a replicated array of visible voxels.

I have noticed that the built-in replication is slow/throttled for these actors when the array is large. It is especially slow when new players join, as it takes over 1 second to replicate each actor. What are some other techniques for replicating large amounts of data?

I know the built-in replication has some restrictions, and the throttling is intentional. I’ve also read it won’t replicate arrays that are larger than 2048 entries.

I have seen some efforts to compress the voxel data before replication with NetSerialize, but I can’t seem to find any working examples. I thought that the engine would try its best to compress (gzip?) the data when replicating anyway?

Should I continue to pursue replication with compression, attempt to use RPCs to sync the data, or use sockets?

Having a quick think of this Im wondering if replicating Actors that store sections of the map in it might be better, you can have mesh components in these Actors rather than using arrays. This might be better as you can use relevancy and frequency based on where the player is loading into instead of pushing the entire map whether they can see it or not.

Thats my goal. I want the nearby actors to stream quickly though, as they currently take a long time to transfer. By using replicated actors, I’m letting the engine decide whether or not the actor is relevant based on distance. But right now, all of the nearby actors still take a long time to transfer.

I am looking into using FCompression to compress a uint8 array representing the voxels.

Yeah its a difficult problem, each Actor makes a new replication layer for itself too so the more Actors the worse it’ll get. Though there might be a sweet spot in there where fewer Actors replicate more quickly but dont have such large packets.

That does sound like a plan, Im guessing youve already optimized for a fixed voxel size so all coordinates fit on a grid. Then you can multiply by gridsize to get a larger number, for instance if you have a byte and gridsize is 10.0 thats 2550.0 (25.5m) size which is only in cm but if your gridsize is larger in 100.0 range then thats 25500.0 (255m). That could be large enough to optimize for each Actors chunk, it does depend on how many voxels we are talking about.

Its possible you may need to hide voxels that arnt visible, especially large chunks as you dont want non-visible nearby Actors streaming in over the network needlessly.

I think Sockets might be of some use in this case especially since Im assuming you want to replicate actual game data along side this too.

I am happy when I see other people doing same stuff as mine. Voxels… I am making a voxel game in plain C++ server and an Unity3D client. these days I started implementing also an Unreal client, but still using Enet as my server…

Anyway, since, I do voxel tests in almost any engine I can encounter, I always end up with some form of troubles.

I will share my experience wth all engines.

  1. C++/ogre3d/enet: This was the easiest part. However making a game this way ended up being extremely frustrating. I managed to sync a big portion on the world while player was connecting, then incrementally load chunks around the player, using Enet and compressing the stream with LZ4.

  2. Unity client / C++ Server

    2.1. Enet: Same as 1 but realized how slow C#/Mono is for voxel processing.

    2.2. Unity’s replication: Raged a lot of it.

    2.3. C# .net socket: Failed to find a way to properly communicate between C# and C++, cancelled.

    2.4. Photon: Didn’t even tried…

  3. Unity Client / Unity Server / LZ4 C++ plugin: Managed to compress and stream the voxel data and then start streaming the world around players. However, C# became the bottleneck, again…

  4. Unreal: Tried and decided that Unreal’s replication is also slow as hell for voxels, however, we have the C++ source of Unreal, and, once I understand how to alter the throttling and how low-level networking protocols work, I will eventually succeed.

Ideas you can try:

  1. Don’t send all voxels, but send the seed and formula and manage and only send DIFFs. This way you send quite as few as possible data but you must have diff-map. By diff-map I mean, you send the seed and formula, you guarantee that this is deterministic and: For each voxel, also store it’s diff state [single bit] In case a chunk have more than 30% diffed, send entire chunk - it will be cheaper! else send a list of coordinates and values for only modified voxels. This, in general, will work. This is the only way I managed to defeat Unity/Enet attempt in my case, and will probably use the same algorithm no matter of what engine I actually use.

  2. Don’t send ANY voxels. Send Meshes instead. However, I had a problem: Since I had to support unlimited materials per shader pass (using 3D arrays), I had to support material-per-vertex. This way I had to generate Material Ids per vertex and blend between them - this, I needed to also store vertex’s index and use forced triangle lists (with dupliated vertices because they differed by their ID!) At the end, compressing this mesh data was worthless - I dodn’t got ANY ratio because floating point arrays ae practically uncompressable. LZ4 managed to smash 200 MB voxel data to 2 MB! and 6mb mesh data to… 4 mb. What?

This is my research so far. I don’t yet know the Unreal internals, but in BOTH Ogre3d/C++ AND Unity, I was forced to write custom shaders, so these things were a lot of research.

So, my answer of your question will be: Custom sockets and NO throttling.

Thanks for taking the time to share your experience. I have had some initial success using UE4’s built in replication to serve zlib compressed arrays of visible-only voxels.

I’ve been packing the visible voxels into uint8 arrays the indices first and then run length encoded types to correspond with each index, then using UE4’s built in ZLIB compression. They transfer at decent speeds now, especially if you slightly raise the maximum transfer rate.

It’s not the best solution as I can’t prioritize which chunks are replicated and don’t really have much control. TCP sockets seem like they will be the way to go.

I am just playing with untextured blocks right now, learning as I go. Would be interested in having a chat with you though. Feel free to message me on Twitter @staringatlights

BTW no rle is needed. If you order your data so it’s repetitive, zlib will smash your data a lot anyway. I’m @petersvp everywhere.

Update: I’ve abandoned using the built in actor replication for voxel chunks in favor of using RPC functions to manually sync the data, heres why:

Actor replication does not allow you to prioritize which chunks get sent first, or control the speed they are sent at. It’s also inefficient for the server to constantly be evaluating whether each actor should be replicated to each client.

I am now using a single “World” actor for each Player that exists on both server and client, and is owned by the client. This provides easy bi-directional communication. On the server, the world is responsible for getting the players position and sending all nearby chunks to it’s client counterpart, which then renders them. I have much more control over the order they are sent in, and how they are throttled. Chunks are children of UMeshComponent and attach to the world actor.