[Feedback] Why are development assets stored in a binary format?

I’ve decided to give feedback about something that has been bugging me for some time now. Why are the source/development assets (and especially blueprint assets) stored in a binary format?

I completely agree with wanting cooked assets to be binary, it tends to be faster to load and faster to parse. But for development purposes going with a binary format can quickly become a nightmare for any team of people when more than one person edits the same asset. The only “solution” to this is to use exclusive locks - so now you’re relying on the mechanics of source control and ensuring that something like git wouldn’t (easily!) work well with Rocket projects.

Obviously this is always an issue with certain file formats (such as sound, image and so on) - I’m just wondering what the decision process was behind making something that can be modified using text (via C++ source code) end up being unable to merge?

I’m sure this isn’t going to change at this point, I’m just interested as to how we reached this point. And although it hasn’t actually been an issue yet, I know I’m going to hit problems as the team size ramps up and assets are locked after I’ve added or removed UPROPERTYs (for example!). :slight_smile:

Well, I guess that decision was made 10 or 15 years ago.

I am not going to try to justify that decision, but I think you are holding on to several myths about text based formats.

  • Myth: Text based file formats are human readable
  • Myth: Text based assets will have sensible diffs
  • Myth: Text based assets can be merged
  • Myth: Text based assets can be edited with a text based editor
  • Myth: Text based assets don’t need versioning

None of those are true, in general. Might be true for certain assets, not true in general.

Maybe certain assets could be marked as safe for text. For example, copy an actor in the editor and paste it into a text editor. Edit the text in the text editor and paste it back into the editor. Works right? Now break the text and try again…crash right?

I think it is a fairly reasonable to have a “save as text” check box for some assets that are certified to be reasonably safe to use a text based format. I would worry a little about proper error control though.

Thanks for your reply! You definitely make some interesting points.

That said, I don’t think that I’m necessarily holding onto several myths. Obviously we’re both making generalizations here but:

  • Text based file formats are human readable

Completely depends on the format. I agree with you that just because it’s ASCII (or whatever) doesn’t make it human readable. You’re right. That said, something like JSON is pretty readable.

  • Text based assets will have sensible diffs
  • Text based assets can be merged

Again, depends on the format. I hate merging XML, for example, but at least it’s possible. JSON, so long as the properties are consistently sorted, is pretty easy to diff and to merge.

  • Text based assets can be edited with a text based editor

Well, that’s not really a myth. Whether you’d want to is a matter of opinion but if you can’t edit text in a text editor then… :wink:

  • Text based assets don’t need versioning

Of course they do. I’d argue that text-based assets version better because, generally speaking, you can create deltas - something that is rarely worth doing with binaries.


Knowing that it’s been this way for 10-15 years makes it a lot more understandable to me - legacy being what it is. But it does mean that no large team using UE4 could ever use a DVCS such as Mercurial or git because there is no locking mechanic (well, without extensions) there for obvious reasons.

You are right about error control and that’s definitely a problem - one that’s avoided by keeping everything binary.

I guess I’m just interested as to how large teams will scale well with UE4 given that a lot of code is going to be written in Blueprint - meaning that only one person can work on a given asset at any time. That’s really what I’m thinking about.

Anyway, thanks for your reply - you’ve given me some insight!

What you are missing is that there is no way to guarantee that anything you do with a text editor results in a loadable or usable asset. And worse it might load fine and generally work fine, but it is still invalid in some subtle way, crashing later for mysterious reasons. So it is a myth that you can edit them with a text editor in general. Deltas might be completely worthless. Merges might be completely impossible: “everything is different”. JSON might not be readable in any real sense of the word.

I think it is a mistake to “write lots of code in blueprint”, but I think you can diff it today.

If you are using a version control system that doesn’t support locking of binary assets, well that clearly isn’t going to work very well. Using a text based format won’t solve that problem. Multiple people working on the same “sufficiently complex” asset at the same time can only end in tears, and this has nothing to do with text vs binary.

Oh, no, I’m not missing that. That’s why I wrote “You are right about error control and that’s definitely a problem - one that’s avoided by keeping everything binary.”

That’s definitely the best argument you could make to me to keep development assets binary. :slight_smile:

I’d love to know more about diffing blueprints - that could alleviate the majority of my worries!

Thank you for your replies by the way, they’ve been super useful. :slight_smile:

The diff feature in Blueprint works really nicely. So that definitely allays a lot of my concerns about these being binary :slight_smile:

I was just looking at blueprint diffing. It’s pretty cool though I feel like the UI could be improved. It’s perfect when only one or two things have changed but quickly becomes very “busy” when lots of changes are made.

Since I have concerns about people (such as mission scripters) not being able to work on the same mission at the same time I’ll try and engineer the game to make sure that doesn’t happen. :slight_smile:

If you want to provide feedback on this, please submit another post so that it can be routed to the right people.

Its very simple, when you store a number in a text file, think about “12345678” that is a string of 8 characters, each with its own space, if chars are 1 byte(ascii), then that thing is 8 bytes.
That same number, stored binary, as an integer, is 4bytes(normal int32), and it can hold MUCH bigger numbers.
its even worse with float types, like vertex positions, something like 12343.5432624 is perfectly possible, and thats 15 characters, or 15 bytes. you can put that number in a simple float perfectly fine, and a normal float is 4 bytes.
You dont need to be computer scientist to see that 4 is less than 15, and changes can get MUCH more extreme.
Thats the reason UE and almost all file formats that NEED performance and little space are in binary. Becouse, apart from being much much less in size, its much easier to make them load.
When you load an integer variable from a file, its just a “copypaste” of memory, no calculations involved. And if you have that in text, the program has to parse that string of text, and create a integer representing that string of text, wich is much more resource demanding. As UE needs every bit of memory and speed it can get, every asset file is stored as binary, and probably also compressed.

I think you missed the part where I wrote “I completely agree with wanting cooked assets to be binary” because, yes, that makes total sense (for all of the reasons you just listed - and more!). :slight_smile:

I agree with you. Blueprints are actually scripting source codes. Source codes are naturely goode when in text format. It makes Version Control Tools, Merge Tools and Diff Tools work perfectly. Because these tools design for text based source code files.

The Binary format codes are very bad. It is just like let the Version Control Tools to manage the .obj files that compile with Visual C++. It is a totally a nightmare. The .obj files are hard to merge, diff, and checked for reviewing. So more often, the .obj files are not allowed to commit to Version Control System.

Why not provide the blueprint sources to be text based when developing, and cooked to binary when published.

I totally agree with you. Hope Epic should think about these carfully.

Merging and pulling compiler’s intermediate files to the repository? Oh, why?

Because there is no source code for Blueprints, source code for Blueprints is what you see - nodes. It’s then getting compiled into byte-code. There is no other form of Blueprints.

You can dump Blueprints to C++ code, that will not compile AFAIK, but that’s all.

P.S. That question is very old, and it would be better to start a new one.

Will the New Blueprint Merge tool work along well with Existing Version Control Tools.

Hi,

This is an archived post from our closed beta of UE4. If you would like to post a new questions feel free to post it on the AnswerHub or if you want to discuss a topic you can post on the Unreal Engine Forums.

Thank you!

Tim