A question about Multi-Threading BluePrints

Just a quick question. Is there a “lock” node in blueprints currently? I don’t think there is, but asking to make sure. By “lock” node, I mean a standard atomic lock/operation ability, where by 2 threads can lock each other out, and have guaranteed exclusive access to an object.

I don’t think this exists, because well I couldn’t find a node for it, two I could not prove to myself that there was an issue with it (say, there is processing in Tick Event, and there is processing for Overlap object, both of which could fire at the same time, as they would look like external interrupts to the blueprint, as the tick time, and overlap event cannot be predicted). As I made sure that the amount of mutual objects between the two (tick and overlap) was kept to a minimum, but I do have one place, where it can occur, probably lasts for a few nano seconds, but it can happen.

So I was just curious, and if there is not a “lock” node, I would definitely vote for one! Thus allowing me to start thinking in that fashion, and if Epic Games ever moves to a paralyzing compiler, well then heck, I’m already good to go, so just bring it on!

Frankly I would love it, idle threads being used? , sounds like a good time to me!

Thank you for your attention.

You are probably looking for “GATE” node . You can use this node to open some line of execution on basis of some triggers and close other executions. You can check the docs https://docs.unrealengine.com/latest/INT/Engine/Blueprints/UserGuide/FlowControl/index.html#gate

Thank you for your reply, very much appreciated, but a “gate” is not what i’m looking for, because a “gate” cannot guarantee an “update” to an object, being “atomic” in that any other thread could “update” the same object, before the first thread had completed it’s processing on that object. An Example.

Two Threads, Two Objects, what they are doing makes no difference, Update operation 1 performed by thread 1 on the object, takes 100 nano seconds to perform (again, as an example), Update operation 2, performed by thread 2 on the same object, takes 25 nano seconds to perform.

Sequence of events

Event 1 fires, causing thread 1 to perform the operation (an update), and is processing merrily along, happy as a clam, life is good, the customer bought the product, money in the bank, happy happy blueprint!

Event 2 fires 25 nano seconds later… causing it to start an update of the same object, and it does the update, that only took 25 nano seconds to ocurr, so now we are at 50 nano seconds into this, from the firing of Event 1.

But wait, Event 1, takes 100 nano seconds to complete… and we have no idea, where it’s executing at currently, we only know,t that Thread 2, because of Event 2, has updated the object, that Thread 1 is currently updating, we have an indeterminate state of the object. Thread 1 is still processing, thinking it had exclusive access to the object, it didn’t, as thread 2 has altered it’s state. Thread 2 is the only Thing in this whole mess, that actually knows the state of object, because it just finished. Yet Thread 2, doesn’t know that it should tell thread 1 (and even if if it did, it would still need some kind of lock, to pass the information, and Thread one would have to go into some form of a spin/wait lock in order to reliably received that information, which we don’t want, because spin locks are hell, wait locks not so bad).

But if there are lock nodes, then it can be guaranteed, that Thread 1 and 2, have mutually exclusive access to that object. Even if the lock never prevents this condition from ocurring, it’s a small price to pay, in comparison to “what can happen”, i.e. debugging race conditions, etc. Those can be a nightmare to figure out.

So again, I was only curious, if

  1. there is a lock node, and a gate cannot do it.
  2. if blueprints were multi-threaded
  3. and if they are not, when will it happen!

Again, tahnk you for your reply, and have a great day!

I dont think there is some LOCK node but what if you use a bool variable e.g. event 1 executes only when this bool variable is true . Whenever event 2 is fired, set this bool to false, which will lock event 1. After event 2 completes, set this bool to true again which will make event 1 to fire again . In this way these 2 events can be mutually exclusive. I am just thinking of possibilities.

Hello Ambhar!

Thank you again. A variable in and of itself, cannot do it, else operating system theory would have been a lot simpler decades ago!

Going back to the example above, but shorting the time scale. Remember that the processor is switching context, outside the domain of the game, that part is important, very important (multi-core processors, just make it more complex, but the theory, pretty much makes it the same, and processors do provide, “lock” operations, one of the oldest that came out of IBM System 370 architecture was Compare Double and Swap, it was an atomic operation, that was guaranteed to lock across all processors, it was a side affect of the instruction (think assembler) but I digress).

So we have 2 threads, both racing towards that variable. Thread 1, starts to do the update, and let’s say, it takes, 500 instructions (not unreasonable, especially if you include the over head of calling functions, setting up the stack, etc) to perform this update, that in the Blueprint graph, looks like 1 “thing” occurring. But it’s not, it’s TONS of things occurring behind the scenes.

Now Thread 1 gets interrupted, the task switches, all the LGDT (Local descriptor tables get updated, our wonderful pages are now all set up, and we are off to the races…)

Not so fast we aren’t, remember Thread 1? it just got context switched out, but it only executed 250 of those 500 instructions it needed to do, it may not even be to the point of updating that object yet (in this case the variable, that we are talking about to act as a lock). Now along comes thread 2, and it only needs 100 instructions to do the update (ask the compiler writer why, if it’s written in assembler, well it should be the same), and even if, the number of instructions is the same,

Then assume that thread 2 was ready to go, to do the update, but thread 1 had already used 90% of it’s time slice, the same thing is going to occur. Thread 2, will complete it’s update, Thread 1, will get switched back in, thinking that no one has altered anything, it updates the variable, then checks it, and all is well…

Except thread 2, has come and gone, done it’s update and Thread 1 never even knew it happened. Now thread 1 is totally oblivious to what happened, and again, we don’t know, what the outcome will be.

Thank you again!

You are correct that low level atomics are not exposed to blueprints in any way.

Np, thanks for explaining your question to the comment above.

,

Thank you for your answer!

Can’;t wait to get this Blueprint done, and ready for the Marketplace, then into C++, and see what kind of stuff I can get done there! lol

Have a great day, and again, thank you,

chuckles, and now for an answer on my other question, about a machine node!

Thank you

hmmmm, how to set this to resolved! There it is, I was just blind, and replying in the comment, instead of as an Answer.

Is being able to use multiple threads in BP planned for the future? Now that the BP->C++ converter is quite finished, will that make it easier to support multithreading in BP which then only works if the game is packaged? I think not being able to use multiple threads is like the only huge thing that’s making BPs way slower than C++ now.

You have probably seen my voxel game project, so you know why I want to have multithreading in BP :slight_smile: Minecraft-Like Infinite Voxel World in Unreal Engine 4 - YouTube

I’ve totally seen the voxel project. Very inspirational. We don’t have any plans and the big debate is whether we expose low level, unsafe primitives, or whether we try and take a more modern approach that is safer (along the lines of Google’s go). Both is also an option, but exposing primitives tends to limit what the compiler can do and softens safety guarantees.

Do you have a specific algorithm in mind? We might be able to put something together quickly given some context. Talking about multithreading in the abstract can be difficult.

I don’t really have a specific algorithm in mind, I just want to have a way to split computations up over multiple CPU cores.

A nice implementation would be something like this I think:

There would be a certain class which you can select when creating a blueprint. This class has one existing event like Work() (where you can set input variables) and another existing function like FinishWork() where you can only set return variables. From another blueprint (like an Actor) in the event graph you can use the create object node to create an object of the type we created earlier. Then you would call a function like DoWork on that Object which has the input and return variables you specified in the Work() event and the FinishWork() function, and this DoWork node would be a latent node with a “Finished” exec output pin. So once you call that node, it gets the input variables and executes the Work() event on a different thread. So inside of the Work() function you could do something that blocks the whole thread for a minute, but the game thread would not be affected.

The “Finished” exec pin executes once the FinishWork() function is called from within the Work() event in the thread and then you could use the variables that were returned to the Actor.

So like this, you could have an Actor that for example sends 2 int arrays to another thread, it would do some stuff (let’s say compare each index against the other array) and then send some data (like an int for how many values were equal) back to the Actor and the Actor would be able to use the int and do stuff based on it.

So it’s basically that you give it some data, you wait until it’s finished and then you get some data back. You would of course create an object like this like 8 times and then once all of them have executed the “Finished” exec pin you continue to do whatever you wanted to do in the event graph of the actor. If you have 8 cores, that would (almost) speed the calculation up by 8 times. And I think it should not be too much work to implement since the functionality is there in c++, it’s basically just exposing it to BP, right?

In my project, the most expensive stuff is (probably, I can’t profile with the converter enabled unfortunately) the check for which block is visible in a chunk when the chunk is loaded. That’s just looping over all (solid) blocks and check if any neighbor block is transparent. With the solution I explained above, I would just split the chunk up in like 12 parts, let one thread do each part and once all threads finished I would continue to do the cheap stuff in the game thread like adding the blocks to the world.

Inside of the Work() function you could of course not access anything apart from the input variables, so it’s 100% safe. You can just do calculations based on what you got and return something based on this.

sorry to hijack this but can you point me to more info about the BP C++ converter?

There is an experimental project setting (Project Setttings/Packaging/Nativize Blueprint Assets) in 4.11 - with many bug fixes in master - that generate code, compiles it, and links it into the game executable when packaging a build.

Hi

I am working on a multithreading plugin for blueprints. I have already requested for submission in market place and soon it’s gonna be available
meanwhile, you can have a look at this thread : Multi-threading in Blueprint - Work in Progress - Unreal Engine Forums