TickGroup queue logic causes Apex deadlock

Hi,

When an Actor is spawned during TG_PrePhysics it will be added to TG_StartPhysics (even if the Actor has TG_PrePhysics as TickGroup).
In our case this can cause a deadlock in Apex because the Actor will tick and execute UDestructibleComponent::CreatePhysicsState which can overlap with ApexScene::fetchResults.

Based on the deadlock it shows that its not allowed to run UDestructibleComponent::CreatePhysicsState during TG_StartPhysics.

FTickFunction::QueueTickFunction has an exception for the case when a TickGroup gets promoted to TG_DuringPhysics

Shouldn’t this exception also happen for promotions to TG_StartPhysics?

Deadlock details

ApexScene::fetchResults first locks all render resources and will later try to lock the physics scene with ApexScene::fetchPhysXStats before it unlocks the render resources.

UDestructibleComponent::CreatePhysicsState will lock the physics scene before flushing the GPhysCommandHandler. If the command handler contains a command that destroys a destructible actor its possible that DestructibleScene::scheduleChunkShapesForDelete is called, which will in turn try to lock the render resource of this object.

Hey -

Can you explain how your actor is setup/spawned into the level? Additionally, could you provide reproduction steps to help me investigate the deadlock on my machine?

Cheers

Hey ,

The actor is spawned from C++ with GetWorld()->SpawnActor. The spawn code is executed from a delegate that is tied to the UPrimitiveComponent::OnComponentHit.

As for reproduction steps, I will see if I can setup a fairly empty scene to reproduce the issue. Keep in mind that we have not been able to reproduce the issue on yet, only on consoles. With different reproduction rates between the consoles. But looking at the code it does appear to be a generic issue.

Here are the callstacks of the threads which are in the deadlock:
(Red is where the render resources are locked, Green is where the physics scene is locked)

78877-apexstack.png

Kind Regards,

Wilco

When you say that this is occurring on console, are you using a or XBoxOne? If you’re also packaging for handheld devices does this occur on those builds as well?

Yes I was referring to and XboxOne. We dont use any handheld devices.

Hey ,

I have tried several things to reproduce the issue but I have not been able to reproduce the deadlock outside the game.
I did verify that the code flow was similar so it can happen in theory, but because its timing related it will be hard to reproduce in an example environment.

But as you can see from my screenshots, the locking behavior is not symmetrical. So even if you cant reproduce it it will still be a potential issue.

Hello -

I was curious if you had the chance to make a sample project to reproduce the issue you’re having that you could share? If not, can you provide any step by step instructions for me to setup a test on my end?

If creating a sample project isn’t an option would it be possible to share the current project where the lock occurs? If you’re able to zip the project you can post it here or you can upload it to Google drive or Dropbox and send me a private message on the forms with the download link for privacy.

Hi ,

The amount of destructible meshes was only increased to get a similar timing we get in game. When the timing is right it can deadlock when spawning only one destructible mesh during a frame.

Kind Regards,

Wilco

Hey -

I was able to test the sample project you sent and have entered a bug report (UE-27660) for further investigation. Based on the sample project it appears to be tied to how many destructible meshes are spawned at once and reducing the number spawned should reduce the chances of this lock occurring.

Cheers