Low FPS, CPU Stalled when Collision or Physics enabled

I experienced serious cpu stalling upon adding 100.000 instanced mesh with collision and physics enabled. While the expected result would be increased cpu and gpu usage, actually the opposite is happening. As i increase the number of instances, the cpu and gpu usage goes below 15%, the FPS is drops seriously as well to 16 fps.

Recreated the very simple situation in 4.8.3 and was suprised to see the collision and physics were performing much better in that older editor version.

For the measurements i used HISM to spawn the built in EditorCube mesh (12 triangles only) with the WorldGridMaterial applied. These are engine defaults. Also set the fps limits to be opened, so i can measure the best possible rendering circumstances. Here are the results in 4.8.3 then 4.15.1.

4.8.3 - 100.000 cube
340 fps / 25% cpu / 100% gpu (no collision)
340 fps / 30% cpu / 100% gpu (collision only)
120 fps / 17% cpu / 40% gpu (physics + collision)
------------------------------------------------------------------

4.15.1 - 100.000 cube
300 fps / 25% cpu / 100% gpu (no collision)
150 fps / 20% cpu / 40% gpu (collision only)
45 fps /  15% cpu	/ 28% gpu (physics only)
38 fps /  14% cpu	/ 22% gpu (physics + collision)

4.15.1 - 150.000 cube
300 fps / 28% cpu / 100% gpu (no collision)
100 fps / 16% cpu / 30% gpu (collision only)
16 fps /  12% cpu	/ 10% gpu (physics + collision)

When i increase the number of blocks from 100k to 150k, and let the physics and collision to be enabled, the fps drops to 16, but there is no reason because no cpu or gpu is the bottleneck here. Then what is? I entered the console command " stat DumpFrame -ms=10" to show what is taking so long, and this is what i found:

LogStats:  55.581ms (   2)  -  TG_EndPhysics - STAT_TG_EndPhysics - STATGROUP_TickGroups - STATCAT_Advanced
LogStats:    54.017ms (   2)  -  CPU Stall - Wait For Event - STAT_EventWait - STATGROUP_CPUStalls - STATCAT_Advanced

Can somebody please explain me, why the cpu is stalled here? Is there any way to change this behavior to allow more small physics objects to be added to the scene?

I7 4790, 16Gb ram, gtx 1070, Windows 7
Ue 4.15.1 (built from 20th Feb repo)

Attached below the demo project that will reproduce the situation for you. It is required to be opened in 4.15. You can increase the number of the instances, just set the public values of the blueprint on the details panel.

link text

Hi Konflict,

You mention that you tested this setup in 4.8. Do you have that project available so I can test the severity of the regression?

Hello TJ Ballard,

Thank you for looking into it! Just attached the project for 4.8, althought it is just the same setup as in the 4.15.1 project. While the testing shows that the 4.8 was performing better (especially when no interaction happens), i’d like to stress the importance the topic of my bug report which focuses mainly on the unused cpu and gpu hardware componenst, while the rendering is being stalled seriously and fps drops to unbearably low ranges.

Would it be possible to change the behavior of how the collision / physics engine is behaving, and allowing better utilization of the cpu for the physics? I have seen benchmark tests on the internet showing 100% cpu utilization is very much possible in other physx implementations. In ue4 you can get very limited cpu consuption under most circumstances. This i find more important to looking into.

link text

Hey Konflict,

I’ve tested this across multiple versions and it appears to break from 4.13 to 4.14. Which happens to be when we upgraded to PhysX 3.4. There isn’t really any immediate fix for you, but I’ve entered UE-42956 for investigation. You can track the report’s status as the issue is reviewed by our development staff. Please be aware that this issue may not be prioritized or fixed soon.