Performance - Hardware Usage

Hey,

we are realy happy to work with Unreal, a greate Engine to use.

We have performance problems with our game, but the hardware seems to do nothing and could need a hand to help out.

For testing I set up a new empty project. Its only one actor that spawn multiple Static Mesh Component.
The graph (left) is a float debug history with max value of 0.02222ms and is showing the delta frame time.
The scene is played in standalone mode.

I know that u can merge actors like this! This test is only about to know why the hardware is “sleeping”

Desired frame rate:

109135-settings.png

The scene:

CPU usage:

GPU usage:

109133-gpu.png

As of the pics you can see that I want to get 120 frames for this test, but that frame rate is not reached when playing. As of the CPU and GPU pics you can see that the hardware could actually do much more, but simply does not. I would expect the hardware to put more power into the process if need be and the need is there.

The GPU is only at 17% load with less than max frequency, but stat unit shows GPU=15.2ms time used for that frame. I gues the GPU is CPU render thread bound here (stat unit: draw = 14.69ms), is that correct? But if the CPU is the limiting factor, why is each core at max of 60%? I would at least expect one core to always work with 100% if there is work to do.

Harddrive is not getting accessed and RAM is more than enough.

[DxDiag][5]

Appreciate any help I can get.
Regards

Hi!
Gotta express joy of seeing a detailed, specific question with screenshots. That is really nice.

Your performance is totally on par with what would I expected from this scene. Your guess about you being render thread bound is totally correct. Your number of draw calls is at least two times higher than what I would consider “maximum acceptable” for myself on a PC scene. I would not expect CPU usage to reach 100% before you start experiencing highly noticeable bottleneck.

Technically, you should try merging your the objects in a single mesh or try using instanced meshes. It should drastically improve performance in your case.

GPU clock speed and load are also not the most conclusive factors. You should consider giving GPU profiler a try.

Lastly, ensure that you are doing tests in standalone game. Play in Editor usually has added costs.

Thanks for your reply.
This is just a test scene, has nothing to do with what I wanna achive. I create custom meshes at runtime, each shall be editable after creation. That makes merging nearly impossible and instanced meshes are not possible either.
That test was run in standalone, as mentioned.

I am also developing for VR, HTC Vive. For that I need const 90FPS. 120FPS in the testscene was just for testing.

Just wondering why the hardware seems not to do stuff when there is lot to do, to reach the min. desired FPS.

Edit: Would be happy to reach half of that 5000 Draw calls. Performance already drops when reaching 500 in the project.

Make sure you also try running it as a ‘Standalone Game’ and not just PIE, I get different performance and bottlenecks when I try it out in that mode!

Performance already drops when
reaching 500 in the project.

Could you be more specific about this? Maybe root of the issue is elsewhere.

Thanks Deadthrey for trying to help.

I dont know how to be more specific. The test scene above is an empty project with just one actor that spawns multiple static mesh components. The data shown in the screenshots are from the time every static mesh component has already been placed. So its a static scene.

In the real project I use RuntimeMeshComponent to build stuff at Runtime, but the result with the HardwareUsage is the same. So I showed off the simple scene with nothing in, as of the rule “if it doesn´t work, try a minimal example”.
I as well tried a minimal example as of above with RuntimeMeshComponent Boxes, that result in the same result.
The actual project is not much more, just more code to create the Objects placed in the world. After creation its just a mesh in the scene, no GameThread usage, just RenderThread/GPU. No changes made to Postprocessing. Im aware of the doubled rendertime for HeadMountedDisplays (HTC Vive). The needed 90Frames increases the render load as well.

Thanks Razer313,

might be you read it over, but as meantioned I tried it in standalone. I noticed that in PIE GameThread is always high, RenderThread low. In Standalone it is the other way around, RenderThread high, GameThread low.

RuntimeMeshComponent is a plugin if i’m not mistaken.
Are you referring to the plugin or Procedural Mesh Component or Static Mesh Components?

Yeah, you are right. It´s the Plugin. Sorry for not posting a link, as it is not standard.

Still up for some more suggestions on that topic.

It does not matter if using RuntimeMeshComponent or simple Static Meshes. The result is the same. The test was made with StaticMeshes, which should be looked into first.

Up for any suggestions.

Regards
Rumbleball

Ok, well. Came to the solution by trying something else. In the example it was just not that clear cause of the 8 cores of the CPU. The renderthread should run on max. possible speed, but it is just one thread. The thread is just passed from core to core, so everyone gets a share.

But why does the CPU always pass the thread around and does not keep it in one thread? This is just a thought, but it may be that the CPU always tries to have the same amount of load at all cores because of heat dissipation. Metal deforms if heat is not applied equally over the whole body. If the area of heat production is larger, it is also easier to pass it to the cooling unit. Might also reduce power consumption, which also reduce heat production.

Thanks everyone for trying to help, always appreciated.