x

Search in
Sort by:

Question Status:

Search help

  • Simple searches use one or more words. Separate the words with spaces (cat dog) to search cat,dog or both. Separate the words with plus signs (cat +dog) to search for items that may contain cat but must contain dog.
  • You can further refine your search on the search results page, where you can search by keywords, author, topic. These can be combined with each other. Examples
    • cat dog --matches anything with cat,dog or both
    • cat +dog --searches for cat +dog where dog is a mandatory term
    • cat -dog -- searches for cat excluding any result containing dog
    • [cats] —will restrict your search to results with topic named "cats"
    • [cats] [dogs] —will restrict your search to results with both topics, "cats", and "dogs"

Performance - Hardware Usage

Hey,

we are realy happy to work with Unreal, a greate Engine to use.

We have performance problems with our game, but the hardware seems to do nothing and could need a hand to help out.

For testing I set up a new empty project. Its only one actor that spawn multiple Static Mesh Component. The graph (left) is a float debug history with max value of 0.02222ms and is showing the delta frame time. The scene is played in standalone mode.

I know that u can merge actors like this! This test is only about to know why the hardware is "sleeping"

Desired frame rate: alt text

The scene: alt text

CPU usage: alt text

GPU usage: alt text

As of the pics you can see that I want to get 120 frames for this test, but that frame rate is not reached when playing. As of the CPU and GPU pics you can see that the hardware could actually do much more, but simply does not. I would expect the hardware to put more power into the process if need be and the need is there.

The GPU is only at 17% load with less than max frequency, but stat unit shows GPU=15.2ms time used for that frame. I gues the GPU is CPU render thread bound here (stat unit: draw = 14.69ms), is that correct? But if the CPU is the limiting factor, why is each core at max of 60%? I would at least expect one core to always work with 100% if there is work to do.

Harddrive is not getting accessed and RAM is more than enough.

DxDiag

Appreciate any help I can get. Regards

Product Version: UE 4.13
Tags:
scene.png (1.1 MB)
cpu.png (56.1 kB)
gpu.png (16.5 kB)
settings.png (10.4 kB)
dxdiag.txt (93.7 kB)
more ▼

asked Sep 30 '16 at 11:19 AM in Rendering

avatar image

Rumbleball
287 8 20 22

(comments are locked)
10|2000 characters needed characters left
Viewable by all users

3 answers: sort voted first

Hi! Gotta express joy of seeing a detailed, specific question with screenshots. That is really nice.

Your performance is totally on par with what would I expected from this scene. Your guess about you being render thread bound is totally correct. Your number of draw calls is at least two times higher than what I would consider "maximum acceptable" for myself on a PC scene. I would not expect CPU usage to reach 100% before you start experiencing highly noticeable bottleneck.

Technically, you should try merging your the objects in a single mesh or try using instanced meshes. It should drastically improve performance in your case.

GPU clock speed and load are also not the most conclusive factors. You should consider giving GPU profiler a try.

Lastly, ensure that you are doing tests in standalone game. Play in Editor usually has added costs.

more ▼

answered Sep 30 '16 at 03:44 PM

avatar image

Deathrey
7.9k 130 31 293

avatar image Rumbleball Oct 04 '16 at 11:03 AM

Thanks for your reply. This is just a test scene, has nothing to do with what I wanna achive. I create custom meshes at runtime, each shall be editable after creation. That makes merging nearly impossible and instanced meshes are not possible either. That test was run in standalone, as mentioned.

I am also developing for VR, HTC Vive. For that I need const 90FPS. 120FPS in the testscene was just for testing.

Just wondering why the hardware seems not to do stuff when there is lot to do, to reach the min. desired FPS.

Edit: Would be happy to reach half of that 5000 Draw calls. Performance already drops when reaching 500 in the project.

avatar image Deathrey Oct 09 '16 at 09:57 AM

Performance already drops when reaching 500 in the project.

Could you be more specific about this? Maybe root of the issue is elsewhere.

avatar image Rumbleball Oct 10 '16 at 08:02 AM

Thanks Deadthrey for trying to help.

I dont know how to be more specific. The test scene above is an empty project with just one actor that spawns multiple static mesh components. The data shown in the screenshots are from the time every static mesh component has already been placed. So its a static scene.

In the real project I use RuntimeMeshComponent to build stuff at Runtime, but the result with the HardwareUsage is the same. So I showed off the simple scene with nothing in, as of the rule "if it doesn´t work, try a minimal example". I as well tried a minimal example as of above with RuntimeMeshComponent Boxes, that result in the same result. The actual project is not much more, just more code to create the Objects placed in the world. After creation its just a mesh in the scene, no GameThread usage, just RenderThread/GPU. No changes made to Postprocessing. Im aware of the doubled rendertime for HeadMountedDisplays (HTC Vive). The needed 90Frames increases the render load as well.

avatar image Razer313 Oct 07 '16 at 09:13 PM

Make sure you also try running it as a 'Standalone Game' and not just PIE, I get different performance and bottlenecks when I try it out in that mode!

avatar image Rumbleball Oct 10 '16 at 08:05 AM

Thanks Razer313,

might be you read it over, but as meantioned I tried it in standalone. I noticed that in PIE GameThread is always high, RenderThread low. In Standalone it is the other way around, RenderThread high, GameThread low.

avatar image Deathrey Oct 10 '16 at 10:56 AM

RuntimeMeshComponent is a plugin if i'm not mistaken. Are you referring to the plugin or Procedural Mesh Component or Static Mesh Components?

avatar image Rumbleball Oct 10 '16 at 11:13 AM

Yeah, you are right. It´s the Plugin. Sorry for not posting a link, as it is not standard.

(comments are locked)
10|2000 characters needed characters left
Viewable by all users

Still up for some more suggestions on that topic.

It does not matter if using RuntimeMeshComponent or simple Static Meshes. The result is the same. The test was made with StaticMeshes, which should be looked into first.

Up for any suggestions.

Regards Rumbleball

more ▼

answered Nov 03 '16 at 12:33 PM

avatar image

Rumbleball
287 8 20 22

(comments are locked)
10|2000 characters needed characters left
Viewable by all users

Ok, well. Came to the solution by trying something else. In the example it was just not that clear cause of the 8 cores of the CPU. The renderthread should run on max. possible speed, but it is just one thread. The thread is just passed from core to core, so everyone gets a share.

But why does the CPU always pass the thread around and does not keep it in one thread? This is just a thought, but it may be that the CPU always tries to have the same amount of load at all cores because of heat dissipation. Metal deforms if heat is not applied equally over the whole body. If the area of heat production is larger, it is also easier to pass it to the cooling unit. Might also reduce power consumption, which also reduce heat production.

Thanks everyone for trying to help, always appreciated.

more ▼

answered Nov 13 '16 at 10:52 AM

avatar image

Rumbleball
287 8 20 22

(comments are locked)
10|2000 characters needed characters left
Viewable by all users
Your answer
toggle preview:

Up to 5 attachments (including images) can be used with a maximum of 5.2 MB each and 5.2 MB total.

Follow this question

Once you sign in you will be able to subscribe for any updates here

Answers to this question