CPU render thread Optimization

Does anyone know where to find CPU render thread optimization tools?

I’ve used stat dumpframe -ms=.1 and gone through the logs but can’t find what exactly is make the Draw so high. Note: this is 1280x720 window. On 1920x1080 fullscreen, the Draw and GPU are near identical at all times.

Thanks in advance. If someone could point me in a directin that’d be great.

Cheers,

Sean

Start the game in standalone and then run console command ‘stat startfile’ and move around the scene (or just stand still, whatever) and let it run for a few seconds, then run ‘stat stopfile’. (Make sure to run stat stopfile before terminating the standalone game process, otherwise it will produce a broken profile log). It will create a log file in MyProject/Saved/Profiling/UnrealStats/SomeLongDirectoryName/SomeLongFileName.ue4stats.

If you upload that saved ue4stats file, I will take a look at it for you, and try to help troubleshoot it.

Hello,
Thanks very much for taking the time.

Here is the file:
https://dl.dropboxusercontent.com/u/10548949/SSG_Statfile.7z

How do you open/read it in Excel or open office calc?

Cheers.

You can view it using the Session Frontend in UE4 editor (or standalone). Switch to the Profiler tab and then load the file.

I’ll look at it now and see if I can find slow things in your project.

Thanks,

I did end up googling it and am in front end. Although I have no idea what I’m looking at. :slight_smile:

There seems to be a lot of “CPU stall - wait for event”…

Ok, so I found two obvious sources of slowness.

On the game thread, you are spending about 1.4ms inside of the logic of a button widget Blueprint. I’m not sure exactly what it’s doing, but it seems like an unreasonable amount of time to spend working on a button.

Of course, that won’t dramatically impact your rendering performance, but it’s good to reduce whatever you can in the game thread.

Next, and most importantly, is your rendering thread:

It is spending the majority of its time in InitViews – which almost always indicates calculations spent on view relevance and more fine-grained occlusion culling.

So my guess is that you have a large number of objects in your scene (around 2000?) and that the engine is having a hard time doing occlusion culling on them.

One thing you can try is toggling on or off HZB occlusion culling and see if it changes the performance profile. For some scenes it will make it worse, for others it will make it better. It defaults to enabled. You can disable it with:

r.HZBOcclusion 0

Or you can use 1 to enable it.

But if your scene in general is just really hard to cull, you’ll have to change other things. Mark your meshes that don’t move as static instead of movable. Combine meshes that don’t move relative to each other as single meshes. (Don’t build large complex buildings out of single panel static mesh actors, for example – build the whole structure in your modeling tool separately.) If you have many copies of the same mesh, consider using Instanced Static Mesh component instead of regular static mesh component.

You can try making your scene easier to cull by blocking off stuff that you don’t need to see with large simple meshes. You can use distance-based culling and LOD to reduce or remove stuff at a distance.

You can also take more extreme measures. You can try using pre-computed visibility, though this is typically only used for mobile platforms that can’t spend much CPU time on the dynamic occlusion that UE4 uses by default. You’ll have to place visibility/occlusion volumes into your scene yourself, and then run the build visibility action from the Build button/menu.

In the very worst case, where you know everything in your scene definitely always needs to be visible and there are large numbers of objects, you can just turn off occlusion culling entirely in the project preferences. It means that you will potentially spend much more time drawing and rendering stuff, but if you know it’s always going to be visible anyway, then you save time by not bothering to do occlusion culling.

4 Likes

Thanks so much for a comprehensive look.

I am extensively using instanced static meshes. Since this is a real life building for a work project he screenshot I sent is everything. That’s how we have to show it. I will test with HZB occlusion culling. The scene does have many many meshes ~2000, we working hard to bring that down with the solutions you mentioned.

Thanks for taking the time, my main query was regards to the CPU draw and how I can isolate the problem. Something which you seemed to touch on.

Is there any place I can read through for a more comprehensive understanding of the profiler and what these things mean?

Thanks again for you time,

Sean

I don’t think there is any single comprehensive guide. The information is scattered around the documentation and source code.

Visibility culling (and “what’s relevant to spend time on drawing?” in general) is a complicated topic and one of the most important parts of any realtime rendering engine.

In some cases you may find that regular static mesh components are faster than instanced static mesh components and vice versa.

You can help see how well you are optimizing things (or making things worse!) by running the command ‘stat InitViews’ which will put up some info measurements about visibility culling and that sort of thing. You can move the camera around the scene and see how it responds and try to get a feel for what you need to work on or what the problem might be.

Alright.
Thanks again for all your help.
Cheers,

Sean