[VR][Vive] FViewExtensionPreDrawCommand pulsing with huge cost

There are already threads that encountered this problem, but the issue seems to occour on different scenarios.
[Serious proformance issue when use 4.15 forward shading pipeline with MSAA? - Rendering - Unreal Engine Forums][1]

[FViewExtensionPreDrawCommand - Rendering - Unreal Engine Forums][2]

This issue only seems to occour when using the HTC Vive.

We are creating actors at runtime dynamically and merge them together. How big the scene gets, is up to the user.
The more objects are created, the more annoying the issue gets. However, it is not only related to the number of objects in the scene. If viewing the objects from afar (no LODs here), the scene might render without issue and 90FPS. When viewing part of the exact same scene up close, the FViewExtensionPredrawCommand starts to pulse with high cost. The closer to an object, the stronger the pulsing (more pulses).

From the profiler:

Here is you can see the high cost for FViewExtensionPredrawCommand. The render time itself seems fine ~4ms.

This also shows that the issue persists not only for some single frames, but for multiple seconds once it started.
Note: The camera has been kept steady as much as possible.

In the code:

Which results in this function beeing called:

Which again tells the views that rendering is going to start and the views can do something now. Which again results into VR device transform updates and late update for objects in the scene (no late update objects on our end).

As stated in one of the linked posts, the issue seems to be related to the VR Compositor for the Vive (SteamVR in this case) and getting data from it. Which makes it more confusing. Why is the compositor going slow in case the player is close to an object, when the compositor is not aware of the scene beeing rendered (nor is the tracking system) at all?

I would do a more precise breakdown of the costly code by placing timers, but it is engine code and we do not have a custom build.

To add some more to this:
I just tested in a new VR template project in UE4.21. On the MotionControllerMap we get
FViewExtensionPreDrawCommand

  • Vive: 7ms-9ms
  • Oculus: 0.05ms-0.15ms (yah, thats more like it)

For me that seems like there is definitly something wrong with the Vive/SteamVR.

Regarding the pulsing behavior:
I would gues that SteamVR realises that the application is running below 90FPS, meassured on the frequency the game tries to get transform data. If the application is tending to run below 90FPS, SteamVR/OpenVR tries to prevent frequent switches between 90FPS/45FPS for the sake of the user, by holding the update back long enough to make the application run with 45FPS for some time. This would also explain why the pulses in the performance graph above are all equal width.

Here is an additional tool for SteamVR that can be used to view frame timings.
https://developer.valvesoftware.com/wiki/SteamVR/Frame_Timing

Digging deeper into this, I came uppon a thread in the Steam Dev Group (you can only access this link if you are signed into Steam and Subscribed to the Steamworks Dev group) https://steamcommunity.com/groups/steamworks/discussions/20/458604254451860435/. The issues seems to be about OpenVR - WaitGetPoses which they discuss, waits 3ms before the next VSync occours to returning the poses requested from the VR-System.

openvr.h
Class IVRCompositor


	/** Scene applications should call this function to get poses to render with (and optionally poses predicted an additional frame out to use for gameplay).
	* This function will block until "running start" milliseconds before the start of the frame, and should be called at the last moment before needing to
	* start rendering.
	*
	* Return codes:
	*	- IsNotSceneApplication (make sure to call VR_Init with VRApplicaiton_Scene)
	*	- DoNotHaveFocus (some other app has taken focus - this will throttle the call to 10hz to reduce the impact on that app)
	*/
	virtual EVRCompositorError WaitGetPoses( VR_ARRAY_COUNT(unRenderPoseArrayCount) TrackedDevicePose_t* pRenderPoseArray, uint32_t unRenderPoseArrayCount,
		VR_ARRAY_COUNT(unGamePoseArrayCount) TrackedDevicePose_t* pGamePoseArray, uint32_t unGamePoseArrayCount ) = 0;

This seems to be put into place to get the best pose for the next frame to render. In case you missed that spot, it seems to hold until the next VSync.

See next post…

Well, the call to WaitGetPoses is done during PreRender stuff, but the call returns only 3ms before the next frame is to be submitted to the HMD. Submitting works by providing a texture with the image (rendered scene) on it. But this means there is only 3ms for rendering?! What is it I’m missing?

For the RenderThread this is bad stuff, as when the RenderCommand for the PreRender stuff is enqueued early in the RenderCommandQueue, all time gets wasted waiting for the WaitGetPoses call to return. So tracking this even deeper, it turns out that the FViewExtensionPreDrawCommand is enqueued pretty late in the engine loop, ticking the world (actors) before. Enqueueing RenderCommands during Actor::Tick ensures they will be executed before the RenderThread stalls in OpenVR::WaitGetPoses.

In other words: FViewExtensionPreDrawCommand displaying with 7ms for the VIVE, means you still have 7ms more time on the RenderThread to do stuff, as long as it is enqueued in the RenderCommandQueue before FViewExtensionPreDrawCommand.

See my other posts above.

TLDR:
In other words: FViewExtensionPreDrawCommand displaying with 7ms for the VIVE, means you still have 7ms more time on the RenderThread to do stuff, as long as it is enqueued in the RenderCommandQueue before FViewExtensionPreDrawCommand.

If you get this frame pulsing from the picture in the first post, and high value for FViewExtensionPreDrawCommand, means the RenderThread is fine (most likely). You need to either optimize the GameThread or GPU time.

Im confused what do we have to do to get the ms down for that FViewExtensionPreDrawCommand? I dont even have dynamic shadows

"If you get this frame pulsing from the picture in the first post, and high value for FViewExtensionPreDrawCommand, means the RenderThread is fine (most likely). You need to either optimize the GameThread or GPU time. "

  • Reduce the geometry of objects
  • Reduce drawcalls
  • Reduce calculations on GameThread
  • Reduce Material complexity