Asynchronous Media texture update

anonymous_user_f4075eb6 · December 3, 2014, 2:05pm

In MediaTextureResource.cpp, the rendering thread calls UpdateDeferredResource, from the
FMediaTextureResource class.

The function looks like this:

uint8* TextureBuffer = (uint8*)RHILockTexture2D(Texture2D, 0, RLM_WriteOnly, Stride, false);
FMemory::Memcpy(TextureBuffer, CurrentFrame->GetData(), CurrentFrame->Num());
RHIUnlockTexture2D(Texture2D, 0, false);

It’s slow. I’m benchmarking 3ms for a HD texture size (1920x1080). My goal is to playback a 4k movie, at 60 fps smoothly. No hickup, no skip frame.

My first optimization was to replace the Lock-Memcpy-Unlock with RHIUpdateTexture2D. It’s cutting the time by approximately a factor of 3, which is not bad. What this function ultimately calls is
Direct3DDeviceIMContext->UpdateSubresource.

Now, there’s this presentation form NVidia:

Slide 14, it says:
Avoid UpdateSubresource() for textures! Especially bad with larger textures!

Instead, NVidia suggets to use ring buffers of staging texture, update and upload the data in the main thread, and calls map-CopyResource in the rendering thread.

I guess the right way to do it is uploading the frames as fast as they arrive in the main thread, and bind the resource in the game thread to data already on the device.

Is that something you are looking at?

Is there something in the engine that I can use to quickly try it?

Or maybe there’s an alternate solution that we have’nt figure out yet?

Thanks,

anonymous_user_f4075eb6 · December 9, 2014, 6:45pm

Some others are interested in this kind of features, here :

gmpreussner_1 · March 13, 2015, 6:50pm

Hey , thanks for your feedback. This is good information. The Media Framework is still in a somewhat experimental state, and we are aware that performance optimizations are needed. It’s something we are looking into now. Another aspect that slows things down is the two copies that are currently required - one from the decoder thread, and one into the render thread. We are aiming to eliminate all copies, but it requires a bit of redesign of our material system.

I don’t have any alternative solutions for you right now, but I will update this thread as improvements are being checked in. If you have any other ideas, we’d love to hear about them, and we do appreciate GitHub pull requests as well

gmpreussner_1 · March 24, 2015, 7:01pm

, just letting you know that we started working on this.

tron1977_1 · April 15, 2015, 6:55pm

I’ve made an Oculus presentation of a convention center and I have video playing on a bunch of monitors in the convention center.

When there is no video on the monitors the “game” runs at 58 FPS/17 ms in the Oculus it looks great and plays smoothly.

When the video files are attached it runs at 43 FPS/23 ms and it very steppy when I turn my head.
It doesn’t matter the size of the video texture map (wmv) I made a test file of 10 frames of black that are 9kb and it plays at the same rate as when they are 7 MB each.

Is there any solution to this issue? Is there something different I could be doing to speed up the gameplay?

It’s built in 4.6.1

tron1977_1 · April 15, 2015, 6:55pm

I’ve made an Oculus presentation of a convention center and I have video playing on a bunch of monitors in the convention center.

When there is no video on the monitors the “game” runs at 58 FPS/17 ms in the Oculus it looks great and plays smoothly.

When the video files are attached it runs at 43 FPS/23 ms and it very steppy when I turn my head.
It doesn’t matter the size of the video texture map (wmv) I made a test file of 10 frames of black that are 9kb and it plays at the same rate as when they are 7 MB each.

Is there any solution to this issue? Is there something different I could be doing to speed up the gameplay?

It’s built in 4.6.1

gmpreussner_1 · April 15, 2015, 7:15pm

I think the relevant factor right now is the resolution of the video, not the file size. You could try a smaller resolution and see if that helps.

This won’t get any faster until we changed the way the dynamic textures are updated. Unfortunately, everyone is still slammed with a variety of tasks, and progress is slow. I don’t know if this will make it into 4.8, but we will try.

tron1977_1 · April 15, 2015, 8:14pm

Thank you. I have since come to that conclusion from reading other posts and doing some test. I’m currently doing some tests to find what the maximum resolution I can use to get optimal performance.

gmpreussner_1 · September 8, 2015, 3:15pm

Not yet - we’re working on it.

anonymous_user_c755c6f6 · September 14, 2015, 2:45pm

I am also trying to improve the performance of my 4k textures. The lock/unlock calls are are particularly slow even on my fast dev machine. Could you give us some pointers what we can try to work around this until it is in the engine?
Thanks a bunch!

gmpreussner_1 · September 16, 2015, 7:20pm

IMediaVideoTexture has a new API for a texture copying fast-path to eliminate one extra copy: BindTexture, UnbindTexture. These are not implemented yet for any player plug-in other than PS4Media. You could maybe take a look at what we do there.

Another thing we are planning to do is to implement pixel format conversion on the GPU, which, when implemented for WmfMedia, will allow for playback of H.264 on Windows and also improve performance and quality. The basics for that are in place - again, check the PS4Media implementation for pointers.

gmpreussner_1 · February 10, 2016, 8:18pm

FYI I have started working on refactoring the API, so that decoders can write directly into texture memory. I’m trying to get this in before GDC. We are also working on another mechanism that will eliminate the last remaining copy from the Render Thread to the GPU on platforms with unified memory. This might not be finished until after GDC.

gmpreussner_1 · June 8, 2016, 3:58pm

Starting with 4.13 there will be several ways to update the media texture render target:

Render directly to FRHITexture (only available if player is able to write on render thread; player provides buffer)
RHIUpdateTexture2D (only available if player is able to write on render thread; decoder provides buffer)
Triple buffer with RHILockTexture2D/RHIUnlockTexture2D (available on any thread)
Update the texture sink resource directly (available on any thread; fastest method)

anonymous_user_78889f87 · August 4, 2016, 8:09pm

Hi @gmpreussner, could you quickly point us to example code for the first two techniques? EDIT: I see the VLC plugin is using the IMediaTextureSink, which seems to be the standard approach.

gmpreussner_1 · August 8, 2016, 9:06am

All methods use the IMediaTextureSink interface. Note that it exposes (currently four) sets of methods. In the 4.13 code base they’re pretty well documented.

Search the code base for the respective function calls, i.e. GetTextureSinkTexture or AcquireTextureSinkBuffer to see where and how they’re used.

anonymous_user_2fac7eaa · September 21, 2016, 1:46pm

Hi @gmpreussner, could you give some example or document for the last one method？

gmpreussner_1 · September 21, 2016, 8:11pm

Currently, the only player plug-in that uses this method is AvfMedia on macOS and iOS: https://github.com/EpicGames/UnrealEngine/blob/master/Engine/Plugins/Media/AvfMedia/Source/AvfMedia/Private/Player/AvfMediaTracks.cpp#L1203

You’re basically giving the media texture a new FRHITexture2D instance that contains your decoded frame buffer. How to convert/wrap an existing texture target on the GPU into a FRHITexture2D compatible resource depends on the platform. In AvfMedia it seems to use the FAvfTexture2DResourceWrapper helper class.