x

Search in
Sort by:

Question Status:

Search help

  • Simple searches use one or more words. Separate the words with spaces (cat dog) to search cat,dog or both. Separate the words with plus signs (cat +dog) to search for items that may contain cat but must contain dog.
  • You can further refine your search on the search results page, where you can search by keywords, author, topic. These can be combined with each other. Examples
    • cat dog --matches anything with cat,dog or both
    • cat +dog --searches for cat +dog where dog is a mandatory term
    • cat -dog -- searches for cat excluding any result containing dog
    • [cats] —will restrict your search to results with topic named "cats"
    • [cats] [dogs] —will restrict your search to results with both topics, "cats", and "dogs"

Media framework: 4K stereo playback performance (HEVC)

We are trying to play 4K stereo videos via the UE's media framework, but are encountering performance issues. 4K mono (in h.264) is running just fine, but when trying to go for 4K stereo (in HEVC), it plays at best at around 0.5x speed and occasionally is just a slideshow.

I have not yet done much testing with it, but would like to ask already now whether I should even expect this to work? What I already know is this:

4K mono video, 4096x2048, mp4, h.264, 80mbit/s, 30fps:
runs smoothly

2K stereo video, 2048x2048 (top-bottom stereo), mp4, HEVC, 30fps, profile=main, level=6.1, tier=main:
smooth at any bitrate (tested up to 100mbit/s)

4K stereo video, 4096x4096 (top-bottom stereo), mp4, HEVC, 30fps, profile=main, level=6.1, tier=main:
0.5x playback speed at best (at any bitrate, from 0.2mbit/s to 100mbit/s!)

Interestingly, the video bitrate seems to not affect the playback performance at all in that last case. Could it be that it is bottlenecked not by decoding performance but by memory bandwidth when transferring around the decoded frames?

Gmpreussner, I think I read you say in some older post that the implementation is (was?) not very efficient, in that it makes some technically unnecessary copying of the decoded frames, do I remember correctly? Could that be related to this?

Test computer specs:
Intel i7-6700 @ 3.40 GHz (16 GB RAM)
GeForce GTX 980 (4 GB VRAM)
Windows 10

All videos are encoded using Adobe Media Encoder CC 2017.

There is nothing else happening in the game (film) while a video is playing.

Product Version: UE 4.15
Tags:
more ▼

asked Mar 28 '17 at 03:51 PM in Using UE4

avatar image

hiili
192 15 24 36

avatar image hiili Apr 04 '17 at 02:43 PM

.....Bump?

(comments are locked)
10|2000 characters needed characters left

1 answer: sort voted first

the implementation is (was?) not very efficient, in that it makes some technically unnecessary copying of the decoded frames [..] Could that be related to this?

Could be. 4K stereo puts quite a bit of strain on the CPU and GPU memory bandwidth.

4096 (width) x 4096 (height) x 0.5 (packed UV horizontally) x 4 (bytes per texel) x 30 (fps) = 1 GB / sec

This is what needs to happen to get the video into a texture:

  • load data from disk into CPU memory (80 Mbit/sec)

  • decode into CPU memory frame buffer (1 GB / sec)

  • copy frame buffer into separate buffer (1 GB / sec) <- this is the extra copy I mentioned

  • copy frame buffer from separate buffer to GPU (1 GB / sec)

  • convert from YUV to RGBA on the GPU

The performance critical parts here are the copies from and to CPU memory, which basically amounts to 3 GB / sec. The extra copy is needed, because the IMFSampleGrabberSinkCallback API that I'm currently using does not allow me to hold on to the decoded buffer. I'm planning to try some other approaches, such as a custom media sample sink, which might allow me to eliminate this extra copy. I'm also investigating asynchronous GPU texture uploads, which we added support for in the DirectX driver some time ago.

Ultimately, going forward with 4K stereo and 8K video, we'll need a decoder that doesn't require any CPU copies at all. We currently have this capability with AvfMedia on macOS / iOS, and we're working with Google and Khronos on Android and Vulkan support, but it will still take some time. I'm currently working on Media Framework 3.0, but I'm not sure how many of these performance optimizations will get done for 4.17.

more ▼

answered Apr 04 '17 at 04:07 PM

avatar image hiili Apr 06 '17 at 12:58 PM

Ok, thank you for the answer!

avatar image Lewkas59 Aug 27 '18 at 09:59 AM

Hello, I wonder where I can find more information about HEVC. Have these improvements arrived on the engine ?

Sorry if this is not the right place to ask this question, and thanks, Lewkas

(comments are locked)
10|2000 characters needed characters left
Your answer
toggle preview:

Up to 5 attachments (including images) can be used with a maximum of 5.2 MB each and 5.2 MB total.

Follow this question

Once you sign in you will be able to subscribe for any updates here

Answers to this question