Memcpy perf drop in game mode

I’m doing a very simple game from a sample. I stream a video on a big video texture (4k, 3840x2160 pixels). Now in Editor mode, it runs smoothly at 60 fps. I copy using a lock-Memcpy-Unlock and the whole operation is around 12 ms, which is fine for me. Details is around 2 ms for the lock, 4 ms for the copy and 7 ms for the unlock.

When I create a package win64, the perf drops. I’m over 20ms, and most of it comes from the Memcpy that goes from 4 to 11 ms.

Is there some packaging options I’m missing? Any ideas where I should look?

Thanks,
Bryan

Do you have any source code we can look at?