Upgrade from 4.15 -> 4.17: Relaunching packaged exe hangs

Hello,

Recently, our team upgraded from UE4 version 4.15 to version 4.17. We have a build process that packages and distributes our prototype game using the Win64 Development configuration. Our prototype is built off of ShooterGame.

After upgrading to 4.17, we have noticed that relaunching the packaged executable after closing it will cause the Unreal Proccess to hang while doing async loading. The loading screen, or rather any window related to the process, will not appear. After opening the task manager, it shows a ShooterGame.exe process running with about 250 megabytes of allocated memory and no CPU usage. As the game attempts to load, the usage will slowly decline to 0% and remain there indefinitely.

Some observations:

  • This does not occur for all of our machines and is not exclusive to a particular Windows OS version. We had it happen on Windows 7, 8, and 10
  • This does not occur for PIE. Only packaged builds
  • This issue only occurs upon launching the executable, closing it, and attempting to reopen it
  • Sometimes, after about an hour, the game will in fact load (obviously we cannot be waiting an hour to playtest)
  • We were able to repro this with a Debug Configuration build and noticed it would hang during Async loading before the Checksum is performed on loaded files
  • Disabling firewall and antivirus had no impact
  • We never experienced this issue on 4.15
  • There is nothing that stands out in the logs

Is anyone else experiencing this issue? Any advice on where to proceed from here would be helpful.

Thanks

(I keep failing to post this as a comment for some reason. Sorry for the bad “Answer”)

Some of my users are having this issue… some of the time. One guy was having the issue and it magically fixed itself one day, and another just randomly started getting it. Restarts don’t help, verifying files doesn’t help, reinstalling doesn’t help. Nothing about the affected computers points to any hardware issue. It’s proven nearly impossible to track down as I haven’t been able to recreate it on any of my 3 computers. I pushed out a dev build earlier with VeryVerbose logging turned on like so:

[Core.Log]
; LogVoice=VeryVerbose
; LogVoiceDecode=VeryVerbose
; LogVoiceEncode=VeryVerbose
; LogAudio=VeryVerbose
LogGameMode=VeryVerbose
LogPakFile=Verbose
LogOnline=VeryVerbose
LogNetVersion=VeryVerbose
LogContentStreaming=VeryVerbose
LogDerivedDataCache=VeryVerbose
LogModuleManager=VeryVerbose
LogWindows=VeryVerbose
LogConfig=VeryVerbose
LogStaticMesh=VeryVerbose
LogCheckComponents=VeryVerbose
LogClass=VeryVerbose
LogBlueprint=VeryVerbose
; Global=VeryVerbose

For some reason setting Global to VeryVerbose caused packaging to fail every time, but that’s a separate issue entirely.

I got somebody to try launching 6 times today and every log ends exactly the same way:

[2017.10.12-17.24.13:690][  0]LogPakFile: Verbose: FPakPlatformFile::OpenAsyncRead[0000000015262035, 0000000015263309) ../../../ProxyWarVR/Content/HUD/FX_DamageIndicator.uexp
[2017.10.12-17.24.13:690][  0]LogPakFile: Verbose: FPakReadRequest[00000000152620F6, 000000001526243D) Notify complete
[2017.10.12-17.24.13:690][  0]LogPakFile: Verbose: FPakReadRequest[00000000152620F6, 000000001526243D) QueueRequest HOT

And then it just goes on like that, with a very HOT QueueRequest. It never tries to load another asset. What is special about FX_DamageIndicator.uexp? No idea. It’s just a cooked particle system. This project started with shootergame 4.12 and is now on 4.17.2. FX_DamageIndicator may have been created in 4.15 but I am not sure.

Also posted this on UDN
https://udn.unrealengine.com/questions/396236/upgrade-from-415-417-relaunching-packaged-exe-hang.html

Be sure to pass on any information and let us know if there’s a workaround found over there… Some of us don’t have access to udn

Received a reply this morning. Part of the response is copied below:

“As for the issue itself, have you attempted disabling Event Driven Loading? This is sometimes related to Async loading issues. It was not enabled automatically in 4.15 but has been enabled by default in projects as of 4.16.”

I did attempt it… however it caused a crash because I’m using StaticLoadObject to load an image at some point and I gave up. I can try just commenting out that line and see what happens I guess.

Stay tuned.

Ran into a similar issue where our packaged exe would crash with the Event Loader turned off

Managed to get a package running without Event Driven Loading and it did seem to fix the issue, at least for the 1 person that got back to me. However, it also seems to have prevented TextRenders from loading the font (roboto). Every digit is squares. UMG widgets seem to have loaded fonts just fine.

What a mess

I’ve worked around the font issue and documented it over here for anyone following in my footsteps.

Paul.Rention has there been any progress on this one? I’m not seeing anything obvious in the 4.18 release notes

Not yet. They want us to try upgrading to a later version of 4.17 or just 4.18 and see if the issue still persists. I’ll keep you posted.

Thanks. For what it’s worth I had the issue on 4.17.2. I’ll probably make it to 4.18 but it may take a few weeks, let me know if you try it

Do you have the setting checked for generating/using Pak files when packaging?

I’ve been taking a look at this issue today and found that our build machine wasn’t packaging our assets. When pak files are turned on, I wasn’t able to reproduce the problem. This could point towards a load order issue or a race condition that only happens when multiple files are opened during async load. It’s also possible that I just moved the race condition elsewhere. I’ll update again if we continue to see the problem.

If you are referring to “use pak file”, it is checked and has never been unchecked

I’ve always been having this problem with pak files on. Do you have blueprint nativization on by chance? I was having this strange issue after disabling event driven loading and the fix I discovered was to have blueprint nativization on. This seems to be a trend; that event driven loading OR blueprint nativization must be enabled to avoid certain issues. Perhaps the 3 settings interact in some crazy race condition?

We do not have nativization turned on. All of the reproduction uncertainty leads me to believe that the race condition still exists in some low level piece of code. Perhaps the lockless list code? I haven’t had an opportunity to comb through it yet to see if I can find anything, and that’s really pure speculation on my part.