Failed to connect to Swarm

I just installed UE4 again, and I’m having trouble building the lighting. When I hit build lighting only I get the error “Failed to connect to Swarm.”.

I found the log file for Swarm and here is what it contains


Starting up SwarmAgent ...
  ... registering SwarmAgent with remoting service
  ... registering SwarmAgent network channels
  ... initializing SwarmAgent
  ...... checking certificate
 ......... certificate check has failed
 ...... initializing cache
 ......... using cache folder 'C:/Users/Ryan/AppData/Local/UnrealEngine/4.7/Saved/Swarm\SwarmCache'
 ......... recreating SwarmAgent cache staging area
 ...... initializing connection to SwarmCoordinator
 ......... using SwarmCoordinator on RENDER-01
[PingCoordinator] Determined that we couldn't ping the coordinator
[PingCoordinator] Coordinator ping failed
 ......... SwarmCoordinator failed to be initialized
 ...... initializing local performance monitoring subsystem

I’ve been doing some googling and I saw a lot of recommendations to turn off firewall, antivirus, add to exceptions, etc. I tried completely turning off both windows firewall and Avast and I still get the same error and log. I also saw some suggestions to delete the swarmcache folder in AppData but that also did nothing.

Still having this issue and I’ve been trying some other stuff. Apparently you are supposed to be able to start swarmagent.exe and get a gui but when I try it just crashes without ever getting a gui.

Hi FracOMac,

C:\Program Files\Unreal Engine\4.7\Engine\Binaries\DotNET

In the Folder X: \ Program Files \ Unreal Engine \ [ Engine Version # ] \ Engine \ Binaries \ DotNET \

you’ll find SwarmAgent.exe and SwarmCoordinator.exe. These two programs will be needed to get a build farm going.

You should be able to open both of these programs. If they are crashing like you said you may nee to try to reinstall the engine or use the launcher > Engine Version > Drop Down > Verify.

Once these are up you can change the settings in Swarm Agent > Settings tab so that you can setup for other computers to connect. This that has been setup Coordinator needs to be opened on the machine that will act as the host. This will allow you to see the other computers that are now connected to use Swarm together. This is not an automatic process and may require some tweaking to get connected. It’s imperative for coordinator to always be open when using multiple systems for using swarm agent.

Let me know.

I did a verify and after that ran there is no change. I can launch SwarmCoordinator fine but when I launch SwarmAgent it shows up briefly as a taskbar icon and then vanishes. Is there any way I can get a more detailed log message/output?

In Swarm Agent you can go to the tab labeled Settings > Log Settings > Verbosity > and chance to the setting you would like. Default is “Informative” and then there are four more levels above that.

Often times with Swarm Agent it doesn’t stay in the normal view of the taskbar, but instead in the extended area of the taskbar.

33321-swarmtaskbar.png

Are you able to bring it up this way? It seems odd that you’re able to get to the logs, but are getting the crash on the actual SwarmAgent.exe program?

Unfortunately since SwarmAgent won’t even start I can’t change the settings through that, is there anywhere that documents the config file so I could edit that to turn on the higher log levels?

Its not in the extended taskbar, it shows up for a moment (like in the picture you show) and then vanishes. A few times I’ve gotten a windows crash prompt with “SwarmAgent has stopped working”.

As for where I’ve gotten the logs I’ve just been looking at the actual .log file it generates after it tries to run (in the SwarmCache\Logs folder)

Hi FracOMac,

do you have SwarmAgent.Options.xml in the SwarmAgent.exe folder?

If so, then you could increase log level editing that file. Please set <Verbosity>SuperVerbose</Verbosity>.

Let me know if you find out something more. I’ll try to figure out something on our side.

Thanks,

Jarek

Ok, got log with a few extra bits!


Starting up SwarmAgent ...
 ... registering SwarmAgent with remoting service
 ... registering SwarmAgent network channels
 ... initializing SwarmAgent
 ...... checking certificate
 ......... certificate check has failed
 ...... initializing cache
 ......... using cache folder 'F:\Program Files\Unreal Engine\4.7\Engine\Binaries\DotNET\SwarmCache'
 ......... recreating SwarmAgent cache staging area
 ...... initializing connection to SwarmCoordinator
 ......... using SwarmCoordinator on RENDER-01
[PingRemoteHost] Failed to ping RENDER-01 with RENDER-01
[PingRemoteHost] Exception details: An exception occurred during a Ping request.
[PingCoordinator] Determined that we couldn't ping the coordinator
[PingCoordinator] Coordinator ping failed
 ......... SwarmCoordinator failed to be initialized
 ...... initializing local performance monitoring subsystem

Hmm, nothing of much help here.

Could you open up Engine\Source\Programs\UnrealSwarm\UnrealSwarm.sln solution in Visual Studio, build the agent in debug mode and try running it from there?

With a little bit of luck, VS should break in the place it’s crashing and we can investigate further.

Thanks,

Jarek

Note sure why I didn’t think of doing that in the first place! I’m just not used to having the source of an awesome program like UE4 :slight_smile:

So the inner exception is “No such host is known”, sounds like its an issue with whatever RENDER-01 is supposed to be? So digging around a little I realized that should be the machine that swarmcoordinator is running on which in my case is my own.

And so, I added


 < LocalCoordinatorRemotingHost >localhost< /LocalCoordinatorRemotingHost >

to my SwamAgent.Options.xml and everything is working now! Thanks for the help guys!

Actually, I was wrong, its not entirely working. I can start swarmagent now and it seems mostly happy but when I start it from the UE4 editor I get the following log message (beginning removed due to length limits)


 [PingRemoteHost] Successfully pinged localhost with localhost
 [Ping] Communication with the coordinator failed, job distribution will be disabled until the connection is established
 Exception details: System.Net.Sockets.SocketException (0x80004005): No connection could be made because the target machine actively refused it 127.0.0.1:8009
 Server stack trace: 
   at System.Net.Sockets.Socket.Connect(IPAddress[] addresses, Int32 port)
   at System.Runtime.Remoting.Channels.RemoteConnection.CreateNewSocket(AddressFamily family)
   at System.Runtime.Remoting.Channels.RemoteConnection.CreateNewSocket()
   at System.Runtime.Remoting.Channels.SocketCache.GetSocket(String machinePortAndSid, Boolean openNew)
   at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.SendRequestWithRetry(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream)
   at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.ProcessMessage(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream)
   at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)
 Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at SwarmCoordinatorInterface.ISwarmCoordinator.Ping(AgentInfo UpdatedInfo)
   at Agent.Agent.PingCoordinator(Boolean ForcePing)
 ......... SwarmCoordinator failed to be initialized
 ...... initializing local performance monitoring subsystem

Sorry for the long reply chain, but perhaps I’m not having an issue with swarm agent but actually the coordinator? It seems like the coordinator isn’t being started properly by the ue4 editor. I can start the agent and the coordinator now outside of the editor and they seem to be able to tal.

Hi FracOMac,
it looks like something on your machine is blocking 8009 port.

Are you sure you’ve disabled all firewalls, etc?

Maybe you have running some process that is using this port? E.g. it looks like Apache Tomcat is using it.

Jarek

Check version of .net framework intalled.

My problem was resolved after installing .net framework 4.5.

I have the exact same issue.

For my case, the file Engine/Binaries/Win64/AgentInterface.dll was missing, just re-running setup.bat and generateprojectfiles.bat solved this problem.