Is swarm agent disconnecting computer during build process?

Swarm agent seems to be diconnecting or running into some sort of error during the build process… The first comp does the build while the 2nd comp runs the coordinator… somewhere during the build process the 2nd comp doesnt seem to make it to processing mappings stage. Here is a screen shot followed by the attached[link text][1] text swarm log.

According to the log there are error messages… and back when i used to use swarm agent the 2nd computer would also go through the “BLUE” (Proccessing mappings) bar… it seems like its not doign that at all… like its losing that computer… in fact that screenshot… is the end of the 2nd computer… as in it does not proceed… that is start to end… the first computer goes on for miles so to speak… great distance.

In the log i get this happening roughly the same amount as i see no blue bars… Its a lot of …

“10:49:45 PM: [PushChannel] Pushing the channel …/DotNET/AutoReporter.exe has failed!”

"
10:49:45 PM: [PushChannel] Pushing the channel …/DotNET/AutoReporter.XmlSerializers.dll has failed!
"

Anotehr thing when i swarm used to work properly the original computer that was running unreal was also the coordinator… now its reversed. I reversed it cause i have no idea why but the 2 comps couldnt see each other in swarm agent anymore… it was weird… so after playing around with it i reversed it so the other is the swarm coordinator and the original still has unreal but no swarm coordinator… So yea… i have no idea why all these issues are happening right now… at this point as i said they see each other and start the build together, but no blue bar… and lots of errors now.

Hello Kurylo3D,

Whenever you allocate another computer, whatever task, that computer completes the task that you assign it. Then, it sends that information back to your base computer. Then all of the information on the base computer is compiled to give you a finished build.

I believe what you are seeing is intended behavior because your build reaches 100%. If you saw an error message or build does not complete then I would be inclined to think there was an issue somewhere.

I’m not quite sure either. I’m investigating this further. It sounds as if the communication between the two computers and swarm is somehow being interrupted. I’ll get back to you.

Thanks, little more info. 1 comp is windows 8.1, and the other is windows 7.

After some research into known issues on this I believe the issue you are having is the compatibility between the different versions of windows. Do you happen to have another computer that has the either windows 8.1 or 7 on to test?

I mean no offense, but this thing was working for months on these very 2 computers with no problems at all. In fact it was working on 3 computers… the third one however was pointless since its old as dirt and really wasnt helping. The third one had windows vista, the 2nd computer had windows 7 and the first which was the coordinator was running unreal and had windows 8.1. So it was working just fine for a long time.

Problem started around 4.9. To answer your question is hould be getting a 4th windows 8.1 alienware laptop maybe end of this month, but i need this working with all my computers again… not going to chock it up to os… since it was working just fine with no issues.

I will say after the problem first appeared , as i said before i made the windows 7 machine the coordinator instead since thats the only way they would see each other in the agent. The windows 8 still uses unreal and launches the build. At that time when i switched it at first they didnt see each other either, but i messed with homegroups and stuff and then they saw each other, but obviously its still broken. So i do not know what to tell you.

None taken. We are here for support and this is part of narrowing down what is causing issues. I did not realize that you had this working previously in another version of the engine.

I will continue to investigate this problem. If this project is one that is near completion I do recommend keeping it in 4.8. Reason being is major features such as AA have been changed in 4.9. I’m not saying this is related to your issue. However, with all new framework implemented in 4.9 there are bound to be issues that need to be addressed before proceeding with development.

I am going to pool more knowledge, and machines :), on this issue and see if I am able to reproduce what you are seeing.

Thanks. I honestly dont know exactly if it was 4.9 or one of the updates to 4.9… Not quite sure… when i get the laptop ill continue testing if we dont find a solution. For now im just leaving it to build over night every night…

We are currently in the process of pooling more resources and computers in order to test if we are able to reproduce this on our end. Thank you for your patience on this.

I have been unable to track down what could be causing this issue. I have successfully synced and built to a few computers. Are there any updates with your builds?

Nope. I dont plan on trying to fix this until i get my new laptop to add to the mix. My only assumption is that before i was not using homegroups to connect my computers (as far as i knew) … and now i sort of am with one created by my windows 8 machine. Though the only reason i did this was cause swarm couldnt see the other computer… and once i did the home group thing it could see the other computer, but the bakes obviously fail.

What was funny was that windows computers still both saw each other and could be navigated inbetween… but swarm didnt.

Ok. What I will do is to mark this thread as answered for tracking purposes. As soon as you can test this please post again on this thread. I will then receive a notification when you do. At that time I will continue to work this issue with you at that time.

Alright i now have 3 computers testing and same probelm. Originally i thought my problem was that my computer running unreal was running an old version of swarm agent from 4.7… which is true. But now it is not… every version is using swarm 4.9 with 4.9 being the building machine as well…The problem is still exists…I will try to go into as much detail as possible.

Computer “Revy”: The computer running unreal. core i7 4 ghz with 16 gb ram. gtx 980. Is running windows 8.1. Its the computer that first created the homegroup. This is the computer running Unreal Editor and starting the build. Swarm agent for 4.9.

Computer “Motoko”: The computer running swarm coordinator. It has core i7 2.7 ghz… 6gb ram. nvidia gtx 780. Running windows 7. This computer is the one running the swarm coordinator. It also is using a swarm agent. Both Coordinator and agent are 4.9.

Computer “Rock”: New laptop. Has core i7 2.5 ghz with 24gb ram and gtx 980m. Running windows 8.1. This computer is only running a swarm agent 4.9. All of its power settings are disabled so it never powers off since its constantly pluged in… only thing that ever happens is screen powers off.

Network itself is set up with the “Revy” machine creating the homegroup and the other 2 connecting to the homegroup with the homegroup password. All computers see each other and are able to navigate each other in the network Via clicking on them in network window and clickign on homegroup. But heres a problem with this new laptop… First It cant navigate REVY without using the homegroup… for instance i cant just go to network-> REVY… i have to literally click on the homegroup user REVY. On top of that. Actually thats not entirely true… I can navigate REVY via ROCK Network but only after REVY connects to ROCK via network… makes no sense… after it does that it connects.

MOTOKO Doesnt see ITSELF on the homegroup at all… Rock and REVY SEE all three computers on the homegroup… MOTOKO sees both computers on the network and can navigate it via network → ROCK … So every computer has different issue connectivity/ networking wise to each other… despite having the following problems…

Now I will tell you they all connect to the coordinator fine eventually. Though the coordinator behaves stupid sometimes. For one i spent 30 minutes last night trying to get each computer seen in the coordinator, running, and available. You see i am assuming this is the cause since it happened 3x, but whenever my “Rock” laptop goes to screensaver mode… the coordinator sees it as “Dead” unassigned… So basically i need to kick off the build process before the screensaver turns on or something …Again even if its dead unassigned i could still navigate the network to each others computer on both computers no problem… Restarting agents did nothing it would just go back to dead unassigned… and pinging the coordinator from “Rock” didnt help either… In fact the pinging of the coordinator would fail according to “Rocks” log… like it wouldnt be able to connect.

But here was the other issue… swarm coordinator itself was acting stupid… for example when i would restart all swarm agents via the coordinator… the swarm agent for MOTOKO would stay on restarting and never go to available or busy for that matter… all computers took longer then usual to restart agents, but this one in particular never came back… i had to restart the ■■■■ computer…

Anyway eventually after I had all 3 computers available and unassigned i was finally able to hit the build button. As you can see from the screenshots and the log… same problem. Here are both the logs for all computers and screenshots in order…As well as a screenshot of the coordinator.

Again just to reitterate… i had it working with the original 2 and a different laptop during 4.7… Once one of the 4.9 versions came out it stopped working… and thats when i switched to using a homegroup for my network so the coordinator could see computers… Obviously still some issues…

link text

link text

link text