Closed
Bug 555988
Opened 14 years ago
Closed 14 years ago
re-imaged try slaves (mac, linux) aren't working properly with puppet
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: lsblakk, Assigned: jabba)
References
Details
(Whiteboard: [waiting on releng])
moz2-linux-slave51 has no buildbot installation on it, neither does the mac slave. from /var/log/system.log on moz2-darwin9-slave68: Mar 30 08:08:21 moz2-darwin9-slave68 com.apple.launchd[86] (buildbot-tac.firstrun.com): Ignored this key: UserName Mar 30 08:08:21 moz2-darwin9-slave68 com.apple.launchd[86] (buildbot.start.slave): Ignored this key: UserName Mar 30 08:08:21 moz2-darwin9-slave68 ARDAgent [96]: ********ARDAgent Launched******** Mar 30 08:08:21 moz2-darwin9-slave68 ARDAgent [96]: ********ARDAgent Ready******** Mar 30 08:08:22 moz2-darwin9-slave68 buildbot-tac.firstrun.com[91]: /builds/slave/buildbot.tac already exists, not doing anything Mar 30 08:08:22 moz2-darwin9-slave68 buildbot-tac.firstrun.com[91]: /Users/cltbld/.buildbot.tac.control says not to run, not doing anything Mar 30 08:08:22 moz2-darwin9-slave68 com.apple.launchd[86] (buildbot.start.slave[92]): posix_spawnp("/tools/buildbot/bin/buildbot", ...): No such file or directory Mar 30 08:08:22 moz2-darwin9-slave68 com.apple.launchd[86] (buildbot.start.slave[92]): Exited with exit code: 1 Mar 30 08:08:23 moz2-darwin9-slave68 /System/Library/CoreServices/coreservicesd[59]: SFLSharePointsEntry::CreateDSRecord: dsCreateRecordAndOpen(Administrator's Public Folder) returned -14135 Mar 30 08:08:23 moz2-darwin9-slave68 /System/Library/CoreServices/coreservicesd[59]: SFLSharePointsEntry::CreateDSRecord: dsCreateRecordAndOpen(cltbld's Public Folder) returned -14135 Mar 30 08:08:42 moz2-darwin9-slave68 org.nagios.nrpe[128]: launchproxy[128]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 56378 Mar 30 08:09:07 moz2-darwin9-slave68 org.nagios.nrpe[128]: launchproxy[128]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 56556 Mar 30 08:09:18 moz2-darwin9-slave68 kernel[0]: AppleYukon2: 00000000,00000001 sk98osx sky2 - - sk98osx_sky2::replaceOrCopyPacket tried N times Mar 30 08:09:29 moz2-darwin9-slave68 org.nagios.nrpe[128]: launchproxy[128]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 56693 Mar 30 08:10:55 moz2-darwin9-slave68 org.nagios.nrpe[223]: launchproxy[223]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 57193 Mar 30 08:14:08 moz2-darwin9-slave68 org.nagios.nrpe[235]: launchproxy[235]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 46060 Mar 30 08:14:32 moz2-darwin9-slave68 org.nagios.nrpe[235]: launchproxy[235]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 46204 Mar 30 08:15:04 moz2-darwin9-slave68 login[246]: USER_PROCESS: 246 ttys000 Mar 30 08:15:54 moz2-darwin9-slave68 org.nagios.nrpe[258]: launchproxy[258]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 46658 Mar 30 08:18:21 moz2-darwin9-slave68 com.apple.launchd[86] (buildbot.start.slave[272]): posix_spawnp("/tools/buildbot/bin/buildbot", ...): No such file or directory Mar 30 08:18:21 moz2-darwin9-slave68 com.apple.launchd[86] (buildbot.start.slave[272]): Exited with exit code: 1 Mar 30 08:18:44 moz2-darwin9-slave68 org.nagios.nrpe[276]: launchproxy[276]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 49158 Mar 30 08:19:07 moz2-darwin9-slave68 org.nagios.nrpe[276]: launchproxy[276]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 49345 Mar 30 08:19:32 moz2-darwin9-slave68 org.nagios.nrpe[276]: launchproxy[276]: /usr/local/nagios/sbin/nrpe: Connection from: 10.2.71.20 on port: 49493 Mar 30 08:20:09 moz2-darwin9-slave68 sshd[290]: USER_PROCESS: 294 ttys001
Reporter | ||
Comment 1•14 years ago
|
||
log excerpt from moz2-linux-slave51 /var/log/messages: Mar 30 08:19:09 moz2-linux-slave51 puppetd[2514]: (//Node[moz2-linux-slave51.build.mozilla.org]/staging-buildslave/buildbot/File[/etc/default/buildbot]) Failed to retrieve current state of resource: No specified source was found from /N/centos5/etc/default/buildbot Mar 30 08:19:09 moz2-linux-slave51 puppetd[2514]: (//Node[moz2-linux-slave51.build.mozilla.org]/staging-buildslave/buildbot/Service[buildbot]) Dependency file[/etc/init.d/buildbot] has 1 failures Mar 30 08:19:09 moz2-linux-slave51 puppetd[2514]: (//Node[moz2-linux-slave51.build.mozilla.org]/staging-buildslave/buildbot/Service[buildbot]) Dependency file[/etc/init.d/buildbot-tac] has 1 failures Mar 30 08:19:09 moz2-linux-slave51 puppetd[2514]: (//Node[moz2-linux-slave51.build.mozilla.org]/staging-buildslave/buildbot/Service[buildbot]) Dependency file[/etc/default/buildbot] has 1 failures Mar 30 08:19:09 moz2-linux-slave51 puppetd[2514]: (//Node[moz2-linux-slave51.build.mozilla.org]/staging-buildslave/buildbot/Service[buildbot]) Skipping because of failed dependencies
Comment 3•14 years ago
|
||
The problem on the mac slave is that it can't mount /N to get at the files to deploy.
Comment 4•14 years ago
|
||
Should this be an IT bug, as previous issues with mounting /N were traced to firewall issues?
Comment 5•14 years ago
|
||
Can somebody please verify that the firewall is set up correctly for these slaves?
Assignee: nobody → server-ops
Component: Release Engineering → Server Operations
QA Contact: release → mrz
Assignee | ||
Comment 6•14 years ago
|
||
Derek, can you check the firewall to see if these hosts have the access they need?
Assignee: server-ops → dmoore
Reporter | ||
Comment 7•14 years ago
|
||
Hey there - any news on if these hosts are connecting properly through the firewalls?
Comment 8•14 years ago
|
||
Were these slaves moved out of the sandbox network? (Used to be sm-try slaves)
Reporter | ||
Comment 9•14 years ago
|
||
It's been a couple of weeks - still can't use these slaves since they are not synching properly with puppet. These slaves were originally in the sandbox network and should not be in build.mozilla.org - please confirm that they are out of the sandbox and check any firewall settings for them to be able to mount /N for puppet access.
Severity: normal → major
Reporter | ||
Comment 10•14 years ago
|
||
Correction:
> These slaves were originally in the sandbox network and should NOW be in
> build.mozilla.org
Comment 11•14 years ago
|
||
I had issues with darwin slaves a couple of weeks ago that were not able to reach the mount shared drive. IT please see this comment from bug 555790 to see if the same thing is happening here: (In reply to comment #6) > Fixed by updating static routes on the NFS server to reflect the new Castro > build netmask.
Updated•14 years ago
|
Assignee: dmoore → server-ops
Updated•14 years ago
|
Assignee: server-ops → dmoore
Comment 12•14 years ago
|
||
(In reply to comment #9) > It's been a couple of weeks - still can't use these slaves since they are not > synching properly with puppet. > > These slaves were originally in the sandbox network and should not be in > build.mozilla.org - please confirm that they are out of the sandbox and check > any firewall settings for them to be able to mount /N for puppet access. Which hosts are you talking about? You can actually figure it out on your own - anything in 10.2.76.0/24 is "sandbox". Anything in 10.2.71.0/24 or 10.2.90.0/23 is the colo build network. Anything in 10.250.48.0/22 is the castro build network. "build.mozilla.org" is ambiguous since hosts under that domain could live in one of three different networks.
Updated•14 years ago
|
Assignee: dmoore → jdow
Updated•14 years ago
|
Whiteboard: [waiting on releng]
Comment 13•14 years ago
|
||
No response in a while, closing.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•