Closed Bug 629692 Opened 13 years ago Closed 13 years ago

slave-side slave-alloc support for w7/32

Categories

(Release Engineering :: General, defect, P3)

x86
Windows 7
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

(Whiteboard: [slavealloc])

Attachments

(1 file)

+++ This bug was initially created as a clone of Bug #616351 +++

Similar to bug 616003, but for w7 systems.  These systems are not managed by any config management system, so they'll need to be touched by hand.
Depends on: 629694
OS: All → Windows 7
Priority: -- → P3
Hardware: All → x86
Whiteboard: [slavealloc]
Depends on: 632103
Depends on: 622980
I'm experimenting with using the same deployment script as used in bug 651264 to deploy this.
That didn't work, but it's a start..
So the problem here is that VirtualStore makes writes to C:\runslave.py succeed, but hides the file under the user's home directory, so it looks like things work, but the file isn't actually where it was placed.

The way around this is to run the batch script as an administrator.

The administrator account is disabled, though.  So runas /user:administrator doesn't work.  The secret sauce is to open the start menu, right-click con 'command prompt', and select "Run As Administrator", then do the tinyurl wget and run the resulting batch script.

I'll make sure this works on the staging machines, then post the full instructions here.
Here's the deployrunslave.bat that I'm using to deploy this:

REM W7
REM note that W7 will let you write to C:\, but tuck the file away somewhere stupid; so we have to run this as Administrator
wget -O c:\runslave.py http://hg.mozilla.org/build/puppet-manifests/raw-file/tip/modules/buildslave/files/runslave.py
wget -O "c:\Users\cltbld\Desktop\startTalos.bat" http://hg.mozilla.org/users/dmitchell_mozilla.com/puppet-manifests/raw-file/tip/modules/buildslave/files/startTalos-w7.bat

This must be run as an administrator, but there's no administrator account on W7, so the RUNAS command doesn't work.  Instead, you need to click the start menu, right click the command prompt, choose "run as administrator", and then run the batch script from that window.  Which means VNC.

I'd like to deploy this next week, but I need some help figuring out how to dodge in and out of running slaves.  With SSH I don't need to worry - I can just install runslave.py and on the next boot the slave will use it.  But does just logging into a running slave with VNC cause test failures?  Timing problems?  Or should I just VNC into slaves and try to catch them when they're idle?  Armen, any thoughts?
I have normally avoided doing any intervention like this when the slaves are running but instead hit "graceful shutdown" and used RDP instead of VNC.

Not sure if it is an option tp place runslave.py in a location where you won't need Admin privileges so you can do all of it through SSH.
With VNC failures can happen if you steal focus on some of the tests.
Doesn't the reboot at the end of a job mean the graceful shutdown on the slave has no effect ? Best I know of is ssh'ing in and modifying/moving buildbot.tac so that it doesn't start up again after reboot.
I don't recall that graceful shutdown reboots the machine but my memory could be failing me.
I have indeed moved the buildbot.tac to be double sure.
Actually I meant that we reboot at the end of job, before the graceful stop has a chance to act. The master sees the slave go away and thinks the graceful stop happened (dustin will know what actually happens here ;-), and is free to hand out another job when the slave reconnects.
yep - I've had to delete each buildbot.tac by hand via SSH, wait for a reboot, and then whack it.

I think I'll just need a downtime for w764, since there's no SSH.
This is done except for

 talos-r3-w7-001 - needs reimage - bug 649835
 talos-r3-w7-017 - still building (darn mochitests)
 talos-r3-w7-032 - reboots bug
 talos-r3-w7-036 - reboots bug
 talos-r3-w7-048 - still building (darn mochitests)

For the reboots bug and reimage, I'll make notes in the slave tracking spreadsheet and the bug; for the other two, I'll check again later tonight.
talos-r3-w7-001 will get reimaged with the appropriate snapshot taken in bug 656042.
 talos-r3-w7-017
 talos-r3-w7-048

done now.  I'm calling this fixed - remaining work is annotated in other bugs.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Oops, I forgot to check in the new starTalos.bat file.  Armen, is this OK?  Hopefully, since it's deployed..
Assignee: nobody → dustin
Status: REOPENED → ASSIGNED
Attachment #531727 - Flags: review?(armenzg)
Comment on attachment 531727 [details] [diff] [review]
m629692-puppet-manifests-r1.patch

Good to have both checked-in.

Will the xp's bat rename affect anything else that you did for it?
Attachment #531727 - Flags: review?(armenzg) → review+
This will need to be redeployed when bug 656450 is solved, too.
Depends on: 656450
Comment on attachment 531727 [details] [diff] [review]
m629692-puppet-manifests-r1.patch

These files are just stuck here for lack of a better place to put them, so renaming won't hurt anything.
Attachment #531727 - Flags: checked-in+
(In reply to comment #16)
> This will need to be redeployed when bug 656450 is solved, too.

Redeployed everywhere but -034, which is being slow.  I'll grab that tomorrow.
Ah, it finished right after I posted that.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: