Closed Bug 552703 Opened 14 years ago Closed 14 years ago

Sendchanges should be done on the slaves

Categories

(Release Engineering :: General, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: rail)

References

Details

(Whiteboard: [automation])

Attachments

(4 files, 5 obsolete files)

Too many headaches having them done on the master.

IIRC, the big issue doing them on the slaves is being able to execute 'buildbot sendchange ....' reliably on Windows.
A MasterShellCommand may also be worth testing out.
Realistically, we probably won't get to this soon.
Priority: -- → P4
Assignee: nobody → rail
To ease this task it would be great to add python\scripts directory to PATH.

Could you review the attached OPSI package?
Attachment #436141 - Flags: review?(bhearsum)
Comment on attachment 436141 [details] [diff] [review]
Add Python Scripts directory to PATH

What risk is there of appending the python script path multiple times?

How uniform are our PATHs?  Could we override it completely instead of just appending to it?
(In reply to comment #4)
> (From update of attachment 436141 [details] [diff] [review])
> What risk is there of appending the python script path multiple times?
> 
> How uniform are our PATHs?  Could we override it completely instead of just
> appending to it?

Sounds like a good opportunity to sync them up, if they aren't already!
(In reply to comment #4)
> (From update of attachment 436141 [details] [diff] [review])
> What risk is there of appending the python script path multiple times?

supp should handle such issues:

  Supp <varname> <list separator> <supplement>

  This command interprets the String value of variable <varname> a list of values
  separated by <list separator> and adds the String <supplement> to this list (if
  it not already contained).
Comment on attachment 436141 [details] [diff] [review]
Add Python Scripts directory to PATH

Yeah, definitely don't want to just append here. Let's pull the $PATH from a production ix machine and production VM, compare, and go from there. If they're the same, just hardcode whatever it is plus the scripts dir. If they're different, we'll have to look at that a bit first.

Either way, it probably makes more sense to create a generic 'set-path' package, or such.
Attachment #436141 - Flags: review?(bhearsum) → review-
Priority: P4 → P2
(In reply to comment #7)
> (From update of attachment 436141 [details] [diff] [review])
> Yeah, definitely don't want to just append here. 

This is not a simple append, this is s a conditional append. The path won't be added if we have it already. supp command handle this very well.

I tried to deploy the package on win32-slave03 in staging multiple times and there were no multiple entries. I tried to run the "setup" action with and without setting the packaged as installed. All wen smooth.

> Let's pull the $PATH from a
> production ix machine and production VM, compare, and go from there. If they're
> the same, just hardcode whatever it is plus the scripts dir. If they're
> different, we'll have to look at that a bit first.

IMHO, it's too complicated for OPSI. I'd prefer KISS principle here. supp command works very fine for one path.

> Either way, it probably makes more sense to create a generic 'set-path'
> package, or such.

IMHO, the proposed package should be deployed once, then removed from OPSI server. This PATH trick should be a part of mozillabuild OPSI package in its next version.

The same logic can be applied for other paths we want to add. The patch should be added by the corresponding OPSI packages.
Status: NEW → ASSIGNED
Comment on attachment 436141 [details] [diff] [review]
Add Python Scripts directory to PATH

Okay, given your comment this is fine, Rail. However, we won't delete this package after deployment because we may need it for future machines.
Attachment #436141 - Flags: review- → review+
Attached patch retry.py (WIP) (obsolete) — Splinter Review
Attaching WIP patches.
Attached patch buildbotcustom changes (WIP) (obsolete) — Splinter Review
Attachment #440523 - Flags: feedback?(catlee)
Attachment #440524 - Flags: feedback?(catlee)
Attachment #440523 - Flags: feedback?(catlee) → feedback+
Comment on attachment 440524 [details] [diff] [review]
buildbotcustom changes (WIP)

Let's see if 'env' can be set in SendChangeStep directly.

Also, if it can reference 'toolsdir' that would be a bit cleaner than using relative paths.
Attachment #440524 - Flags: feedback?(catlee) → feedback-
Attached patch buildbotcustom changes (obsolete) — Splinter Review
Avoid setting PYTHONPATH.
Attachment #440524 - Attachment is obsolete: true
Attachment #442402 - Flags: review?(catlee)
Attached patch retry.py (obsolete) — Splinter Review
Put all files into one directory to avoid setting PYTHONPATH.
Attachment #440523 - Attachment is obsolete: true
Attachment #442404 - Flags: review?(catlee)
Comment on attachment 442404 [details] [diff] [review]
retry.py

>--- /dev/null
>+++ b/buildfarm/utils/retry.py
>@@ -0,0 +1,102 @@
>+#!/usr/bin/python
>+"""retrier.py [options] cmd [arg arg1 ...]

Tiny nit: update the docstring to reflect the name of the script
Attachment #442404 - Flags: review?(catlee) → review+
Comment on attachment 442402 [details] [diff] [review]
buildbotcustom changes

>-    def interrupt(self, reason):
>-        self.retries = 0
>-        if self._interrupt:
>-            self._interrupt("Cancelled sendchange")
>-            self._interrupt = None
>+    def evaluateCommand(self, cmd):
>+        if None != re.search('WARNING', cmd.logs['stdio'].getText()):
>+            return WARNINGS
>+        return self.super_class.evaluateCommand(self, cmd)

Looks good, except let's remove evaluateCommand for now.
Attachment #442402 - Flags: review?(catlee) → review+
Attached patch buildbotcustom changes (obsolete) — Splinter Review
evaluateCommand removed
Attachment #442402 - Attachment is obsolete: true
Attachment #443042 - Flags: review?(catlee)
Attached patch retry.pySplinter Review
(In reply to comment #15)
> Tiny nit: update the docstring to reflect the name of the script

Done
Attachment #442404 - Attachment is obsolete: true
Attachment #443046 - Flags: review?(catlee)
Attachment #443042 - Flags: review?(catlee) → review+
Attachment #443046 - Flags: review?(catlee) → review+
Blocks: 563531
Comment on attachment 436141 [details] [diff] [review]
Add Python Scripts directory to PATH

changeset:   48:53cd812008d4
Attachment #436141 - Flags: checked-in+
(In reply to comment #19)
> (From update of attachment 436141 [details] [diff] [review])
> changeset:   48:53cd812008d4

This package seems to be setup on production-opsi. During today's downtime should this be deployed to production win32-slave* and mw32-ix-slave* ? Do I need to force reboots so that those slaves pick it up immediately ?
python-scripts-path set to install on 
* mw32-ix-slave02 thru 25
* win32-slave01/02/05-20/22 thru 59

To do:
* win2k3-ref-image
* win32-ix-ref
Whiteboard: [automation]
Blocks: 564914
Still need to get it on the ref platform and the try slaves.
(In reply to comment #22)
> Still need to get it on the ref platform and the try slaves.

Might not need to do the try slaves since we're not planning on reconfig'ing the old try master.
Comment on attachment 443042 [details] [diff] [review]
buildbotcustom changes

changeset:   727:e6bcbc6649ad
Attachment #443042 - Flags: checked-in+
Comment on attachment 443046 [details] [diff] [review]
retry.py

changeset:   2401:c8b509ff1079
Attachment #443046 - Flags: checked-in+
reconfig'ed pm01 and pm02 for this.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #445345 - Flags: review?(catlee) → review+
Comment on attachment 445345 [details] [diff] [review]
update sendchanges to go to pm, pm02 rather than localhost

Landed this and reconfiged the masters for it.3
Attachment #445345 - Flags: review?
Attachment #445345 - Flags: review+
Attachment #445345 - Flags: checked-in+
Had to back this out as a quick fix for 10.5 machines:
18:41 < bhearsum|buildduty> Executing: ['buildbot', 'sendchange', '--master',                             'production-master.build.mozilla.org:9010',                             '--username', 'sendchange-unittest', '--branch',                             'mozilla-central-macosx-debug-unittest',                             '--revision', '6a9d0521a319', 
'http://stage.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx-debug/1273857449/firefox-3.7a5pre.en-US.mac.dmg', 
'http://stage.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx-debug/1273857449/firefox-3.7a5pre.en-US.mac.tests.zip']
18:41 < bhearsum|buildduty> Traceback (most recent call last):
18:41 < bhearsum|buildduty>   File "/tools/buildbot/bin/buildbot", line 4, in 
                            <module>
18:41 < bhearsum|buildduty>     import pkg_resources
18:41 < bhearsum|buildduty> ImportError: No module named pkg_resources


Buildbot 0.8.0 probably broke it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The script runs the target program (buildbot) searching it in the PATH, so has no idea which buildbot should be executed if there is no one in the PATH or if there are more than one instances of buildbot installed.

IMO, to keep the script simple and generic across platforms we should adjust the slaves so, that running "buildbot whatever" runs properly.
(In reply to comment #30)
> The script runs the target program (buildbot) searching it in the PATH, so has
> no idea which buildbot should be executed if there is no one in the PATH or if
> there are more than one instances of buildbot installed.
> 
> IMO, to keep the script simple and generic across platforms we should adjust
> the slaves so, that running "buildbot whatever" runs properly.

Yeah, absolutely. Backing out this patch was the quickest way to get the slaves working again -- I don't think this patch is at fault, ultimately.
Simple scripts to test buildbot setup on Linux and Mac slaves:
-----------------------------
#!/bin/bash

MAC_SNOW_MINIS="['moz2-darwin10-slave%02i' % x for x in range(1,51)]"
MAC_MINIS="['moz2-darwin9-slave%02i' % x for x in [2,5,6,7] + range(9,27) + range(29,68)]"
XSERVES="['bm-xserve%02i' % x for x in [6,7,9,11,12,15,16,17,18,19,21,22]]"
LINUX_VMS="['moz2-linux-slave%02i' % x for x in [1,2] + range(5,17) + range(18,51)]"
LINUX64_VMS="['moz2-linux64-slave%02i' % x for x in range(1,7) + range(8,13)]"
LINUX_IXS="['mv-moz2-linux-ix-slave%02i' % x for x in range(2,25)]"

BAD_SLAVES=
for slave in `python -c "print ' '.join($MAC_SNOW_MINIS + $MAC_MINIS + $XSERVES + $LINUX_VMS + $LINUX64_VMS +  $LINUX_IXS) " `;do
    ssh-keyscan $slave.build.mozilla.org >> ./host_keys
    ./ssh.exp cltbldpasswd $slave.build.mozilla.org
    if [ "$?" != "0" ]; then
        BAD_SLAVES="$BAD_SLAVES $slave"
    fi
done

echo $BAD_SLAVES
-----------------------------
#!/usr/bin/expect -f
# ssh.exp
set password [lrange $argv 0 0]
set hostname [lrange $argv 1 1]
set timeout -1

spawn ssh -o UserKnownHostsFile=./host_keys cltbld@$hostname "bash --login -c 'buildbot --version'"
match_max 100000
expect "*?assword:*"
send -- "$password\r"
send -- "\r"
expect eof
-----------------------------

Results:

MAC_SNOW_MINIS = ['moz2-darwin10-slave%02i' % x for x in range(1,51)]
1. moz2-darwin10-slave16 doesn't respond
2. moz2-darwin10-slave{30,39}: Name or service not known

MAC_MINIS = ['moz2-darwin9-slave%02i' % x for x in [2,5,6,7] + range(9,27) + range(29,68)]
All passed

XSERVES = ['bm-xserve%02i' % x for x in [6,7,9,11,12,15,16,17,18,19,21,22]]
All passed

LINUX_VMS=['moz2-linux-slave%02i' % x for x in [1,2] + range(5,17) + range(18,51)]
All passed

LINUX64_VMS = ['moz2-linux64-slave%02i' % x for x in range(1,7) + range(8,13)]
All passed

LINUX_IXS= ['mv-moz2-linux-ix-slave%02i' % x for x in range(2,25)]
All passed

Randomly picked ~10 win32 builders have been tested manually. "buildbot --version" in cmd ran properly.

Could we retry landing this patch?
Refreshed patch. Keeping r+.
Attachment #443042 - Attachment is obsolete: true
Attachment #451594 - Flags: review+
Blocks: 571571
Comment on attachment 451594 [details] [diff] [review]
buildbotcustom changes

788:2d9073bc6645
Attachment #451594 - Flags: checked-in+
No longer blocks: 571571
Blocks: 571571
All done here?
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Oops, just found out that the change from "localhost" to FQDN in the configs is something I need to do on SeaMonkey as well...

Would be helpful to notify me of such changes when possible - thanks!
Blocks: 589135
Blocks: 652391
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: