Last Comment Bug 772446 - Migrate desktop linux32, linux64 firefox builds to mock slaves (and migrate Firefox Linux builders to centos6)
: Migrate desktop linux32, linux64 firefox builds to mock slaves (and migrate F...
Status: RESOLVED FIXED
:
Product: Release Engineering
Classification: Other
Component: General Automation (show other bugs)
: other
: All All
: -- normal (vote)
: ---
Assigned To: John Hopkins (:jhopkins)
: Chris AtLee [:catlee]
Mentors:
: 633275 772063 (view as bug list)
Depends on: 802114 772563 773379 786088 790444 790506 792859 793016 793217 793576 793782 794063 794339 813039
Blocks: 661306 662417 670707 740690 780022 793088 794185 794378
  Show dependency treegraph
 
Reported: 2012-07-10 07:46 PDT by Chris AtLee [:catlee]
Modified: 2013-08-12 21:54 PDT (History)
21 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
buildbot-configs wip (3.48 KB, patch)
2012-07-23 14:24 PDT, John Hopkins (:jhopkins)
no flags Details | Diff | Review
buildbotcustom wip (9.28 KB, patch)
2012-07-23 14:25 PDT, John Hopkins (:jhopkins)
catlee: feedback+
Details | Diff | Review
puppet config (1.08 KB, patch)
2012-08-23 06:19 PDT, John Hopkins (:jhopkins)
rail: review-
Details | Diff | Review
buildbot-configs wip (6.37 KB, patch)
2012-08-23 12:08 PDT, John Hopkins (:jhopkins)
rail: feedback+
Details | Diff | Review
buildbotcustom wip (69.33 KB, patch)
2012-08-23 12:09 PDT, John Hopkins (:jhopkins)
rail: feedback+
Details | Diff | Review
proposed buildbot-configs patch (10.50 KB, patch)
2012-08-24 11:16 PDT, John Hopkins (:jhopkins)
bhearsum: review-
Details | Diff | Review
proposed buildbotcustom patch (70.27 KB, patch)
2012-08-24 11:18 PDT, John Hopkins (:jhopkins)
rail: review-
Details | Diff | Review
puppet config v2 (1.98 KB, patch)
2012-08-24 11:43 PDT, Rail Aliiev [:rail] PTO Jun 27
bugspam.Callek: review+
Details | Diff | Review
puppet config v3 (1.71 KB, patch)
2012-08-27 06:12 PDT, Rail Aliiev [:rail] PTO Jun 27
rail: review+
Details | Diff | Review
mock.py diff (16.17 KB, patch)
2012-08-27 10:23 PDT, Rail Aliiev [:rail] PTO Jun 27
no flags Details | Diff | Review
puppet config v3.1 (1.65 KB, patch)
2012-08-31 13:33 PDT, Rail Aliiev [:rail] PTO Jun 27
rail: review+
rail: checked‑in+
Details | Diff | Review
tools: python failover for valgrind (964 bytes, patch)
2012-09-04 07:51 PDT, Rail Aliiev [:rail] PTO Jun 27
catlee: review+
rail: checked‑in+
Details | Diff | Review
configs (39.23 KB, patch)
2012-09-04 07:59 PDT, Rail Aliiev [:rail] PTO Jun 27
no flags Details | Diff | Review
buildbotcustom (103.33 KB, patch)
2012-09-04 08:18 PDT, Rail Aliiev [:rail] PTO Jun 27
no flags Details | Diff | Review
don't clobber $PROPERTIES_FILE (929 bytes, patch)
2012-09-06 10:32 PDT, Rail Aliiev [:rail] PTO Jun 27
catlee: review+
rail: checked‑in+
Details | Diff | Review
configs (43.07 KB, patch)
2012-09-06 13:12 PDT, Rail Aliiev [:rail] PTO Jun 27
no flags Details | Diff | Review
buildbotcustom (103.24 KB, patch)
2012-09-06 13:13 PDT, Rail Aliiev [:rail] PTO Jun 27
catlee: review-
Details | Diff | Review
stdio of alive tests 1 (70.30 KB, text/html)
2012-09-06 13:51 PDT, Rail Aliiev [:rail] PTO Jun 27
no flags Details
mockify compare leak steps (4.25 KB, patch)
2012-09-12 11:49 PDT, Chris AtLee [:catlee]
no flags Details | Diff | Review
buildbot-configs (44.81 KB, patch)
2012-09-14 09:06 PDT, John Hopkins (:jhopkins)
catlee: review+
jhopkins: checked‑in+
Details | Diff | Review
buildbotcustom (109.60 KB, patch)
2012-09-14 09:07 PDT, John Hopkins (:jhopkins)
catlee: review+
jhopkins: checked‑in+
Details | Diff | Review
buildbot-configs follow-up patch to includ /tools/buildbot/bin in PATH for linux32/64 builds (2.68 KB, patch)
2012-09-18 12:34 PDT, John Hopkins (:jhopkins)
no flags Details | Diff | Review
buildbot-configs follow-up patch to include /tools/buildbot/bin in PATH for linux32/64 builds + debug (5.35 KB, patch)
2012-09-19 08:31 PDT, John Hopkins (:jhopkins)
catlee: review+
jhopkins: checked‑in+
Details | Diff | Review

Description Chris AtLee [:catlee] 2012-07-10 07:46:22 PDT

    
Comment 1 Chris Cooper [:coop] 2012-07-11 09:21:03 PDT
Bug 670707 is the one I was talking about on Monday when I mentioned that there were *lots* of requested updates to the linux platforms. Should those dependencies be moved here instead?

Since we're going to be moving to CentOS 6 (or 6.2) on AWS, we may get many of these for free, but since we're essentially revving the ref platform here, we should make sure the outstanding requests are incorporated.
Comment 2 Chris Cooper [:coop] 2012-07-22 09:42:56 PDT
*** Bug 772063 has been marked as a duplicate of this bug. ***
Comment 3 Gary Kwong [:gkw] [:nth10sd] 2012-07-22 13:48:41 PDT
> Since we're going to be moving to CentOS 6 (or 6.2) on AWS, we may get many
> of these for free, but since we're essentially revving the ref platform
> here, we should make sure the outstanding requests are incorporated.

Are we getting python upgrades to 2.7.x as well? (build machines run fuzzers and 2.7.x does help the harness, which is running python 2.5.1 and has a series of ugly hacks)
Comment 4 John Hopkins (:jhopkins) 2012-07-23 13:27:57 PDT
Gary: Releng do want to move to Python 2.7 across all platforms but this will take some time.  Linux will get it first, and I can confirm that the mock work is using Python 2.7.
Comment 5 Gary Kwong [:gkw] [:nth10sd] 2012-07-23 13:32:57 PDT
(In reply to John Hopkins (:jhopkins) from comment #4)
> Gary: Releng do want to move to Python 2.7 across all platforms but this
> will take some time.  Linux will get it first, and I can confirm that the
> mock work is using Python 2.7.

That is great news. Thanks for the update John!
Comment 6 John Hopkins (:jhopkins) 2012-07-23 14:24:58 PDT
Created attachment 645072 [details] [diff] [review]
buildbot-configs wip
Comment 7 John Hopkins (:jhopkins) 2012-07-23 14:25:27 PDT
Created attachment 645073 [details] [diff] [review]
buildbotcustom wip
Comment 8 John Hopkins (:jhopkins) 2012-07-23 14:33:30 PDT
Comment on attachment 645073 [details] [diff] [review]
buildbotcustom wip

I've added an "addMockCommand" wrapper which is used a little differently than the MockCommand class.  In my case, I used it to wrap the MozillaCheck class rather than writing a new MockMozillaCheck class from scratch.  We should be able to obsolete the original MockCommand class if this method is preferred.
Comment 9 Chris AtLee [:catlee] 2012-08-02 07:38:51 PDT
Comment on attachment 645073 [details] [diff] [review]
buildbotcustom wip

Review of attachment 645073 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! Have you tested that the dep/nightly l10n jobs still work ok?

::: steps/unittest.py
@@ +730,5 @@
> +        def start(self):
> +            self.set_mock_command()
> +            self.super_class.start(self)
> +
> +    return C

I think it would make sense to put this function in some other module, since it's not specific to unit tests.
Comment 10 Chris AtLee [:catlee] 2012-08-02 07:39:45 PDT
Comment on attachment 645072 [details] [diff] [review]
buildbot-configs wip

Review of attachment 645072 [details] [diff] [review]:
-----------------------------------------------------------------

can you re-submit this patch against an unmodified buildbot-configs? It looks like there's some earlier work below that's not showing up in the diff.
Comment 11 Chris AtLee [:catlee] 2012-08-02 09:30:32 PDT
*** Bug 633275 has been marked as a duplicate of this bug. ***
Comment 12 John Hopkins (:jhopkins) 2012-08-23 06:19:15 PDT
Created attachment 654592 [details] [diff] [review]
puppet config

puppet config patch from /etc/puppet/environments/jhopkins/env on releng-puppet1.
Comment 13 John Hopkins (:jhopkins) 2012-08-23 06:23:41 PDT
I've built green Linux mozilla-central on-change and nightly+l10n mock builds in staging using the buildbot-configs/buildbotcustom patches at http://hg.mozilla.org/users/jhopkins_mozilla.com/

I started verifying that non-Linux platforms still build ok but have hit a couple of issues that I'm working on today.
Comment 14 Rail Aliiev [:rail] PTO Jun 27 2012-08-23 09:16:30 PDT
Comment on attachment 654592 [details] [diff] [review]
puppet config

Can you move package definitions from toplevel::slave::build::xvfb to packages::xvfb and include it *inside* of toplevel::slave::build::xvfb class. Also, move "include packages::supervisord" inside toplevel::slave::build::xvfb class.

packages::xvfb should be a meta package, which contains all needed packages (["xorg-x11-server-Xvfb", "xorg-x11-xauth"]) and checks for operating system (like other packages).
Comment 15 John Hopkins (:jhopkins) 2012-08-23 12:08:41 PDT
Created attachment 654723 [details] [diff] [review]
buildbot-configs wip
Comment 16 John Hopkins (:jhopkins) 2012-08-23 12:09:13 PDT
Created attachment 654725 [details] [diff] [review]
buildbotcustom wip
Comment 17 Rail Aliiev [:rail] PTO Jun 27 2012-08-23 12:45:26 PDT
Comment on attachment 654723 [details] [diff] [review]
buildbot-configs wip

Review of attachment 654723 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +1094,5 @@
> +BRANCHES['mozilla-central']['platforms']['linux']['use_mock'] = True
> +BRANCHES['mozilla-central']['platforms']['linux']['mock_target'] = 'mozilla-centos6-i386'
> +BRANCHES['mozilla-central']['platforms']['linux']['slaves'] = SLAVES['mock']
> +BRANCHES['mozilla-central']['platforms']['linux']['l10n_slaves_key'] = 'mock'
> +BRANCHES['mozilla-central']['platforms']['linux']['mock_packages'] = \

I think, you can move mock_target and mock_packages to PLATFORM_VARS. They won't be used unless use set use_mock.

@@ +1109,5 @@
> +                                'gcc45_0moz3', 'yasm', 'ccache', # <-- from releng repo
> +                                ]
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PATH'] = '/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin:/home/cltbld/bin'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON26'] = None
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON27'] = '/tools/python27/bin/python2.7'

We should rename this variable to something more sane. ATM, we use PYTHON26 when we need to run non default python. Something like PYTHON_BIN or just PYTHON will be better.

@@ +1116,5 @@
> +BRANCHES['mozilla-central']['platforms']['linux64']['use_mock'] = True
> +BRANCHES['mozilla-central']['platforms']['linux64']['mock_target'] = 'mozilla-centos6-x86_64'
> +BRANCHES['mozilla-central']['platforms']['linux64']['slaves'] = SLAVES['mock']
> +BRANCHES['mozilla-central']['platforms']['linux64']['l10n_slaves_key'] = 'mock'
> +BRANCHES['mozilla-central']['platforms']['linux64']['mock_packages'] = \

The same here.

@@ +1131,5 @@
> +                                'gcc45_0moz3', 'yasm', 'ccache', # <-- from releng repo
> +                                ]
> +BRANCHES['mozilla-central']['platforms']['linux64']['env']['PATH'] = '/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin:/home/cltbld/bin'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON26'] = None
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON27'] = '/tools/python27/bin/python2.7'

The same here.
Comment 18 Rail Aliiev [:rail] PTO Jun 27 2012-08-23 13:04:32 PDT
Comment on attachment 654725 [details] [diff] [review]
buildbotcustom wip

Review of attachment 654725 [details] [diff] [review]:
-----------------------------------------------------------------

Woot! Looks good in overall see my comments inline.

::: process/factory.py
@@ +967,5 @@
> +        ))
> +        self.addStep(MockInit(
> +            target=self.mock_target,
> +        ))
> +        # XXX: Hardcoding these directories here sucks.

Agree. Maybe it'd be better to move them to configs? Something like

1) add them to configs:

GLOBAL_VARS = {
...
 'mock_copyin_files': [('/home/cltbld/.ssh', '/home/mock_mozilla/.ssh'), ('/home/cltbld/.android', '/builds/.android'), ...],
...
}

2) Pass mock_copyin_files to the corresponding factories in misc.py

3) Add mock_copyin_files for MozillaBuildFactory/MercurialBuildFactory __init__

4) Loop wrapping the values with WithProperties to be more flexible:
for source, target in self.mock_copyin_files:
  self.addStep(ShellCommand(
		command=['mock_mozilla', '-r', self.mock_target, '--copyin', WithProperties(source), WithProperties(target)],
		haltOnFailure=True,
		))

@@ +3731,3 @@
>              name='get_build_id',
>              workdir=self.absMozillaSrcDir,
> +            extract_fn=parse_build_id,

Hmm, why do you need a parsing function here?

@@ +3749,5 @@
>                           WithProperties('%(inipath)s'),
>                           'App', 'Name'],
>                  property='appName',
>                  name='get_app_name',
> +                workdir=WithProperties('%(basedir)s/' + self.absMozillaSrcDir),

Did it work for Windows? Why you didn't use MockProperty here like in get_build_id?

::: steps/mock.py
@@ +1,1 @@
> +# ***** BEGIN LICENSE BLOCK *****

Assuming that this files contains just copy/pasted classes, I skipped this file. If you changed something here I can review it as well.

Please make sure that you removed trailing spaces.
Comment 19 John Hopkins (:jhopkins) 2012-08-24 11:16:35 PDT
Created attachment 655065 [details] [diff] [review]
proposed buildbot-configs patch
Comment 20 John Hopkins (:jhopkins) 2012-08-24 11:18:44 PDT
Created attachment 655068 [details] [diff] [review]
proposed buildbotcustom patch
Comment 21 John Hopkins (:jhopkins) 2012-08-24 11:28:26 PDT
The above patches enable linux/linux64 mock builds on mozilla-central, try, and all project branches.

Staging test results so far:

| Linux mozilla-central nightly                | OK |
| Firefox mozilla-central linux l10n nightly   | fail make_libmar |
| WINNT 5.2 mozilla-central nightly            | OK |
| Firefox mozilla-central win32 l10n nightly   | did not run |
| OS X 10.7 mozilla-central nightly            | OK |
| Firefox mozilla-central macosx64 l10n nightly| fail make_libmar |
| TB Linux comm-central build (non-mock)       | OK |
Comment 22 Rail Aliiev [:rail] PTO Jun 27 2012-08-24 11:43:51 PDT
Created attachment 655077 [details] [diff] [review]
puppet config v2
Comment 23 Ben Hearsum (:bhearsum) 2012-08-24 14:23:57 PDT
Comment on attachment 655065 [details] [diff] [review]
proposed buildbot-configs patch

Review of attachment 655065 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +1122,5 @@
> +BRANCHES['mozilla-central']['platforms']['linux']['use_mock'] = True
> +BRANCHES['mozilla-central']['platforms']['linux']['slaves'] = SLAVES['mock']
> +BRANCHES['mozilla-central']['platforms']['linux']['l10n_slaves_key'] = 'mock'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PATH'] = '/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin:/home/cltbld/bin'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON26'] = None

All of this should be done in the PLATFORMS dictionary - not overridden down here. It is supposed to be the "default" as much as possible. Override aurora, beta, release, and esr10 to the values they need. Once you do that, you don't need to add anything at all to the project branch interpolation loop, and only need to override 'slaves' for try (because use_mock will automatically get set).

This patch looks fine otherwise.
Comment 24 Rail Aliiev [:rail] PTO Jun 27 2012-08-26 07:35:12 PDT
Comment on attachment 655077 [details] [diff] [review]
puppet config v2

Throwing Callek's way to double check if the patch made in puppetagain style.
Comment 25 Justin Wood (:Callek) 2012-08-26 12:32:18 PDT
Comment on attachment 655077 [details] [diff] [review]
puppet config v2

Review of attachment 655077 [details] [diff] [review]:
-----------------------------------------------------------------

None of my comments block this landing, so provided you test[ed] it somewhere, you're good to go on the puppet side.

::: modules/mockbuild/manifests/service.pp
@@ +1,1 @@
> +class mockbuild::service {

nit: name it services since the current naming makes me thing we're setting "mockbuild" itself up as a service on the system.

::: modules/packages/manifests/xvfb.pp
@@ +1,1 @@
> +class packages::xvfb {

(question, not nit) would this make sense in the existing packages::x_libs or is it better as its own thing?
Comment 26 Rail Aliiev [:rail] PTO Jun 27 2012-08-27 05:32:55 PDT
(In reply to Justin Wood (:Callek) from comment #25)
> nit: name it services since the current naming makes me thing we're setting
> "mockbuild" itself up as a service on the system.

Agreed.
 
> (question, not nit) would this make sense in the existing packages::x_libs
> or is it better as its own thing?

I'd prefer to have this package separately because unlike x_libs it runs as an X server.
Comment 27 Rail Aliiev [:rail] PTO Jun 27 2012-08-27 06:12:02 PDT
Created attachment 655570 [details] [diff] [review]
puppet config v3

s/service/services/g
carrying on r+ from Callek
Comment 28 Rail Aliiev [:rail] PTO Jun 27 2012-08-27 10:16:10 PDT
Comment on attachment 655068 [details] [diff] [review]
proposed buildbotcustom patch

Review of attachment 655068 [details] [diff] [review]:
-----------------------------------------------------------------

Still need some fixes before it lands. I'll attach a patch for mock.py as a separate file.

::: process/factory.py
@@ +719,1 @@
>                   mock_packages=[],

mock_packages=None is better here

@@ +957,5 @@
> +    def addMockSteps(self):
> +        self.addStep(MockInit(
> +            target=self.mock_target,
> +        ))
> +        # XXX: Hardcoding these directories here sucks.

See comment#18

@@ +2957,5 @@
>                   mergeLocales=True, mozconfigBranch="production",
>                   testPrettyNames=False,
>                   callClientPy=False,
>                   clientPyConfig=None,
> +                 use_mock=False, mock_target='mozilla-f16-x86_64', mock_packages=[],

Setting default value to [] is dangerous, please use mock_packages=None

@@ +3159,5 @@
>          ))
>          # WinCE is the only platform that will do repackages with
>          # a mozconfig for now. This will be fixed in bug 518359
>          # For backward compatibility where there is no mozconfig
> +        self.addStep(MockCommand( **self.processCommand(

A nit, please remove an extra whitespace before**.

@@ +3175,2 @@
>          )))
> +        self.addStep(MockCommand( **self.processCommand(

The same here.
Comment 29 Rail Aliiev [:rail] PTO Jun 27 2012-08-27 10:23:56 PDT
Created attachment 655636 [details] [diff] [review]
mock.py diff

Mostly a cleanup up to pass pep8 and most of pylint errors/warnings.

the only significant change is how mock_args gets its parameters. Instead of using a list (it's dangerous!) I'd use None and check it like this:
self.mock_args = mock_args or ['--unpriv']

The patch should be applied on top of the existing one.
Comment 30 John Hopkins (:jhopkins) 2012-08-27 11:54:56 PDT
(In reply to Rail Aliiev [:rail] from comment #28)
> Comment on attachment 655068 [details] [diff] [review]
> proposed buildbotcustom patch

Aside from the whitespace nit, those comments all refer to existing code.  I either moved the code around or copied the style in each case.

Why is [] dangerous?
Comment 31 Rail Aliiev [:rail] PTO Jun 27 2012-08-27 12:14:11 PDT
(In reply to John Hopkins (:jhopkins) from comment #30)
> (In reply to Rail Aliiev [:rail] from comment #28)
> > Comment on attachment 655068 [details] [diff] [review]
> > proposed buildbotcustom patch
> 
> Aside from the whitespace nit, those comments all refer to existing code.  I
> either moved the code around or copied the style in each case.

Yeah, but once we touch that code, let's review it! :)
 
> Why is [] dangerous?


def foo(a=[]):
    a.append(5)
    return a

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]

Even though you don't use append() ATM, someone may use it in the future.
Comment 32 John Hopkins (:jhopkins) 2012-08-28 09:08:38 PDT
DONE:
- mock_packages=[] changed to mock_packages=None (rail's feedback)
- steps/mock.py patch applied (rail's patch)
- whitespace nit fixed (rail's feedback)
- use_mock and slaves key made default (bhearsum's feedback)
 - thunderbird-specific
  - set use_mock=False in thunderbird_config.py
  - thunderbird slaves already set to non-mock slaves
  - will revisit thunderbird when i have time to test.  that's the only reason for not doing it now.

TODO:
- address remaining bhearsum's configs feedback: https://bugzilla.mozilla.org/show_bug.cgi?id=772446#c23
 - remaining work:
  - l10n_slaves_key, PATH env, PYTHON26 env
- rail's feedback on hard-coded filenames in https://bugzilla.mozilla.org/show_bug.cgi?id=772446#c18
- do another round of nightly test builds (all desktop platforms) with all the above changes
Comment 33 John Hopkins (:jhopkins) 2012-08-28 09:11:00 PDT
Releng: The changes made to address the above are pushed to my user repos (in the 'mock' branch):

http://hg.mozilla.org/users/jhopkins_mozilla.com/buildbot-configs/
http://hg.mozilla.org/users/jhopkins_mozilla.com/buildbotcustom/

staging master is http://dev-master01.build.scl1.mozilla.com:8900/ in /builds/buildbot/jhopkins/build1/  Feel free to use it to continue this work while I'm away.
Comment 34 Rail Aliiev [:rail] PTO Jun 27 2012-08-31 13:33:24 PDT
Created attachment 657426 [details] [diff] [review]
puppet config v3.1

s/packages::supervisord/packages::mozilla::supervisor/
carry over r=dustin
Comment 35 John Hopkins (:jhopkins) 2012-09-03 09:58:50 PDT
Reassigning to Rail who is continuing this work.
Comment 36 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 05:39:08 PDT
Comment on attachment 657426 [details] [diff] [review]
puppet config v3.1

http://hg.mozilla.org/build/puppet/rev/bf7262454ff1
Comment 37 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 07:51:39 PDT
Created attachment 658086 [details] [diff] [review]
tools: python failover for valgrind
Comment 38 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 07:59:35 PDT
Created attachment 658092 [details] [diff] [review]
configs

Comments incoming
Comment 39 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 08:18:00 PDT
Created attachment 658098 [details] [diff] [review]
buildbotcustom

comments incoming
Comment 40 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 08:24:47 PDT
Comment on attachment 658092 [details] [diff] [review]
configs

Review of attachment 658092 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +99,5 @@
> +    'mock_copyin_files': [
> +        ('/home/cltbld/.ssh', '/home/mock_mozilla/.ssh'),
> +        ('/home/cltbld/.android', '/builds/.android'),
> +        ('/home/cltbld/.mozpass.cfg', '/builds/.mozpass.cfg'),
> +    ],

I put this variable to GLOBAL_VARS to avoid boiler plate code, even thought this variable sounds more like a platform variable.

@@ +171,5 @@
> +                        'mpfr', # required for system compiler
> +                        'xorg-x11-font*', # fonts required for PGO
> +                        'imake', # required for makedepend!?!
> +                        'gcc45_0moz3', 'yasm', 'ccache', # <-- from releng repo
> +                        'valgrind', 'valgrind-devel',

Had to add valgrind packages to make the builder happy. The build itself is red (th same as in production).

@@ +1172,5 @@
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['slaves'] = SLAVES['linux']
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['env']['SYMBOL_SERVER_SSH_KEY'] = "/home/cltbld/.ssh/ffxbld_dsa"
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['env']['PATH'] = "/tools/jdk6/bin:/opt/local/bin:/tools/python/bin:/tools/buildbot/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/"
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['env']['PYTHON26'] = "/tools/python-2.6.5/bin/python"
> +# mock disabled block stop

A lot of noise to avoid evil loops.

@@ -1413,5 @@
> -        BRANCHES[b]['platforms'][p]['env']['SYMBOL_SERVER_SSH_KEY'] = "/home/cltbld/.ssh/ffxbld_dsa"
> -        BRANCHES[b]['platforms'][p]['env']['PATH'] = "/tools/jdk6/bin:/opt/local/bin:/tools/python/bin:/tools/buildbot/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/"
> -        del BRANCHES[b]['platforms'][p]['use_mock']
> -        del BRANCHES[b]['platforms'][p]['mock_target']
> -        del BRANCHES[b]['platforms'][p]['mock_packages']

this block is flattened above
Comment 41 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 08:50:37 PDT
Comment on attachment 658098 [details] [diff] [review]
buildbotcustom

Review of attachment 658098 [details] [diff] [review]:
-----------------------------------------------------------------

::: misc.py
@@ +1661,5 @@
> +                'category': name,
> +                'nextSlave': _nextSlowSlave,
> +                'properties': {'branch': name, 'platform': platform, 'slavebuilddir': reallyShort('%s-%s-xulrunner' % (name, platform)), 'product': 'xulrunner'},
> +            }
> +            branchObjects['builders'].append(mozilla2_xulrunner_builder)

A lot of noise here because of fixed indentation.

::: process/factory.py
@@ +1194,5 @@
>      def addDoBuildSteps(self):
>          workdir=WithProperties('%(basedir)s/build')
>          if self.platform.startswith('win'):
>              workdir="build"
> +        command = self.makeCmd + ['-f', 'client.mk', 'build',

Unrelated cleanup!

@@ +2568,5 @@
>                  log_eval_func=lambda c,s: regex_log_evaluator(c, s, upload_errors),
>                  locks=[upload_lock.access('counting')],
>              ))
>  
> +        talosBranch = "%s-%s-talos" % (self.branchName, self.complete_platform)

The same here

@@ +6197,2 @@
>          self.get_basedir_cmd = ['bash', '-c', 'pwd']
> +        self.env = env

Had to add "env" member to pass it to mock command

::: process/release.py
@@ +1105,5 @@
>              useBetaChannelForRelease=releaseConfig.get('useBetaChannelForRelease', False),
>              signingServers=getSigningServers('linux'),
>              useChecksums=releaseConfig.get('enablePartialMarsAtBuildTime', True),
>              mozRepoPath=moz_repo_path,
> +            python=branchConfig['platforms']['linux']['env'].get('PYTHON26', 'python'),

Don't depend on this variable which is absent on mock slaves.

::: steps/mock.py
@@ +99,5 @@
> +        self.command = [self.mock_login, '-v', '-r', self.target,
> +                        '--cwd', WithProperties(mock_workdir)] + \
> +            self.mock_args + ['--shell'] + \
> +            [WithProperties('/usr/bin/env %s %s' % (environment,
> +                                                    string_command))]

This file is mostly a copy/paste from steps/misc.py. The only significant change is how we pass the final command to ShellCommand.

In the past we used to pass the whole command as a string, what caused at least 2 problems:

1) mock_workwir can't contain spaces. Not a n issue anymore since we pass this parameter as a separate list item, so ShellCommand passes it to the final command as a separate argument, avoiding shell.

2) there was no way to execute complex shell commands (sh -c "...", especially with && and ; inside). Now shell command gets the last argument as a quoted (with undocumented :/ pipes.quote()) command prefixed with /usr/bin/env. Unfortunately it's not so easy to preserve the environment, since consolehelper and mock clean it up.

I also added -v to make mock more verbose.
Comment 42 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 08:59:10 PDT
= Summary =

== Working builds (m-c) ==

* linux/linux64/android/armv6 builds/pgo builds/nightly builds
* linux/linux64 l10n dep and nightly repacks
* win/mac builds (their factories are not affected by mock)

== Failing builds ==
* linux/linux64 debug fail during second AliveTest:
 *** glibc detected *** /builds/slave/m-cen-lnx-dbg/build/obj-firefox/dist/bin/firefox-bin: corrupted double-linked list: 0x0b57c8a8 ***
* xulrunner - the same error as in production
* valgrind - the same error as in production


== Not tested ==
* PROJECT builds
* mobile l10n: http://hg.mozilla.org/build/buildbotcustom/file/08ab818fd5a1/misc.py#l1167 
* blocklist update
Comment 43 Rail Aliiev [:rail] PTO Jun 27 2012-09-04 13:31:54 PDT
Comment on attachment 658086 [details] [diff] [review]
tools: python failover for valgrind

http://hg.mozilla.org/build/tools/rev/261314a760ee
Comment 44 Rail Aliiev [:rail] PTO Jun 27 2012-09-06 10:32:45 PDT
Created attachment 658921 [details] [diff] [review]
don't clobber $PROPERTIES_FILE

Valgrind build failed:

+ python /builds/slave/m-cen-lnx-valgrind/scripts/clobberer/clobberer.py -s scripts -s /builds/slave/m-cen-lnx-valgrind/buildprops.json http://clobberer.pvt.build.mozilla.org/index.php mozilla-central 'Linux mozilla-central valgrind' m-cen-lnx-valgrind bld-centos6-hp-002 http://dev-master01.build.scl1.mozilla.com:8033/
Checking clobber URL: http://clobberer.pvt.build.mozilla.org/index.php?master=http%3A%2F%2Fdev-master01.build.scl1.mozilla.com%3A8033%2F&slave=bld-centos6-hp-002&builddir=m-cen-lnx-valgrind&branch=mozilla-central&buildername=Linux+mozilla-central+valgrind
m-cen-lnx-valgrind:Our last clobber date:  2012-09-01 00:47:20
m-cen-lnx-valgrind:Server clobber date:    2012-09-01 22:00:19
m-cen-lnx-valgrind:Server is forcing a clobber, initiated by Callek@gmail.com
m-cen-lnx-valgrind:Clobbering...
Removing objdir/
Removing buildprops.json
Removing build/
Skipping last-clobber
Removing src/
Skipping scripts
+ cd /builds/slave/m-cen-lnx-valgrind/scripts/..
+ python /builds/slave/m-cen-lnx-valgrind/scripts/buildfarm/maintenance/purge_builds.py -s 4 -n info -n 'rel-*' -n m-cen-lnx-valgrind
187.61 GB of space available
+ python /builds/slave/m-cen-lnx-valgrind/scripts/buildfarm/utils/hgtool.py --rev c64a9f342156 --tbox http://hg.mozilla.org/mozilla-central src
Traceback (most recent call last):
  File "/builds/slave/m-cen-lnx-valgrind/scripts/buildfarm/utils/hgtool.py", line 72, in <module>
    js = json.load(open(options.propsfile))
IOError: [Errno 2] No such file or directory: '/builds/slave/m-cen-lnx-valgrind/buildprops.json'
+ exit 2
Comment 45 Rail Aliiev [:rail] PTO Jun 27 2012-09-06 13:12:21 PDT
Created attachment 658972 [details] [diff] [review]
configs

Refreshed configs with mock_copyin_files moved to PLATFORMS.
Comment 46 Rail Aliiev [:rail] PTO Jun 27 2012-09-06 13:13:49 PDT
Created attachment 658973 [details] [diff] [review]
buildbotcustom

Refreshed buildbotcustom with mock_copyin_files moved to PLATFORMS.
Comment 47 Rail Aliiev [:rail] PTO Jun 27 2012-09-06 13:51:56 PDT
Created attachment 658986 [details]
stdio of alive tests 1

ATM, leak test on linux/linux64 turns orange because of failed alive test 1 (0 based). This is our major stopper here...
Comment 48 Karl Tomlinson (ni?:karlt) 2012-09-06 17:05:00 PDT
> *** glibc detected *** /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/firefox-bin: corrupted double-linked list: 0x0000000001cffaa0 ***
> ======= Backtrace: =========
> /lib64/libc.so.6(+0x75366)[0x7fb28c772366]
> /lib64/libc.so.6(+0x78181)[0x7fb28c775181]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x180b35c)[0x7fb28929135c]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x180be1c)[0x7fb289291e1c]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x17c2e8a)[0x7fb289248e8a]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x180bf55)[0x7fb289291f55]
> /usr/lib64/libX11.so.6(XCloseDisplay+0xa2)[0x7fb2866dbc82]

> TEST-UNEXPECTED-FAIL | automation.py | Exited with code 1 during test run

Are we meant to get a stack trace from our crash reporter or similar here?

I'd expect glibc to raise SIGABRT, which would normally trigger our crash
reporter.  Is it actually the firefox process exiting with exit code 1 or some
parent process?
Exit code 1 is not what I'd expect from SIGABRT.

If libxul.so is not stripped then

addr2line -if -e /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so 0x180b35c

etc, will be helpful.

I fear it might be stripped and symbols in a special format which might need our
own tools, but I don't know what they are.  It may be easier to persuade the
build not to strip.

These could be helpful, whatever:

addr2line -if -e /usr/lib/debug/lib64/libc-2.11.so.debug 0x75366
addr2line -if -e /usr/lib/debug/lib64/libc-2.11.so.debug 0x78181
Comment 49 Karl Tomlinson (ni?:karlt) 2012-09-06 19:27:49 PDT
I picked glibc-debuginfo-2.12-1.80.el6_3.5.x86_64.
Don't know if that is the right one, but it looks close enough.

% addr2line -if -e usr/lib/debug/lib64/libc-2.12.so.debug 0x75366 
ptmalloc_lock_all
/usr/src/debug/glibc-2.12-2-gc4ccff1/malloc/arena.c:279
% addr2line -if -e usr/lib/debug/lib64/libc-2.12.so.debug 0x78181 
_int_free
/usr/src/debug/glibc-2.12-2-gc4ccff1/malloc/malloc.c:4938

I had forgotten we don't use jemalloc in debug builds, so this doesn't tell us much more than a call to free or perhaps delete, which I should have expected.

The libxul.so stack would be the next thing to get, but if it is a previous memory corruption, we may need more investigation.  If the stack doesn't tell us much and you can give me access to the system, I can investigate.

Do these builds run OK on other systems?
Do builds from CentOS 5 run OK on this system?
Comment 50 Karl Tomlinson (ni?:karlt) 2012-09-06 19:29:27 PDT
Installing valgrind and gdb on the system, if not already there, would be helpful, too.
Comment 51 Rail Aliiev [:rail] PTO Jun 27 2012-09-07 09:02:18 PDT
Comment on attachment 658921 [details] [diff] [review]
don't clobber $PROPERTIES_FILE

http://hg.mozilla.org/build/tools/rev/358ec93eaea2
Comment 52 Rail Aliiev [:rail] PTO Jun 27 2012-09-07 12:10:27 PDT
Assigning it to catlee for a week
Comment 53 Chris AtLee [:catlee] 2012-09-07 13:26:36 PDT
Comment on attachment 658973 [details] [diff] [review]
buildbotcustom

Review of attachment 658973 [details] [diff] [review]:
-----------------------------------------------------------------

r- because I'm pretty sure the talos branch name is busted.

::: process/factory.py
@@ +326,5 @@
>                   buildsBeforeReboot=None, branchName=None, baseWorkDir='build',
>                   hashType='sha512', baseMirrorUrls=None, baseBundleUrls=None,
>                   signingServers=None, enableSigning=True, env={},
>                   balrog_api_root=None, balrog_credentials_file=None,
> +                 use_mock=False, mock_target='mozilla-f16-x86_64',

do we really want a default mock_target here? and if so, should it be -f16- ?

@@ +474,5 @@
> +                    target=self.mock_target,
> +                ))
> +                self.addStep(MockInit(
> +                            target=self.mock_target,
> +                ))

nit: use consistent indentation

@@ +990,5 @@
> +        if self.use_mock and self.mock_packages:
> +            self.addStep(MockInstall(
> +                target=self.mock_target,
> +                packages=self.mock_packages,
> +            ))

nit: add step names

@@ -1193,5 @@
> -            bldtgt = 'profiledbuild'
> -        else:
> -            bldtgt = 'build'
> -
> -        command = self.makeCmd + ['-f', 'client.mk', bldtgt,

\o/

@@ -2525,5 @@
>  
> -        if self.profiledBuild and self.branchName not in ('mozilla-1.9.1', 'mozilla-1.9.2', 'mozilla-2.0'):
> -            talosBranch = "%s-%s-pgo-talos" % (self.branchName, self.complete_platform)
> -        else:
> -            talosBranch = "%s-%s-talos" % (self.branchName, self.complete_platform)

is this right?

if branchName == "mozilla-central" and profiledBuild == True, then we want "mozilla-central-win32-pgo-talos" don't we?

@@ -3395,4 @@
>           name='repack_installers_pretty',
>           description=['repack', 'installers', 'pretty'],
> -         command=['sh', '-c',
> -                  WithProperties('make installers-%(locale)s LOCALE_MERGEDIR=$PWD/merged')],

any idea why this used to be 'sh -c ...' instead of a raw 'make installers'?

does $PWD get expanded properly?

@@ -3598,4 @@
>              name='make_bsdiff',
>              command=['sh', '-c',
>                       'if [ ! -e dist/host/bin/mbsdiff ]; then ' +
> -                     'make tier_base; make tier_nspr; make -C config;' +

tier_nspr isn't needed?

@@ +6181,5 @@
>      def __init__(self, scriptRepo, scriptName, cwd=None, interpreter=None,
>              extra_data=None, extra_args=None,
>              script_timeout=1200, script_maxtime=None, log_eval_func=None,
> +            reboot_command=None, hg_bin='hg', platform=None,
> +            use_mock=False, mock_target='mozilla-f16-x86_64',

same question/resolution here for default mock_target value.

@@ +6290,5 @@
> +
> +            self.addStep(MockInstall(
> +                target=self.mock_target,
> +                packages=self.mock_packages,
> +            ))

nit: step names

::: steps/mock.py
@@ +61,5 @@
> +            target=target,
> +        )
> +
> +    def set_mock_command(self):
> +        #This variable is used to decide whether to wrap the

you're leaving me hanging here...
Comment 54 Chris AtLee [:catlee] 2012-09-12 10:47:13 PDT
Ok, leak tests are now running - onto the next hurdle!

compare leaks doesn't work on the 32-bit slaves...they probably need to run under mock as well
Comment 55 Chris AtLee [:catlee] 2012-09-12 11:49:48 PDT
Created attachment 660532 [details] [diff] [review]
mockify compare leak steps
Comment 56 Chris AtLee [:catlee] 2012-09-12 13:22:42 PDT
Back to John to get us through the final stretch!
Comment 57 John Hopkins (:jhopkins) 2012-09-14 09:06:25 PDT
Created attachment 661244 [details] [diff] [review]
buildbot-configs
Comment 58 John Hopkins (:jhopkins) 2012-09-14 09:07:22 PDT
Created attachment 661245 [details] [diff] [review]
buildbotcustom
Comment 59 John Hopkins (:jhopkins) 2012-09-18 12:34:30 PDT
Created attachment 662271 [details] [diff] [review]
buildbot-configs follow-up patch to includ /tools/buildbot/bin in PATH for linux32/64 builds

This patch is needed to ensure that 'buildbot' is in the path for the sendchange step.

Before patch: http://dev-master01.build.scl1.mozilla.com:8033/builders/Linux%20mozilla-central%20build/builds/1
After patch: http://dev-master01.build.scl1.mozilla.com:8033/builders/Linux%20mozilla-central%20build/builds/2
Comment 60 Mike Hommey [:glandium] 2012-09-19 06:41:09 PDT
For the record, you don't have to wait for bug 772563 before actually switching to mock. The nightlies can be broken on (really) old distros for a couple days, it's not a problem.
Comment 61 John Hopkins (:jhopkins) 2012-09-19 06:46:10 PDT
Thanks Mike.

I'm currently going through test builds and looking at why some have warnings.  We'd like to deploy this week if possible.
Comment 62 John Hopkins (:jhopkins) 2012-09-19 08:31:28 PDT
Created attachment 662573 [details] [diff] [review]
buildbot-configs follow-up patch to include /tools/buildbot/bin in PATH for linux32/64 builds + debug
Comment 63 Chris AtLee [:catlee] 2012-09-19 08:35:43 PDT
Comment on attachment 662573 [details] [diff] [review]
buildbot-configs follow-up patch to include /tools/buildbot/bin in PATH for linux32/64 builds + debug

Review of attachment 662573 [details] [diff] [review]:
-----------------------------------------------------------------

nit: /usr/local/bin is listed twice
r+ with that cleaned up
Comment 64 Nick Thomas [:nthomas] 2012-09-19 17:53:52 PDT
NB: When we enable Thunderbird builds on mock slaves we need a patch like the one on bug 772563 to block updates to deprecated platforms.
Comment 65 John Hopkins (:jhopkins) 2012-09-20 06:41:30 PDT
Comment on attachment 661244 [details] [diff] [review]
buildbot-configs

Landed in https://hg.mozilla.org/build/buildbot-configs/rev/18d1ebb7084a
Comment 66 John Hopkins (:jhopkins) 2012-09-20 06:41:56 PDT
Comment on attachment 661245 [details] [diff] [review]
buildbotcustom

Landed in http://hg.mozilla.org/build/buildbotcustom/rev/a1389c53fede
Comment 67 Axel Hecht [:Pike] 2012-09-21 07:15:10 PDT
(In reply to Chris AtLee [:catlee] from comment #53)
> Comment on attachment 658973 [details] [diff] [review]
> buildbotcustom
> 
> Review of attachment 658973 [details] [diff] [review]:
<..>
> 
> @@ -3395,4 @@
> >           name='repack_installers_pretty',
> >           description=['repack', 'installers', 'pretty'],
> > -         command=['sh', '-c',
> > -                  WithProperties('make installers-%(locale)s LOCALE_MERGEDIR=$PWD/merged')],
> 
> any idea why this used to be 'sh -c ...' instead of a raw 'make installers'?
> 
> does $PWD get expanded properly?


Apparently not, see bug 793088

Note You need to log in before you can comment on or make changes to this bug.