Closed Bug 772446 Opened 12 years ago Closed 11 years ago

Migrate desktop linux32, linux64 firefox builds to mock slaves (and migrate Firefox Linux builders to centos6)

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: jhopkins)

References

Details

Attachments

(7 files, 16 obsolete files)

1.65 KB, patch
rail
: review+
Details | Diff | Splinter Review
964 bytes, patch
catlee
: review+
Details | Diff | Splinter Review
929 bytes, patch
catlee
: review+
Details | Diff | Splinter Review
70.30 KB, text/html
Details
44.81 KB, patch
catlee
: review+
jhopkins
: checked-in+
Details | Diff | Splinter Review
109.60 KB, patch
catlee
: review+
jhopkins
: checked-in+
Details | Diff | Splinter Review
5.35 KB, patch
catlee
: review+
jhopkins
: checked-in+
Details | Diff | Splinter Review
      No description provided.
Bug 670707 is the one I was talking about on Monday when I mentioned that there were *lots* of requested updates to the linux platforms. Should those dependencies be moved here instead?

Since we're going to be moving to CentOS 6 (or 6.2) on AWS, we may get many of these for free, but since we're essentially revving the ref platform here, we should make sure the outstanding requests are incorporated.
Blocks: 670707
Summary: Migrate desktop firefox builds to mock slaves → Migrate desktop linux firefox builds to mock slaves
Depends on: 773379
> Since we're going to be moving to CentOS 6 (or 6.2) on AWS, we may get many
> of these for free, but since we're essentially revving the ref platform
> here, we should make sure the outstanding requests are incorporated.

Are we getting python upgrades to 2.7.x as well? (build machines run fuzzers and 2.7.x does help the harness, which is running python 2.5.1 and has a series of ugly hacks)
Gary: Releng do want to move to Python 2.7 across all platforms but this will take some time.  Linux will get it first, and I can confirm that the mock work is using Python 2.7.
(In reply to John Hopkins (:jhopkins) from comment #4)
> Gary: Releng do want to move to Python 2.7 across all platforms but this
> will take some time.  Linux will get it first, and I can confirm that the
> mock work is using Python 2.7.

That is great news. Thanks for the update John!
Attached patch buildbot-configs wip (obsolete) — Splinter Review
Attached patch buildbotcustom wip (obsolete) — Splinter Review
Comment on attachment 645073 [details] [diff] [review]
buildbotcustom wip

I've added an "addMockCommand" wrapper which is used a little differently than the MockCommand class.  In my case, I used it to wrap the MozillaCheck class rather than writing a new MockMozillaCheck class from scratch.  We should be able to obsolete the original MockCommand class if this method is preferred.
Attachment #645073 - Flags: feedback?(catlee)
Attachment #645072 - Flags: feedback?(catlee)
Comment on attachment 645073 [details] [diff] [review]
buildbotcustom wip

Review of attachment 645073 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! Have you tested that the dep/nightly l10n jobs still work ok?

::: steps/unittest.py
@@ +730,5 @@
> +        def start(self):
> +            self.set_mock_command()
> +            self.super_class.start(self)
> +
> +    return C

I think it would make sense to put this function in some other module, since it's not specific to unit tests.
Attachment #645073 - Flags: feedback?(catlee) → feedback+
Comment on attachment 645072 [details] [diff] [review]
buildbot-configs wip

Review of attachment 645072 [details] [diff] [review]:
-----------------------------------------------------------------

can you re-submit this patch against an unmodified buildbot-configs? It looks like there's some earlier work below that's not showing up in the diff.
Attachment #645072 - Flags: feedback?(catlee)
Depends on: 772563
Summary: Migrate desktop linux firefox builds to mock slaves → Migrate desktop linux32, linux64 firefox builds to mock slaves
Blocks: 662417
Attached patch puppet config (obsolete) — Splinter Review
puppet config patch from /etc/puppet/environments/jhopkins/env on releng-puppet1.
Attachment #654592 - Flags: review?(rail)
I've built green Linux mozilla-central on-change and nightly+l10n mock builds in staging using the buildbot-configs/buildbotcustom patches at http://hg.mozilla.org/users/jhopkins_mozilla.com/

I started verifying that non-Linux platforms still build ok but have hit a couple of issues that I'm working on today.
Comment on attachment 654592 [details] [diff] [review]
puppet config

Can you move package definitions from toplevel::slave::build::xvfb to packages::xvfb and include it *inside* of toplevel::slave::build::xvfb class. Also, move "include packages::supervisord" inside toplevel::slave::build::xvfb class.

packages::xvfb should be a meta package, which contains all needed packages (["xorg-x11-server-Xvfb", "xorg-x11-xauth"]) and checks for operating system (like other packages).
Attachment #654592 - Flags: review?(rail) → review-
Attached patch buildbot-configs wip (obsolete) — Splinter Review
Attachment #645072 - Attachment is obsolete: true
Attachment #654723 - Flags: feedback?(rail)
Attached patch buildbotcustom wip (obsolete) — Splinter Review
Attachment #645073 - Attachment is obsolete: true
Attachment #654725 - Flags: feedback?(rail)
Comment on attachment 654723 [details] [diff] [review]
buildbot-configs wip

Review of attachment 654723 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +1094,5 @@
> +BRANCHES['mozilla-central']['platforms']['linux']['use_mock'] = True
> +BRANCHES['mozilla-central']['platforms']['linux']['mock_target'] = 'mozilla-centos6-i386'
> +BRANCHES['mozilla-central']['platforms']['linux']['slaves'] = SLAVES['mock']
> +BRANCHES['mozilla-central']['platforms']['linux']['l10n_slaves_key'] = 'mock'
> +BRANCHES['mozilla-central']['platforms']['linux']['mock_packages'] = \

I think, you can move mock_target and mock_packages to PLATFORM_VARS. They won't be used unless use set use_mock.

@@ +1109,5 @@
> +                                'gcc45_0moz3', 'yasm', 'ccache', # <-- from releng repo
> +                                ]
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PATH'] = '/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin:/home/cltbld/bin'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON26'] = None
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON27'] = '/tools/python27/bin/python2.7'

We should rename this variable to something more sane. ATM, we use PYTHON26 when we need to run non default python. Something like PYTHON_BIN or just PYTHON will be better.

@@ +1116,5 @@
> +BRANCHES['mozilla-central']['platforms']['linux64']['use_mock'] = True
> +BRANCHES['mozilla-central']['platforms']['linux64']['mock_target'] = 'mozilla-centos6-x86_64'
> +BRANCHES['mozilla-central']['platforms']['linux64']['slaves'] = SLAVES['mock']
> +BRANCHES['mozilla-central']['platforms']['linux64']['l10n_slaves_key'] = 'mock'
> +BRANCHES['mozilla-central']['platforms']['linux64']['mock_packages'] = \

The same here.

@@ +1131,5 @@
> +                                'gcc45_0moz3', 'yasm', 'ccache', # <-- from releng repo
> +                                ]
> +BRANCHES['mozilla-central']['platforms']['linux64']['env']['PATH'] = '/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin:/home/cltbld/bin'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON26'] = None
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON27'] = '/tools/python27/bin/python2.7'

The same here.
Attachment #654723 - Flags: feedback?(rail) → feedback+
Comment on attachment 654725 [details] [diff] [review]
buildbotcustom wip

Review of attachment 654725 [details] [diff] [review]:
-----------------------------------------------------------------

Woot! Looks good in overall see my comments inline.

::: process/factory.py
@@ +967,5 @@
> +        ))
> +        self.addStep(MockInit(
> +            target=self.mock_target,
> +        ))
> +        # XXX: Hardcoding these directories here sucks.

Agree. Maybe it'd be better to move them to configs? Something like

1) add them to configs:

GLOBAL_VARS = {
...
 'mock_copyin_files': [('/home/cltbld/.ssh', '/home/mock_mozilla/.ssh'), ('/home/cltbld/.android', '/builds/.android'), ...],
...
}

2) Pass mock_copyin_files to the corresponding factories in misc.py

3) Add mock_copyin_files for MozillaBuildFactory/MercurialBuildFactory __init__

4) Loop wrapping the values with WithProperties to be more flexible:
for source, target in self.mock_copyin_files:
  self.addStep(ShellCommand(
		command=['mock_mozilla', '-r', self.mock_target, '--copyin', WithProperties(source), WithProperties(target)],
		haltOnFailure=True,
		))

@@ +3731,3 @@
>              name='get_build_id',
>              workdir=self.absMozillaSrcDir,
> +            extract_fn=parse_build_id,

Hmm, why do you need a parsing function here?

@@ +3749,5 @@
>                           WithProperties('%(inipath)s'),
>                           'App', 'Name'],
>                  property='appName',
>                  name='get_app_name',
> +                workdir=WithProperties('%(basedir)s/' + self.absMozillaSrcDir),

Did it work for Windows? Why you didn't use MockProperty here like in get_build_id?

::: steps/mock.py
@@ +1,1 @@
> +# ***** BEGIN LICENSE BLOCK *****

Assuming that this files contains just copy/pasted classes, I skipped this file. If you changed something here I can review it as well.

Please make sure that you removed trailing spaces.
Attachment #654725 - Flags: feedback?(rail) → feedback+
Attached patch proposed buildbot-configs patch (obsolete) — Splinter Review
Attachment #654723 - Attachment is obsolete: true
Attachment #655065 - Flags: review?(bhearsum)
Attached patch proposed buildbotcustom patch (obsolete) — Splinter Review
Attachment #654725 - Attachment is obsolete: true
Attachment #655068 - Flags: review?(bhearsum)
The above patches enable linux/linux64 mock builds on mozilla-central, try, and all project branches.

Staging test results so far:

| Linux mozilla-central nightly                | OK |
| Firefox mozilla-central linux l10n nightly   | fail make_libmar |
| WINNT 5.2 mozilla-central nightly            | OK |
| Firefox mozilla-central win32 l10n nightly   | did not run |
| OS X 10.7 mozilla-central nightly            | OK |
| Firefox mozilla-central macosx64 l10n nightly| fail make_libmar |
| TB Linux comm-central build (non-mock)       | OK |
Attached patch puppet config v2 (obsolete) — Splinter Review
Attachment #654592 - Attachment is obsolete: true
Attachment #655077 - Flags: review?(jhopkins)
Comment on attachment 655065 [details] [diff] [review]
proposed buildbot-configs patch

Review of attachment 655065 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +1122,5 @@
> +BRANCHES['mozilla-central']['platforms']['linux']['use_mock'] = True
> +BRANCHES['mozilla-central']['platforms']['linux']['slaves'] = SLAVES['mock']
> +BRANCHES['mozilla-central']['platforms']['linux']['l10n_slaves_key'] = 'mock'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PATH'] = '/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin::/tools/git/bin/:/tools/python27/bin/:/tools/python27-mercurial/bin:/home/cltbld/bin'
> +BRANCHES['mozilla-central']['platforms']['linux']['env']['PYTHON26'] = None

All of this should be done in the PLATFORMS dictionary - not overridden down here. It is supposed to be the "default" as much as possible. Override aurora, beta, release, and esr10 to the values they need. Once you do that, you don't need to add anything at all to the project branch interpolation loop, and only need to override 'slaves' for try (because use_mock will automatically get set).

This patch looks fine otherwise.
Attachment #655065 - Flags: review?(bhearsum) → review-
Comment on attachment 655077 [details] [diff] [review]
puppet config v2

Throwing Callek's way to double check if the patch made in puppetagain style.
Attachment #655077 - Flags: review?(jhopkins) → review?(bugspam.Callek)
Comment on attachment 655077 [details] [diff] [review]
puppet config v2

Review of attachment 655077 [details] [diff] [review]:
-----------------------------------------------------------------

None of my comments block this landing, so provided you test[ed] it somewhere, you're good to go on the puppet side.

::: modules/mockbuild/manifests/service.pp
@@ +1,1 @@
> +class mockbuild::service {

nit: name it services since the current naming makes me thing we're setting "mockbuild" itself up as a service on the system.

::: modules/packages/manifests/xvfb.pp
@@ +1,1 @@
> +class packages::xvfb {

(question, not nit) would this make sense in the existing packages::x_libs or is it better as its own thing?
Attachment #655077 - Flags: review?(bugspam.Callek) → review+
(In reply to Justin Wood (:Callek) from comment #25)
> nit: name it services since the current naming makes me thing we're setting
> "mockbuild" itself up as a service on the system.

Agreed.
 
> (question, not nit) would this make sense in the existing packages::x_libs
> or is it better as its own thing?

I'd prefer to have this package separately because unlike x_libs it runs as an X server.
Attached patch puppet config v3 (obsolete) — Splinter Review
s/service/services/g
carrying on r+ from Callek
Attachment #655077 - Attachment is obsolete: true
Attachment #655570 - Flags: review+
Comment on attachment 655068 [details] [diff] [review]
proposed buildbotcustom patch

Review of attachment 655068 [details] [diff] [review]:
-----------------------------------------------------------------

Still need some fixes before it lands. I'll attach a patch for mock.py as a separate file.

::: process/factory.py
@@ +719,1 @@
>                   mock_packages=[],

mock_packages=None is better here

@@ +957,5 @@
> +    def addMockSteps(self):
> +        self.addStep(MockInit(
> +            target=self.mock_target,
> +        ))
> +        # XXX: Hardcoding these directories here sucks.

See comment#18

@@ +2957,5 @@
>                   mergeLocales=True, mozconfigBranch="production",
>                   testPrettyNames=False,
>                   callClientPy=False,
>                   clientPyConfig=None,
> +                 use_mock=False, mock_target='mozilla-f16-x86_64', mock_packages=[],

Setting default value to [] is dangerous, please use mock_packages=None

@@ +3159,5 @@
>          ))
>          # WinCE is the only platform that will do repackages with
>          # a mozconfig for now. This will be fixed in bug 518359
>          # For backward compatibility where there is no mozconfig
> +        self.addStep(MockCommand( **self.processCommand(

A nit, please remove an extra whitespace before**.

@@ +3175,2 @@
>          )))
> +        self.addStep(MockCommand( **self.processCommand(

The same here.
Attachment #655068 - Flags: review?(bhearsum) → review-
Attached patch mock.py diff (obsolete) — Splinter Review
Mostly a cleanup up to pass pep8 and most of pylint errors/warnings.

the only significant change is how mock_args gets its parameters. Instead of using a list (it's dangerous!) I'd use None and check it like this:
self.mock_args = mock_args or ['--unpriv']

The patch should be applied on top of the existing one.
(In reply to Rail Aliiev [:rail] from comment #28)
> Comment on attachment 655068 [details] [diff] [review]
> proposed buildbotcustom patch

Aside from the whitespace nit, those comments all refer to existing code.  I either moved the code around or copied the style in each case.

Why is [] dangerous?
(In reply to John Hopkins (:jhopkins) from comment #30)
> (In reply to Rail Aliiev [:rail] from comment #28)
> > Comment on attachment 655068 [details] [diff] [review]
> > proposed buildbotcustom patch
> 
> Aside from the whitespace nit, those comments all refer to existing code.  I
> either moved the code around or copied the style in each case.

Yeah, but once we touch that code, let's review it! :)
 
> Why is [] dangerous?


def foo(a=[]):
    a.append(5)
    return a

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]

Even though you don't use append() ATM, someone may use it in the future.
Depends on: 786088
DONE:
- mock_packages=[] changed to mock_packages=None (rail's feedback)
- steps/mock.py patch applied (rail's patch)
- whitespace nit fixed (rail's feedback)
- use_mock and slaves key made default (bhearsum's feedback)
 - thunderbird-specific
  - set use_mock=False in thunderbird_config.py
  - thunderbird slaves already set to non-mock slaves
  - will revisit thunderbird when i have time to test.  that's the only reason for not doing it now.

TODO:
- address remaining bhearsum's configs feedback: https://bugzilla.mozilla.org/show_bug.cgi?id=772446#c23
 - remaining work:
  - l10n_slaves_key, PATH env, PYTHON26 env
- rail's feedback on hard-coded filenames in https://bugzilla.mozilla.org/show_bug.cgi?id=772446#c18
- do another round of nightly test builds (all desktop platforms) with all the above changes
Releng: The changes made to address the above are pushed to my user repos (in the 'mock' branch):

http://hg.mozilla.org/users/jhopkins_mozilla.com/buildbot-configs/
http://hg.mozilla.org/users/jhopkins_mozilla.com/buildbotcustom/

staging master is http://dev-master01.build.scl1.mozilla.com:8900/ in /builds/buildbot/jhopkins/build1/  Feel free to use it to continue this work while I'm away.
s/packages::supervisord/packages::mozilla::supervisor/
carry over r=dustin
Attachment #657426 - Flags: review+
Attachment #655570 - Attachment is obsolete: true
Reassigning to Rail who is continuing this work.
Assignee: jhopkins → rail
Attached patch configs (obsolete) — Splinter Review
Comments incoming
Attachment #655065 - Attachment is obsolete: true
Attachment #658092 - Flags: review?(catlee)
Attached patch buildbotcustom (obsolete) — Splinter Review
comments incoming
Attachment #655068 - Attachment is obsolete: true
Attachment #655636 - Attachment is obsolete: true
Attachment #658098 - Flags: review?(catlee)
Comment on attachment 658092 [details] [diff] [review]
configs

Review of attachment 658092 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +99,5 @@
> +    'mock_copyin_files': [
> +        ('/home/cltbld/.ssh', '/home/mock_mozilla/.ssh'),
> +        ('/home/cltbld/.android', '/builds/.android'),
> +        ('/home/cltbld/.mozpass.cfg', '/builds/.mozpass.cfg'),
> +    ],

I put this variable to GLOBAL_VARS to avoid boiler plate code, even thought this variable sounds more like a platform variable.

@@ +171,5 @@
> +                        'mpfr', # required for system compiler
> +                        'xorg-x11-font*', # fonts required for PGO
> +                        'imake', # required for makedepend!?!
> +                        'gcc45_0moz3', 'yasm', 'ccache', # <-- from releng repo
> +                        'valgrind', 'valgrind-devel',

Had to add valgrind packages to make the builder happy. The build itself is red (th same as in production).

@@ +1172,5 @@
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['slaves'] = SLAVES['linux']
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['env']['SYMBOL_SERVER_SSH_KEY'] = "/home/cltbld/.ssh/ffxbld_dsa"
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['env']['PATH'] = "/tools/jdk6/bin:/opt/local/bin:/tools/python/bin:/tools/buildbot/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/"
> +BRANCHES['mozilla-release']['platforms']['android-armv6']['env']['PYTHON26'] = "/tools/python-2.6.5/bin/python"
> +# mock disabled block stop

A lot of noise to avoid evil loops.

@@ -1413,5 @@
> -        BRANCHES[b]['platforms'][p]['env']['SYMBOL_SERVER_SSH_KEY'] = "/home/cltbld/.ssh/ffxbld_dsa"
> -        BRANCHES[b]['platforms'][p]['env']['PATH'] = "/tools/jdk6/bin:/opt/local/bin:/tools/python/bin:/tools/buildbot/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/"
> -        del BRANCHES[b]['platforms'][p]['use_mock']
> -        del BRANCHES[b]['platforms'][p]['mock_target']
> -        del BRANCHES[b]['platforms'][p]['mock_packages']

this block is flattened above
Comment on attachment 658098 [details] [diff] [review]
buildbotcustom

Review of attachment 658098 [details] [diff] [review]:
-----------------------------------------------------------------

::: misc.py
@@ +1661,5 @@
> +                'category': name,
> +                'nextSlave': _nextSlowSlave,
> +                'properties': {'branch': name, 'platform': platform, 'slavebuilddir': reallyShort('%s-%s-xulrunner' % (name, platform)), 'product': 'xulrunner'},
> +            }
> +            branchObjects['builders'].append(mozilla2_xulrunner_builder)

A lot of noise here because of fixed indentation.

::: process/factory.py
@@ +1194,5 @@
>      def addDoBuildSteps(self):
>          workdir=WithProperties('%(basedir)s/build')
>          if self.platform.startswith('win'):
>              workdir="build"
> +        command = self.makeCmd + ['-f', 'client.mk', 'build',

Unrelated cleanup!

@@ +2568,5 @@
>                  log_eval_func=lambda c,s: regex_log_evaluator(c, s, upload_errors),
>                  locks=[upload_lock.access('counting')],
>              ))
>  
> +        talosBranch = "%s-%s-talos" % (self.branchName, self.complete_platform)

The same here

@@ +6197,2 @@
>          self.get_basedir_cmd = ['bash', '-c', 'pwd']
> +        self.env = env

Had to add "env" member to pass it to mock command

::: process/release.py
@@ +1105,5 @@
>              useBetaChannelForRelease=releaseConfig.get('useBetaChannelForRelease', False),
>              signingServers=getSigningServers('linux'),
>              useChecksums=releaseConfig.get('enablePartialMarsAtBuildTime', True),
>              mozRepoPath=moz_repo_path,
> +            python=branchConfig['platforms']['linux']['env'].get('PYTHON26', 'python'),

Don't depend on this variable which is absent on mock slaves.

::: steps/mock.py
@@ +99,5 @@
> +        self.command = [self.mock_login, '-v', '-r', self.target,
> +                        '--cwd', WithProperties(mock_workdir)] + \
> +            self.mock_args + ['--shell'] + \
> +            [WithProperties('/usr/bin/env %s %s' % (environment,
> +                                                    string_command))]

This file is mostly a copy/paste from steps/misc.py. The only significant change is how we pass the final command to ShellCommand.

In the past we used to pass the whole command as a string, what caused at least 2 problems:

1) mock_workwir can't contain spaces. Not a n issue anymore since we pass this parameter as a separate list item, so ShellCommand passes it to the final command as a separate argument, avoiding shell.

2) there was no way to execute complex shell commands (sh -c "...", especially with && and ; inside). Now shell command gets the last argument as a quoted (with undocumented :/ pipes.quote()) command prefixed with /usr/bin/env. Unfortunately it's not so easy to preserve the environment, since consolehelper and mock clean it up.

I also added -v to make mock more verbose.
= Summary =

== Working builds (m-c) ==

* linux/linux64/android/armv6 builds/pgo builds/nightly builds
* linux/linux64 l10n dep and nightly repacks
* win/mac builds (their factories are not affected by mock)

== Failing builds ==
* linux/linux64 debug fail during second AliveTest:
 *** glibc detected *** /builds/slave/m-cen-lnx-dbg/build/obj-firefox/dist/bin/firefox-bin: corrupted double-linked list: 0x0b57c8a8 ***
* xulrunner - the same error as in production
* valgrind - the same error as in production


== Not tested ==
* PROJECT builds
* mobile l10n: http://hg.mozilla.org/build/buildbotcustom/file/08ab818fd5a1/misc.py#l1167 
* blocklist update
Attachment #658086 - Flags: review?(catlee) → review+
Comment on attachment 658086 [details] [diff] [review]
tools: python failover for valgrind

http://hg.mozilla.org/build/tools/rev/261314a760ee
Attachment #658086 - Flags: checked-in+
Valgrind build failed:

+ python /builds/slave/m-cen-lnx-valgrind/scripts/clobberer/clobberer.py -s scripts -s /builds/slave/m-cen-lnx-valgrind/buildprops.json http://clobberer.pvt.build.mozilla.org/index.php mozilla-central 'Linux mozilla-central valgrind' m-cen-lnx-valgrind bld-centos6-hp-002 http://dev-master01.build.scl1.mozilla.com:8033/
Checking clobber URL: http://clobberer.pvt.build.mozilla.org/index.php?master=http%3A%2F%2Fdev-master01.build.scl1.mozilla.com%3A8033%2F&slave=bld-centos6-hp-002&builddir=m-cen-lnx-valgrind&branch=mozilla-central&buildername=Linux+mozilla-central+valgrind
m-cen-lnx-valgrind:Our last clobber date:  2012-09-01 00:47:20
m-cen-lnx-valgrind:Server clobber date:    2012-09-01 22:00:19
m-cen-lnx-valgrind:Server is forcing a clobber, initiated by Callek@gmail.com
m-cen-lnx-valgrind:Clobbering...
Removing objdir/
Removing buildprops.json
Removing build/
Skipping last-clobber
Removing src/
Skipping scripts
+ cd /builds/slave/m-cen-lnx-valgrind/scripts/..
+ python /builds/slave/m-cen-lnx-valgrind/scripts/buildfarm/maintenance/purge_builds.py -s 4 -n info -n 'rel-*' -n m-cen-lnx-valgrind
187.61 GB of space available
+ python /builds/slave/m-cen-lnx-valgrind/scripts/buildfarm/utils/hgtool.py --rev c64a9f342156 --tbox http://hg.mozilla.org/mozilla-central src
Traceback (most recent call last):
  File "/builds/slave/m-cen-lnx-valgrind/scripts/buildfarm/utils/hgtool.py", line 72, in <module>
    js = json.load(open(options.propsfile))
IOError: [Errno 2] No such file or directory: '/builds/slave/m-cen-lnx-valgrind/buildprops.json'
+ exit 2
Attachment #658921 - Flags: review?(catlee)
Attached patch configs (obsolete) — Splinter Review
Refreshed configs with mock_copyin_files moved to PLATFORMS.
Attachment #658092 - Attachment is obsolete: true
Attachment #658092 - Flags: review?(catlee)
Attachment #658972 - Flags: review?(catlee)
Attached patch buildbotcustom (obsolete) — Splinter Review
Refreshed buildbotcustom with mock_copyin_files moved to PLATFORMS.
Attachment #658098 - Attachment is obsolete: true
Attachment #658098 - Flags: review?(catlee)
Attachment #658973 - Flags: review?(catlee)
Attached file stdio of alive tests 1
ATM, leak test on linux/linux64 turns orange because of failed alive test 1 (0 based). This is our major stopper here...
> *** glibc detected *** /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/firefox-bin: corrupted double-linked list: 0x0000000001cffaa0 ***
> ======= Backtrace: =========
> /lib64/libc.so.6(+0x75366)[0x7fb28c772366]
> /lib64/libc.so.6(+0x78181)[0x7fb28c775181]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x180b35c)[0x7fb28929135c]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x180be1c)[0x7fb289291e1c]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x17c2e8a)[0x7fb289248e8a]
> /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so(+0x180bf55)[0x7fb289291f55]
> /usr/lib64/libX11.so.6(XCloseDisplay+0xa2)[0x7fb2866dbc82]

> TEST-UNEXPECTED-FAIL | automation.py | Exited with code 1 during test run

Are we meant to get a stack trace from our crash reporter or similar here?

I'd expect glibc to raise SIGABRT, which would normally trigger our crash
reporter.  Is it actually the firefox process exiting with exit code 1 or some
parent process?
Exit code 1 is not what I'd expect from SIGABRT.

If libxul.so is not stripped then

addr2line -if -e /builds/slave/m-cen-lnx64-dbg/build/obj-firefox/dist/bin/libxul.so 0x180b35c

etc, will be helpful.

I fear it might be stripped and symbols in a special format which might need our
own tools, but I don't know what they are.  It may be easier to persuade the
build not to strip.

These could be helpful, whatever:

addr2line -if -e /usr/lib/debug/lib64/libc-2.11.so.debug 0x75366
addr2line -if -e /usr/lib/debug/lib64/libc-2.11.so.debug 0x78181
I picked glibc-debuginfo-2.12-1.80.el6_3.5.x86_64.
Don't know if that is the right one, but it looks close enough.

% addr2line -if -e usr/lib/debug/lib64/libc-2.12.so.debug 0x75366 
ptmalloc_lock_all
/usr/src/debug/glibc-2.12-2-gc4ccff1/malloc/arena.c:279
% addr2line -if -e usr/lib/debug/lib64/libc-2.12.so.debug 0x78181 
_int_free
/usr/src/debug/glibc-2.12-2-gc4ccff1/malloc/malloc.c:4938

I had forgotten we don't use jemalloc in debug builds, so this doesn't tell us much more than a call to free or perhaps delete, which I should have expected.

The libxul.so stack would be the next thing to get, but if it is a previous memory corruption, we may need more investigation.  If the stack doesn't tell us much and you can give me access to the system, I can investigate.

Do these builds run OK on other systems?
Do builds from CentOS 5 run OK on this system?
Installing valgrind and gdb on the system, if not already there, would be helpful, too.
Attachment #658921 - Flags: review?(catlee) → review+
Assigning it to catlee for a week
Assignee: rail → catlee
Comment on attachment 658973 [details] [diff] [review]
buildbotcustom

Review of attachment 658973 [details] [diff] [review]:
-----------------------------------------------------------------

r- because I'm pretty sure the talos branch name is busted.

::: process/factory.py
@@ +326,5 @@
>                   buildsBeforeReboot=None, branchName=None, baseWorkDir='build',
>                   hashType='sha512', baseMirrorUrls=None, baseBundleUrls=None,
>                   signingServers=None, enableSigning=True, env={},
>                   balrog_api_root=None, balrog_credentials_file=None,
> +                 use_mock=False, mock_target='mozilla-f16-x86_64',

do we really want a default mock_target here? and if so, should it be -f16- ?

@@ +474,5 @@
> +                    target=self.mock_target,
> +                ))
> +                self.addStep(MockInit(
> +                            target=self.mock_target,
> +                ))

nit: use consistent indentation

@@ +990,5 @@
> +        if self.use_mock and self.mock_packages:
> +            self.addStep(MockInstall(
> +                target=self.mock_target,
> +                packages=self.mock_packages,
> +            ))

nit: add step names

@@ -1193,5 @@
> -            bldtgt = 'profiledbuild'
> -        else:
> -            bldtgt = 'build'
> -
> -        command = self.makeCmd + ['-f', 'client.mk', bldtgt,

\o/

@@ -2525,5 @@
>  
> -        if self.profiledBuild and self.branchName not in ('mozilla-1.9.1', 'mozilla-1.9.2', 'mozilla-2.0'):
> -            talosBranch = "%s-%s-pgo-talos" % (self.branchName, self.complete_platform)
> -        else:
> -            talosBranch = "%s-%s-talos" % (self.branchName, self.complete_platform)

is this right?

if branchName == "mozilla-central" and profiledBuild == True, then we want "mozilla-central-win32-pgo-talos" don't we?

@@ -3395,4 @@
>           name='repack_installers_pretty',
>           description=['repack', 'installers', 'pretty'],
> -         command=['sh', '-c',
> -                  WithProperties('make installers-%(locale)s LOCALE_MERGEDIR=$PWD/merged')],

any idea why this used to be 'sh -c ...' instead of a raw 'make installers'?

does $PWD get expanded properly?

@@ -3598,4 @@
>              name='make_bsdiff',
>              command=['sh', '-c',
>                       'if [ ! -e dist/host/bin/mbsdiff ]; then ' +
> -                     'make tier_base; make tier_nspr; make -C config;' +

tier_nspr isn't needed?

@@ +6181,5 @@
>      def __init__(self, scriptRepo, scriptName, cwd=None, interpreter=None,
>              extra_data=None, extra_args=None,
>              script_timeout=1200, script_maxtime=None, log_eval_func=None,
> +            reboot_command=None, hg_bin='hg', platform=None,
> +            use_mock=False, mock_target='mozilla-f16-x86_64',

same question/resolution here for default mock_target value.

@@ +6290,5 @@
> +
> +            self.addStep(MockInstall(
> +                target=self.mock_target,
> +                packages=self.mock_packages,
> +            ))

nit: step names

::: steps/mock.py
@@ +61,5 @@
> +            target=target,
> +        )
> +
> +    def set_mock_command(self):
> +        #This variable is used to decide whether to wrap the

you're leaving me hanging here...
Attachment #658973 - Flags: review?(catlee) → review-
Depends on: 790444
Depends on: 790506
Ok, leak tests are now running - onto the next hurdle!

compare leaks doesn't work on the 32-bit slaves...they probably need to run under mock as well
Attached patch mockify compare leak steps (obsolete) — Splinter Review
Back to John to get us through the final stretch!
Assignee: catlee → jhopkins
Attached patch buildbot-configsSplinter Review
Attachment #658972 - Attachment is obsolete: true
Attachment #658972 - Flags: review?(catlee)
Attachment #661244 - Flags: review?(catlee)
Attached patch buildbotcustomSplinter Review
Attachment #658973 - Attachment is obsolete: true
Attachment #660532 - Attachment is obsolete: true
Attachment #661245 - Flags: review?(catlee)
Attachment #661244 - Flags: review?(catlee) → review+
Attachment #661245 - Flags: review?(catlee) → review+
For the record, you don't have to wait for bug 772563 before actually switching to mock. The nightlies can be broken on (really) old distros for a couple days, it's not a problem.
Thanks Mike.

I'm currently going through test builds and looking at why some have warnings.  We'd like to deploy this week if possible.
Comment on attachment 662573 [details] [diff] [review]
buildbot-configs follow-up patch to include /tools/buildbot/bin in PATH for linux32/64 builds + debug

Review of attachment 662573 [details] [diff] [review]:
-----------------------------------------------------------------

nit: /usr/local/bin is listed twice
r+ with that cleaned up
Attachment #662573 - Flags: review?(catlee) → review+
NB: When we enable Thunderbird builds on mock slaves we need a patch like the one on bug 772563 to block updates to deprecated platforms.
Attachment #662573 - Flags: checked-in+
Depends on: 792859
Depends on: 793016
(In reply to Chris AtLee [:catlee] from comment #53)
> Comment on attachment 658973 [details] [diff] [review]
> buildbotcustom
> 
> Review of attachment 658973 [details] [diff] [review]:
<..>
> 
> @@ -3395,4 @@
> >           name='repack_installers_pretty',
> >           description=['repack', 'installers', 'pretty'],
> > -         command=['sh', '-c',
> > -                  WithProperties('make installers-%(locale)s LOCALE_MERGEDIR=$PWD/merged')],
> 
> any idea why this used to be 'sh -c ...' instead of a raw 'make installers'?
> 
> does $PWD get expanded properly?


Apparently not, see bug 793088
Blocks: 793088
Depends on: 793576
Depends on: 793217
Depends on: 793642
No longer depends on: 793642
Depends on: 793782
Depends on: 794063
Blocks: 794185
Depends on: 794339
Summary: Migrate desktop linux32, linux64 firefox builds to mock slaves → Migrate desktop linux32, linux64 firefox builds to mock slaves (and migrate Firefox Linux builders to centos6)
Blocks: 794378
Depends on: 802114
Depends on: 813039
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.