Closed Bug 858797 Opened 12 years ago Closed 10 years ago

fx desktop builds in mozharness

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mozilla, Assigned: jlund)

References

Details

(Whiteboard: [mozharness])

Attachments

(13 files, 7 obsolete files)

68.77 KB, patch
mozilla
: review+
jlund
: checked-in+
Details | Diff | Splinter Review
318 bytes, patch
mozilla
: review+
Details | Diff | Splinter Review
14.52 KB, patch
Details | Diff | Splinter Review
52.35 KB, patch
mozilla
: review+
jlund
: checked-in+
Details | Diff | Splinter Review
1.66 KB, patch
mozilla
: review+
jlund
: checked-in+
Details | Diff | Splinter Review
17.90 KB, patch
Details | Diff | Splinter Review
16.49 KB, patch
Details | Diff | Splinter Review
1.85 KB, text/x-patch
Details
368.57 KB, text/x-patch
Details
16.74 KB, patch
bhearsum
: review+
jlund
: checked-in+
Details | Diff | Splinter Review
18.96 KB, patch
bhearsum
: review+
jlund
: checked-in+
Details | Diff | Splinter Review
2.12 KB, patch
catlee
: review+
Callek
: review+
jlund
: checked-in+
Details | Diff | Splinter Review
4.74 KB, patch
catlee
: review+
Details | Diff | Splinter Review
May cover Android as well.

This means m-c level repos will use mozharness to build fx desktop.
This will require porting the bulk of MercurialBuildFactory, TryBuildFactory, and NightlyBuildFactory to mozharness.
Depends on: 862018
Depends on: 862017
Product: mozilla.org → Release Engineering
Assignee: aki → jlund
Status update: 

My priorities have been with getting Talos tests to run in metro browser but I wanted to share where I am at with this. I have ~75% of the steps ported from Buildbot to Mozharn for a vanilla (not nightly, xul, asan specific, etc) Linux 32 bit build. Of that 75, 50% has been tested and ran. 50% is just after the actual compile step. 

for the actual implementation, code can be seen here: https://github.com/lundjordan/mozharness/tree/fx-desktop-builds

in particular:
https://github.com/lundjordan/mozharness/blob/fx-desktop-builds/scripts/fx_nightly_build.py
https://github.com/lundjordan/mozharness/blob/fx-desktop-builds/mozharness/mozilla/building/buildbase.py
https://github.com/lundjordan/mozharness/blob/fx-desktop-builds/configs/builds/linux_builds.py

along with some other minor changes to purge, mock mixins.
======================================= Status Update:

The following Linux Mozilla-central Builder variants *have* been ported to Mozharness:
    * Linux mozilla-central
    * Linux mozilla-central leak test
    * Linux x86-64 mozilla-central
    * Linux x86-64 mozilla-central asan
    * Linux x86-64 mozilla-central debug asan
    * Linux x86-64 mozilla-central leak test
    * Linux x86-64 mozilla-central debug static analysis

The following Linux Mozilla-central Builder variants *have not* been ported
    * Linux mozilla-central pgo-build
    * Linux mozilla-central nightly
    * Firefox mozilla-central linux l10n nightly
    * Firefox mozilla-central linux l10n dep
    * Linux mozilla-central xulrunner nightly
    * Linux x86-64 mozilla-central pgo-build
    * Linux x86-64 mozilla-central nightly
    * Firefox mozilla-central linux64 l10n nightly
    * Linux x86-64 mozilla-central valgrind
    * Firefox mozilla-central linux64 l10n dep
    * Linux x86-64 mozilla-central hsts preload update
    * Linux x86-64 mozilla-central xulrunner nightly
    * Linux x86-64 mozilla-central asan nightly
    * Linux x86-64 mozilla-central debug asan nightly

I will be publishing several blogs explaining the port process and how mozharness replaces the fx_build factories/steps in buildbot but for this update, I'll summarize:

Fx Mozharness Build logic can still be tracked here:
    https://github.com/lundjordan/mozharness/tree/fx-desktop-builds

Mozharness files:
    scripts/fx_desktop_build.py
    mozharness/mozilla/building
    ├── buildbase.py
    configs/builds
    ├── releng_base_linux_32_builds.py
    ├── releng_base_linux_64_builds.py
    └── releng_sub_linux_configs
        ├── 32_debug.py
        ├── 64_asan_and_debug.py
        ├── 64_asan.py
        ├── 64_debug.py
        └── 64_stat_and_debug.py

Explanation of above config files: releng_base_linux_64_builds.py is used for all linux 64 bit builds. Additional configs from the releng_sub_linux_configs dir can be added to this (eg: 64_asan_and_debug.py). These sub configs should only contain keys/values that replace releng_base_linux_64_builds keys/values.

Buildbot uses either ScriptFactory or SigningScriptFactory objects in buildbotcustom/misc.py to handle the invocation of 'scripts/fx_desktop_build.py'

The arguments for this mozharness script are defined in buildbot-configs/mozilla/config.py. An example of this would be:
'linux64-asan': {
    'mozharness_config': {
        'script_name': 'scripts/fx_desktop_build.py',
        'extra_args': [
            '--config', 'builds/releng_base_linux_64_builds.py',
            '--custom-build-type', 'asan',
        ],
        'reboot_command': ['scripts/external_tools/count_and_reboot.py',
                           '-f', '../reboot_count.txt','-n', '1', '-z'],
    }
}

The above '--custom-build-type' is explained by the help string:
  --custom-build-type=BUILD_TYPE
                        Sets the build type and will determine appropriate
                        additional config to use. Examples include: ['debug',
                        'asan', 'stat-and-debug', 'asan-and-debug']

More build-types will be implemented as builders are added. The motivation for this option was to remove the need to specify every config needed, thus cleaning up and simplifying the arg list. To combat against losing explicity, mozharness will be more verbose in what configs are driving the script and how 'self.config' is ultimately formed. More on that to come.

What's next?
    I see three possible options.
        1) Continue implementing the builders in the m-c branch that have yet to be ported (ie: pgo, nightly, xulrunner, etc).
        2) Start implementing Builders in other platforms (ie: mac, win)
        3) Implement the builders that have been completed in m-c, in other branches (ie: m-r, try, mozilla-esr24, etc).

Does anyone have any suggestions which would be best (or preferable if they were to do it)?

Will post today's latest logs in next comment, later tonight.
Example Logs:

Builder: Linux mozilla-central
Logs: http://people.mozilla.org/~jlund/linux64_m-c_build.1016.txt

Builder: Linux x86-64 mozilla-central asan
Logs: http://people.mozilla.org/~jlund/linux64_m-c_build_asan.1016.txt

Also, after talking with Coop, I will be implementing the PGO and Nightly variants for Linux which will cover most linux builds. Following that I will most likely start with another platform, eg: macosx
======================================= Status Update:

pgo has been implemented thus these variants have been ported:
    * Linux mozilla-central pgo-build
    * Linux x86-64 mozilla-central pgo-build

currently working on nightly builds for these variants:
    * Linux mozilla-central nightly
    * Linux x86-64 mozilla-central nightly
    * Linux x86-64 mozilla-central asan nightly
    * Linux x86-64 mozilla-central debug asan nightly

I will not be working on l10n, xulrunner, hsts, and valgrind after nightly but instead moving on to mac os x first
====================================== Status Update:

Nightly variants implemented. Among this brings branch and build env (production, preprod, and staging) support + major refactoring that included fixes to coderot, clean ups, and code reuse through additional classes.

buildbotcustom and buildbot-configs have also been updated to support only the completed variants with mozharness on a per branch basis while still building all the other non implemented ones through buildbot.

Now that a major pipeline is done for opt, pgo, and nightlies, mac and windows should only require config changes and minimal script adjustments. This will be completed for Q1 2014 while at least running in cedar/cypress.

Following that, Try and Release builds are to be implemented next.
Jordan: no update since last year. How's progress?
(In reply to Chris Cooper [:coop] from comment #6)
> Jordan: no update since last year. How's progress?

I should have been keeping this more current. I'll improve on that. 

Work on this has been put aside for the last few weeks. For metro talos, buildduty, and PTO reasons. However I have been itching away at it while I can:

I have migrated more branch/pool support. I have added some internal changes to Mozharness recently. Namely script.py and config.py. These, along with purge, signing, buildbot, and mock will be out for review by end of week.

This will allow me to then land my two main scripts and related configs.

I will be focusing on this solely for the next few weeks so progress should roll out much quicker.
Also, for the curious, I started a blog series on this: http://jordan-lund.ghost.io/first-assignment/
Depends on: 973491
Depends on: 973492
Depends on: 973493
Patches starting to trickle in to support upcoming desktop script merge (see dep bugs). 

Next up will be patch for config.py and script.py. However, I'll need nosetests before landing these. I wrote a blog post outlining the features this would add: http://jordan-lund.ghost.io/diving-into-list-config-files/

Following this, I'll be landing the main build scripts in Mozharness.
Depends on: 974777
Depends on: 978307
Depends on: 978319
Depends on: 978507
Blocks: 978510
Depends on: 978512
Status of this has been staying updated in the respective dep bugs but here is a high level road map:

- All Linux Builders have been implemented and tested on my dev master staging environment.
- All Branches have been implemented for every platform.
- Buildbot now supports Desktop Mozharness builds and Buildbot based ones: we can turn this on via a by branch, platform, and even build variant basis
- Windows is still a work in progress

Landing plan:

1) get linux platforms on a few branches using production pool environment
2) implement TryBuildFactory prior to m-c branches using linux mozharn builders
3) roll out linux builders across branches bit by bit
4) add windows to production + branch roll out
5) add mac to production (possibly depending on x-compiling work)

(3 and 4) will be worked on simultaneously.

(2) was presented as an issue today by Callek as without it, m-c and try-server results will differ since m-c would use mozharness and try would be buildbot-based. Because of this, this AFAIK should be implemented
Depends on: 989827
Depends on: 990282
https://tbpl.mozilla.org/?tree=Cedar&fromchange=4f35eade7dc6&tochange=9db662707aba - they seem a little confused by a DONTBUILD rev that lands while they are still pending on the rev before.
interesting! thanks for the catch

After looking at the logs: https://tbpl.mozilla.org/php/getParsedLog.php?id=37327684&tree=Cedar

the buildid and builduid are set correctly but the revision logic is incorrect. The revision is originally grabbed from buildbot props and that cset is correct. You can see it in properties and sourcestamp.

However it seems to be ignored by hgtool.py

IIUC hgtool.py is supposed to look for it in sourcestamp: http://mxr.mozilla.org/build/source/tools/buildfarm/utils/hgtool.py#75

but when I make the call to hgtool:
hgtool.py --bundle https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles/cedar.hg https://hg.mozilla.org/projects/cedar /builds/slave/ced-lx-00000000000000000000000/build/source

hgtool.py seems to ignore the revision and does not 'hg pull -r <rev>' or 'hg update -C -r <rev>' like it does in buildbot-based builds.

I wonder if `which hgtool.py` is an older version of hgtool.py on this machine. I think it's about time that I use a clone checkout of tools hgtool.py. This will have to be done anyway since I am not sure if hgtool.py is on every machines PATH.
status update =====

I have been on Buildduty this past week and I will be on PTO from now till April 15th.

Currently comment 12 has been addressed and Cedar is builds are running green again. I have a ported TryBuildFactory and win32 opt build support. This needs to be tested and landed first in their respective bugs attached to this tracking bug.
status update =====

I have swapped buildduty schedule with armen so I am again on buildduty. I will try to land a few patches in between so this keeps churning.
Depends on: 999905
Depends on: 999929
status update =====

bug 978507 is gaining traction. We are going to strip out all make targets called from automation, and integrate it into a proxy/script/mach call.

this will dramatically remove the majority of logic in mozharness.

There are a number of benefits to this, see Bug 978211 for details on that.

continuing trying to land current work on mozharness linux and windows builds across all branches will be put on hold due to this process change.
Depends on: 978211
I just want to commit what I have even though much of this will be ripped out. It will keep mozharness up to date and show whats needed for upcoming build config logic


- adds a working patch for porting TryBuildFactory:
* adds 'who' item for try builds
* adds query_revision method and beefs up revision look up logic so that try can clone by revision


- adds support for windows builds (win32 90% done):
* adds 'sh' to fetch_and_unpack call for window's sake
* adds windows32 config
* switches to run_command_m over run_command for non mock dep builds
* improves generate_build_stats logic and adds _count_vsize to it for win builds

- fixes bitrot against buildbot based builds:
* uses mshal's tooltool_cache changest
* adds .boto to mock files
* nightly builds should do 'make -k check'


- general improvements:
* switches to native 'hg' for cloning tools and then clones src with hgtool from tools repo
* removes unused 'download_base_url'
* combines ccache_env to env since some items were already there and it simplifies things
* switches to a list of items for post build make targes instead of booleans like   'enable_package_tests'. This removes duplication and simplifies things
* uses stage_platform now. Allowing 'platform' to stay as 'linux', 'windows', and 'mac' and 'stage_platform' now representing platform+variant
* moves old_packages list to default_config since it is the same platform wide
* removes 'mock_files' from all sub linux platform config files since it is redundant
* _pre_config_lock will now check if arguments were passed that resulted in BuildConfig
  adding more config files. Helps with explicitness.
* cat mozconfig after it's been added
* removes dep on buildbot_config in purgemixin.
Attachment #8417497 - Flags: review?(aki)
Comment on attachment 8417497 [details] [diff] [review]
140505-858797_mozharn_desktop_builds-windows_and_try-mozharness.patch

My review comments are mainly around the shared modules, since we're planning on revamping a lot of the other files soonish anyway.

>diff --git a/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py b/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py
>index bc54f09..38390b7 100644
>--- a/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py
>+++ b/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py
>@@ -3,7 +3,7 @@ MOZ_OBJDIR = 'obj-firefox'
> config = {
>     'default_actions': [
>         'clobber',
>-        'pull',
>+        'clone-tools',

I think you're cloning tools to get hgtool, but then how do you get the source?
Guessing you want to have a clone-tools then pull?

>diff --git a/mozharness/base/vcs/hgtool.py b/mozharness/base/vcs/hgtool.py
>index bd0564e..067554b 100644
>--- a/mozharness/base/vcs/hgtool.py
>+++ b/mozharness/base/vcs/hgtool.py
>@@ -46,6 +46,8 @@ class HgtoolVCS(ScriptMixin, LogMixin):
>         #  revision: revision,
>         #  ssh_username: ssh_username,
>         #  ssh_key: ssh_key,
>+        #  clone_by_revision: clone_by_revision,
>+        #  clone_with_purge: clone_with_purge,
>         # }
>         self.vcs_config = vcs_config
>         self.hgtool = self.query_exe('hgtool.py', return_type='list')
>@@ -65,11 +67,18 @@ class HgtoolVCS(ScriptMixin, LogMixin):
>         revision = c.get('revision')
>         branch = c.get('branch')
>         share_base = c.get('vcs_share_base', os.environ.get("HG_SHARE_BASE_DIR", None))
>+        clone_by_rev = c.get('clone_by_revision')
>+        clone_with_purge = c.get('clone_with_purge')
>         env = {'PATH': os.environ.get('PATH')}
>+        if c.get('env'):
>+            env.update(c['env'])

This is probably fine.  Sometimes 'env' is for a specific [non-hgtool] step, which isn't ideal here.  Maybe we should make self.config['env'] be the env for the entire script, and any specialized envs are in other config keys.

>--- a/mozharness/mozilla/purge.py
>+++ b/mozharness/mozilla/purge.py
>-    purge_tool = os.path.join(external_tools_path, 'purge_builds.py')
>-    clobber_tool = os.path.join(external_tools_path, 'clobberer.py')
>-
>+    purge_tool = [
>+        'python', os.path.join(external_tools_path, 'purge_builds.py')
>+    ]
>+    clobber_tool = [
>+        'python', os.path.join(external_tools_path, 'clobberer.py')
>+    ]

What's the rationale here?
purge_builds.py has #!/usr/bin/env python; it should do the right thing.  Is it broken on Windows?
clobberer.py has /usr/bin/python, which is wrong; I'd like that fixed to /usr/bin/env python.

This probably works, unless the correct python isn't in PATH, in which case we need to fall back to either self.query_exe('python') or self.query_python_path(), depending on whether we have a virtualenv.

Ideally /usr/bin/env python works everywhere, and we fix clobberer.py.  If there's another reason, this probably doesn't hurt.
Attachment #8417497 - Flags: review?(aki) → review+
(In reply to Aki Sasaki [:aki] from comment #17)
> Comment on attachment 8417497 [details] [diff] [review]
> 140505-858797_mozharn_desktop_builds-windows_and_try-mozharness.patch
> 
> My review comments are mainly around the shared modules, since we're
> planning on revamping a lot of the other files soonish anyway.

in retrospect a lot of my issues with windows involved PATH / env. Previously I copied: 
    'PATH': "${MOZILLABUILD}nsis-2.46u;${MOZILLABUILD}python27;${MOZILLABUILD}buildbotve\\scripts;${PATH}",

from our win32 config in mozilla/config.py. This works fine for Buildbot as I guess it uses a bash like shell where as Mozharness just uses straight up python.

I fixed this error after I made purge.py and hgtool.py changes that are in this patch. I will re-visit these changes and see if we still need them once we get back to windows builds need. For now, I am only going to apply non-shared files.

> >diff --git a/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py b/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py
> >index bc54f09..38390b7 100644
> >--- a/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py
> >+++ b/configs/builds/releng_sub_linux_configs/64_asan_and_debug.py
> >@@ -3,7 +3,7 @@ MOZ_OBJDIR = 'obj-firefox'
> > config = {
> >     'default_actions': [
> >         'clobber',
> >-        'pull',
> >+        'clone-tools',
> 
> I think you're cloning tools to get hgtool, but then how do you get the
> source?
> Guessing you want to have a clone-tools then pull?
>

in the preflight_ 'build' action I clone src: http://hg.mozilla.org/build/mozharness/file/4f2efaa47950/mozharness/mozilla/building/buildbase.py#l1337

I could make _checkout_src() its own action?

IIRC I created clone_tools() instead of using pull() so that I can explicitly say that I wanted to use 'hg' of over 'hgtool'. pull() uses 'default_vcs' to decide but I set that to 'hgtool' for cloning src.
(In reply to Jordan Lund (:jlund) from comment #18)
> in the preflight_ 'build' action I clone src:
> http://hg.mozilla.org/build/mozharness/file/4f2efaa47950/mozharness/mozilla/
> building/buildbase.py#l1337
> 
> I could make _checkout_src() its own action?

Not forcing someone to pull source before building is good.

> IIRC I created clone_tools() instead of using pull() so that I can
> explicitly say that I wanted to use 'hg' of over 'hgtool'. pull() uses
> 'default_vcs' to decide but I set that to 'hgtool' for cloning src.

You can also

self.config['repos'] = [{
    "repo": "https://hg.mozilla.org/build/tools",
    "vcs": "hg",
}, {
    "repo": "https://hg.mozilla.org/mozilla-central",
    "vcs": "hgtool",
    ...
}, {
    ...
}]

Not really important at this point, but for future use.
Comment on attachment 8417497 [details] [diff] [review]
140505-858797_mozharn_desktop_builds-windows_and_try-mozharness.patch

pushed to default: https://hg.mozilla.org/build/mozharness/rev/1ae4317b104e

I lied in previous comment. I ended up including hgtool.py changes but I did include modifying the env
Attachment #8417497 - Flags: checked-in+
status update =====

work for build config automation (bug 978211) has landed. We now have ported build logic out of buildbot and implemented it across mozharness and our mozconfigs(through mach).

We are rolling out this new build impl (mozharness + mach) before EOW on cedar for all linux variants. platforms and remaining branches will land in iterations assuming we have no rollbacks
Component: General Automation → Mozharness
Attached file dmast_diff (obsolete) —
Attached file builderlist_diff (obsolete) —
for mozharness builds:

- adds ability to pass custom interpreter, timeout, and maxtime to ScriptFactory
- adds return codes that buildbot can grok
- makes only nightly builds that are run against pgo enabled platforms, run with pgo
- will allow for 'try' builders to be created against mozharness once 'try' config flips the switch in buildbot-configs (no-op)

dumpmaster diff: https://bugzilla.mozilla.org/attachment.cgi?id=8455139
builderlist diff: https://bugzilla.mozilla.org/attachment.cgi?id=8455140
Attachment #8455142 - Flags: review?(bhearsum)
for mozharness builds:

- preps all mac and windows builders to have the ability to go through mozharness if the branch has it enabled[1]
- cedar is still the only branch that has mozharness builds enabled so that will be the only affected branch
- sets timeout, maxtime to match existing builds through buildbot
- adds explicit interpreter for mozharness to be called from


dumpmaster diff: https://bugzilla.mozilla.org/attachment.cgi?id=8455139
builderlist diff: https://bugzilla.mozilla.org/attachment.cgi?id=8455140

[1]: http://mxr.mozilla.org/build/source/buildbot-configs/mozilla/project_branches.py#142
Attachment #8455147 - Flags: review?(bhearsum)
changes:

- sets stage username/ssh key differently for try builds
- removes un-needed symbols env keys for variants that don't uploadsymbols (e.g. asan)
- consolidates symbols env host/path into one key for clarity and necessity (windows munges path into a msys path even though it is unix)
- adds partial gen logic back into mozharness for now
- linux mulet build support
- mac debug, debug+nonunified, and nonunified support
- windows debug+nonunified support
- changes python shebang in clobberer.py to use env
- makes query_virtualenv_path be able to return None so query_python_path() can actually use query_exe('python') as the binary if virtualenv is not set
- fixes pgo env key typo
- allows for build props to be generated anytime after a build or even if the build action is not in current script run. (as long as the last build wasn't clobbered)
- if mach build returns anything but 0, set the build to a failure (2)
- fixes condition typo where enable_max_vsize was never getting hit
- makes purge.py windows friendly[1]
- changes 'source' to 'src' for the source repo name to be a stopgap of path names that are too long in windows.


[1] comment 17 touched on this previously but I can confirm now that I need these changes. even with using env python in shebang python file and python being in PATH, the command line env I get when running this requires me to specify the python interpreter path explicitly. I tried to do this in a way that was requested in comment 17
Attachment #8455153 - Flags: review?(aki)
accompanying clobberer.py change that I made in mozharness external_tools.
Attachment #8455155 - Flags: review?(aki)
(In reply to Jordan Lund (:jlund) from comment #26)
> Created attachment 8455153 [details] [diff] [review]
> 140713_bug_858797_fx_desktop_builds_mac_windows-mozharness.patch
> 
> changes:

there is a typo in this attachment. the following dev change shouldn't be here:
   - "repo_base": "https://hg.mozilla.org/users/mshal_mozilla.com"

I'll upload a new patch with it removed if requested.
Attachment #8455155 - Flags: review?(aki) → review+
Comment on attachment 8455147 [details] [diff] [review]
140713_bug_858797_fx_desktop_builds-mac_windows-bbot-cfgs.patch

Review of attachment 8455147 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/config.py
@@ +154,5 @@
>                  ],
>                  'reboot_command': ['scripts/external_tools/count_and_reboot.py',
>                                     '-f', '../reboot_count.txt', '-n', '1', '-z'],
> +                'script_timeout': 3 * 3600,
> +                'script_maxtime': 5.5 * 3600,

Does maxtime support being a float? I don't think I've ever seen this.
Attachment #8455147 - Flags: review?(bhearsum) → review+
Comment on attachment 8455142 [details] [diff] [review]
140713_bug_858797_fx_desktop_builds-mac_windows-bbotcustom.patch

Review of attachment 8455142 [details] [diff] [review]:
-----------------------------------------------------------------

::: misc.py
@@ +737,4 @@
>  
>      factory = factory_class(
>          scriptRepo=scriptRepo,
> +        interpreter=mh_cfg.get('mozharness_python'),

I have no issue with this, but why is it needed?

@@ +1105,5 @@
>      # if do_nightly:
>      if config['enable_nightly'] and pf.get('enable_nightly', True):
> +        nightly_extra_args = base_extra_args + config['mozharness_desktop_extra_options']['nightly']
> +        # if this builder is a pgo platform, make the nightly build use pgo
> +        if (config['pgo_strategy'] in ('periodic', 'try') and

What about "per-checkin"? Should this be just a plain 'if config["pgo_strategy"]' instead?
Comment on attachment 8455153 [details] [diff] [review]
140713_bug_858797_fx_desktop_builds_mac_windows-mozharness.patch

>+            env['MOZ_SIGN_CMD'] = moz_sign_cmd.replace('\\', '\\\\\\\\')

Boo.  I thought we were going to get rid of this type of thing when we got rid of pymake.  Makes sense, though.


>+    def _query_props_set_by_mach(self, console_output, error_level):
>+        mach_properties_path = os.path.join(
>+            self.query_abs_dirs()['abs_obj_dir'], 'mach_build_properties.json'
>+        )
>+        self.info("setting properties set by mach build. Looking in path: %s"
>+                  % mach_properties_path)
>+        if os.path.exists(mach_properties_path):
>+            with open(mach_properties_path) as build_property_file:

with self.opened() ?  Not a must, but if it's helpful, it's there.

>+                build_props = json.load(build_property_file)
>+                if console_output:
>+                    self.info("Properties set from 'mach build'")
>+                    pprint.pformat(build_props)

It's good that you want to output this, but does this even get output to console?
pprint.pprint() would print to console only, which is better but not what we want.
I think you want self.info(pprint.pformat(build_props))

Looks like I should have caught this earlier, but better late than never.
Attachment #8455153 - Flags: review?(aki) → review+
(In reply to Ben Hearsum [:bhearsum] from comment #29)
> Comment on attachment 8455147 [details] [diff] [review]
> 140713_bug_858797_fx_desktop_builds-mac_windows-bbot-cfgs.patch
> 
> Review of attachment 8455147 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mozilla/config.py
> @@ +154,5 @@
> >                  ],
> >                  'reboot_command': ['scripts/external_tools/count_and_reboot.py',
> >                                     '-f', '../reboot_count.txt', '-n', '1', '-z'],
> > +                'script_timeout': 3 * 3600,
> > +                'script_maxtime': 5.5 * 3600,
> 
> Does maxtime support being a float? I don't think I've ever seen this.

oops, I missed the int conversion we set here: http://mxr.mozilla.org/build/source/buildbotcustom/process/factory.py#1362

however, buildbot seems to not care as all my tests did the right thing:
 in dir c:\builds\moz2_slave\m-cen-w32-ntly-000000000000000\. (timeout 10800 secs) (maxTime 19800 secs)

Either way, I'll add the int() call before landing
(In reply to Ben Hearsum [:bhearsum] from comment #30)
> Comment on attachment 8455142 [details] [diff] [review]
> 140713_bug_858797_fx_desktop_builds-mac_windows-bbotcustom.patch
> 
> Review of attachment 8455142 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: misc.py
> @@ +737,4 @@
> >  
> >      factory = factory_class(
> >          scriptRepo=scriptRepo,
> > +        interpreter=mh_cfg.get('mozharness_python'),
> 
> I have no issue with this, but why is it needed?

the default python is not always python 2.7 (mach build requirement) on every slave[1]. setting it explicitly lets me use sys.executable and also use same version of python everywhere. We use the mozharness_python idiom for unittests AFAIK

> 
> @@ +1105,5 @@
> >      # if do_nightly:
> >      if config['enable_nightly'] and pf.get('enable_nightly', True):
> > +        nightly_extra_args = base_extra_args + config['mozharness_desktop_extra_options']['nightly']
> > +        # if this builder is a pgo platform, make the nightly build use pgo
> > +        if (config['pgo_strategy'] in ('periodic', 'try') and
> 
> What about "per-checkin"? Should this be just a plain 'if
> config["pgo_strategy"]' instead?

my understanding was that we only use per-checkin to decide on unittest-branch name (for sendhchanges)[2]. For deciding if the build was actually going to be pgo or not, I copied the same logic from here[3]. Does that make sense or do I still have it wrong?

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=602908#c62
[2] http://mxr.mozilla.org/build/source/buildbotcustom/misc.py#1715
[3] http://mxr.mozilla.org/build/source/buildbotcustom/misc.py#1850
Comment on attachment 8455142 [details] [diff] [review]
140713_bug_858797_fx_desktop_builds-mac_windows-bbotcustom.patch

Review of attachment 8455142 [details] [diff] [review]:
-----------------------------------------------------------------

> my understanding was that we only use per-checkin to decide on unittest-branch name (for sendhchanges)[2]. For deciding if the build was actually going to be pgo or not, I copied the same logic from here[3]. Does that make sense or do I still have it wrong?

I'm going to assume you're right about this.
Attachment #8455142 - Flags: review?(bhearsum) → review+
interdiff of upcoming mozharness patch
interdiff of upcoming mozharness patch for realz
Attachment #8456314 - Attachment is obsolete: true
see interdiff here for what has changed: https://bugzilla.mozilla.org/attachment.cgi?id=8456324

summary:

- fights bitrot of branch changes (mainly b2g branches)
- rms symbol server target, path, and user (those values now live in build cfgs)
- windows config had a reference to old m-c checkout path 'source' instead of 'src'.
- base_name for windows is used for resultsname in graph server post. Needs underscores instead of spaces
- switches python's 'open()' to use 'self.opened()'
- fixes lost pprint output that wasn't grabbed by self.info() in mach prop gen
- find cmds are now run and compiled in bash. windows still needs some slash love with find_dir
- allows for worst_level self.return_code to persist if its higher than a target
- give other actions the ability to generate_build_props() incase the action is run standalone
- installer_url is needed and if it is not set, sendchange will fail with python errors. This burns the job but we don't want the job to burn so we catch it and set it orange
- unfortunately os.env is munged from buildbot -> mozharness so we need hgtool.py to use vcs_config['env'] if it exists. I prefer this as I don't want to rely on env set from buildbot. comment 17 said this was probably OK. We still fallback to os.env if env is not set in vcs_config.
- rms mshal m-c reference
Attachment #8455153 - Attachment is obsolete: true
Attachment #8456326 - Flags: review?(aki)
(In reply to Jordan Lund (:jlund) from comment #36)
> Created attachment 8456324 [details] [diff] [review]
> 140715_bug_858797_fx_desktop_builds_mac_windows-mozharness-interdiff2.patch
> 
> interdiff of upcoming mozharness patch for realz

this snuck in here and it is part of a patch I have for Bug 1034925. You can see that in my full mozharn patch currently in r? it's not there.
(In reply to Jordan Lund (:jlund) from comment #38)
> (In reply to Jordan Lund (:jlund) from comment #36)
> > Created attachment 8456324 [details] [diff] [review]
> > 140715_bug_858797_fx_desktop_builds_mac_windows-mozharness-interdiff2.patch
> > 
> > interdiff of upcoming mozharness patch for realz
> 
> this snuck in here and it is part of a patch I have for Bug 1034925. You can
> see that in my full mozharn patch currently in r? it's not there.

bah, sorry for the noise. I didn't mean to hit send

diff --git a/mozharness/mozilla/updates/balrog.py b/mozharness/mozilla/updates/balrog.py
index ab36301..304ab3b 100644
--- a/mozharness/mozilla/updates/balrog.py
+++ b/mozharness/mozilla/updates/balrog.py
@@ -42,4 +42,4 @@ class BalrogMixin(object):
             self.run_command, args=(cmd,),
         )
         if return_code not in [0]:
-            self.return_code = 3
+            self.return_code = 1
Attachment #8456326 - Flags: review?(aki) → review+
Comment on attachment 8456326 [details] [diff] [review]
140715_bug_858797_fx_desktop_builds_mac_windows-mozharness.patch

https://hg.mozilla.org/build/mozharness/rev/da2e0cf4a096

will also do two follow up steps right now:
1) clobber linux cedar builders since we changed source path: s/source/src
2) do a full run on cedar while pulling from m-c as it depends on latest m-c
Attachment #8456326 - Flags: checked-in+
part of this was backed out:

http://hg.mozilla.org/build/mozharness/rev/bbfb3b43c216

the motivation for purge.py and clobberer.py patch was to fix python calls with windows. I am not sure how pre-pending 'python' or changing the shebang to 'env python' broke things but I'd imagine either the interpreter changed or subprocess wasn't a fan.

without any of my changes:
  https://tbpl.mozilla.org/php/getParsedLog.php?id=44451144&full=1&branch=b2g-inbound

after I landed changes and things started to break:

  failed examples (all similar but slightly different):
     https://tbpl.mozilla.org/php/getParsedLog.php?id=44449434&tree=B2g-Inbound
     https://tbpl.mozilla.org/php/getParsedLog.php?id=44438590&full=1&branch=b2g-inbound
     https://tbpl.mozilla.org/php/getParsedLog.php?id=44446887&tree=B2g-Inbound

  successful examples (not every build failed):
    https://tbpl.mozilla.org/php/getParsedLog.php?id=44450704&full=1&branch=b2g-inbound

Now it's the fun time of figuring out what went wrong. I am thinking of landing a patch that will log the path and version of python when query_exe('python') is used and also log sys.executable from within external_tools/clobberer.py so I can contrast the two in automation. Maybe I can just do this on ash but I'll want some quick cycles and lots of jobs to test against.

see: "Bug 1042790 - clobberer step is failing to clobber"  for fall out.
(In reply to Jordan Lund (:jlund) from comment #41)
> Now it's the fun time of figuring out what went wrong.

It looks like we stopped clobbering /builds/slave/* and started clobbering /builds/* which is bad.
That'll remove mock_mozilla, which is where our mock environments are; our hg-shared clean clones; our buildbot slave.  lost+found.

The slaves that hit this may have become permanently broken until reimage, or maybe puppet will save them.
(In reply to Aki Sasaki [:aki] from comment #42)
> (In reply to Jordan Lund (:jlund) from comment #41)
> > Now it's the fun time of figuring out what went wrong.
> 
> It looks like we stopped clobbering /builds/slave/* and started clobbering
> /builds/* which is bad.
> That'll remove mock_mozilla, which is where our mock environments are; our
> hg-shared clean clones; our buildbot slave.  lost+found.
> 
> The slaves that hit this may have become permanently broken until reimage,
> or maybe puppet will save them.

boo, really?

how come we have mock_mozilla and lost+found in /builds/slave:
[cltbld@bld-linux64-spot-1046.build.releng.use1.mozilla.com ~]$ ls /builds/slave/mock_mozilla/
[cltbld@bld-linux64-spot-1046.build.releng.use1.mozilla.com ~]$ ls /builds/slave/lost+found/

both exist.
(In reply to Jordan Lund (:jlund) from comment #43)
> (In reply to Aki Sasaki [:aki] from comment #42)
> > (In reply to Jordan Lund (:jlund) from comment #41)
> > > Now it's the fun time of figuring out what went wrong.
> > 
> > It looks like we stopped clobbering /builds/slave/* and started clobbering
> > /builds/* which is bad.
> > That'll remove mock_mozilla, which is where our mock environments are; our
> > hg-shared clean clones; our buildbot slave.  lost+found.
> > 
> > The slaves that hit this may have become permanently broken until reimage,
> > or maybe puppet will save them.
> 
> boo, really?
> 
> how come we have mock_mozilla and lost+found in /builds/slave:
> [cltbld@bld-linux64-spot-1046.build.releng.use1.mozilla.com ~]$ ls
> /builds/slave/mock_mozilla/
> [cltbld@bld-linux64-spot-1046.build.releng.use1.mozilla.com ~]$ ls
> /builds/slave/lost+found/
> 
> both exist.

Hm, really?

a) Weird, is that its own partition?
b) That probably means my theory is wrong, and there's more debugging needed.
aki: so this doesn't block fx desktop build rollout, how do you feel about having a:

if self.config.get("base_clobber_cmd"): # only impl in windows configs e.g. '{sys.executable} path/clobber'
    base_cmd = self.config.get("clobber_cmd")
else:
    base_cmd = ['path/to/clobberer.py']
[11:54]	<jlund|buildduty>	aki: speaking of which, how about letting this slide: https://bugzilla.mozilla.org/show_bug.cgi?id=858797#c45
[11:54]	<jlund|buildduty>	;)
[11:55]	<aki>	jlund|buildduty: it would be less ugly if you made that a default to a --clobberer-path option
[11:55]	<aki>	--clobber-cmd even
[11:56]	<aki>	also, you refer to both base_clobber_cmd and clobber_cmd ?
[11:56]	* aki	comments in bug
[11:57]	<jlund|buildduty>	aki: ya, sorry that was pretty bad even for pseudo code :)
[11:57]	<aki>	also, not solvable via self.config['exes'] ?
[11:58]	<jlund|buildduty>	aki: well I think you'd have to update every ['exes'] config that uses purge.py
[11:59]	<jlund|buildduty>	unless you did self.query_exe('path/to/clobberer.py')
[11:59]	<jlund|buildduty>	where the path is the key and is returned by default
[11:59]	<aki>	jlund|buildduty: or change this line to check exes: http://hg.mozilla.org/build/mozharness/file/2913947db89d/mozharness/mozilla/purge.py#l27
[11:59]	<aki>	no?
[11:59]	<aki>	in fact, if this is a true-ism across windows, i'd be fine with an is_windows() check there
[12:00]	<aki>	this should only be hit in releng automation, so this isn't a bad place to hardcode paths
[12:01]	<jlund|buildduty>	yes, I think this is true-ism across windows at least in our current automation setup.
[12:01]	<jlund|buildduty>	+ buildbot factories prepend python against the call too
[12:02]	<aki>	i think it would be ok to do an is_windows check, alter the cmd for windows, and then if we ever get a mixed pool where that's not true anymore, revisit at that point
[12:02]	<jlund|buildduty>	sgtm
[12:02]	<jlund|buildduty>	ty
[12:02]	<aki>	should be a 4-liner, i think. the other 2 lines are to deal with http://hg.mozilla.org/build/mozharness/file/2913947db89d/mozharness/mozilla/purge.py#l90
[12:03]	<aki>	need a patch, or do you know what needs changing?
[12:04]	<aki>	actually, you can remove that line and just make clobber_tool a list always
[12:04]	<jlund|buildduty>	aki: I'll whip something up this evening. tx!
[12:04]	<aki>	and cmd = clobber_tool
[12:04]	<aki>	np
this is what I was thinking. I think it achieves what we described maybe not line for line. BTW last patch from me while you're getting paid to review :)
Attachment #8462723 - Flags: review?(escapewindow+mozbugs)
Comment on attachment 8462723 [details] [diff] [review]
240714_bug_mozharness_purge_windows_support.patch

I'd push for a query_clobber_cmd() type method (or other non-dup solution) if we had three or more copies of this block.  This wfm for one or two.
Attachment #8462723 - Flags: review?(escapewindow+mozbugs) → review+
Comment on attachment 8462723 [details] [diff] [review]
240714_bug_mozharness_purge_windows_support.patch

checked in: https://hg.mozilla.org/build/mozharness/rev/6eea1941b891

I'll fire a bunch of suites on cypress that were failing before this is a no-op for existing jobs.
Attachment #8462723 - Flags: checked-in+
Attachment #8455139 - Attachment is obsolete: true
Attachment #8455140 - Attachment is obsolete: true
Attachment #8455142 - Attachment is obsolete: true
Attachment #8455147 - Attachment is obsolete: true
Attached file builderlist_diff
Attached file dump_diff
previous buildbot patches never landed last week due to a mozharness backout and my buildduty shift. 

On friday, massimo landed some repack changes so my original patch no longer made sense anymore as we have more than one mozharness script per plaftorm: normal build + repack build

So this patch does the following:

* in line with 'mozharness_desktop_l10n', this now uses 'mozharness_desktop_build' instead of generic 'mozharness_python'
* shares the interpreter: mozharness_python
* shares the reboot cmd: reboot_command
* rms need for 'mozharness_desktop_build_platforms' since we no longer use the same 'mozharness_config' key with other platforms like b2g/spider (I think repack can do the same too with 'mozharness_desktop_l10n_platforms' but that is a separate issue)

this should be a no-op from last patch. We are still just enabling win+mac on cedar.

latest builderlist and dump_master diff can be found here:
dumpmaster: https://bugzilla.mozilla.org/attachment.cgi?id=8463820&action=edit
buildderlist: https://bugzilla.mozilla.org/attachment.cgi?id=8463819&action=edit

accompanying buildbotcustom patch coming on next attachment.
Attachment #8463825 - Flags: review?(bhearsum)
this patch makes the required changes of buildbot-config patch from comment 54

I uploaded an interdiff to just show changes from last review: https://bugzilla.mozilla.org/attachment.cgi?id=8463818&action=edit

the interdiff from last time:

* since we no longer use generic mozharness_config, makeMHFactory needs to be able to accepts explicitly given mozharness_configs (mh_cfg)
* makeMHFactory's interpreter and reboot_command can now come from pf (like l10n) or mh_cfg (like b2g) 
* no longer uses mozharness_desktop_build_platforms as we have: s/mozharness_config/mozharness_desktop_build

the patch looks confusing because of some indent changes but basically we are...

before we would have to say:
    if mozharness_config in pf:
          if pf in mozharness_desktop_build_platforms:
              # it a desktop build
              # create builders and keep going to find out what we missed
              pass
          else:
              # it a b2g/spider
              # create some builders and don't keep going as all of
              # b2g/spider uses mozharness
              continue

now we:
   if mozharness_desktop_build in pf:
       # it a desktop build
       # create builders and keep going to find out what we missed
       pass
   elif mozharness_config in pf:
        # it a b2g/spider
        # create some builders and don't keep going as all of
        # b2g/spider uses mozharness
        continue
Attachment #8463830 - Flags: review?(bhearsum)
> * in line with 'mozharness_desktop_l10n', this now uses
> 'mozharness_desktop_build' instead of generic 'mozharness_python'

s/mozharness_python/mozharness_config
Attachment #8463825 - Flags: review?(bhearsum) → review+
Attachment #8463830 - Flags: review?(bhearsum) → review+
I r+'ed mostly based on the builder list diff.
Blocks: 1042790
Comment on attachment 8463825 [details] [diff] [review]
140728_bug_858797_desktop_builds-bitrot-against-repacks_bbotcfgs.patch

http://hg.mozilla.org/build/buildbot-configs/rev/61e780040be9
Attachment #8463825 - Flags: checked-in+
Comment on attachment 8463830 [details] [diff] [review]
140728_bug_858797_desktop_builds-bitrot-against-repacks_bbotcustom.patch

http://hg.mozilla.org/build/buildbotcustom/rev/24c866909b09
Attachment #8463830 - Flags: checked-in+
looks like clobberer on windows still isn't happy.

this is due to buildbot passing msys paths into buildprops for basedir.

This patch essentially s/buildbot_config['basedir']/dirs['base_work_dir']. I had something like this before but it was swallowed up as part of a previous backout.
Attachment #8464195 - Flags: review?
something here is now in production
> looks like clobberer on windows still isn't happy.

https://tbpl.mozilla.org/php/getParsedLog.php?id=44820752&tree=Cedar is the error log
Comment on attachment 8464195 [details] [diff] [review]
140728_bug_858797_desktop_builds-makes_purge-use-base_work_dir-mozharness.patch

Review of attachment 8464195 [details] [diff] [review]:
-----------------------------------------------------------------

stamping so jordan can land, expecting someone with more context to review post-landing of course.
Attachment #8464195 - Flags: review+
Comment on attachment 8464195 [details] [diff] [review]
140728_bug_858797_desktop_builds-makes_purge-use-base_work_dir-mozharness.patch

as per callek's post review request in comment 63

this is on default now: https://hg.mozilla.org/build/mozharness/rev/2092f46dc264

re-triggering windows build: https://tbpl.mozilla.org/?tree=Cedar&jobname=WINNT%205.2%20cedar%20build
Attachment #8464195 - Flags: review?(catlee)
Attachment #8464195 - Flags: review?
Attachment #8464195 - Flags: checked-in+
above patch fixed issue and now we are failing at checkout_source


this is due to MOZ_BUILD_DATE being a unicode string in our env.

this wasn't hit on my dev-master as the buildid was never set before forcing a
build and left to mozharness to generate it. But now that buildprops.json does
have buildid in it, we are using that and it's a unicode str.

revision is also something that isn't added on a 'force build' in dev-master so
it is popping up in production like buildid/builduid via buildprops.json.
We should only be using -r {rev} if we set --clone-by-revision (we do this in
try for example).

so this patch:

* stringifies buildid and builduid and revision
* only uses revision in cloning source if clone_by_revsion is true

callek: this is no rush, the cedar tree is broken right now so I can't test mozharness builds: https://tbpl.mozilla.org/?tree=Cedar&rev=97d2978152bf

I can also bump to someone else if you're swamped (you are on buildduty after all)
Attachment #8464443 - Flags: review?(bugspam.Callek)
Blocks: 1046138
see comment 65 for details of this patch.

callek was not confident enough to make the call whether we should supply
'-r {rev}' or not when we checkout source of a desktop build. Looking at this deeper, it turns out that it doesn't matter if we do or not. What matters is if we pass --clone-by-rev (we do --clone-by-rev for try which causes: [1][2] )

this logic is based off:

anytime we clone source in our existing bbot factories it is done here [3]. you can see in MBF (what NightlyBuildFactory uses) we do not pass rev [4]. But it does not matter because hgtool.py will find the rev if it exists or not
via our PROPERTIES_FILE:
http://mxr.mozilla.org/build/source/puppet/modules/packages/files/hgtool.py#35
http://mxr.mozilla.org/build/source/puppet/modules/packages/files/hgtool.py#88

you can see in this log m-i desktop build job: [5] we do not pass '-r {rev}' but
hgtool.py knows that we want rev: 569dda025a06 even though its initial tip is:
075e6f5c332d

So I am going to explicitly leave '-r' in so it is (1) obvious what is going
on and so (2) we do not rely on PROPERTIES_FILE. that is the only diff from patch in comment 65

[1]http://mxr.mozilla.org/build/source/tools/lib/python/util/hg.py#342
[2]http://mxr.mozilla.org/build/source/tools/lib/python/util/hg.py#207
[3] http://mxr.mozilla.org/build/source/buildbotcustom/process/factory.py#719
[4]http://mxr.mozilla.org/build/source/buildbotcustom/process/factory.py#1273
[5] https://pastebin.mozilla.org/5714285
Attachment #8464443 - Attachment is obsolete: true
Attachment #8464443 - Flags: review?(bugspam.Callek)
Attachment #8464821 - Flags: review?(catlee)
Comment on attachment 8464821 [details] [diff] [review]
140728_bug_858797_desktop_builds-rev-buildid-builduid-fixes2.patch

Review of attachment 8464821 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/building/buildbase.py
@@ +802,4 @@
>  
>          if c.get('tinderbox_build_dir'):
>              # TODO find out if we should fail here like we are
> +            if not who and not revision:

should be:
if not who or not revision


try builds are the only ones concerned with this since try will make c.get('tinderbox_build_dir') == True. This check is probably not needed anyway since we default to nobody@m.o for who and revision, unlike before we checkout source where revision may not be included in sourcestamp, will have a value by the time we reach this point or we would have already have failed.
Comment on attachment 8464195 [details] [diff] [review]
140728_bug_858797_desktop_builds-makes_purge-use-base_work_dir-mozharness.patch

Review of attachment 8464195 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8464195 - Flags: review?(catlee) → review+
Comment on attachment 8464821 [details] [diff] [review]
140728_bug_858797_desktop_builds-rev-buildid-builduid-fixes2.patch

Review of attachment 8464821 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/building/buildbase.py
@@ +632,4 @@
>          # now let's see if it's in buildbot_properties
>          elif in_buildbot_props:
>              self.info("Determining buildid from buildbot properties")
> +            self.buildid = str(self.buildbot_config['properties']['buildid'])

if the problem is that these are unicode, you should explicitly encode them to a string somehow.

.encode("ascii", "replace") should work well enough here I think.
Attachment #8464821 - Flags: review?(catlee) → review+
something(s) here went to production today
I am going to resolve this bug. mozharness builds have been live across all our platforms on cedar for a while now. Next steps are rolling out remaining branches. That work will be tracked in follow up bugs
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: