Closed Bug 1080370 Opened 7 years ago Closed 7 years ago

Enable e10s opt browser chrome tests on all desktop platforms

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(e10s+)

RESOLVED FIXED
Tracking Status
e10s + ---

People

(Reporter: markh, Assigned: armenzg)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 6 obsolete files)

+++ This bug was initially created as a clone of Bug #1061014 +++

Bug 1061014 enabled bc tests in e10s on opt linux builds.  Things now appear in a state where we can enable it on all desktop opt platforms \o/

See the tip of holly - all opt platforms are green and I've retriggered each run 5 times - all retries are green (although OS X 10.8 has the retries pending as I file this)

I'm attaching a patch which I cargo-culted from bug 1061014 - I've not tested it, but it looks like it should be close to working :)  It would be great if someone can test it (and the sooner we do it, the less likely we are to find other regressions causing e10s orange people aren't noticing...)
Attachment #8502288 - Flags: feedback?(jmathies)
Attachment #8502288 - Flags: feedback?(armenzg)
Summary: Enable e10s Linux opt browser chrome tests → Enable e10s opt browser chrome tests on all desktop platforms
Comment on attachment 8502288 [details] [diff] [review]
enable_e10s_bc_all_platforms.patch

Since xp currently doesn't generate good log output (child side is ignored) I think you can skip off xp tests -

+        branch['platforms']['win32']['xp-ix']['opt_unittest_suites'] += MOCHITEST_BC_3_E10S[:]

and just run win7/win8 for windows.
Attachment #8502288 - Flags: review?(armenzg)
Attachment #8502288 - Flags: feedback?(jmathies)
Attachment #8502288 - Flags: feedback?(armenzg)
Attachment #8502288 - Flags: feedback+
Comment on attachment 8502288 [details] [diff] [review]
enable_e10s_bc_all_platforms.patch

This patch is going in the right direction, however, it needs some adjustments.

In order to test that the patch is good you need to do the following:
1) setup a buildbot environment [1]
2) run test-masters.sh [2]
3) list_builder_differences.sh [3]

For #2, you would see the failures that I'm seeing and you can create a master to tinker with.
For instance, one of the failing masters can be created like this:
source ~/.mozilla/releng/venv/bin/activate
./setup-master.py master bm109-tests1-windows && cd master
buildbot checkconfig .

You can see the failures in [4].

Adjust config.py and run buildbot checkconfig . until it says "The config files is good!".

At that point you can run ./test-masters.sh until all masters pass.
If they all pass, then you should go to step #3.
#3 will yield a list of builders that are being added/removed.



[1] This is WIP, however, it should do the trick as of now
hg clone http://hg.mozilla.org/build/braindump && cd braindump
wget -Omypatch.txt https://bugzilla.mozilla.org/attachment.cgi?id=8502068
patch -p1 < mypatch.txt
buildbot-related/community/setup_buildbot_environment.sh

[2]
# This path gets created after running setup-buildbot
cd ~/.mozilla/releng/repos/buildbot-configs
wget -Opatch_to_test.diff https://bugzilla.mozilla.org/attachment.cgi?id=8502288
patch -p1 < patch_to_test.diff
source ~/.mozilla/releng/venv/bin/activate
./test-masters.sh

[3]
buildbot-related/list_builder_differences.sh -j path_to_patch.diff
Read more in http://armenzg.blogspot.ca/2014/09/which-builders-get-added-to-buildbot.html

[4]
(venv)armenzg-thinkpad buildbot-configs hg:[default!] $ ./setup-master.py master bm109-tests1-windows
(venv)armenzg-thinkpad buildbot-configs hg:[default!] $ cd master/
(venv)armenzg-thinkpad master hg:[default!] $ buildbot checkconfig .
Traceback (most recent call last):
  File "/home/armenzg/.mozilla/releng/venv/local/lib/python2.7/site-packages/buildbot-0.8.2_hg_c0c54553a432_production_0.8-py2.7.egg/buildbot/scripts/runner.py", line 1040, in doCheckConfig
    ConfigLoader(basedir=configFileName)
  File "/home/armenzg/.mozilla/releng/venv/local/lib/python2.7/site-packages/buildbot-0.8.2_hg_c0c54553a432_production_0.8-py2.7.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
    self.loadConfig(configFile, check_synchronously_only=True)
  File "/home/armenzg/.mozilla/releng/venv/local/lib/python2.7/site-packages/buildbot-0.8.2_hg_c0c54553a432_production_0.8-py2.7.egg/buildbot/master.py", line 652, in loadConfig
    exec f in localDict
  File "./master.cfg", line 10, in <module>
    import config
  File "/home/armenzg/repos/buildbot-configs/master/config.py", line 2016, in <module>
    branch['platforms']['win8']['xp-ix']['opt_unittest_suites'] += MOCHITEST_BC_3_E10S[:]
KeyError: 'win8'
Attachment #8502288 - Flags: review?(armenzg) → review-
I'm afraid I couldn't get those instructions to work - I'm on Windows and I couldn't get it to work in an msys or native environment.  However, I did find the problem you found in the patch and have fixed it, as well as removing XP as suggested by Jim.  Can you please see if this version works?
Attachment #8502288 - Attachment is obsolete: true
Attachment #8502870 - Flags: feedback?(armenzg)
Attached patch e10s_bc3.diff (obsolete) — Splinter Review
This patch enables e10s chunked browser chrome jobs on all platforms for all trunk trees and it is set to ride the trains.

markh, this looks green:
https://tbpl.mozilla.org/?tree=Holly&jobname=mochitest-e10s-
however I see a lot of oranges:
https://tbpl.mozilla.org/?tree=Holly&onlyunstarred=1

Are we ready for making these changes?
Are there any patches on Holly that have to land on m-c trees?
Would you like to enable this on try first and push there before enabling across all platforms?
Attachment #8502870 - Attachment is obsolete: true
Attachment #8502870 - Flags: feedback?(armenzg)
Attachment #8503283 - Flags: review?(catlee)
Thanks Armen! One clarification: we don't want this to ride the trains. There's some e10s stuff that is #ifdef NIGHTLY-only. If we let it ride the trains, it will just turn orange. This is by design.

As far as I know, this is ready to land on all trees. I don't think try gives us any additional information over holly. The opt browser-chrome tests are green on all platforms.
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #4)
> markh, this looks green:
> https://tbpl.mozilla.org/?tree=Holly&jobname=mochitest-e10s-
> however I see a lot of oranges:
> https://tbpl.mozilla.org/?tree=Holly&onlyunstarred=1

Thanks!  As Bill mentioned, there is alot of orange but best I can see, no browser-chrome orange on opt builds - which is all we want to enable for now.
Attached patch e10s_bc3.diff (obsolete) — Splinter Review
Locking it to mc version.
Passing to nthomas as Monday is off for catlee.
Attachment #8503283 - Attachment is obsolete: true
Attachment #8503283 - Flags: review?(catlee)
Attachment #8503616 - Flags: review?(nthomas)
Comment on attachment 8503616 [details] [diff] [review]
e10s_bc3.diff

>diff --git a/mozilla-tests/config.py b/mozilla-tests/config.py
>+# Enable e10s browser-chrome mochitests on trunk branches, opt builds only.
>+for platform in PLATFORMS.keys():
>+    for name, branch in items_at_least(BRANCHES, 'gecko_version', mc_gecko_version):
>+        for slave_platform in PLATFORMS[platform]['slave_platforms']:
>+            if platform in branch['platforms'] and slave_platform in branch['platforms'][platform]:
>+                branch['platforms'][platform][slave_platform]['opt_unittest_suites'] += MOCHITEST_BC_3_E10S[:]

Something here isn't quite right. There's no 
  PLATFORM_UNITTEST_VARS['win32']['xp-ix'['mochitest-e10s-browser-chrome']
added by your patch, but builders like 'Windows XP 32-bit graphics opt test mochitest-e10s-browser-chrome-1' are being added, which I suspect leads to runtime errors/bad tests.
Attachment #8503616 - Flags: review?(nthomas) → review-
Attached patch e10s_bc3.2.diff (obsolete) — Splinter Review
I need some clarification as to what we want to do in here.
I think I misunderstood originally and this is what I believe is what we're trying to accomplish.

* Enable e10s bc3 for all platforms on *Holly*

Once we see non-Linux jobs go green we can request enabling them on all trunk trees and do not ride the trains until ready.

The reason I'm only enabling non-Linux platforms on Holly is because afaict we have not run bc3 e10s on all platforms on *Holly*. I only see Linux jobs in here:
https://tbpl.mozilla.org/?tree=Holly&jobname=mochitest-e10s-browser-chrome

If I got this wrong please help me understand.

In my following attachment you will see exactly which builders get added.
Assignee: nobody → armenzg
Attachment #8503616 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attached file differences.txt (obsolete) —
Currently *all* tests on holly run with e10s. You can see that here, for example:
  http://hg.mozilla.org/projects/holly/file/b7b8001b2a30/testing/config/mozharness/windows_config.py
Notice the --e10s option being passed as part of mochitest_options.

The fact that we have a separate M-e10s line as well is an unfortunate redundancy.

As you can see on holly, the opt bc tests are green on all platforms. So we want to enable them on all trees now. Enabling them only on holly does us no good (and, in fact, is just further waste).
Attached patch e10s_bc3.3.diffSplinter Review
That helps! Thank you for clarifying it.

This patch does not add XP to the list of builders.
Attachment #8504782 - Attachment is obsolete: true
Attachment #8504875 - Flags: review?(nthomas)
Attached file differences.txt
Attachment #8504783 - Attachment is obsolete: true
BTW we should have XP dumping log output pretty soon too (bug 1037445), so the XP exclusions are temporary.
I'm told that we can't add more jobs to the releng systems because of an issue of a maximum number of builders.

Is there a bug filed for that so we can block on it?

In another words, we can't land this until releng manages to bump that limit.
Comment on attachment 8504875 [details] [diff] [review]
e10s_bc3.3.diff

>diff --git a/mozilla-tests/config.py b/mozilla-tests/config.py
>+                'mochitest-e10s-browser-chrome': {
>+                    'config_files': ["unittests/win_unittest.py"],
>+                },

This is missing for snowleopard. (Man, this duplicaton is silly!)

I don't hit any problems with test_masters with this patch, IIRC that will fail on the builder limit. That could be because you're adding mac and windows tests, not linux ones.

Please address comment #11 in a follow up - ie you're adding more tests on holly which duplicate browser chrome that is already using e10s.
Attachment #8504875 - Flags: review?(nthomas) → review+
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #15)
> I'm told that we can't add more jobs to the releng systems because of an
> issue of a maximum number of builders.
> 
> Is there a bug filed for that so we can block on it?
> 
> In another words, we can't land this until releng manages to bump that limit.

In the short term, can we steal some from holly?  eg, holly runs many mochitests twice (once "normally" and once for e10s - even though "normally" really is also with e10s in that branch).  ISTM we could even remove the b-c tests on all platforms there too - IOW, have holly release exactly the same number of builders that we are adding to central.
Attachment #8506935 - Flags: review?(nthomas)
Attached file differences.txt
wfu?
Attachment #8506937 - Flags: feedback?(mhammond)
Comment on attachment 8506937 [details]
differences.txt

This looks fine to me.  Bill, are you OK with that?
Attachment #8506937 - Flags: feedback?(wmccloskey)
Attachment #8506937 - Flags: feedback?(mhammond)
Attachment #8506937 - Flags: feedback+
Comment on attachment 8506935 [details] [diff] [review]
e10s_bc3.holly.diff

urgh, these configs. 

FTR, current # of builders for slave classes, the limit is 4096 - 12  = 4084
# in-house hardware
1144  panda
1414  t-xp32-ix
1584  t-w732-ix
1630  t-w864-ix
1166  t-snow-r4
1095  talos-mtnlion-r5
 246  talos-linux32-ix    
 284  talos-linux64-ix

# AWS virtual
1918  tst-linux32
3966  tst-linux64-spot    # The only place we have a problem, 64bit tests on AWS
 529  tst-emulator64
Attachment #8506935 - Flags: review?(nthomas) → review+
Comment on attachment 8506935 [details] [diff] [review]
e10s_bc3.holly.diff

nthomas thanks for the info! What do you use to measure the number of builders?

Landing as the first f+ came by:
https://hg.mozilla.org/build/buildbot-configs/rev/4a04be67eb76

Let me know if we need to make any further adjustment.
Attachment #8506935 - Flags: checked-in+
I think on Friday we started running e10s on more platforms successfully:
https://tbpl.mozilla.org/?jobname=e10s
Attachment #8506937 - Flags: feedback?(wmccloskey) → feedback+
My last change is now live as well.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #23)
> nthomas thanks for the info! What do you use to measure the number of
> builders?

It was some manual hackery - modified http://hg.mozilla.org/build/buildbot-configs/file/default/mozilla-tests/tests_master.cfg#l242 to print s and count, did a checkconfig, then pulled the values out by hand.
No longer blocks: 1087642
Depends on: 1087642
markh, we need to request to reveal the Windows 8 64-bit e10s jobs which had been hidden due to bug 1087642.
Would you be able to follow with the sheriffs to make sure that we're ready to make them visible?
bc1 and bc3 are a bit intermittently orange.

https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Windows%208.*e10s&showall=1
Flags: needinfo?(mhammond)
Are they "more orange" than Windows in general?  Most of the failures in the above links have open bugs, so it looks to me like they aren't.  I'm not sure what action we can take other than to disable the problematic tests in that configuration.
Flags: needinfo?(mhammond)
I would say Win8 triggers a bit more intermittent oranges than Win7 and WinXp shows no intermittent oranges.
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Windows.*e10s&showall=1

I think it is best if we ask sheriff directly if there are any issues on showing Win8 e10s bc jobs.

:RyanVM, what do you say?
Flags: needinfo?(ryanvm)
Fine with me to unhide. We only hid them due to the perma-fail at the time.
Flags: needinfo?(ryanvm)
Depends on: 1089673
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.