[meta] Turn e10s-multi on in Nightly

RESOLVED FIXED in Firefox 54

Status

()

Core
General
RESOLVED FIXED
11 months ago
4 months ago

People

(Reporter: mrbkap, Assigned: krizsa)

Tracking

(Depends on: 2 bugs)

unspecified
mozilla54
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox54 fixed)

Details

(Whiteboard: [e10s-multi:M1])

Attachments

(1 attachment)

(Reporter)

Description

11 months ago
We would like to turn e10s-multi on in Nightly (we'll start with 2 content processes and increase from there).

One question Erin asked over email is if we want to segment the Nightly population. It would probably make sense to do so. We should also probably figure out what percentage of Nightly users are already using processCount > 1.
(Assignee)

Updated

9 months ago
Depends on: 1312022
(Assignee)

Comment 1

9 months ago
Created attachment 8808105 [details] [diff] [review]
Turning 2 content processes in Nightly. v1

This patch as we discussed turns it on for the full population. Since the goal is to get more bug reports and the purpose of nightly is to test features like this I would not split the population. We can do that later on aurora or beta when we want to get the performance numbers / crash stats before release.
Attachment #8808105 - Flags: review?(mrbkap)
(Assignee)

Comment 2

9 months ago
Ryan, could you please take a look at the current state: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b2458ebae8d&selectedJob=30546722

Some of the intermittents we had might have got a bit worse, it's hard to tell. I've re-triggered some tests. Could you help me to decide if we're good to go or if we should disable some of them and re-enable them later?
Flags: needinfo?(ryanvm)
(Assignee)

Comment 3

9 months ago
Oh, and ignore test_browserElement_oop_PrivateBrowsing.html failures, it should have been disabled already...
(Assignee)

Comment 4

9 months ago
Bc4 on linux debug and bc7 on linux 64 debug seem a bit too orange, but the failing intermittents typically timeouts from known intermittents, not sure what to do about them.
We discussed on this IRC for a bit. The linux32 browser_hsts-priming failures are extremely frequent on m-c tip as well (and probably heading for disabling anyway). There are some leaks that are still concerning, and it's not clear what the situation on linux64 debug will be once browser_tab_dragdrop2.js and browser_tabkeynavigation.js get sorted out. Anyway, I think we're getting close, but it would be good to take another look after some of those bigger issues get cleaned up.
Flags: needinfo?(ryanvm)
(Reporter)

Updated

9 months ago
Attachment #8808105 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 6

9 months ago
BC7 failures were all over the place with the last push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=38f16c7a63ffba51bfdbcf4ddba5b4153792b4b6&selectedJob=30678695

but I think it was unrelated, because today after a rebased patch the issue is gone:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6999b4394c7fe32f78b0e78695034023f7651a7e

Comment 7

9 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/085035586d2b
Turn e10s-multi on in Nightly. r=mrbkap

Comment 8

9 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/adfcc194af1c
Turning off a test for e10s-multi temporarily. r=me
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/e1f230913306b5ab63a64438c12f76e8fbde8c62 for landing just 5 days too early.

Originally, 52 was going to merge to aurora on the 7th and now would have been fine, but it got pushed back a week so now this landed and massively destabilized our tests and shut off quite a few of them for both single and multi-e10s right before next Monday's merge, while people are trying to shove in things at the last minute that they actually want to ship in that branch which is going to be an ESR, so we really *really* don't want to disable tests that we will never remember to reenable on aurora where we'll be back to single-e10s.

Happy to see this land again next Monday morning (Pacific time Monday morning, you can tell it's safe to land when you see a mozilla-central push talking about tagging and bumping the version number), even happier if by then it doesn't have some of the bustage that is already apparent from it, like the leak in https://treeherder.mozilla.org/logviewer.html#?job_id=38930104&repo=mozilla-inbound (one of the clearest examples of why I don't want to see it landed right now, because that failure is bustage which landed a few pushes before you, and then your leak hiding below it), the failure in https://treeherder.mozilla.org/logviewer.html#?job_id=38930048&repo=mozilla-inbound, the leaks for sure and probably the timeout itself in https://treeherder.mozilla.org/logviewer.html#?job_id=38931136&repo=mozilla-inbound, and the failures in https://treeherder.mozilla.org/logviewer.html#?job_id=38938876&repo=mozilla-inbound
browser_tab_dragdrop2.js  was fixed by turning browser_tab_dragdrop.js off (it was enabled and caused browser_tab_dragdrop2.js  to fail very frequently)

the hsts tests are really frustrating, we disabled all of them for linux32-debug this week, and now linux64-debug is failing very frequently as of yesterday (possibly related to backing this out?)
as a note this had a large impact on talos:
== Change summary for alert #4055 (as of November 09 2016 13:46 UTC) ==

Regressions:

270%  tabpaint summary osx-10-10 opt e10s       58.11 -> 214.79
263%  tabpaint summary windowsxp pgo e10s       50.45 -> 183.01
223%  tabpaint summary windows7-32 pgo e10s     49.69 -> 160.26
217%  tabpaint summary linux64 pgo e10s         59.5 -> 188.54
214%  tabpaint summary linux64 opt e10s         68.86 -> 216.27
205%  tabpaint summary windows8-64 opt e10s     59.03 -> 180.29
196%  tabpaint summary windowsxp opt e10s       73.17 -> 216.68
177%  tabpaint summary windows7-32 opt e10s     71.44 -> 197.61
 67%  tps summary windows7-32 pgo e10s          33.25 -> 55.64
 58%  tps summary windows7-32 opt e10s          40.02 -> 63.24
 42%  tps summary osx-10-10 opt e10s            39.78 -> 56.29
 27%  tps summary windowsxp opt e10s            39.5 -> 50.02
 25%  tps summary windowsxp pgo e10s            33.74 -> 42.19
 16%  damp summary windows7-32 pgo e10s         222.94 -> 259.04
 14%  damp summary windows8-64 opt e10s         267.11 -> 303.89
 13%  damp summary osx-10-10 opt e10s           302.7 -> 342.71
 11%  damp summary linux64 opt e10s             293.99 -> 326.18
 10%  damp summary linux64 pgo e10s             245.15 -> 270.28
  9%  tps summary linux64 opt e10s              41.7 -> 45.47
  8%  damp summary windows7-32 opt e10s         300.24 -> 325.07
  8%  tps summary linux64 pgo e10s              36.23 -> 39.22

Improvements:

  3%  tps summary windows8-64 opt e10s     37.1 -> 35.86

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4055


please address these talos issues prior to landing this again.
(Assignee)

Comment 12

9 months ago
(In reply to Joel Maher ( :jmaher) from comment #11)
> please address these talos issues prior to landing this again.

Let's branch this off: bug 1317312.
(Assignee)

Updated

7 months ago
Depends on: 1317312
No longer depends on: 1251963, 1290167, 1294389
(Assignee)

Updated

7 months ago
Depends on: 1324428
(Assignee)

Comment 13

7 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=55581ed910f5a53008bcc5e0bfe35a2cd1e4a51a&selectedJob=65004753

I'm going to do another try now that the talos regressions are fixed. I had to fix browser_service_workers_status.js and force single cp in one more test, otherwise this patch
is pretty much what I tried the last time. Then I will work on the new intermittent failures
a bit if it's necessary and re-enable the tests.

Comment 14

7 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2bd53e4e662b
Turn e10s-multi on in Nightly. r=me
Backed out for various test failures in tests for debug builds with e10s:

https://hg.mozilla.org/integration/mozilla-inbound/rev/0bbd4e32a321fd93c963155185fbbf015e82dd85

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=2bd53e4e662bcdd32c53cb4e09ceff088e8f6369
Flags: needinfo?(gkrizsanits)

Updated

7 months ago
Depends on: 1328358

Updated

7 months ago
Depends on: 1328359

Updated

7 months ago
Depends on: 1328360

Updated

7 months ago
Depends on: 1328362

Updated

7 months ago
Depends on: 1328366

Updated

7 months ago
Depends on: 1328368

Updated

7 months ago
Depends on: 1328371

Updated

7 months ago
Depends on: 1328372

Updated

7 months ago
Depends on: 1328374

Updated

7 months ago
Depends on: 1328376

Updated

7 months ago
Depends on: 1328377

Updated

7 months ago
Depends on: 1328379

Updated

7 months ago
Depends on: 1328380

Updated

7 months ago
Depends on: 1328381

Updated

7 months ago
Depends on: 1328382

Updated

7 months ago
Depends on: 1328384

Updated

7 months ago
Depends on: 1328387

Updated

7 months ago
Depends on: 1328389

Updated

7 months ago
Depends on: 1328390

Updated

7 months ago
Depends on: 1328392

Updated

7 months ago
Depends on: 1328395

Updated

7 months ago
Depends on: 1328396

Updated

7 months ago
Depends on: 1328426

Updated

7 months ago
Depends on: 1328427

Updated

7 months ago
Depends on: 1328428
as a note, this caused a performance regression in the damp (devtools) test:
== Change summary for alert #4699 (as of January 03 2017 17:16 UTC) ==

Regressions:

 12%  damp summary linux64 opt e10s     329.61 -> 369.16

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4699

as this is backed out, we are not filing a new bug.
(Assignee)

Updated

7 months ago
Depends on: 1330018
(Assignee)

Comment 17

6 months ago
(In reply to Joel Maher ( :jmaher) from comment #16)
> as a note, this caused a performance regression in the damp (devtools) test:
> == Change summary for alert #4699 (as of January 03 2017 17:16 UTC) ==
> 
> Regressions:
> 
>  12%  damp summary linux64 opt e10s     329.61 -> 369.16
> 
> For up to date results, see:
> https://treeherder.mozilla.org/perf.html#/alerts?id=4699

I think we will have to swallow this. But I will take a look at it if I can do any improvement other then
making sure that devtools do not start a new process (which might be what we will want in the end).
Flags: needinfo?(gkrizsanits)
I backed this out for failures like https://treeherder.mozilla.org/logviewer.html#?job_id=70408216&repo=mozilla-inbound

https://hg.mozilla.org/integration/mozilla-inbound/rev/bde3fc40b9b5
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 19

6 months ago
Hey Kris, do you know anything about these oop extension tests and why they fail with multiple content processes? Also, do you mind if I turn them off temporarily and once they fixed we can turn them back on again? This is blocking us.
Flags: needinfo?(gkrizsanits) → needinfo?(kmaglione+bmo)
(Assignee)

Comment 20

6 months ago
(In reply to Wes Kocher (:KWierso) from comment #18)
> I backed this out for failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=70408216&repo=mozilla-
> inbound
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/bde3fc40b9b5

I have not seen this on ash (might have overlooked it), is this a frequent/perma orange or a seen once intermittent?
Flags: needinfo?(wkocher)
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #20)
> (In reply to Wes Kocher (:KWierso) from comment #18)
> > I backed this out for failures like
> > https://treeherder.mozilla.org/logviewer.html#?job_id=70408216&repo=mozilla-
> > inbound
> > 
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/bde3fc40b9b5
> 
> I have not seen this on ash (might have overlooked it), is this a
> frequent/perma orange or a seen once intermittent?

It's essentially permafailing, at least on Windows debug: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=e2f6478f748157bf82a5fd0e940a6043af076a77&bugfiler&group_state=expanded&noautoclassify&filter-searchStr=windows%20mochi%20s(5%20debug
Flags: needinfo?(wkocher)
There were a couple failures on linux as well: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=e2f6478f748157bf82a5fd0e940a6043af076a77&bugfiler&group_state=expanded&noautoclassify&filter-searchStr=mochi%20s(10%20debug&selectedJob=70397287
The test_ext_cookies failure looks like a manifestation of bug 1309637. I'd be OK with disabling that test in Windows debug builds until that bug is fixed, but I'm a bit worried that e10s multi triggering that bug.

In the case of test_ext_storage_content, it looks like we're somehow running the test twice, in parallel, and the two instances are conflicting. I can't produce this locally on Linux, but it's definitely worrisome, and we can't just disable this test. I'll look into it some more.

I suspect the test_ext_i18n and test_ext_unload_frame problems may be similar, but I can't reproduce them locally either.

Updated

6 months ago
Flags: needinfo?(kmaglione+bmo)

Comment 24

6 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0c891a3aff93
Turn e10s-multi on in Nightly. r=me
OK, it actually looks like this is a cascade of failures caused by the test_ext_cookies timeout, that leads to extra windows and tabs staying open, and breaking the other tests. So let's just disable that test on Windows debug builds.
(Assignee)

Comment 26

6 months ago
(In reply to Pulsebot from comment #24)
> Pushed by gkrizsanits@mozilla.com:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/0c891a3aff93
> Turn e10s-multi on in Nightly. r=me

Just to make this clear, this is the one that has got backed out, just Pulsebot was slower than the backout.

(In reply to Kris Maglione [:kmag] from comment #25)
> OK, it actually looks like this is a cascade of failures caused by the
> test_ext_cookies timeout, that leads to extra windows and tabs staying open,
> and breaking the other tests. So let's just disable that test on Windows
> debug builds.

Thanks Kris, I'll do that. Although it seems like this failure might happen on linux as well (as Wes pointed it out), so I might have to turn it off for all debug builds temporarily.

Comment 27

6 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/529ae909938a
Turn e10s-multi on in Nightly. r=me
(Assignee)

Comment 28

6 months ago
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #26)
> Thanks Kris, I'll do that. Although it seems like this failure might happen
> on linux as well (as Wes pointed it out), so I might have to turn it off for
> all debug builds temporarily.

Actually I see failures in release mode as well: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&selectedJob=70726841

Not too frequent yet... might have to do a followup and turn it off in release mode as well. It would be nice if there were a way to turn off the oop version only...
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #28)
> Not too frequent yet... might have to do a followup and turn it off in
> release mode as well. It would be nice if there were a way to turn off the
> oop version only...

There is, but the failures aren't only in OOP mode. Those just happen to show up first.
I had to back this out for frequent asan failures like https://treeherder.mozilla.org/logviewer.html#?job_id=70784059&repo=mozilla-inbound

https://hg.mozilla.org/integration/mozilla-inbound/rev/dbef9f0ee2d5
Flags: needinfo?(gkrizsanits)
Not just asan: https://treeherder.mozilla.org/logviewer.html#?job_id=70787548&repo=mozilla-inbound
OK, we can't disable that entire test, but we can disable the private browsing section that's failing until bug 1309637 is fixed:

http://searchfox.org/mozilla-central/rev/30fcf167af036aeddf322de44a2fadd370acfd2f/toolkit/components/extensions/test/mochitest/test_ext_cookies.html#182-215
(Assignee)

Comment 33

6 months ago
(In reply to Kris Maglione [:kmag] from comment #32)
> OK, we can't disable that entire test, but we can disable the private
> browsing section that's failing until bug 1309637 is fixed:
> 
> http://searchfox.org/mozilla-central/rev/
> 30fcf167af036aeddf322de44a2fadd370acfd2f/toolkit/components/extensions/test/
> mochitest/test_ext_cookies.html#182-215

By disabling you mean should I just remove that part of the test and it will be put back once bug 1309637 is fixed?
Flags: needinfo?(gkrizsanits)
I'd rather add an `if (false)` to the beginning of that block than remove it, and add a comment about bug 1309637, but yes.

Updated

6 months ago
Duplicate of this bug: 1332791

Updated

6 months ago
Duplicate of this bug: 1332799

Updated

6 months ago
Depends on: 1332809

Comment 37

6 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4b8933f62ea
Turn e10s-multi on in Nightly. r=me

Comment 38

6 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f6c9241b40ec
Followup for a typo in the manigest file. r=me
(Assignee)

Comment 39

6 months ago
(In reply to Kris Maglione [:kmag] from comment #34)
> I'd rather add an `if (false)` to the beginning of that block than remove
> it, and add a comment about bug 1309637, but yes.

This is not going to work, this test is failing in quite a few ways. I must disable the entire test I'm afraid. Or if you think that this is not just a bug in the test but actually a broken feature important enough to hold back the landing, please let me know so I can plan accordingly.
Flags: needinfo?(kmaglione+bmo)
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/2baeb0ea2983 for the cookies again.
Only the private browsing parts are failing. There are other parts of the test file that deal with private browsing that can be disabled as well, but we can't disable the entire test.
Flags: needinfo?(kmaglione+bmo)
(Assignee)

Comment 42

6 months ago
(In reply to Kris Maglione [:kmag] from comment #41)
> Only the private browsing parts are failing. There are other parts of the
> test file that deal with private browsing that can be disabled as well, but
> we can't disable the entire test.

Even this one? https://treeherder.mozilla.org/logviewer.html#?job_id=70963350&repo=mozilla-inbound&lineNumber=2967

I see two blocks where we open a privateWindow, I disabled them both, this failures happens before those. Anyway, let's talk about it on Monday, and we'll sort out something for this.

Updated

6 months ago
No longer depends on: 1332809
Duplicate of this bug: 1332809
Hm. No, I guess not. I was assuming that was related to the private browsing cookies, but from the screenshot, it looks like we're actually just winding up with an extra tab open during that test.

And it looks like that's probably a race in test_ext_contentscript_permission.html, which doesn't wait for its tab removal to succeed before ending the test.

Updated

6 months ago
Depends on: 1332868
Just so it's not so boring, being only about test_ext_cookies.html, there's also https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=c1f03c71ed58936e84f1793b5e1944d2cc55f95d&group_state=expanded&filter-searchStr=21ccd22701da8ee54aab8604116ea43d547d2c36&selectedJob=70967962

Comment 46

6 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aefa445b9c77
Turn e10s-multi on in Nightly. r=me
Please nominate this for the release notes when it is ready to ride the trains.
Assignee: nobody → gkrizsanits
I duped all the leak bugs over, as they haven't shown up in a while, except for bug 1328374, which has happened a few times, in around the time window when this landed.
https://hg.mozilla.org/mozilla-central/rev/aefa445b9c77
Status: NEW → RESOLVED
Last Resolved: 6 months ago
status-firefox54: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
(Assignee)

Updated

6 months ago
No longer depends on: 1289723
(Assignee)

Updated

6 months ago
No longer depends on: 1324428
(Assignee)

Updated

6 months ago
No longer depends on: 1328362

Updated

6 months ago
Depends on: 1275447

Updated

6 months ago
Depends on: 1254841
Depends on: 1337778
Depends on: 1341353
No longer depends on: 1341353
Depends on: 1340921
Depends on: 1301415
Depends on: 1066789
You need to log in before you can comment on or make changes to this bug.