Closed Bug 1312022 Opened 9 years ago Closed 9 years ago

Keep alive content processes for bc and dt tests for e10s-multi

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla52

Tracking Flags:

Tracking

Status

firefox52

---

fixed

People

(Reporter: gkrizsanits, Assigned: gkrizsanits)

References

Details

(Whiteboard: [e10s-multi:M1])

Attachments

(4 files, 3 obsolete files)

pref to keep alive content processes. v1 9 years ago Gabor Krizsanits (INACTIVE) 4.77 KB, patch		Details \| Diff \| Splinter Review
keep alive content processes for bc/dt if e10s-multi is enabled. v1 9 years ago Gabor Krizsanits (INACTIVE) 1.44 KB, patch		Details \| Diff \| Splinter Review
part1: pref to keep alive content processes. v2 9 years ago Gabor Krizsanits (INACTIVE) 5.07 KB, patch	mrbkap : review+	Details \| Diff \| Splinter Review
part2: keep alive content processes for bc/dt if e10s-multi is enabled. v2 9 years ago Gabor Krizsanits (INACTIVE) 1.44 KB, patch	mrbkap : review+	Details \| Diff \| Splinter Review
part3: Leak detector hack. v1 9 years ago Gabor Krizsanits (INACTIVE) 2.39 KB, patch	mccr8 : review-	Details \| Diff \| Splinter Review
rem process reusing code. v1 9 years ago Gabor Krizsanits (INACTIVE) 1.16 KB, patch	mrbkap : review+	Details \| Diff \| Splinter Review
part3: Fixing some leaky tests. v1 9 years ago Gabor Krizsanits (INACTIVE) 3.78 KB, patch	mrbkap : review+	Details \| Diff \| Splinter Review

Gabor Krizsanits (INACTIVE)

Assignee

Description

•

9 years ago

Because of the slow process startup times bc and dt tests time out frequently. Until some serious perf optimization to enable 2 content processes on nightly we'll try to keep alive both content processes for these suits.

Gabor Krizsanits (INACTIVE)

Assignee

Comment 1

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=af9d6d6545741dbea4bc53499b2a38883f33a189

Blocks: 1303113

Whiteboard: [e10s-multi:M1]

Gabor Krizsanits (INACTIVE)

Assignee

Comment 2

•

9 years ago

Attached patch pref to keep alive content processes. v1 (obsolete) — Details — Splinter Review

Gabor Krizsanits (INACTIVE)

Assignee

Comment 3

•

9 years ago

Attached patch keep alive content processes for bc/dt if e10s-multi is enabled. v1 (obsolete) — Details — Splinter Review

Gabor Krizsanits (INACTIVE)

Assignee

Updated

•

9 years ago

Attachment #8803377 - Attachment is patch: true

Gabor Krizsanits (INACTIVE)

Assignee

Comment 4

•

9 years ago

(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #1) > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=af9d6d6545741dbea4bc53499b2a38883f33a189 Wrong try flags and tree is closed :( will try it later...

Gabor Krizsanits (INACTIVE)

Assignee

Comment 5

•

9 years ago

looks promising: https://treeherder.mozilla.org/#/jobs?repo=try&revision=208987d81c04&selectedJob=29698810 some known issues, and a few interesting crashes that I have to look into...

Gabor Krizsanits (INACTIVE)

Assignee

Comment 6

•

9 years ago

Seems like the previous patch still had some side effects for the cases where the pref is not set. Trying to be a bit more careful: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f97c638c2872

Gabor Krizsanits (INACTIVE)

Assignee

Comment 7

•

9 years ago

Attached patch part1: pref to keep alive content processes. v2 — Details — Splinter Review

Attachment #8803376 - Attachment is obsolete: true

Attachment #8805079 - Flags: review?(mrbkap)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 8

•

9 years ago

Attached patch part2: keep alive content processes for bc/dt if e10s-multi is enabled. v2 — Details — Splinter Review

Attachment #8803377 - Attachment is obsolete: true

Attachment #8805080 - Flags: review?(mrbkap)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 9

•

9 years ago

Attached patch part3: Leak detector hack. v1 (obsolete) — Details — Splinter Review

Sometimes we get a window/docshell leak report, because the last 2 windows and last docshell is destroyed later then we decide to report the leak... Based on bug 1073352 comment 28 (and the following) it seems like it's not a trivial to fix this, on the other hand I don't think it's super important either. If there was an actual leak, since we use this in debug builds, it will show up anyway, if it's just the problem Andrew describes then I'm not even sure it's worth to fix it. Bottom line is, that I wouldn't block on this. (this is not super frequent, a couple of times per platform, example: https://treeherder.mozilla.org/logviewer.html#?job_id=29804952&repo=try#L18749) The patch just does not report the leak if it's the last one or two windows/docshell in the content process. Just in case I want to let Andrew to take a look at this temporary hack as well.

Attachment #8805084 - Flags: review?(mrbkap)

Attachment #8805084 - Flags: feedback?(continuation)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 10

•

9 years ago

Attached patch rem process reusing code. v1 — Details — Splinter Review

Not sure if I should revert this change yet but I have a patch for it.

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 11

•

9 years ago

Comment on attachment 8805084 [details] [diff] [review] part3: Leak detector hack. v1 Review of attachment 8805084 [details] [diff] [review]: ----------------------------------------------------------------- I'd really like to see some more analysis of what the leak actually is before approving something like this. I fixed a number of these leaks when getting e10s leak checking working, and they were not super difficult to fix. If I understand it correctly, this patch allows us to always leak two of each. If somebody lands a test that permanently leaks 1 or 2, then it'll be green, but at some later point an intermittent leak will show up, when we hit the extra leak from your patches here, which will be difficult to track down. How frequent is "sometimes"? This will ignore leaks of any docshells and windows when there are only 1 or 2 alive, so this could include others, before the last two. Isn't this ignoring these leaks in the parent process, too? Or is there some check I'm missing? As a minor nitpick for the patch, it would be less fragile and easier to understand if you matched against "DOMWINDOW == 0 " instead of just "0 ".

Attachment #8805084 - Flags: feedback?(continuation) → review-

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 12

•

9 years ago

> If there was an actual leak, since we use this in debug builds, it will show up anyway, These kinds of leaks don't show up with the regular shutdown leak checking. The purpose of this particular leak check is that you can have a case where you leak a window forever after closing a tab, but it still gets cleaned up somehow before shutdown.

Gabor Krizsanits (INACTIVE)

Assignee

Comment 13

•

9 years ago

(In reply to Andrew McCreight [:mccr8] from comment #11) > Comment on attachment 8805084 [details] [diff] [review] > part3: Leak detector hack. v1 > > Review of attachment 8805084 [details] [diff] [review]: > ----------------------------------------------------------------- > > I'd really like to see some more analysis of what the leak actually is > before approving something like this. I fixed a number of these leaks when > getting e10s leak checking working, and they were not super difficult to fix. > > If I understand it correctly, this patch allows us to always leak two of > each. If somebody lands a test that permanently leaks 1 or 2, then it'll be > green, but at some later point an intermittent leak will show up, when we > hit the extra leak from your patches here, which will be difficult to track > down. How frequent is "sometimes"? After reading your comments I think I misunderstood a bit the purpose of this check and how this all works. The tracking problem could be easily solved by printing something else than an error message, but I can see now why is it important to fix this. (Not to mention that the patch did not do exactly what I intended to do) So the frequency is low enough to make it hard to reproduce it locally (I couldn't so far) but high enough to make it impossible to turn on 2 content processes on nightly. But I'll try to come up with something. Do you have any hints how to hunt down these issues? (In reply to Andrew McCreight [:mccr8] from comment #12) > > If there was an actual leak, since we use this in debug builds, it will show up anyway, > These kinds of leaks don't show up with the regular shutdown leak checking. > The purpose of this particular leak check is that you can have a case where > you leak a window forever after closing a tab, but it still gets cleaned up > somehow before shutdown. Right, it makes sense now.

Gabor Krizsanits (INACTIVE)

Assignee

Comment 14

•

9 years ago

Attachment #8805084 - Flags: review?(mrbkap)

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 15

•

9 years ago

> Do you have any hints how to hunt down these issues? I think I usually used refcount logging, or DMD's heap scan mode to try to figure out who is holding references to things. If you can't reproduce locally, that will make it more difficult to investigate. The line after the UNEXPECTED-FAIL prints out information about the leaked windows: TEST-INFO | toolkit/mozapps/extensions/test/browser/browser_webext_options.js | windows(s) leaked: [pid = 5132] [serial = 207], [pid = 5132] [serial = 205] You can then search in the log for, say, "[pid = 5132] [serial = 205]" to see exactly where the window was created. That might help...

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 16

•

9 years ago

One thing that might help reproduce the issue is to make 4 or 5 copies of the test that is leaking, and just run them all in a row.

Gabor Krizsanits (INACTIVE)

Assignee

Comment 17

•

9 years ago

Attached patch part3: Fixing some leaky tests. v1 — Details — Splinter Review

It turns out that it's just a bunch of leaky tests. I have no idea why they did not show up on the radar sooner, but I guess I was just the unlucky one who's patch gave them the final touch that triggered the leaks to be detected. By leaks I mean we finish the test sooner than we should and that _sometimes_ causing some trouble (usually because incorrect usage of nested generator functions). I think I fixed the most common offenders, the rest might be just some rare intermittent, let's see if more stuff needs to get fixed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=98d8e8d88b3affdd7f54fcb2cb0062835162be32

Attachment #8805084 - Attachment is obsolete: true

Attachment #8805540 - Flags: review?(mrbkap)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 18

•

9 years ago

Comment on attachment 8805085 [details] [diff] [review] rem process reusing code. v1 Review of attachment 8805085 [details] [diff] [review]: ----------------------------------------------------------------- I think there is no reason to leave in this hack with the current setup, let's remove it.

Attachment #8805085 - Flags: review?(mrbkap)

Blake Kaplan (:mrbkap) (inactive)

Comment 19

•

9 years ago

Comment on attachment 8805079 [details] [diff] [review] part1: pref to keep alive content processes. v2 Do we shut the processes down on parent process shutdown?

Attachment #8805079 - Flags: review?(mrbkap) → review+

Blake Kaplan (:mrbkap) (inactive)

Comment 20

•

9 years ago

Comment on attachment 8805080 [details] [diff] [review] part2: keep alive content processes for bc/dt if e10s-multi is enabled. v2 Do we need a testing peer's review on this?

Attachment #8805080 - Flags: review?(mrbkap) → review+

Blake Kaplan (:mrbkap) (inactive)

Updated

•

9 years ago

Attachment #8805085 - Flags: review?(mrbkap) → review+

Blake Kaplan (:mrbkap) (inactive)

Updated

•

9 years ago

Attachment #8805540 - Flags: review?(mrbkap) → review+

Gabor Krizsanits (INACTIVE)

Assignee

Comment 21

•

9 years ago

(In reply to Blake Kaplan (:mrbkap) from comment #19) > Comment on attachment 8805079 [details] [diff] [review] > part1: pref to keep alive content processes. v2 > > Do we shut the processes down on parent process shutdown? Yes, also our tests would report any lingering processes left behind. (In reply to Blake Kaplan (:mrbkap) from comment #20) > Comment on attachment 8805080 [details] [diff] [review] > part2: keep alive content processes for bc/dt if e10s-multi is enabled. v2 > > Do we need a testing peer's review on this? Good point, I'm sure keeping someone in the loop is important... Ted, we want to enable two content processes on nightly, and since content process startup is still not perfect, and in debug build it's pretty bad even, to avoid timeouts in bc/dt tests we keep the first two content processes alive. Once we optimize the process startup this will be ofc turned off again. The patches here are inactive until dom.ipc.processCount is set to 2 by default, so I'm quite confident landing them, but before we start using this on mc I want to give you a chance to review the approach we're about to take here and raise any concerns you might have with it.

Flags: needinfo?(ted)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 22

•

9 years ago

I have to update patch 1 a bit to work together with bug 1304140. I talked to smaug on IRC about this, and we agreed that we just never want to keep those special processes alive. try build as it will land on mc (if I didn't break something in the rebase process...): https://treeherder.mozilla.org/#/jobs?repo=try&revision=c19ed80effa5ea988b7b8d0dadbbcde47ed396bd

Depends on: 1304140

Pulsebot

Comment 23

•

9 years ago

Pushed by gkrizsanits@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/37fbc13ab60a Pref to keep some content processes alive. r=mrbkap https://hg.mozilla.org/integration/mozilla-inbound/rev/11d5d68a6a4a Keep alive content processes on bc and dt. r=mrbkap https://hg.mozilla.org/integration/mozilla-inbound/rev/423cf7cd8adc Fix some leaky tests. r=mrbkap https://hg.mozilla.org/integration/mozilla-inbound/rev/0c1f5793e613 Revert reusing processes between bt tests. r=mrbkap

Phil Ringnalda (:philor)

Comment 24

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/37fbc13ab60a https://hg.mozilla.org/mozilla-central/rev/11d5d68a6a4a https://hg.mozilla.org/mozilla-central/rev/423cf7cd8adc https://hg.mozilla.org/mozilla-central/rev/0c1f5793e613

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla52

Ryan VanderMeulen [:RyanVM]

Updated

•

9 years ago

Blocks: 1305606

(not currently active) Ted Mielczarek

Comment 25

•

9 years ago

Thanks for the heads-up (sorry I didn't look at this sooner). This seems like a sensible approach. It actually seems like longer-term we might want to make some variant of this the default behavior--if the user closes the last tab in a process, we should probably keep it around for a while in expectation of them opening a new tab. Is something like that planned?

Flags: needinfo?(ted)

(not currently active) Ted Mielczarek

Updated

•

9 years ago

Assignee: nobody → gkrizsanits

You need to log in before you can comment on or make changes to this bug.