Closed
Bug 1312022
Opened 8 years ago
Closed 8 years ago
Keep alive content processes for bc and dt tests for e10s-multi
Categories
(Core :: General, defect)
Core
General
Tracking
()
RESOLVED
FIXED
mozilla52
Tracking | Status | |
---|---|---|
firefox52 | --- | fixed |
People
(Reporter: gkrizsanits, Assigned: gkrizsanits)
References
Details
(Whiteboard: [e10s-multi:M1])
Attachments
(4 files, 3 obsolete files)
5.07 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
1.44 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
1.16 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
3.78 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
Because of the slow process startup times bc and dt tests time out frequently. Until some serious perf optimization to enable 2 content processes on nightly we'll try to keep alive both content processes for these suits.
Assignee | ||
Comment 1•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=af9d6d6545741dbea4bc53499b2a38883f33a189
Blocks: 1303113
Whiteboard: [e10s-multi:M1]
Assignee | ||
Comment 2•8 years ago
|
||
Assignee | ||
Comment 3•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Attachment #8803377 -
Attachment is patch: true
Assignee | ||
Comment 4•8 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #1) > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=af9d6d6545741dbea4bc53499b2a38883f33a189 Wrong try flags and tree is closed :( will try it later...
Assignee | ||
Comment 5•8 years ago
|
||
looks promising: https://treeherder.mozilla.org/#/jobs?repo=try&revision=208987d81c04&selectedJob=29698810 some known issues, and a few interesting crashes that I have to look into...
Assignee | ||
Comment 6•8 years ago
|
||
Seems like the previous patch still had some side effects for the cases where the pref is not set. Trying to be a bit more careful: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f97c638c2872
Assignee | ||
Comment 7•8 years ago
|
||
Attachment #8803376 -
Attachment is obsolete: true
Attachment #8805079 -
Flags: review?(mrbkap)
Assignee | ||
Comment 8•8 years ago
|
||
Attachment #8803377 -
Attachment is obsolete: true
Attachment #8805080 -
Flags: review?(mrbkap)
Assignee | ||
Comment 9•8 years ago
|
||
Sometimes we get a window/docshell leak report, because the last 2 windows and last docshell is destroyed later then we decide to report the leak... Based on bug 1073352 comment 28 (and the following) it seems like it's not a trivial to fix this, on the other hand I don't think it's super important either. If there was an actual leak, since we use this in debug builds, it will show up anyway, if it's just the problem Andrew describes then I'm not even sure it's worth to fix it. Bottom line is, that I wouldn't block on this. (this is not super frequent, a couple of times per platform, example: https://treeherder.mozilla.org/logviewer.html#?job_id=29804952&repo=try#L18749) The patch just does not report the leak if it's the last one or two windows/docshell in the content process. Just in case I want to let Andrew to take a look at this temporary hack as well.
Attachment #8805084 -
Flags: review?(mrbkap)
Attachment #8805084 -
Flags: feedback?(continuation)
Assignee | ||
Comment 10•8 years ago
|
||
Not sure if I should revert this change yet but I have a patch for it.
Comment 11•8 years ago
|
||
Comment on attachment 8805084 [details] [diff] [review] part3: Leak detector hack. v1 Review of attachment 8805084 [details] [diff] [review]: ----------------------------------------------------------------- I'd really like to see some more analysis of what the leak actually is before approving something like this. I fixed a number of these leaks when getting e10s leak checking working, and they were not super difficult to fix. If I understand it correctly, this patch allows us to always leak two of each. If somebody lands a test that permanently leaks 1 or 2, then it'll be green, but at some later point an intermittent leak will show up, when we hit the extra leak from your patches here, which will be difficult to track down. How frequent is "sometimes"? This will ignore leaks of any docshells and windows when there are only 1 or 2 alive, so this could include others, before the last two. Isn't this ignoring these leaks in the parent process, too? Or is there some check I'm missing? As a minor nitpick for the patch, it would be less fragile and easier to understand if you matched against "DOMWINDOW == 0 " instead of just "0 ".
Attachment #8805084 -
Flags: feedback?(continuation) → review-
Comment 12•8 years ago
|
||
> If there was an actual leak, since we use this in debug builds, it will show up anyway,
These kinds of leaks don't show up with the regular shutdown leak checking. The purpose of this particular leak check is that you can have a case where you leak a window forever after closing a tab, but it still gets cleaned up somehow before shutdown.
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #11) > Comment on attachment 8805084 [details] [diff] [review] > part3: Leak detector hack. v1 > > Review of attachment 8805084 [details] [diff] [review]: > ----------------------------------------------------------------- > > I'd really like to see some more analysis of what the leak actually is > before approving something like this. I fixed a number of these leaks when > getting e10s leak checking working, and they were not super difficult to fix. > > If I understand it correctly, this patch allows us to always leak two of > each. If somebody lands a test that permanently leaks 1 or 2, then it'll be > green, but at some later point an intermittent leak will show up, when we > hit the extra leak from your patches here, which will be difficult to track > down. How frequent is "sometimes"? After reading your comments I think I misunderstood a bit the purpose of this check and how this all works. The tracking problem could be easily solved by printing something else than an error message, but I can see now why is it important to fix this. (Not to mention that the patch did not do exactly what I intended to do) So the frequency is low enough to make it hard to reproduce it locally (I couldn't so far) but high enough to make it impossible to turn on 2 content processes on nightly. But I'll try to come up with something. Do you have any hints how to hunt down these issues? (In reply to Andrew McCreight [:mccr8] from comment #12) > > If there was an actual leak, since we use this in debug builds, it will show up anyway, > These kinds of leaks don't show up with the regular shutdown leak checking. > The purpose of this particular leak check is that you can have a case where > you leak a window forever after closing a tab, but it still gets cleaned up > somehow before shutdown. Right, it makes sense now.
Assignee | ||
Comment 14•8 years ago
|
||
Comment on attachment 8805084 [details] [diff] [review] part3: Leak detector hack. v1 Review of attachment 8805084 [details] [diff] [review]: ----------------------------------------------------------------- Cancelling r?, until I figure out how to fix this leak.
Attachment #8805084 -
Flags: review?(mrbkap)
Comment 15•8 years ago
|
||
> Do you have any hints how to hunt down these issues?
I think I usually used refcount logging, or DMD's heap scan mode to try to figure out who is holding references to things. If you can't reproduce locally, that will make it more difficult to investigate.
The line after the UNEXPECTED-FAIL prints out information about the leaked windows:
TEST-INFO | toolkit/mozapps/extensions/test/browser/browser_webext_options.js | windows(s) leaked: [pid = 5132] [serial = 207], [pid = 5132] [serial = 205]
You can then search in the log for, say, "[pid = 5132] [serial = 205]" to see exactly where the window was created. That might help...
Comment 16•8 years ago
|
||
One thing that might help reproduce the issue is to make 4 or 5 copies of the test that is leaking, and just run them all in a row.
Assignee | ||
Comment 17•8 years ago
|
||
It turns out that it's just a bunch of leaky tests. I have no idea why they did not show up on the radar sooner, but I guess I was just the unlucky one who's patch gave them the final touch that triggered the leaks to be detected. By leaks I mean we finish the test sooner than we should and that _sometimes_ causing some trouble (usually because incorrect usage of nested generator functions). I think I fixed the most common offenders, the rest might be just some rare intermittent, let's see if more stuff needs to get fixed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=98d8e8d88b3affdd7f54fcb2cb0062835162be32
Attachment #8805084 -
Attachment is obsolete: true
Attachment #8805540 -
Flags: review?(mrbkap)
Assignee | ||
Comment 18•8 years ago
|
||
Comment on attachment 8805085 [details] [diff] [review] rem process reusing code. v1 Review of attachment 8805085 [details] [diff] [review]: ----------------------------------------------------------------- I think there is no reason to leave in this hack with the current setup, let's remove it.
Attachment #8805085 -
Flags: review?(mrbkap)
Comment 19•8 years ago
|
||
Comment on attachment 8805079 [details] [diff] [review] part1: pref to keep alive content processes. v2 Do we shut the processes down on parent process shutdown?
Attachment #8805079 -
Flags: review?(mrbkap) → review+
Comment 20•8 years ago
|
||
Comment on attachment 8805080 [details] [diff] [review] part2: keep alive content processes for bc/dt if e10s-multi is enabled. v2 Do we need a testing peer's review on this?
Attachment #8805080 -
Flags: review?(mrbkap) → review+
Updated•8 years ago
|
Attachment #8805085 -
Flags: review?(mrbkap) → review+
Updated•8 years ago
|
Attachment #8805540 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 21•8 years ago
|
||
(In reply to Blake Kaplan (:mrbkap) from comment #19) > Comment on attachment 8805079 [details] [diff] [review] > part1: pref to keep alive content processes. v2 > > Do we shut the processes down on parent process shutdown? Yes, also our tests would report any lingering processes left behind. (In reply to Blake Kaplan (:mrbkap) from comment #20) > Comment on attachment 8805080 [details] [diff] [review] > part2: keep alive content processes for bc/dt if e10s-multi is enabled. v2 > > Do we need a testing peer's review on this? Good point, I'm sure keeping someone in the loop is important... Ted, we want to enable two content processes on nightly, and since content process startup is still not perfect, and in debug build it's pretty bad even, to avoid timeouts in bc/dt tests we keep the first two content processes alive. Once we optimize the process startup this will be ofc turned off again. The patches here are inactive until dom.ipc.processCount is set to 2 by default, so I'm quite confident landing them, but before we start using this on mc I want to give you a chance to review the approach we're about to take here and raise any concerns you might have with it.
Flags: needinfo?(ted)
Assignee | ||
Comment 22•8 years ago
|
||
I have to update patch 1 a bit to work together with bug 1304140. I talked to smaug on IRC about this, and we agreed that we just never want to keep those special processes alive. try build as it will land on mc (if I didn't break something in the rebase process...): https://treeherder.mozilla.org/#/jobs?repo=try&revision=c19ed80effa5ea988b7b8d0dadbbcde47ed396bd
Depends on: 1304140
Comment 23•8 years ago
|
||
Pushed by gkrizsanits@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/37fbc13ab60a Pref to keep some content processes alive. r=mrbkap https://hg.mozilla.org/integration/mozilla-inbound/rev/11d5d68a6a4a Keep alive content processes on bc and dt. r=mrbkap https://hg.mozilla.org/integration/mozilla-inbound/rev/423cf7cd8adc Fix some leaky tests. r=mrbkap https://hg.mozilla.org/integration/mozilla-inbound/rev/0c1f5793e613 Revert reusing processes between bt tests. r=mrbkap
Comment 24•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/37fbc13ab60a https://hg.mozilla.org/mozilla-central/rev/11d5d68a6a4a https://hg.mozilla.org/mozilla-central/rev/423cf7cd8adc https://hg.mozilla.org/mozilla-central/rev/0c1f5793e613
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Comment 25•8 years ago
|
||
Thanks for the heads-up (sorry I didn't look at this sooner). This seems like a sensible approach. It actually seems like longer-term we might want to make some variant of this the default behavior--if the user closes the last tab in a process, we should probably keep it around for a while in expectation of them opening a new tab. Is something like that planned?
Flags: needinfo?(ted)
Updated•8 years ago
|
Assignee: nobody → gkrizsanits
You need to log in
before you can comment on or make changes to this bug.
Description
•