Closed
Bug 1349699
Opened 8 years ago
Closed 8 years ago
Add some IPC lifetime assertions
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
People
(Reporter: billm, Assigned: billm)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files)
6.43 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
1.17 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
For a long time I've wanted to add some assertions to ensure that IPC channels are closed properly. These patches do that.
Assignee | ||
Comment 1•8 years ago
|
||
Already talked this one over with dvander and got an r+.
Attachment #8850163 -
Flags: review+
Assignee | ||
Comment 2•8 years ago
|
||
This patch tries to make sure that we have closed the Chromium IPC channel before we destroy the mLink object. If we destroy mLink while the channel is still open, then any incoming data will cause us to call virtual methods on mLink, and that will crash.
Attachment #8850164 -
Flags: review?(dvander)
Attachment #8850164 -
Flags: review?(dvander) → review+
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/16e000b3fe0c
Assert when destroying a MessageLoop that a live MessageChannel is attached to (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b9eadd61e26
Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8a50c2d76afc
Fix ASAN builds
Comment 5•8 years ago
|
||
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/a842ddcd299b for being entirely too successful at crashing on Android, https://treeherder.mozilla.org/logviewer.html#?job_id=88487361&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88486735&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487441&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487426&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487430&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487447&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487442&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487435&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88487429&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=88492698&repo=mozilla-inbound
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/13f1d3918fe5
Assert when destroying a MessageLoop that a live MessageChannel is attached to (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/5cc766c0864e
Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)
Assignee | ||
Comment 7•8 years ago
|
||
Thanks for the backout. I'll try disabling on Android for now. It looks like it's broken. I filed bug 1353262 about this.
Comment 8•8 years ago
|
||
Wups, backed out again in https://hg.mozilla.org/integration/mozilla-inbound/rev/ebc1182ae0a5 because there were (much, much) later Win8 opt xpcshell crashes too, https://treeherder.mozilla.org/logviewer.html#?job_id=88492147&repo=mozilla-inbound
Assignee | ||
Comment 9•8 years ago
|
||
Can you take a look at the crashes in comment 8, David? The ProtocolName is showing up as PGPUChild. It looks like we're asserting because the compositor thread is shutting down before the PGPUChild channel has been closed. I'm not sure why though.
Flags: needinfo?(dvander)
Comment 10•8 years ago
|
||
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d28d4ab5091b
Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Comment 11•8 years ago
|
||
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/48ad3b64dea6
Revert "Bug 1349699 - Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)"
Depends on: 1334655
Assignee | ||
Comment 12•8 years ago
|
||
David suggested calling Close() unconditionally here:
http://searchfox.org/mozilla-central/source/gfx/ipc/GPUProcessHost.cpp#155-164
Flags: needinfo?(dvander)
Comment 13•8 years ago
|
||
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f26e6ae5421
Assert when destroying a MessageLoop that a live MessageChannel is attached to (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/1b5b065f9a16
Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/7c11fae5ee51
Close GPU channel unconditionally (r=dvander)
Comment 14•8 years ago
|
||
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb79e576b617
Revert "Bug 1349699 - Close GPU channel unconditionally (r=dvander)"
https://hg.mozilla.org/integration/mozilla-inbound/rev/356d65d040d4
Revert "Bug 1349699 - Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)"
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5d165f89e76
Revert "Bug 1349699 - Assert when destroying a MessageLoop that a live MessageChannel is attached to (r=dvander)"
Assignee | ||
Comment 15•8 years ago
|
||
The "Close GPU channel unconditionally" patch seems to regress Android:
https://treeherder.mozilla.org/logviewer.html#?job_id=89705278&repo=mozilla-inbound
Randall, do your patches fix that problem too?
Flags: needinfo?(rbarker)
Comment 16•8 years ago
|
||
I rebased my patch set and applied the patches in comment 13. I removed the android guards and ran the image/test/mochitest/test_discardAnimatedImage.html several times in the 4.3 emulator with out seeing the issues in comment 15. I also ran all the image/test/mochitest test in the emulator and all passed.
Flags: needinfo?(rbarker)
Assignee | ||
Comment 17•8 years ago
|
||
It looks like bug 1354499 fixed the real problem behind the test_discardAnimatedImage.html failure. I'll try to land again.
Comment 18•8 years ago
|
||
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b52ae1c94254
Close GPU channel unconditionally (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/0e4de1c13808
Assert when destroying a MessageLoop that a live MessageChannel is attached to (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/8123d3e11f7d
Assert that the Chromium channel is closed when MessageLoop is destroyed (r=dvander)
Comment 19•8 years ago
|
||
bugherder |
Comment 20•8 years ago
|
||
Do you think we can uplift these patches to beta?
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 21•8 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #20)
> Do you think we can uplift these patches to beta?
Seems a bit risky. We'll at least need to fix the crashes it causes (bug 1356365). Then we'd have to backport those fixes to Aurora.
Flags: needinfo?(wmccloskey)
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment 22•8 years ago
|
||
A friend of mine bisected a regression on his website to
https://hg.mozilla.org/integration/mozilla-inbound/rev/48ad3b64dea6
from this bug. Which seems strange, since that commit just removes some code. On the other hand it's in ipc code, and the bug is somehow related to his antivirus, which perhaps is relevant to ipc?
Maybe someone will have an idea. Some more details:
1. This is on https://www.kitely.com/
2. When the bug happens, loading the page on Firefox stalls and takes tens of seconds before the page is rendered and loaded, while on other browsers its very snappy.
3. This happens on my friend's machine but not mine. It appears to show up in his google analytics reports, where Firefox load time is 3.6x slower than other browsers, so it's not just a problem on his machine, and he's worried about losing users that are on Firefox.
4. Disabling his antivirus (Avast) gets rid of the bug. (He doesn't have any Firefox plugin or addon from the antivirus, or other suspicious addon or plugin.)
5. The bug is a regression, it used to be fine, and the regression seems to be on the firefox side as older firefox is still ok while newer is broken (so it's not a regression in the antivirus or something else).
6. Nothing odd shows up in the network tab in devtools. Using the profiler, the stall time is reported as spent in XRE_InitChildProcess.
Assignee | ||
Comment 23•8 years ago
|
||
Can you file a separate bug for that issue, Alon? It seems like it should go under "External software affecting Firefox" in Bugzilla. Maybe someone who watches that component will know how to reach out to the AV vendor.
Comment 24•8 years ago
|
||
I'll do that, thanks. I just first wanted to see if there was something obvious I was missing here, since bisection pointed to this bug.
Comment 25•7 years ago
|
||
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•