Closed
Bug 1295622
Opened 7 years ago
Closed 7 years ago
e10s crash in IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was
Categories
(Toolkit :: about:memory, defect)
Tracking
()
RESOLVED
FIXED
mozilla51
People
(Reporter: philipp, Assigned: jld)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
1.96 KB,
patch
|
n.nethercote
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-ad715afe-629e-467d-88ed-765e02160816. ============================================================= Crashing Thread (0) Frame Module Signature Source 0 mozglue.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp:33 1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp:434 2 xul.dll mozilla::dom::ContentChild::ProcessingError(mozilla::ipc::HasResultCodes::Result, char const*) dom/ipc/ContentChild.cpp:2359 this signature seems to regress with firefox 49 & it's currently making up ~2% of all content crashes in 49.0b3.
![]() |
||
Updated•7 years ago
|
Crash Signature: [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ] → [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was%20 ]
![]() |
||
Updated•7 years ago
|
Crash Signature: [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was%20 ] → [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ]
![]() |
||
Comment 1•7 years ago
|
||
Accurate crash list: https://crash-stats.mozilla.com/search/?product=Firefox&signature=~Msg_PMemoryReportRequestConstructor&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature Andrew, is there someone on your team who could take a look at this?
Crash Signature: [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ] → [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ]
Component: General → DOM
Flags: needinfo?(overholt)
Updated•7 years ago
|
Component: DOM → about:memory
Product: Core → Toolkit
Comment 3•7 years ago
|
||
I feel like I might have run into this once. I opened my normal set of windows (3 gmails, irc, bugzilla, telegram, twitter, etc) and then ran an about:memory measurement. The browser crashed. The key difference from my usual workflow was that I did not run a minimize before the measurement. Not sure if that helps here.
Assignee | ||
Comment 4•7 years ago
|
||
So this blames PContent::Msg_PMemoryReportRequestConstructor and says the problem is that the handler returned false, which implicates [1]; if this is the non-minimize case, then that might be the “already in progress” check in [2]. I think the problem here is [1] turning an nsresult error into an IPDL false — errors here ought to be reported somehow (add an nsresult to PMemoryReportRequest::__delete__, maybe?), but it shouldn't kill the process. (I had a lot less experience with the IPC framework when I wrote that code.) [1] http://searchfox.org/mozilla-central/rev/b35975ec17c8/dom/ipc/ContentChild.cpp#1070 [2] http://searchfox.org/mozilla-central/rev/b35975ec17c8/xpcom/base/nsMemoryReporterManager.cpp#1807
Assignee | ||
Comment 5•7 years ago
|
||
(With thanks to the Metadata tab on crash-stats, where the ipc_channel_error key has the non-truncated version of the error message.)
Flags: needinfo?(continuation)
Comment 6•7 years ago
|
||
Thanks for looking at this, Jed!
![]() |
||
Updated•7 years ago
|
Looks like there are about 170 crashes on beta 5 in the last week. We could still take a patch for beta.
Assignee | ||
Comment 9•7 years ago
|
||
This is the simple “band-aid” fix. It's not actually tested, because I don't have STR (and it's more complicated than just starting multiple memory reports in succession; the timeout stuff might be involved somehow, and see also bug 1297164 as mentioned by :erahm on IRC the other day), but it ensures that RecvPMemoryReportRequestConstructor can't ever return `false`, which will prevent this crash. Also, I added an NS_WARN_IF so that we get *some* indication (at least on debug builds) that things went wrong even in the minimizeMemoryUsage case, where we're currently not even printing a warning, it looks like. Longer-term the error handling (or lack thereof) of all the memory reporting stuff should be looked at, but a fix at that level would need more time than I currently have (and would probably be difficult and high-risk to uplift, whereas this is very simple and low-risk).
Attachment #8785534 -
Flags: review?(n.nethercote)
![]() |
||
Comment 10•7 years ago
|
||
Comment on attachment 8785534 [details] [diff] [review] bug1295622-remote-memory-report-fail-hg0.diff Review of attachment 8785534 [details] [diff] [review]: ----------------------------------------------------------------- In bug 1297658 I just streamlined (i.e. removed) a lot of the error checking within individual memory reporters because it wasn't useful. I think that's orthogonal to this, but Jed you might be interested in at least looking at that bug's patch. ::: dom/ipc/ContentChild.cpp @@ +1062,5 @@ > } > > + // Bug 1295622: don't kill the process just because this failed. > + NS_WARN_IF(NS_FAILED(rv)); > + return true; The process is killed if this function fails? Really?
Attachment #8785534 -
Flags: review?(n.nethercote) → review+
Assignee | ||
Comment 11•7 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #10) > In bug 1297658 I just streamlined (i.e. removed) a lot of the error checking > within individual memory reporters because it wasn't useful. I think that's > orthogonal to this, but Jed you might be interested in at least looking at > that bug's patch. Thanks for the pointer. > ::: dom/ipc/ContentChild.cpp > @@ +1062,5 @@ > > } > > > > + // Bug 1295622: don't kill the process just because this failed. > > + NS_WARN_IF(NS_FAILED(rv)); > > + return true; > > The process is killed if this function fails? Really? Yes, if it fails by returning false rather than by signaling an error in-band with IPC messages. https://developer.mozilla.org/en-US/docs/Mozilla/IPDL/Tutorial#Shutdown_and_Error_Handling tries to explain. It's been like that for longer than I've been a contributor to this area (as confirmed by the MDN history — which makes me wonder how I managed to miss it when I read through the tutorial the first time), so I don't know the original rationale.
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 12•7 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/c9df0314cff2 Don't crash the content process if a memory report fails. r=njn
Keywords: checkin-needed
Comment 13•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c9df0314cff2
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Comment 14•7 years ago
|
||
This seems to be a very low-risk patch: could we uplift it to 50 and 49? I almost just filed this bug because personally experience it almost every night if I leave nightly open. This is the kind of thing that might really annoy our dedicated/active userbase.
Flags: needinfo?(jld)
Comment 15•7 years ago
|
||
And FWIW, I didn't ever use about:memory, so this was triggered somehow else.
Assignee | ||
Comment 16•7 years ago
|
||
Comment on attachment 8785534 [details] [diff] [review] bug1295622-remote-memory-report-fail-hg0.diff Approval Request Comment [Feature/regressing bug #]: e10s memory reporting (since bug 946407?) [User impact if declined]: Crashes — and comment #14 reports this can happen without explicitly trying to take a memory report. [Describe test coverage new/current, TreeHerder]: There are e10s memory reporting tests in mochitest-other; the patch has been stable on m-c for a little while. [Risks and why]: Very low; this applies to cases where we would have crashed, and ignores an error which is already being ignored in other cases. [String/UUID change made/needed]: None
Flags: needinfo?(jld)
Attachment #8785534 -
Flags: approval-mozilla-beta?
Attachment #8785534 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 17•7 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #15) > And FWIW, I didn't ever use about:memory, so this was triggered somehow else. That's interesting — a PMemoryReportRequest has to have been created to get to this crash, so *something* was doing a memory report. Looking through the source for things that might be using nsIMemoryReporterManager and/or nsIMemoryInfoDumper to get reports (and that aren't tests or about:memory itself) finds the memory.dump_reports_on_oom pref (off by default, used by the JS engine allocator?) and nsThread::SaveMemoryReportNearOOM -> nsXULAppInfo::SaveMemoryReport.
Comment on attachment 8785534 [details] [diff] [review] bug1295622-remote-memory-report-fail-hg0.diff Looks low risk, should fix a common crash in beta. This should make it into the beta 9 build on Thursday.
Attachment #8785534 -
Flags: approval-mozilla-beta?
Attachment #8785534 -
Flags: approval-mozilla-beta+
Attachment #8785534 -
Flags: approval-mozilla-aurora?
Attachment #8785534 -
Flags: approval-mozilla-aurora+
Comment 19•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/d632eadae7e3
Comment 20•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/e64579bcbcca
You need to log in
before you can comment on or make changes to this bug.
Description
•