Closed Bug 1295622 Opened 9 years ago Closed 9 years ago

e10s crash in IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was

Categories

(Toolkit :: about:memory, defect)

49 Branch
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
e10s + ---
firefox49 --- fixed
firefox50 --- fixed
firefox51 --- fixed

People

(Reporter: philipp, Assigned: jld)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is report bp-ad715afe-629e-467d-88ed-765e02160816. ============================================================= Crashing Thread (0) Frame Module Signature Source 0 mozglue.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp:33 1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp:434 2 xul.dll mozilla::dom::ContentChild::ProcessingError(mozilla::ipc::HasResultCodes::Result, char const*) dom/ipc/ContentChild.cpp:2359 this signature seems to regress with firefox 49 & it's currently making up ~2% of all content crashes in 49.0b3.
Crash Signature: [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ] → [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was%20 ]
Crash Signature: [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was%20 ] → [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ]
Crash Signature: [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ] → [@ IPCError-content | (msgtype=0x46000A,name=PContent::Msg_PMemoryReportRequestConstructor) Processing error: message was ]
Component: General → DOM
Flags: needinfo?(overholt)
Are these just OOMs?
Flags: needinfo?(overholt) → needinfo?(continuation)
Component: DOM → about:memory
Product: Core → Toolkit
I feel like I might have run into this once. I opened my normal set of windows (3 gmails, irc, bugzilla, telegram, twitter, etc) and then ran an about:memory measurement. The browser crashed. The key difference from my usual workflow was that I did not run a minimize before the measurement. Not sure if that helps here.
So this blames PContent::Msg_PMemoryReportRequestConstructor and says the problem is that the handler returned false, which implicates [1]; if this is the non-minimize case, then that might be the “already in progress” check in [2]. I think the problem here is [1] turning an nsresult error into an IPDL false — errors here ought to be reported somehow (add an nsresult to PMemoryReportRequest::__delete__, maybe?), but it shouldn't kill the process. (I had a lot less experience with the IPC framework when I wrote that code.) [1] http://searchfox.org/mozilla-central/rev/b35975ec17c8/dom/ipc/ContentChild.cpp#1070 [2] http://searchfox.org/mozilla-central/rev/b35975ec17c8/xpcom/base/nsMemoryReporterManager.cpp#1807
(With thanks to the Metadata tab on crash-stats, where the ipc_channel_error key has the non-truncated version of the error message.)
Flags: needinfo?(continuation)
Thanks for looking at this, Jed!
Tentatively assigning to Jed.
Assignee: nobody → jld
Looks like there are about 170 crashes on beta 5 in the last week. We could still take a patch for beta.
This is the simple “band-aid” fix. It's not actually tested, because I don't have STR (and it's more complicated than just starting multiple memory reports in succession; the timeout stuff might be involved somehow, and see also bug 1297164 as mentioned by :erahm on IRC the other day), but it ensures that RecvPMemoryReportRequestConstructor can't ever return `false`, which will prevent this crash. Also, I added an NS_WARN_IF so that we get *some* indication (at least on debug builds) that things went wrong even in the minimizeMemoryUsage case, where we're currently not even printing a warning, it looks like. Longer-term the error handling (or lack thereof) of all the memory reporting stuff should be looked at, but a fix at that level would need more time than I currently have (and would probably be difficult and high-risk to uplift, whereas this is very simple and low-risk).
Attachment #8785534 - Flags: review?(n.nethercote)
Comment on attachment 8785534 [details] [diff] [review] bug1295622-remote-memory-report-fail-hg0.diff Review of attachment 8785534 [details] [diff] [review]: ----------------------------------------------------------------- In bug 1297658 I just streamlined (i.e. removed) a lot of the error checking within individual memory reporters because it wasn't useful. I think that's orthogonal to this, but Jed you might be interested in at least looking at that bug's patch. ::: dom/ipc/ContentChild.cpp @@ +1062,5 @@ > } > > + // Bug 1295622: don't kill the process just because this failed. > + NS_WARN_IF(NS_FAILED(rv)); > + return true; The process is killed if this function fails? Really?
Attachment #8785534 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #10) > In bug 1297658 I just streamlined (i.e. removed) a lot of the error checking > within individual memory reporters because it wasn't useful. I think that's > orthogonal to this, but Jed you might be interested in at least looking at > that bug's patch. Thanks for the pointer. > ::: dom/ipc/ContentChild.cpp > @@ +1062,5 @@ > > } > > > > + // Bug 1295622: don't kill the process just because this failed. > > + NS_WARN_IF(NS_FAILED(rv)); > > + return true; > > The process is killed if this function fails? Really? Yes, if it fails by returning false rather than by signaling an error in-band with IPC messages. https://developer.mozilla.org/en-US/docs/Mozilla/IPDL/Tutorial#Shutdown_and_Error_Handling tries to explain. It's been like that for longer than I've been a contributor to this area (as confirmed by the MDN history — which makes me wonder how I managed to miss it when I read through the tutorial the first time), so I don't know the original rationale.
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/c9df0314cff2 Don't crash the content process if a memory report fails. r=njn
Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
This seems to be a very low-risk patch: could we uplift it to 50 and 49? I almost just filed this bug because personally experience it almost every night if I leave nightly open. This is the kind of thing that might really annoy our dedicated/active userbase.
Flags: needinfo?(jld)
And FWIW, I didn't ever use about:memory, so this was triggered somehow else.
Comment on attachment 8785534 [details] [diff] [review] bug1295622-remote-memory-report-fail-hg0.diff Approval Request Comment [Feature/regressing bug #]: e10s memory reporting (since bug 946407?) [User impact if declined]: Crashes — and comment #14 reports this can happen without explicitly trying to take a memory report. [Describe test coverage new/current, TreeHerder]: There are e10s memory reporting tests in mochitest-other; the patch has been stable on m-c for a little while. [Risks and why]: Very low; this applies to cases where we would have crashed, and ignores an error which is already being ignored in other cases. [String/UUID change made/needed]: None
Flags: needinfo?(jld)
Attachment #8785534 - Flags: approval-mozilla-beta?
Attachment #8785534 - Flags: approval-mozilla-aurora?
(In reply to Benjamin Smedberg [:bsmedberg] from comment #15) > And FWIW, I didn't ever use about:memory, so this was triggered somehow else. That's interesting — a PMemoryReportRequest has to have been created to get to this crash, so *something* was doing a memory report. Looking through the source for things that might be using nsIMemoryReporterManager and/or nsIMemoryInfoDumper to get reports (and that aren't tests or about:memory itself) finds the memory.dump_reports_on_oom pref (off by default, used by the JS engine allocator?) and nsThread::SaveMemoryReportNearOOM -> nsXULAppInfo::SaveMemoryReport.
Comment on attachment 8785534 [details] [diff] [review] bug1295622-remote-memory-report-fail-hg0.diff Looks low risk, should fix a common crash in beta. This should make it into the beta 9 build on Thursday.
Attachment #8785534 - Flags: approval-mozilla-beta?
Attachment #8785534 - Flags: approval-mozilla-beta+
Attachment #8785534 - Flags: approval-mozilla-aurora?
Attachment #8785534 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: