Closed
Bug 1269036
Opened 9 years ago
Closed 8 years ago
crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition from ABORT: __delete__()d actor
Categories
(Core :: DOM: Content Processes, defect, P1)
Tracking
()
People
(Reporter: dbaron, Assigned: kanru)
References
Details
(Keywords: crash, regression, Whiteboard: e10st?)
Crash Data
Attachments
(2 files)
58 bytes,
text/x-review-board-request
|
smaug
:
review+
ritu
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
|
Details |
58 bytes,
text/x-review-board-request
|
smaug
:
review+
|
Details |
[Tracking Requested - why for this release]:
[Tracking Requested - why for this release]:
This bug was filed from the Socorro interface and is
report bp-2df9207b-3c07-494c-a804-ec4722160429.
=============================================================
Note that this bug's crash signature should change when bug 1269033 is fixed.
Crashes on nightly with the signature mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError started happening on low frequency on nightly starting with build 20160408030212, although given the frequencies it's not certain that it was a regression from the previous nightly:
https://crash-stats.mozilla.com/signature/?product=Firefox&release_channel=nightly&platform=Windows&date=%3E%3D2016-01-01&signature=mozalloc_abort+%7C+NS_DebugBreak+%7C+mozilla%3A%3Aipc%3A%3ALogicError&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=abort_message&_columns=url&page=1
Rank Build id Count %
4 20160408030212 3 8.82 %
1 20160409030219 6 17.65 %
5 20160410030224 3 8.82 %
2 20160411030231 5 14.71 %
6 20160413030239 3 8.82 %
8 20160418030305 2 5.88 %
9 20160420030213 1 2.94 %
7 20160421030302 3 8.82 %
3 20160423030220 5 14.71 %
10 20160424030601 1 2.94 %
11 20160427030215 1 2.94 %
12 20160428030218 1 2.94 %
They've now merged over to aurora, and are showing up with somewhat higher numbers, although that's partly due to a single user crashing many times:
https://crash-stats.mozilla.com/signature/?product=Firefox&release_channel=aurora&platform=Windows&date=%3E%3D2016-01-01&signature=mozalloc_abort+%7C+NS_DebugBreak+%7C+mozilla%3A%3Aipc%3A%3ALogicError&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=abort_message&_columns=url&_columns=install_time&page=1
Rank Build id Count %
1 20160429004052 14 82.35 %
2 20160428004042 3 17.65 %
In all cases the abort message is:
[Parent NNNN] ###!!! ABORT: __delete__()d actor: file c:/builds/moz2_slave/m-cen-w64-ntly-000000000000000/build/src/ipc/glue/ProtocolUtils.cpp, line 377
The top of the stack of the report I linked to is:
Frame Module Signature Source
0 mozglue.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp
1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp
2 xul.dll mozilla::ipc::LogicError(char const*) ipc/glue/ProtocolUtils.cpp
3 xul.dll mozilla::dom::PBrowser::Transition(mozilla::dom::PBrowser::State, mozilla::ipc::Trigger, mozilla::dom::PBrowser::State*) obj-firefox/ipc/ipdl/PBrowser.cpp
4 xul.dll mozilla::dom::PBrowserParent::SendDestroy() obj-firefox/ipc/ipdl/PBrowserParent.cpp
5 xul.dll mozilla::dom::TabParent::DestroyInternal() dom/ipc/TabParent.cpp
6 xul.dll mozilla::dom::TabParent::Destroy() dom/ipc/TabParent.cpp
7 xul.dll nsFrameLoader::DestroyComplete() dom/base/nsFrameLoader.cpp
8 xul.dll mozilla::dom::TabParent::ActorDestroy(mozilla::ipc::IProtocolManager<mozilla::ipc::IProtocol>::ActorDestroyReason) dom/ipc/TabParent.cpp
though I haven't yet verified that the other crashes have the same stack.
I'd say this is borderline for being considered frequent enough to be a topcrash.
A one day regression window for when it started on nightly would be:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b6683e141c47c022598c0caac3ea8ba8c6236d42&tochange=d9b1a9829c8ee2862955043f28183efa07de3d2b
A broader three day window would be:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d9f50aa0a1aaf90499b85c31e0f329b762e80fdd&tochange=d9b1a9829c8ee2862955043f28183efa07de3d2b
since it's not clear that the frequency is high enough that we can be confident in a one day window.
Reporter | ||
Comment 1•9 years ago
|
||
The regression window points to the fact that it was actually just a change in crash signature from:
https://hg.mozilla.org/mozilla-central/rev/0e3178883b51
Reporter | ||
Comment 2•9 years ago
|
||
I'll wait to see what the signatures look like after bug 1269033 before trying to make further progress here.
Comment 3•9 years ago
|
||
Tracking for 48 and 49 - dbaron can you help find an owner for this or do you want to follow up per Comment 2?
Reporter | ||
Comment 5•9 years ago
|
||
The thing I wanted to follow up on was confirming that this was the bulk of the mozilla::ipc::LogicError crashes, which it is. (Although not all.)
Flags: needinfo?(dbaron)
Reporter | ||
Comment 6•9 years ago
|
||
(I filed bug 1269854 on the rest of the crashes.)
Summary: crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError → crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition from ABORT: __delete__()d actor
Updated•9 years ago
|
Flags: needinfo?(jmathies)
Updated•9 years ago
|
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → kchen
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError]
[@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition] → [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError]
[@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition]
[@ mozalloc_abort | Abort | NS_DebugBreak | mozilla::dom::PBrowserParent::SendDestroy]
Updated•8 years ago
|
Whiteboard: e10st?
Updated•8 years ago
|
Updated•8 years ago
|
Version: unspecified → Trunk
Updated•8 years ago
|
status-firefox47:
--- → wontfix
status-firefox50:
--- → affected
Updated•8 years ago
|
Updated•8 years ago
|
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError]
[@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition]
[@ mozalloc_abort | Abort | NS_DebugBreak | mozilla::dom::PBrowserParent::SendDestroy] → [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError]
[@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition]
[@ mozalloc_abort | Abort | NS_DebugBreak | mozilla::dom::PBrowserParent::SendDestroy]
[@ A…
Comment 9•8 years ago
|
||
This is a fairly high volume crash in early beta, so if we come up with a fix, I'm hoping we can uplift it.
Comment 10•8 years ago
|
||
A user reported via Twitter [1] that Firefox crashed with this signature when he clicked a phone number on some address card in Google's search result page. [2]
[1] https://twitter.com/dandelionspiral/status/765121569681125377 (Chinese)
[2] https://crash-stats.mozilla.com/report/index/514feb9d-85df-4688-9e0e-8cecc2160815
Assignee | ||
Comment 11•8 years ago
|
||
Did he/she clicked on the "phone number" in the card? The only clickable part I see is the label part which is a search link.
Assignee | ||
Updated•8 years ago
|
Crash Signature: Abort | __delete__()d actor | mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition] → Abort | __delete__()d actor | mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::dom::PBrowser::Transition]
[@ Abort | __delete__()d actor | mozalloc_abort | NS_DebugBreak | mozilla::dom::PBrowser::Transition]
Comment 12•8 years ago
|
||
Is this related to the Windows option (or plugin/extension of some kind, I can't recall) that tries to make all phone numbers clickable? Philip does this look familiar to you?
Flags: needinfo?(madperson)
Assignee | ||
Comment 13•8 years ago
|
||
I talked to the user in comment 10. The clickable phone number is made by google to enable calling with hangouts. It's a javascript:void(0) link and it will open a new window when you click on it.
I traced the relevant code before and this confirmed by theory that when for some reason the window opener in browser.xul failed to create a new window and attach the remote browser to it, we crash with this signature.
I haven't traced to where we could fail to create a new window but I can make a band-aid patch that stops this crash. It will hide the real problem though.
Comment 14•8 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #12)
> Is this related to the Windows option (or plugin/extension of some kind, I
> can't recall) that tries to make all phone numbers clickable? Philip does
> this look familiar to you?
the skype click-to-play extension doesn't seem to be involved in those kind of crashes, in case you had that in mind.
Flags: needinfo?(madperson)
Comment hidden (mozreview-request) |
Comment 17•8 years ago
|
||
mozreview-review |
Comment on attachment 8784331 [details]
Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
https://reviewboard.mozilla.org/r/73826/#review71820
::: dom/ipc/ContentParent.cpp:5008
(Diff revision 1)
> }
>
> TabParent* newTab = TabParent::GetFrom(aNewTab);
> MOZ_ASSERT(newTab);
>
> + auto destroyNewTabOnError = MakeScopeExit([&] {
(Got to say that this kind of lambdas make the code harder to read, but I see this is very useful here.)
Comment 18•8 years ago
|
||
mozreview-review |
Comment on attachment 8784331 [details]
Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
https://reviewboard.mozilla.org/r/73826/#review71830
Yeah, I think this makes sense. Did you try artificially to make window opening to fail?
Attachment #8784331 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 19•8 years ago
|
||
(In reply to Olli Pettay [:smaug] (vacation Aug 25-28) from comment #18)
> Comment on attachment 8784331 [details]
> Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
>
> https://reviewboard.mozilla.org/r/73826/#review71830
>
> Yeah, I think this makes sense. Did you try artificially to make window
> opening to fail?
Yes, I tried to make OpenWindowWithTabParent fail.
Comment 21•8 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/7f1978918cb1
Never call PBrowserChild::Send__delete__ in content process. r=smaug
Keywords: checkin-needed
Comment 22•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox51:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Comment 23•8 years ago
|
||
hi, this crash is causing 0.5% of browser crashes on 49.0b at the moment. do you think the patch would be upliftable in terms of scope/risk?
Flags: needinfo?(kchen)
Assignee | ||
Comment 24•8 years ago
|
||
(In reply to [:philipp] from comment #23)
> hi, this crash is causing 0.5% of browser crashes on 49.0b at the moment. do
> you think the patch would be upliftable in terms of scope/risk?
I think it's rather safe to uplift.
Flags: needinfo?(kchen)
Assignee | ||
Comment 25•8 years ago
|
||
Comment on attachment 8784331 [details]
Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
Approval Request Comment
[Feature/regressing bug #]: n/a
[User impact if declined]: user may experience browser crashes when websites try to open a new window
[Describe test coverage new/current, TreeHerder]: landed on m-c, baked on Nightly for several days
[Risks and why]: This patch moves TabParent deallocation from the content process to the parent process. If the TabParent is not deallocated correctly it will cause small amount leak. I've tested the fail to open window scenario manually and confirmed the TabParent is correctly destroyed so I think the risk of leak is small.
[String/UUID change made/needed]: n/a
Attachment #8784331 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 27•8 years ago
|
||
Comment on attachment 8784331 [details]
Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
Approval Request Comment: see comment 25
Flags: needinfo?(kchen)
Attachment #8784331 -
Flags: approval-mozilla-beta?
Comment on attachment 8784331 [details]
Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
The 4th signature shows a number of occurrences on Beta49 and since the fix landed in Nightly, there are no reports of this crash. Let's uplift to Aurora50.
Attachment #8784331 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 29•8 years ago
|
||
bugherder uplift |
Comment 30•8 years ago
|
||
Comment on attachment 8784331 [details]
Bug 1269036 - Never call PBrowserChild::Send__delete__ in content process.
Fix for top 20 browser crash in beta, let's uplift for beta 9.
Attachment #8784331 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 31•8 years ago
|
||
bugherder uplift |
Comment 32•8 years ago
|
||
Backed out from Beta for causing Linux timeouts in test_HighPriority.html and subsequent tests.
https://treeherder.mozilla.org/logviewer.html#?job_id=1568007&repo=mozilla-beta
https://hg.mozilla.org/releases/mozilla-beta/rev/089c3a6535cb
Flags: needinfo?(kchen)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 34•8 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #32)
> Backed out from Beta for causing Linux timeouts in test_HighPriority.html
> and subsequent tests.
> https://treeherder.mozilla.org/logviewer.html#?job_id=1568007&repo=mozilla-
> beta
>
> https://hg.mozilla.org/releases/mozilla-beta/rev/089c3a6535cb
We also need to destroy TabParent properly for mozbrowser iframes. This shouldn't affect beta user because mozbrowser is not exposed to content. IMO we should reland the fixes to beta after the tests are fixed.
Flags: needinfo?(kchen)
Assignee | ||
Comment 35•8 years ago
|
||
Comment 36•8 years ago
|
||
mozreview-review |
Comment on attachment 8787547 [details]
Bug 1269036 - Also destroy TabParent properly in case of mozbrowser.
https://reviewboard.mozilla.org/r/76264/#review74410
I assume you went through all the cases when opening window might fail.
Attachment #8787547 -
Flags: review?(bugs) → review+
Comment 37•8 years ago
|
||
FYI Firefox 49 goes to release candidate builds next week. We'll need to weigh risks of re-landing on 49.
Comment 38•8 years ago
|
||
Fairly high number of crashes, I'm ok to re-land this on beta if you can do it before the beta to release merge. Kan-ru can you land it? If not, hopefully a sheriff will see this Monday morning .
Flags: needinfo?(kchen)
Comment 39•8 years ago
|
||
e10s only bug. With beta 50% and release 100%; a basic assumption is double the crashes if not fixed for release.
Comment 40•8 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d75552d5fa0d
Also destroy TabParent properly in case of mozbrowser. r=smaug
Comment 41•8 years ago
|
||
bugherder uplift |
Updated•8 years ago
|
Flags: needinfo?(kchen)
Comment 42•8 years ago
|
||
bugherder |
You need to log in
before you can comment on or make changes to this bug.
Description
•