Closed
Bug 1037214
Opened 11 years ago
Closed 11 years ago
crash in mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::dom::FragmentOrElement::GetTextContentInternal(nsAString_internal&)
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
VERIFIED
FIXED
mozilla34
People
(Reporter: u279076, Assigned: Dexter)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file, 2 obsolete files)
10.96 KB,
patch
|
Sylvestre
:
approval-mozilla-aurora+
lmandel
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-8f3bc16c-96cb-45a9-90b7-be9ef2140704.
=============================================================
0 mozalloc.dll mozalloc_abort(char const * const) memory/mozalloc/mozalloc_abort.cpp
1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp
2 xul.dll mozilla::dom::FragmentOrElement::GetTextContentInternal(nsAString_internal &) content/base/src/FragmentOrElement.cpp
3 xul.dll mozilla::dom::NodeBinding::get_textContent obj-firefox/dom/bindings/NodeBinding.cpp
4 xul.dll mozilla::dom::GenericBindingGetter(JSContext *,unsigned int,JS::Value *) dom/bindings/BindingUtils.cpp
5 mozjs.dll js::Invoke(JSContext *,JS::CallArgs,js::MaybeConstruct) js/src/vm/Interpreter.cpp
6 xul.dll xpc::XrayWrapper<js::CrossCompartmentWrapper,xpc::DOMXrayTraits>::getPropertyDescriptor(JSContext *,JS::Handle<JSObject *>,JS::Handle<jsid>,JS::MutableHandle<JSPropertyDescriptor>) js/xpconnect/wrappers/XrayWrapper.cpp
7 xul.dll xpc::XrayWrapper<js::CrossCompartmentWrapper,xpc::XPCWrappedNativeXrayTraits>::get(JSContext *,JS::Handle<JSObject *>,JS::Handle<JSObject *>,JS::Handle<jsid>,JS::MutableHandle<JS::Value>) js/xpconnect/wrappers/XrayWrapper.cpp
8 mozjs.dll js::jit::DoGetPropFallback js/src/jit/BaselineIC.cpp
9 @0x81000000
=============================================================
More reports:
https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort%28char%20const*%20const%29%20|%20NS_DebugBreak%20|%20mozilla%3A%3Adom%3A%3AFragmentOrElement%3A%3AGetTextContentInternal%28nsAString_internal%26%29
This looks like a new crash in Firefox 31. I'm only seeing reports for Firefox 31 Beta, 32 Aurora, and 33 Nightly. Most of these crashes appear to be on Windows 7 32-bit. Two comments mention opening a new tab. The top URL is facebook.com but that might be a red herring.
It's not really high volume right now (it doesn't even show up in the top 300). I'm only reporting it because it would seem to be a Firefox 31 regression and has a fairly high explosiveness rating.
https://crash-analysis.mozilla.com/rkaiser/2014-07-09/2014-07-09.firefox.31.explosiveness.html
Comment 1•11 years ago
|
||
The code is:
if(!nsContentUtils::GetNodeTextContent(this, true, aTextContent))
NS_RUNTIMEABORT("OOM");
and we're hitting the NS_RUNTIMEABORT.
We used to just silently return the wrong textContent instead, before the changes in bug 950076.
We should probably throw OOM to script here instead of aborting the process.
Blocks: 950076
Updated•11 years ago
|
Comment 2•11 years ago
|
||
Hmm, outside of the actual fix here, if this is an OOM case, we should see to crash with an abort signature if we crash at all. I see this will probably fixed to not actually crash (yay) but do we have a similar pattern elsewhere (doing an OOM abort that doesn't get OOM signatures)? If so, we should see to change those places (ask bsmedberg what to do so we correctly detect the OOM in crash-stats).
Updated•11 years ago
|
Assignee | ||
Comment 3•11 years ago
|
||
It might be worthwhile discussing this with :jst, since he suggested to abort instead of returned things in a broken state in the first place ( https://bugzilla.mozilla.org/show_bug.cgi?id=950076#c22 )
(In reply to Boris Zbarsky [:bz] from comment #1)
> The code is:
>
> if(!nsContentUtils::GetNodeTextContent(this, true, aTextContent))
> NS_RUNTIMEABORT("OOM");
>
> and we're hitting the NS_RUNTIMEABORT.
>
> We used to just silently return the wrong textContent instead, before the
> changes in bug 950076.
>
> We should probably throw OOM to script here instead of aborting the process.
Comment 5•11 years ago
|
||
Actually, I just looked at what jst suggested. He suggested aborting in the nsXBLPrototypeBinding code, not in general, in bug 950076 comment 22.
In cases when it's being called with script on the stack, as here, the right thing is to throw an exception.
Flags: needinfo?(jst) → needinfo?(alessio.placitelli)
Assignee | ||
Comment 6•11 years ago
|
||
I see. I'm sorry, I completely misunderstood :jst's suggestion. I can take care of the fix if you wish! What do you mean by throwing OOM to the script? Can you provide me with an example?
Robert, the error propagation part of the patch in bug 950076 (https://bugzilla.mozilla.org/attachment.cgi?id=8394385) adds a NS_RUNTIMEABORT("OOM") in a couple more places. Should I ping :bsmedberg to understand what to do so we correctly detect the OOM in crash-stats?
Flags: needinfo?(alessio.placitelli)
Comment 7•11 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #6)
> Robert, the error propagation part of the patch in bug 950076
> (https://bugzilla.mozilla.org/attachment.cgi?id=8394385) adds a
> NS_RUNTIMEABORT("OOM") in a couple more places. Should I ping :bsmedberg to
> understand what to do so we correctly detect the OOM in crash-stats?
Yes, I think something should be done there that actually 1) reports some OOM signature, 2) reports the size of the failing allocation. That said, Benjamin is out atm, but :dmajor should be able to help as well. David?
Flags: needinfo?(dmajor)
Comment 8•11 years ago
|
||
> What do you mean by throwing OOM to the script? Can you provide me with an example?
Sure. You mark the attrbute as [Throws] in Node.webidl (it's already SetterThrows, which you'll want to remove; [Throws] means both getter and setter throw).
Then you add an ErrorResult& argument to nsINode::GetTextContent and nsINode::GetTextContentInternal. In the latter, you do aResult.Throw(NS_ERROR_OUT_OF_MEMORY) in the failure case. And then you fix any compile errors that arise from internal callers who need to pass in an ErrorResult now.
Assignee | ||
Comment 9•11 years ago
|
||
Thanks Boris and Robert for your replies.
@Boris, did you mean something along the line of this patch?
@Robert, perfect, I'll wait for :dmajor to come by and leave a comment then :)
Attachment #8459301 -
Flags: feedback?(bzbarsky)
Comment 10•11 years ago
|
||
Comment on attachment 8459301 [details] [diff] [review]
bug_1037214.patch
Yep, though you don't need the "mozilla::" prefix in nsGenericHTMLElement::GetItemValueText.
r=me
Attachment #8459301 -
Flags: review+
Attachment #8459301 -
Flags: feedback?(bzbarsky)
Attachment #8459301 -
Flags: feedback+
Comment 11•11 years ago
|
||
I've only skimmed this bug and 950076, but I think the concern from the crash-stats team is about making sure we can properly categorize these in crash reports. Generally the options for error handling are, from best to worst:
1. Handle the OOM gracefully and safely
2. Crash very close to the failure, with an OOM signature and a size
3. Crash in some other way
These NS_RUNTIMEABORT("OOM") are a case of #3. An improvement would be NS_ABORT_OOM(size) but only if you know the size of the failed allocation.
If you have a pattern where the code doing the null-check is several stack frames away from the allocator, then your size information is probably lost. Or, said another way: the decision to crash should be made at a point where you still know the size. Any code above that should pass down its preference as a fallible_t parameter.
Flags: needinfo?(dmajor)
Assignee | ||
Comment 12•11 years ago
|
||
Thanks for the review Boris. As for the "mozilla::" prefix, I just kept it to be locally consistent with the one in nsGenericHTMLElement::SetItemValueText. Do you think it would be better to remove it (from nsGenericHTMLElement::GetItemValueText ?).
(In reply to Boris Zbarsky [:bz] from comment #10)
> Comment on attachment 8459301 [details] [diff] [review]
> bug_1037214.patch
>
> Yep, though you don't need the "mozilla::" prefix in
> nsGenericHTMLElement::GetItemValueText.
>
> r=me
Comment 13•11 years ago
|
||
ni(bz) for comment 12.
Does Alessio's patch address your concerns in comment 11, David?
Assignee: nobody → alessio.placitelli
Flags: needinfo?(dmajor)
Flags: needinfo?(bzbarsky)
Comment 14•11 years ago
|
||
Probably not, since there are other GetNodeTextContent callers not affected by this patch that abort if it fails.
Flags: needinfo?(bzbarsky)
Comment 15•11 years ago
|
||
Alessio, please post a link to a try run and if it's all green, set the checkin-needed keyword.
Flags: needinfo?(alessio.placitelli)
Comment 16•11 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #13)
> Does Alessio's patch address your concerns in comment 11, David?
This patch is an improvement over the current tree, in that it removes one NS_RUNTIMEABORT("OOM").
I don't like that there are still a bunch of other aborts from bug 950076 still remaining. Those other aborts should either be handled or replaced with NS_ABORT_OOM(size). If that's better done in another bug, I'd be happy to open it. Or if you tell me that the others are so rare that I'll never see them in reports, I'd be fine with that as well.
Flags: needinfo?(dmajor)
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #15)
> Alessio, please post a link to a try run and if it's all green, set the
> checkin-needed keyword.
Andrew, I'm afraid there's still work that needs to be done!
Flags: needinfo?(alessio.placitelli)
Assignee | ||
Comment 18•11 years ago
|
||
How can I know if there's a script on the stack or not? Or, in other words, is there any other place where the applied by the attached patch would be useful?
Unfortunately I'm pretty much new in this area, so I'm not quite sure what's the best way to tackle the other |NS_RUNTIMEABORT("OOM")| without introducing other bugs. Does anybody have any suggestion?
Would it make sense to have two versions of |nsContentUtils::GetNodeTextContent|, one which is fallible and the other which is not? (i.e., we need to add the infallible version).
Flags: needinfo?(bzbarsky)
Comment 19•11 years ago
|
||
> How can I know if there's a script on the stack or not?
I don't think you should care.
The relevant question is more "is there a sane thing the caller can do if I return a failure here?" Script can: it can throw. For other callers you have to look at it on a case-by-case basis.
> I'm not quite sure what's the best way to tackle the other |NS_RUNTIMEABORT("OOM")|
I would claim the best way is in a separate bug. .textContent is likely to be far and away the most likely way this code is called and the most likely way for it to end up with a large string that triggers OOM.
> Would it make sense to have two versions of |nsContentUtils::GetNodeTextContent|
That really depends on the callers. We should look through them and decide what makes sense for them, but in a separate bug.
Flags: needinfo?(bzbarsky)
Comment 20•11 years ago
|
||
> I just kept it to be locally consistent with the one in
> nsGenericHTMLElement::SetItemValueText
I don't think the prefix should be there either.
Comment 21•11 years ago
|
||
I did a try push without that prefix (but on top of some of my local patches that I also wanted to test) at https://tbpl.mozilla.org/?tree=Try&rev=5a9308fe2b00
Comment 22•11 years ago
|
||
Looks like the patch needs to also fix the caller in accessible/mac/mozHTMLAccessible.mm
Assignee | ||
Comment 23•11 years ago
|
||
This takes care of the problem in accessible/mac/mozHTMLAccessible.mm.
This is a new try run, I just enabled the tests for Mac OSX since the others were all green in the previous test run and I just modified Mac code here:
https://tbpl.mozilla.org/?tree=Try&rev=adb15dd886ce
Attachment #8459301 -
Attachment is obsolete: true
Assignee | ||
Comment 24•11 years ago
|
||
(In reply to On vacation Aug 5-18. Please do not request review. from comment #19)
> > I'm not quite sure what's the best way to tackle the other |NS_RUNTIMEABORT("OOM")|
>
> I would claim the best way is in a separate bug.
Thanks, I've just opened a new bug to discuss about how to fix the other aborts (Bug 1048012). Does that mean I can flag this bug as checkin-needed?
Flags: needinfo?(bzbarsky)
Comment 26•11 years ago
|
||
Looks like it has xpcshell failures?
https://tbpl.mozilla.org/php/getParsedLog.php?id=45140119&tree=Try
Keywords: checkin-needed
Assignee | ||
Comment 27•11 years ago
|
||
As per IRC chat with :RyanVM, I'm flagging this again as checkin-needed.
This new patch contains no changes, it was just rebased against latest m-c. The test was failing due to bug 1045421.
Attachment #8466778 -
Attachment is obsolete: true
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 28•11 years ago
|
||
The relative try run: https://tbpl.mozilla.org/?tree=Try&rev=bfc7f06a9833
Comment 29•11 years ago
|
||
Keywords: checkin-needed
Comment 30•11 years ago
|
||
Follow-up for non-unified bustage:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf5d28075da4
https://hg.mozilla.org/mozilla-central/rev/4361cb2e3378
https://hg.mozilla.org/mozilla-central/rev/cf5d28075da4
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Updated•11 years ago
|
status-firefox34:
--- → affected
Reporter | ||
Comment 32•11 years ago
|
||
There are zero crashes reported with Firefox 34.0 built later than 2014-08-05.
Status: RESOLVED → VERIFIED
Comment 33•10 years ago
|
||
Alessio, can we get an uplift request for aurora (33) and beta (32)? Thanks
Flags: needinfo?(alessio.placitelli)
Assignee | ||
Comment 34•10 years ago
|
||
Comment on attachment 8467740 [details] [diff] [review]
bug_1037214.patch
Approval Request Comment
[Feature/regressing bug #]: bug 950076
[User impact if declined]: The handled OOM condition would make Firefox crash with the wrong crash signature.
[Describe test coverage new/current, TBPL]: This landed on m-c almost two weeks ago, no problems have been raised about it.
[Risks and why]: None
[String/UUID change made/needed]: None
Attachment #8467740 -
Flags: approval-mozilla-beta?
Attachment #8467740 -
Flags: approval-mozilla-aurora?
Flags: needinfo?(alessio.placitelli)
Updated•10 years ago
|
Attachment #8467740 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 35•10 years ago
|
||
Comment on attachment 8467740 [details] [diff] [review]
bug_1037214.patch
beta+
Attachment #8467740 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 36•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/f0bdcf487ac4
https://hg.mozilla.org/releases/mozilla-beta/rev/353ade05d903
Is this bad enough to warrant esr31 consideration as well?
Reporter | ||
Comment 37•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #36)
> https://hg.mozilla.org/releases/mozilla-aurora/rev/f0bdcf487ac4
> https://hg.mozilla.org/releases/mozilla-beta/rev/353ade05d903
>
> Is this bad enough to warrant esr31 consideration as well?
I don't think so. Based on crashstats, there are 17 reports of this signature for Firefox 31.0esr in the last week, ranking somewhere around #364.
Comment 38•10 years ago
|
||
No crashes on Firefox 32 beta or Aurora 33.0a2 later than 2014-08-17, marking as verified based on crash-stats.
Comment 39•10 years ago
|
||
This does not meet ESR landing criteria.
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•