Closed Bug 1132499 Opened 5 years ago Closed 4 years ago
Large OOM in NS
_Convert UTF8to UTF16::NS _Convert UTF8to UTF16
42.88 KB, image/png
2.70 KB, patch
|Details | Diff | Splinter Review|
3.07 KB, patch
|Details | Diff | Splinter Review|
This bug was filed from the Socorro interface and is report bp-8be10840-0330-494c-882c-ef0402150212. =============================================================
Component: General → DOM: Core & HTML
Product: Firefox → Core
Trying to allocate a 23MB string for the .value of an input...
bp-edfd0dbf-5bef-4c1e-83e5-96e062150212 12/02/2015 11:12 a.m.
38.0b9 is indeed affected... bp-506ccc5c-92a6-402c-8775-93d702150501 01/05/2015 10:48 a.m.
bp-836aeb44-c13b-4620-ab0a-ecd192150606 06/06/2015 04:37 p.m. Crashing Thread Frame Module Signature Source 0 xul.dll NS_ABORT_OOM(unsigned int) xpcom/base/nsDebugImpl.cpp 1 xul.dll NS_ConvertUTF8toUTF16::NS_ConvertUTF8toUTF16(nsACString_internal const&) xpcom/string/nsString.h
These are coming from different sources; we'll need bug 1130960 in order to sort this out.
Depends on: 1130960
Bah, even after bug 1130960, a bunch of these are still getting the generic signature because of a failed stack walk. From a quick glance the stragglers seem to be nsTextEditorState::GetValue like in comment 1. bz, do you know anyone who could own this? I'm kinda worried it might be a lot of work; the UTF8 converters don't seem to have any notion of fallible currently.
Henri or Simon might have some ideas here.
Hmm. The string code owned by dbaron with bsmedberg as the active peer. I'm not sure either of them would have time offhand either, but worth checking with them to see whom they'd suggest.
(In reply to Olli Pettay [:smaug] from comment #7) > Henri or Simon might have some ideas here. This looks like expected behavior arising from "infallible" malloc. I guess we could re-introduce fallible variants for these conversions and use the fallible variants when persisting form state, since form state allows the user or Web content to cause a large allocation.
Crash Signature: [@ OOM | large | NS_ABORT_OOM(unsigned int) | NS_ConvertUTF8toUTF16::NS_ConvertUTF8toUTF16(nsACString_internal const&)] → [@ OOM | large | NS_ABORT_OOM(unsigned int) | NS_ConvertUTF8toUTF16::NS_ConvertUTF8toUTF16(nsACString_internal const&)] [@ OOM | large | NS_ABORT_OOM | NS_ConvertUTF8toUTF16::NS_ConvertUTF8toUTF16]
like bug 996303?
The view of Firefox menu just before a crash. The crash occurred after "Exit" button was pressed. Crash-report: https://crash-stats.mozilla.com/report/index/21d470ea-105c-449f-813e-6b7212151123
The visual artifacts during HTML rendering before these crashes are the as for crash [@ nsWindow::ClearTranslucentWindow ] https://crash-stats.mozilla.com/report/index/8a6a8e4c-220c-4a47-a261-ea2952151204 Should a separate bug be filed on the latest crash-report?
Ruvim, yes, please file a new report about https://crash-stats.mozilla.com/report/index/8a6a8e4c-220c-4a47-a261-ea2952151204 since that seems to be unrelated issue
¡Hola! This happens on 44 as well. Report ID Date Submitted bp-4363e587-724f-4a90-b2e7-d72fe2151229 28/12/2015 09:03 p.m. ¡Gracias!
Ugh. There are really bad crashes around here. Anything trying to allocate megabytes or more should surely be made fallible and fail gracefully, but this signature has crashes that even go into the hundreds of megabytes - I'm pretty sure that finding that much contigous space will be hard on most systems we run on, see e.g.: bp-a8367b21-5d54-4712-a2d6-7c1572160226 bp-fb776bf8-1aad-4ae0-8a2c-e81442160226 bp-a284e16e-2bf4-453b-b2d0-ddbf62160226 bp-c6814f59-b13e-45c3-a156-04e8a2160226 bp-d43e90c3-bf98-4349-a080-240cc2160225 bp-42cf730a-67b2-4f98-ad5d-2f5ed2160225 And those are only from the first page of results I saw when taking a look. We really need to find out what's up here and do something about it.
Summary: crash in OOM | large | NS_ABORT_OOM(unsigned int) | NS_ConvertUTF8toUTF16::NS_ConvertUTF8toUTF16(nsACString_internal const&) → Large OOM in NS_ConvertUTF8toUTF16::NS_ConvertUTF8toUTF16
Those are ... not useful stacks. :( In particular, they don't indicate which caller of ConvertUTF8toUTF16 would need to be doing a fallible conversion...
(In reply to Boris Zbarsky [:bz] from comment #16) > Those are ... not useful stacks. :( In particular, they don't indicate > which caller of ConvertUTF8toUTF16 would need to be doing a fallible > conversion... Yes, someone will need to load the stacks into WinDBG and find out if that can walk the stacks more usefully than minidump_stackwalk and can get decent stack traces for this.
(In reply to Robert Kaiser (:firstname.lastname@example.org) from comment #17) > (In reply to Boris Zbarsky [:bz] from comment #16) > > Those are ... not useful stacks. :( In particular, they don't indicate > > which caller of ConvertUTF8toUTF16 would need to be doing a fallible > > conversion... > > Yes, someone will need to load the stacks into WinDBG and find out if that > can walk the stacks more usefully than minidump_stackwalk and can get decent > stack traces for this. I will try to do that today if nobody else beats me to it.
For https://crash-stats.mozilla.com/report/index/c6814f59-b13e-45c3-a156-04e8a2160226 , the stack trace is kind of bogus (windbg thinks jemalloc is calling into NS_ConvertUTF8toUTF16), but examining stack memory, it looks like a call from nsTextEditorState::GetValue to NS_ConvertUTF8toUTF16 might have led to the OOM, which seems plausible: https://dxr.mozilla.org/mozilla-central/rev/4ea7408b3eef059aa248f4b00328f8fdb4475112/dom/html/nsTextEditorState.cpp#1921 windbg reports that https://crash-stats.mozilla.com/report/index/a8367b21-5d54-4712-a2d6-7c1572160226 is calling NS_ConvertUTF8toUTF16 from somewhere in the networking code, but nsTextEditorState::GetValue is also living on the stack there. The story is the same for https://crash-stats.mozilla.com/report/index/c679306d-81b7-4f6f-a3bb-a68a92160302 , though I think we're actually coming through FragmentOrElement::SaveSubtreeState, somehow. So I'm going to tentatively say we should fix nsTextEditorState::GetValue to be fallible. Which is unfortunate, as it's currently not. :( And IIUC, fixing it to be fallible means we might not save the users's form state sometimes...which is not so great, but I suppose that's better than crashing. We could just fix the assignment call to be fallible, without propagating that error out (ew)?
In other words, we'd need to do something like this, which has not been tested, but does compile. I have very little confidence in this being correct. WDYT of the overall idea?
Attachment #8726348 - Flags: feedback?(bzbarsky)
Also, why are we storing the value as a CString? It looks like we have to do a conversion into CString and out of...and that's all we do with it. We even store the CString as an allocated object! Are/were we trying to save space or something?
Comment on attachment 8726348 [details] [diff] [review] try to make nsTextEditorState::GetValue fallible Going to defer to ehsan on the editor state stuff...
Attachment #8726348 - Flags: feedback?(bzbarsky) → feedback?(ehsan)
Comment on attachment 8726348 [details] [diff] [review] try to make nsTextEditorState::GetValue fallible Review of attachment 8726348 [details] [diff] [review]: ----------------------------------------------------------------- Spoke with Nathan about this on IRC. It looks like we should first convert nsTextEditorState::mValue to be an nsString or something and see how much of these issues that will avoid. Correctly propagating the error out of GetValue is a huge pain specially since the BindToFrame/UnbindFromFrame machinery is not really designed to be fallible.
Attachment #8726348 - Flags: feedback?(ehsan)
This change is just a minor tidying; we need to distinguish between "have a value" and "don't have a value" in nsTextEditorState::GetValue, but we can do better than heap-allocating the string.
Attachment #8726746 - Flags: review?(ehsan)
Attachment #8726348 - Attachment is obsolete: true
I don't know the full history here; roc asked about this exact thing in bug 534785 when reviewing the creation of nsTextEditorState. The answer then was "historical reasons" (the original code is hg@1)...maybe to try and save some space? Regardless, since the only thing we do here is convert from/to incoming/outgoing nsStrings, which at least sometimes appear to be causing OOMs, we might as well hold it as an nsString all the time. This change will ideally eliminate allocations, as we'll be able to use nsString's buffer sharing underneath the hood.
Attachment #8726747 - Flags: review?(ehsan)
> the original code is hg@1 Don't let that stop you! That said, the char* mValue in nsHTMLInputElement was introduced in bug 34297. The actual patch attached there has it as nsString*, afaict. I have no idea why what landed has char*, and I doubt jst remembers...
Little spike around 50% of increase in 45.0.1 between 2016-03-23 and 2016-03-28.
Liz, this is a crash spike on 45.0.1. Perhaps we should consider uplifting this to Beta 46.
Nathan, what do you think about uplift here? I would want some confirmation that the fix worked and isn't causing other problems, unless you think it's super safe. We are heading into beta 7 so it's a bit late in the cycle.
Flags: needinfo?(lhenry) → needinfo?(nfroyd)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #31) > Nathan, what do you think about uplift here? I would want some confirmation > that the fix worked and isn't causing other problems, unless you think it's > super safe. We are heading into beta 7 so it's a bit late in the cycle. Judging by: https://crash-stats.mozilla.com/report/list?range_unit=days&range_value=28&signature=OOM+%7C+large+%7C+NS_ABORT_OOM+%7C+NS_ConvertUTF8toUTF16%3A%3ANS_ConvertUTF8toUTF16#tab-sigsummary there are no more crashes from this signature in 47 or later, which suggests this patch has done some good. I don't know how to show that there aren't other problems, but I think the patches themselves are straightforward and reasonably safe.
Comment on attachment 8726746 [details] [diff] [review] part 1 - convert nsTextEditorState::mValue to use Maybe instead of heap allocation This request is for both of the patches in this bug, given that they seem to have fixed some OOM crashes. Approval Request Comment [Feature/regressing bug #]: bug 34297, bug 534785, I guess? [User impact if declined]: OOM crashes [Describe test coverage new/current, TreeHerder]: I know we have a fair number of editing-related tests; I don't know that we have any tests that definitively exercise the code these patches modify, as I'm not familiar with this code. I could try to investigate, if need be. [Risks and why]: Low risk; these patches simply remove a memory-allocating conversion. [String/UUID change made/needed]: None.
Attachment #8726746 - Flags: approval-mozilla-beta?
Comment on attachment 8726746 [details] [diff] [review] part 1 - convert nsTextEditorState::mValue to use Maybe instead of heap allocation Should prevent some OOM crashes. OK for uplift for the beta 8 build.
Attachment #8726746 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.