Closed Bug 1431370 Opened 8 years ago Closed 8 years ago

Crash in OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | nsTArray_base<T>::EnsureCapacity<T> | nsTArray_base<T>::InsertSlotsAt<T> | nsTArray_Impl<T>::SetLength<T> | mozilla::safebrowsing::DoRiceDeltaDecode

Categories

(Toolkit :: Safe Browsing, defect, P1)

57 Branch
All
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- disabled
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- fixed
firefox60 --- fixed

People

(Reporter: philipp, Assigned: francois)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is report bp-898cc5c2-de34-4fe2-908e-63c630180117. ============================================================= Top 10 frames of crashing thread: 0 mozglue.dll mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:33 1 mozglue.dll mozalloc_handle_oom memory/mozalloc/mozalloc_oom.cpp:54 2 mozglue.dll moz_xmalloc memory/mozalloc/mozalloc.cpp:86 3 xul.dll nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity<nsTArrayInfallibleAllocator> xpcom/ds/nsTArray-inl.h:136 4 xul.dll nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::InsertSlotsAt<nsTArrayInfallibleAllocator> xpcom/ds/nsTArray-inl.h:289 5 xul.dll nsTArray_Impl<unsigned int, nsTArrayInfallibleAllocator>::SetLength<nsTArrayInfallibleAllocator> xpcom/ds/nsTArray.h:1859 6 xul.dll mozilla::safebrowsing::DoRiceDeltaDecode toolkit/components/url-classifier/ProtocolParser.cpp:1002 7 xul.dll mozilla::safebrowsing::ProtocolParserProtobuf::ProcessEncodedAddition toolkit/components/url-classifier/ProtocolParser.cpp:1025 8 xul.dll mozilla::safebrowsing::ProtocolParserProtobuf::ProcessAdditionOrRemoval toolkit/components/url-classifier/ProtocolParser.cpp:902 9 xul.dll mozilla::safebrowsing::ProtocolParserProtobuf::ProcessOneResponse toolkit/components/url-classifier/ProtocolParser.cpp:866 ============================================================= this out of memory crash signature on windows seems to be introduced with v4 of safebrowsing shipping with firefox. it's a low volume crash after all though.
Assignee: nobody → francois
Status: NEW → ASSIGNED
Priority: -- → P1
Comment on attachment 8946816 [details] Bug 1431370 - Make DoRiceDeltaDecode allocation fallible to fix startup OOM crash. https://reviewboard.mozilla.org/r/216728/#review222724 ::: toolkit/components/url-classifier/ProtocolParser.cpp:1004 (Diff revision 1) > > // Setup the output buffer. The "first value" is included in > // the output buffer. > - aDecoded.SetLength(aEncoding.num_entries() + 1); > + if (!aDecoded.SetLength(aEncoding.num_entries() + 1, mozilla::fallible)) { > + NS_WARNING("Not enough memory to decode the RiceDelta input."); > + return NS_ERROR_OUT_OF_MEMORY; I traced the error reporting up the callstack and found this: https://searchfox.org/mozilla-central/rev/c56f656febb1aa58c72b218699e24a76425bd284/toolkit/components/url-classifier/ProtocolParser.cpp#782 We set mUpdateStatus = NS_ERROR_FAILURE; early on, and then: if (NS_SUCCEEDED(rv)) { mUpdateStatus = rv; } but if any fails we just give a warning *and do not clear the success status*. Amazingly, this appears to be by design: https://searchfox.org/mozilla-central/rev/c56f656febb1aa58c72b218699e24a76425bd284/toolkit/components/url-classifier/ProtocolParser.cpp#766 So we consider an update to be successful even if we failed to decode it almost entirely. Does the rest of the code work correctly given that invariant, or is it exactly the reason why we're having *this* bug now?
Attachment #8946816 - Flags: review?(gpascutto) → review+
Depends on: 1434662
Comment on attachment 8946816 [details] Bug 1431370 - Make DoRiceDeltaDecode allocation fallible to fix startup OOM crash. https://reviewboard.mozilla.org/r/216728/#review222724 > I traced the error reporting up the callstack and found this: > https://searchfox.org/mozilla-central/rev/c56f656febb1aa58c72b218699e24a76425bd284/toolkit/components/url-classifier/ProtocolParser.cpp#782 > > We set mUpdateStatus = NS_ERROR_FAILURE; early on, and then: > if (NS_SUCCEEDED(rv)) { mUpdateStatus = rv; } but if any fails we just give a warning *and do not clear the success status*. > > Amazingly, this appears to be by design: https://searchfox.org/mozilla-central/rev/c56f656febb1aa58c72b218699e24a76425bd284/toolkit/components/url-classifier/ProtocolParser.cpp#766 > > So we consider an update to be successful even if we failed to decode it almost entirely. > > Does the rest of the code work correctly given that invariant, or is it exactly the reason why we're having *this* bug now? I couldn't find any discussion of that design decision on https://reviewboard.mozilla.org/r/69246/ so I filed bug 1434662 to follow up and test that it works as it should.
Pushed by fmarier@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bdb502c30c7c Make DoRiceDeltaDecode allocation fallible to fix startup OOM crash. r=gcp
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Please request Beta approval on this when you get a chance.
Flags: needinfo?(francois)
Comment on attachment 8946816 [details] Bug 1431370 - Make DoRiceDeltaDecode allocation fallible to fix startup OOM crash. Approval Request Comment [Feature/Bug causing the regression]: 1431370 [User impact if declined]: OOM crash on startup [Is this code covered by automated tests?]: No [Has the fix been verified in Nightly?]: Yes, manual test by me. [Needs manual test from QE? If yes, steps to reproduce]: [List of other uplifts needed for the feature/fix]: None [Is the change risky?]: No [Why is the change risky/not risky?]: It's simply making a single memory allocation fallible. [String changes made/needed]: None
Flags: needinfo?(francois)
Attachment #8946816 - Flags: approval-mozilla-beta?
Comment on attachment 8946816 [details] Bug 1431370 - Make DoRiceDeltaDecode allocation fallible to fix startup OOM crash. Fixes a startup crash. Taking for 59b8.
Attachment #8946816 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to François Marier [:francois] from comment #7) > [Feature/Bug causing the regression]: 1431370 > [User impact if declined]: OOM crash on startup > [Is this code covered by automated tests?]: No > [Has the fix been verified in Nightly?]: Yes, manual test by me. > [Needs manual test from QE? If yes, steps to reproduce]: François, is this something QE should be testing manually? If so, could you please share the testcase you mentioned?
Flags: qe-verify?
Flags: needinfo?(francois)
(In reply to Andrei Vaida [:avaida] – please ni? me from comment #10) > François, is this something QE should be testing manually? If so, could you > please share the testcase you mentioned? It's pretty hard to test manually, so I think we can simply keep an eye on the crash reports and see if we're still getting crashes with that signature.
Flags: needinfo?(francois)
Flags: qe-verify? → qe-verify-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: