Closed Bug 1576260 Opened 4 months ago Closed 3 months ago

Crash in [@ OOM | large | NS_ABORT_OOM | nsTSubstring<T>::SetLength] in mozilla::dom::SnappyCompress

Categories

(Core :: Storage: localStorage & sessionStorage, defect, P2, critical)

x86
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- disabled
firefox68 --- disabled
firefox69 --- disabled
firefox70 + fixed
firefox71 --- fixed

People

(Reporter: philipp, Assigned: janv)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-64b5308d-e61c-462d-97e0-453fd0190702.

Top 10 frames of crashing thread:

0 xul.dll NS_ABORT_OOM xpcom/base/nsDebugImpl.cpp:604
1 xul.dll nsTSubstring<char>::SetLength xpcom/string/nsTSubstring.cpp:942
2 xul.dll static bool mozilla::dom::SnappyCompress dom/localstorage/SnappyUtils.cpp:25
3 xul.dll void mozilla::dom::LSValue::LSValue dom/localstorage/LSValue.h:47
4 xul.dll void mozilla::dom::SnapshotWriteOptimizer::Enumerate dom/localstorage/LSSnapshot.cpp:71
5 xul.dll nsresult mozilla::dom::LSSnapshot::Checkpoint dom/localstorage/LSSnapshot.cpp:856
6 xul.dll nsresult mozilla::dom::LSSnapshot::Run dom/localstorage/LSSnapshot.cpp:928
7 xul.dll mozilla::CycleCollectedJSContext::ProcessStableStateQueue xpcom/base/CycleCollectedJSContext.cpp:434
8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1283
9 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486

this was a mid to high volume crash signature (~60 reports per day with this stack) on firefox 69 beta until lsng was turned off with bug 1570644, when it basically stopped.

can we look into it before the feature starts shipping to release?

Flags: needinfo?(jvarga)

:philipp, since this bug is a regression, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(madperson)
Flags: needinfo?(madperson)

Looking.

Flags: needinfo?(jvarga)
Priority: -- → P2

Jan discussed with asuth and hsivonen about incremental unicode conversion and investigated incremental reading/writing in Snappy.
He's working on first patches.

Assignee: nobody → jvarga
Status: NEW → ASSIGNED

Incremental conversion and compression is quite complex task and not very suitable for a beta uplift. However, we can fix the crash by switching to fallible string allocations.

Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/85fcb1b22a2a
Make LSValue initialization fallible; r=asuth
Status: ASSIGNED → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71

Comment on attachment 9092666 [details]
Bug 1576260 - Make LSValue initialization fallible; r=asuth

Beta/Release Uplift Approval Request

  • User impact if declined: Users experience content process crashes.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch is not trivial, but the idea is simple. We switched from infallible to fallible memory allocations. This should eliminate crashes related to utf16 to utf8 conversion and snappy compression in new LocalStorage implementation.
  • String changes made/needed: None
Attachment #9092666 - Flags: approval-mozilla-beta?

Comment on attachment 9092666 [details]
Bug 1576260 - Make LSValue initialization fallible; r=asuth

Should help with a high-frequency crash. Approved for 70.0b8.

Attachment #9092666 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

ESR68 is showing a number of crashes which look similar (mozilla::dom::PBackgroundStorageParent::OnMessageReceived(IPC::Message const&) on the stack) - is this something we can backport there also? It grafts cleanly.

Flags: needinfo?(jvarga)

Note that mozilla::dom::PBackgroundStorageParent::OnMessageReceived(IPC::Message const&) is the old LocalStorage implementation. We would have to enable new LocalStorage implementation on that branch to benefit from the patch in this bug. Enabling LSNG have many dependencies, like bug 1563023 and I don't think we are ready to uplift that to ESR68.

Flags: needinfo?(jvarga)

Oh definitely not! Is there anything we can do about these crashes for the old LocalStorage implementation?

It would be too much work. We actually don't work on the old implementation much, only if it's really necessary.

If the crash rate is high and there's an easy boolean all-process-wide method that indicates if memory/address space is tight, it's possible to optionally violate spec and throw. But it would be a hack that would potentially break sites.

You need to log in before you can comment on or make changes to this bug.