Closed
Bug 1475218
Opened 6 years ago
Closed 6 years ago
Crash in OOM | large | NS_ABORT_OOM | nsTSubstring<T>::SetLength with local storage on Fennec
Categories
(Core :: DOM: Core & HTML, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla63
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox-esr60 | --- | unaffected |
firefox61 | --- | wontfix |
firefox62 | + | fixed |
firefox63 | --- | fixed |
People
(Reporter: marcia, Assigned: janv)
References
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(1 file)
4.03 KB,
patch
|
asuth
:
review+
RyanVM
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-d916dd12-3690-434d-a774-6039b0180712.
=============================================================
Seen while looking at Fennec release crash stats: https://bit.ly/2NJvHvq. Rising crash in 61.0 that has moved into the #11 slot. So far it appears 62/63 are not affected. 1153 crashes/517 installs so far.
No useful comments. Correlations show as follows:
(100.0% in signature vs 04.64% overall) moz_crash_reason = MOZ_CRASH(OOM)
(100.0% in signature vs 40.37% overall) address = 0x0
(100.0% in signature vs 74.72% overall) reason = SIGSEGV
(43.76% in signature vs 16.19% overall) Module "AudioFlinger::Client (deleted)" = true
(90.15% in signature vs 65.30% overall) Module "libmozavcodec.so" = true
(90.15% in signature vs 65.30% overall) Module "libmozavutil.so" = true
Top 10 frames of crashing thread:
0 libxul.so NS_ABORT_OOM xpcom/base/nsDebugImpl.cpp:614
1 libxul.so nsTSubstring<char16_t>::SetLength xpcom/string/nsTSubstring.cpp:814
2 libxul.so IPC::ParamTraits<nsTSubstring<char16_t> >::Read ipc/glue/IPCMessageUtils.h:460
3 libxul.so mozilla::ipc::PBackgroundParent::OnMessageReceived ipc/ipdl/PBackgroundParent.cpp:1428
4 libmozglue.so RedBlackTree<extent_node_t, ExtentTreeSzTrait>::Insert memory/build/rb.h:394
5 libmozglue.so RedBlackTree<arena_chunk_map_t, ArenaAvailTreeTrait>::MoveRedRight
6 libmozglue.so RedBlackTree<arena_chunk_map_t, ArenaAvailTreeTrait>::LeanRight
7 libmozglue.so RedBlackTree<arena_chunk_map_t, ArenaAvailTreeTrait>::Insert memory/build/rb.h:380
8 libxul.so wcsrtombs
9 libxul.so wcsrtombs
=============================================================
Comment 1•6 years ago
|
||
Those stacks are iffy, but they seem to be going through local storage [1] deserialization. We might want to add `SetLength` to our skip list as it's a pretty generic OOM.
[1] https://crash-stats.mozilla.com/sources/highlight/?url=https://gecko-generated-sources.s3.amazonaws.com/5a57e7442378a146debe7bfd25911e7fa92fb69450d1c02295bd7f388511dbeb36d2426eaf83b7118a896c114089ee94f5023d5d74092ecae47da153d1c8561a/ipc/ipdl/PBackgroundParent.cpp#L-1428
Component: XPCOM → DOM
Updated•6 years ago
|
Priority: -- → P3
Comment 2•6 years ago
|
||
This sounds very similar to bug 1462162, where heavy local storage activity ends up using a lot of memory in this deserialization code. It is a little different, in that there is no preallocated process involved, because it is Fennec.
Jan, will bug 1462162 help with the non-e10s case?
Updated•6 years ago
|
Flags: needinfo?(jvarga)
Assignee | ||
Comment 3•6 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> This sounds very similar to bug 1462162, where heavy local storage activity
> ends up using a lot of memory in this deserialization code. It is a little
> different, in that there is no preallocated process involved, because it is
> Fennec.
>
> Jan, will bug 1462162 help with the non-e10s case?
My patch optimizes broadcasting of local storage changes across multiple content processes in the e10-multi case (not just the preallocated process).
non-e10s case is not handled in the patch.
However, I think we could do something special for Fennec, since if there are no multiple content processes, we don't need to synchronize local storage caches at all, there's just one cache.
Flags: needinfo?(jvarga)
Updated•6 years ago
|
See Also: → 1462162
Summary: Crash in OOM | large | NS_ABORT_OOM | nsTSubstring<T>::SetLength → Crash in OOM | large | NS_ABORT_OOM | nsTSubstring<T>::SetLength with local storage on Fennec
Reporter | ||
Comment 4•6 years ago
|
||
Updated the status flags since there are a handful of crashes in 62 and at least one recently in 63.
Updated•6 years ago
|
Reporter | ||
Comment 5•6 years ago
|
||
In Fennec 62b9 this is the #3 browser top crash.
Comment 6•6 years ago
|
||
Jan, can you take another look here or help find someone to investigate? This crash signature has increased a lot in recent betas.
tracking-firefox62:
--- → +
Flags: needinfo?(jvarga)
Assignee | ||
Comment 7•6 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #6)
> Jan, can you take another look here or help find someone to investigate?
> This crash signature has increased a lot in recent betas.
I just requested a beta approval in bug 1462162. It has to land on the 62 branch first, then we will see if it fixes this crash too. If it doesn't fix the crash, I'll add additional optimizations for Fennec.
Flags: needinfo?(jvarga)
Assignee | ||
Comment 8•6 years ago
|
||
Andrew, since Fennec doesn't use e10s, should we just avoid creating PBackgroundLocalStorageCache actors completely in that case ?
There can be only one Local Storage Cache, so there's nothing to synchronize.
Flags: needinfo?(bugmail)
Comment 9•6 years ago
|
||
Agreed that in Fennec the broadcast mechanism is just overhead and it makes sense to disable the broadcasts/broadcast mechanism.
That said, the allocation sizes here that are failing are "just" 1.2M and 2.2M and the signatures here also are showing up under case PBackgroundStorage::Msg_AsyncUpdateItem__ID, and that's one we can't be rid of (because that's the actual LocalStorageg storage command, not the redundant broadcast). If this is one of those cases :janv observed like the vidyo web beta concatenating a log and repeatedly storing it in localstorage, it's possible the content side allocations are under control thanks to string ropes just making many small allocations.
Given armeabi-v7a is a 32-bit arch, maybe we should consider some kind of proactive intervention (https://github.com/WICG/interventions) for large LocalStorage values. Maybe only when the address space is "at risk". Or maybe we should consider it simply because, as in bugs like bug 857888, we don't expect devices running Fennec to have the same level of available storage and so storing huge values is mis-use. In any event, I'll try and factor this into my review of next-gen localstorage... it's likely the situation for fennec got worse once we switched to unified LocalStorage handling via PBackground...
Flags: needinfo?(bugmail)
Comment 10•6 years ago
|
||
Andrew are you taking this on? Or maybe Jan?
It's still the #4 top crash in beta 62 so I expect it will look pretty bad on release. On release 61 we can see around 16000 crashes per week, which is very high.
Comment 11•6 years ago
|
||
Assigning to Jan because I know he'd much rather focus on the reviews I owe him. :) Leaving needinfo up for next steps.
Assignee: nobody → jvarga
Status: NEW → ASSIGNED
Flags: needinfo?(bugmail)
Assignee | ||
Comment 12•6 years ago
|
||
Agreed.
Assignee | ||
Comment 13•6 years ago
|
||
Andrew, this patch just disables PBackgroundLocalStorageCache on Android. I think it should be enough for now. We'll see how it affects number of OOM crashes, then we can decide what to do next.
try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=07e0bc5ecfc5fd36b90eeb3c109987a1504391da
Flags: needinfo?(jvarga)
Attachment #8998890 -
Flags: review?(bugmail)
Reporter | ||
Comment 14•6 years ago
|
||
Hello Andrew and Jan - We are getting later in the 62 cycle. Are we going to try to land this to see if it improves the crash situation before 62 goes to release? Thanks.
Updated•6 years ago
|
Attachment #8998890 -
Flags: review?(bugmail) → review+
Comment 15•6 years ago
|
||
Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2181b2ee1fc
Disable PBackgroundLocalStorageCache on Android; r=asuth
Assignee | ||
Updated•6 years ago
|
Flags: needinfo?(jvarga)
Comment 16•6 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Comment 17•6 years ago
|
||
Please request Beta approval on this when you get a chance.
status-firefox-esr52:
--- → unaffected
status-firefox-esr60:
--- → unaffected
Assignee | ||
Comment 18•6 years ago
|
||
Ok, looks like nightly is now getting less crashes, going to request beta approval.
Flags: needinfo?(jvarga)
Assignee | ||
Comment 19•6 years ago
|
||
Comment on attachment 8998890 [details] [diff] [review]
patch
Approval Request Comment
[Feature/Bug causing the regression]: bug 1285898
[User impact if declined]: Fennec users may still experience crashes related to LocalStorage.
[Is this code covered by automated tests?]: Yes (mochitests and wpt tests)
[Has the fix been verified in Nightly?]: Yes (and it seems number of crashes is declining in Nightly)
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No (very low risk).
[Why is the change risky/not risky?]: Changes are small, only affect Fennec and landed on m-c almost a week ago.
[String changes made/needed]: None.
Attachment #8998890 -
Flags: approval-mozilla-beta?
Comment 20•6 years ago
|
||
Comment on attachment 8998890 [details] [diff] [review]
patch
Helps reduce OOM crashes on Fennec. Approved for Fennec 62.0b21.
Attachment #8998890 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 21•6 years ago
|
||
bugherder uplift |
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•