Closed
Bug 1262456
Opened 8 years ago
Closed 8 years ago
crash in OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | std::priv::_Deque_base<T>::_M_initialize_map
Categories
(Firefox for Android Graveyard :: Audio/Video, defect)
Tracking
(firefox47 fixed, firefox48 fixed, fennec46+)
RESOLVED
FIXED
Firefox 48
People
(Reporter: snorp, Assigned: esawin)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files)
2.96 KB,
patch
|
snorp
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
2.56 KB,
patch
|
snorp
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-c1fe5a7f-fc67-4fd0-ad5c-9b4ea2160330. ============================================================= Most of the reports seem to be OOMing in media decoder, but unknown if that is the cause.
Reporter | ||
Comment 1•8 years ago
|
||
Looks like fallout from bug 1248792
Assignee: nobody → esawin
Blocks: 1248792
Reporter | ||
Updated•8 years ago
|
tracking-fennec: --- → 46+
Assignee | ||
Comment 2•8 years ago
|
||
The early return in a double-booked shutdown situation is problematic. A forced shutdown through the destructor would return early if the shutdown procedure had already been initiated, in which case the object would be destructed without a proper shutdown of the decoder thread. We should also prevent any illegal state transitions (flushing could potentially override a shutdown) when trying to set the state to give the call site feedback on the success of the transition.
Attachment #8739187 -
Flags: review?(snorp)
Reporter | ||
Comment 3•8 years ago
|
||
Comment on attachment 8739187 [details] [diff] [review] 0001-Bug-1262456-1.1-Prevent-interruption-of-the-decoder-.patch Review of attachment 8739187 [details] [diff] [review]: ----------------------------------------------------------------- Nice catch!
Attachment #8739187 -
Flags: review?(snorp) → review+
Comment 5•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8bce23713e8f
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Assignee | ||
Comment 6•8 years ago
|
||
The crash is still occurring. The device distribution looks suspicious, with over 30% of all crashes on the GT-I9060I (Samsung Galaxy Grand Neo), which afaik is not a mainstream device like the S-series.
Assignee | ||
Comment 7•8 years ago
|
||
Let's replace the queue container adapter with the underlying container deque to gain access to its clear implementation (std::queue is not perfectly suited for our case because of the flushing mechanics). This will help diagnosing the actual issue here, since deque::clear does not require a temporary container and therefore avoids the allocation which should give us a different crash stack.
Attachment #8742409 -
Flags: review?(snorp)
Reporter | ||
Updated•8 years ago
|
Attachment #8742409 -
Flags: review?(snorp) → review+
Comment 9•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/cbf41516e4a3
Assignee | ||
Comment 10•8 years ago
|
||
Closing this and will post a new bug with the new signature. So far nothing related has popped up on Nightly, we should uplift this to get clearer results.
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Assignee | ||
Comment 11•8 years ago
|
||
Comment on attachment 8739187 [details] [diff] [review] 0001-Bug-1262456-1.1-Prevent-interruption-of-the-decoder-.patch Approval Request Comment [Feature/regressing bug #]: Media playback, blocks uplift of bug 1273523 [User impact if declined]: Top media crash [Describe test coverage new/current, TreeHerder]: On Nightly and Aurora for weeks [Risks and why]: Low, only affects media decoder shutdown [String/UUID change made/needed]: none
Attachment #8739187 -
Flags: approval-mozilla-beta?
Assignee | ||
Updated•8 years ago
|
Attachment #8742409 -
Flags: approval-mozilla-beta?
Comment on attachment 8739187 [details] [diff] [review] 0001-Bug-1262456-1.1-Prevent-interruption-of-the-decoder-.patch Need this patch for medium/high volume top Fennec 47 crash, this patch has been in 48 for over a month, seems safe. Beta47+
Attachment #8739187 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
status-firefox47:
--- → affected
Hi Margaret, Snorp, FYI. We had conflicts uplifting patch from bug 1273523 to Beta and Eugen would like to uplift this patch instead of adding a new rebased patch to Beta47 in that bug. I have approved it for uplift to Beta47. If you have any concerns, please let me know. At this point in Beta47 cycle, I am only taking fixes for critical recent regressions and severe stability/security issues. I'd appreciate all the help I can get in evaluating risk associated with this uplift. Thanks!
Flags: needinfo?(snorp)
Flags: needinfo?(margaret.leibovic)
Attachment #8742409 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 14•8 years ago
|
||
I defer to Snorp and Eugen here, this is outside my area of expertise.
Flags: needinfo?(margaret.leibovic)
Snorp commented on IRC that he thinks we should take these two patches.
Comment 16•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/3d2d1c251cc8 https://hg.mozilla.org/releases/mozilla-beta/rev/12f8a8080555
Comment 17•8 years ago
|
||
No crashes have been encountered on the 47.0b10 build in today's testing session. The testing took place on the following devices: Samsung Galaxy Note 5(Android 5.1.1), Xiaomi mi i4 (Android 5.0.2), Samsung Galaxy S5 (Android 5.0.2), Nexus 5 (Android 6.0.1), Nexus 6P (Android N Beta), Alcatel OneTouch, and on tablets: Xiaomi MI pad2 (Android 5.1.1), Nexus 9 (Android 5.1), Nexus 9 (Android 6.1). I will leave this issue open in order to gather more information in the Crash-stats reports.
Comment 18•8 years ago
|
||
As per bug 1273523 comment 10, I suspect this is an stlport bug, not a Mozilla bug. Look at the results of this search: https://crash-stats.mozilla.com/search/?oom_allocation_size=2147483656&_facets=oom_allocation_size&_facets=signature&_facets=platform&_facets=proto_signature&_facets=version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=process_type&_columns=platform_pretty_version#facet-oom_allocation_size Look at the different facet tabs and you can see all these crashes are coming from std::priv::_Deque_base<T>::_M_initialize_map, which is from the std:deque usage in mozilla::MediaCodecDataDecoder::PeekNextSample, and they're all Android. There are also crashes in version 46.0b13. Looking at the stack traces, the decoder is just calling clear() on the std::deque, which seems totally reasonable. Smells like an stlport bug to me. That would explain why the steps taken so far don't seem to help. If I'm right, the good news is that Nathan Froyd recently did the hard work to get us off stlport, onto a more modern C++ stdlib. The bad news is that we backporting stlport removal is likely hard or impossible...
Assignee | ||
Comment 19•8 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #18) > Looking at the stack traces, the decoder is just calling clear() on the > std::deque, which seems totally reasonable. Smells like an stlport bug to > me. That would explain why the steps taken so far don't seem to help. In bug 1273523 we fix the UB (possibly heap corruption) caused by calling deque::pop_front() on an empty deque, which would explain the subsequent deque::clear() call to behave abnormally. Previously, we've only asserted this constraint in debug builds.
Comment 20•8 years ago
|
||
> In bug 1273523 we fix the UB (possibly heap corruption) caused by calling
> deque::pop_front() on an empty deque, which would explain the subsequent
> deque::clear() call to behave abnormally. Previously, we've only asserted
> this constraint in debug builds.
Ah. In that case my analysis is mistaken and the problem is in our code. Thank you for the explanation.
Comment 21•8 years ago
|
||
(In reply to Ritu Kothari (:ritu) from comment #15) > Snorp commented on IRC that he thinks we should take these two patches. Clearing needinfo since the question was answered and the issue was resolved.
Flags: needinfo?(snorp)
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•