Crash in std::__ndk1::__deque_base<T>::clear

RESOLVED FIXED in Firefox 47

Status

()

defect
--
critical
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: esawin, Assigned: esawin)

Tracking

({crash})

48 Branch
Firefox 49
All
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox47 fixed, firefox48 fixed, firefox49 fixed)

Details

(crash signature)

Attachments

(1 attachment, 3 obsolete attachments)

Assignee

Description

3 years ago
This bug was filed from the Socorro interface and is 
report bp-a57f78dd-e5df-4d94-9a2d-4f8642160515.
=============================================================

This is the new signature for bug 1262456.
Assignee

Comment 1

3 years ago
I can (sometimes) reproduce this on the GT-I9060I when watching YouTube videos.

This device has a tendency to fail in QueueSample (probably {Dequeue|QueueInput}Buffer) and we don't handle this case correctly in ProcessOutput, where we expect !mDurations.empty().

We should break out of the decoder loop when QueueSample fails and we should return early from ProcessOutput if mDurations.empty() holds.
Assignee

Comment 2

3 years ago
We shouldn't just ignore when we fail to queue a sample.
Attachment #8753443 - Flags: review?(snorp)
Assignee

Comment 3

3 years ago
Calling GetOutputDuration with mDurations.empty() will corrupt the heap by the call to mDurations.pop_front(). We should check for this and return early.
Attachment #8753444 - Flags: review?(snorp)
Assignee

Comment 4

3 years ago
I've noticed that the GT-I9060I seems to intermittently fail in QueueSample but succeed in subsequent calls to it.

We can give the decoder a few tries before giving up to prevent playback from failing on these devices.
Attachment #8753445 - Flags: review?(snorp)
Attachment #8753443 - Flags: review?(snorp) → review+
Assignee

Comment 5

3 years ago
Comment on attachment 8753445 [details] [diff] [review]
0003-Bug-1273523-3.1-Attempt-to-queue-sample-a-few-times-.patch

Taking this patch back, retrying doesn't actually help.
Attachment #8753445 - Attachment is obsolete: true
Attachment #8753445 - Flags: review?(snorp)
Comment on attachment 8753444 [details] [diff] [review]
0002-Bug-1273523-2.1-Don-t-process-output-if-no-duration-.patch

Review of attachment 8753444 [details] [diff] [review]:
-----------------------------------------------------------------

Why would we be in this situation? The only way to get here is if we are somehow getting more frames out than we have given as input, no?
Comment on attachment 8753443 [details] [diff] [review]
0001-Bug-1273523-1.1-Stop-decoding-on-sample-queueing-fai.patch

Review of attachment 8753443 [details] [diff] [review]:
-----------------------------------------------------------------

Actually I'm taking back my r+. Failure to queue the sample is a fairly normal error. You can get that just by having all of the slots full. It shouldn't bail out of the decoder loop just for that.
Attachment #8753443 - Flags: review+ → review-
Comment hidden (obsolete)
Assignee

Comment 9

3 years ago
You're right, we only need patch 2, patch 3 doesn't help and patch 1 will only make a lot of playbacks fail for no reason.
Assignee

Updated

3 years ago
Attachment #8753443 - Attachment is obsolete: true
FWIW, I've seen numerous occurrences of bad behaviour from std::deque in terms of memory usage.

For example, consider this crash report:

https://crash-stats.mozilla.com/report/index/9b1773c9-1569-4756-b347-ea86e2160511

We're initializing the deque. Why are we trying to allocate 2,144,532,512 bytes?! That's only 2,951,136 bytes short of 2^31, which is suggestive of an integer overflow of some kind.

Then again, we recently moved away from stlport, so maybe we'll get better behaviour from std::deque from now on?
Comment on attachment 8753444 [details] [diff] [review]
0002-Bug-1273523-2.1-Don-t-process-output-if-no-duration-.patch

Review of attachment 8753444 [details] [diff] [review]:
-----------------------------------------------------------------

Why would we be in this situation? The only way to get here is if we are somehow getting more frames out than we have given as input, no?

EDIT: Eugen and I talked about this, and apparently the problem is that we can fail to queue an input buffer, but the decoder will still give us an output buffer. Eugen, please add a comment describing the issue.
Attachment #8753444 - Flags: review?(snorp) → review+
Assignee

Comment 12

3 years ago
Added comment.
Attachment #8753444 - Attachment is obsolete: true
Attachment #8753871 - Flags: review+

Comment 14

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/0ebe261ee2f7
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 49
Assignee

Updated

3 years ago
Depends on: 1274488
Assignee

Comment 15

3 years ago
Comment on attachment 8753871 [details] [diff] [review]
0001-Bug-1273523-1.2-Don-t-process-output-if-no-duration-.patch

Approval Request Comment
[Feature/regressing bug #]: Media playback
[User impact if declined]: Media playback may crash Fennec on certain popular devices (#1 crash on beta)
[Describe test coverage new/current, TreeHerder]: Due to broken Nightly updates (bug 1274488) this has just landed on today's Nightly
[Risks and why]: Low, it's a simple a safe change to fix heap corruption
[String/UUID change made/needed]: None
Attachment #8753871 - Flags: approval-mozilla-beta?
Attachment #8753871 - Flags: approval-mozilla-aurora?
Comment on attachment 8753871 [details] [diff] [review]
0001-Bug-1273523-1.2-Don-t-process-output-if-no-duration-.patch

This is definitely a medium/high volume crash on Fennec47, ~600 crashes in a week on 47.0b6, let's uplift to Aurora48, Beta47.
Attachment #8753871 - Flags: approval-mozilla-beta?
Attachment #8753871 - Flags: approval-mozilla-beta+
Attachment #8753871 - Flags: approval-mozilla-aurora?
Attachment #8753871 - Flags: approval-mozilla-aurora+
Assignee

Updated

3 years ago
Depends on: 1262456
Assignee

Comment 20

3 years ago
I could rebase the patch on Beta, but it's much safer (Aurora-tested) to uplift bug 1262456 instead. I've requested it for uplifting.
Flags: needinfo?(esawin)

Comment 22

3 years ago
No crashes have been encountered on the 47.0b10 build in today's testing session. 

The testing took place on the following devices: Samsung Galaxy Note 5(Android 5.1.1), Xiaomi mi i4 (Android 5.0.2), Samsung Galaxy S5 (Android 5.0.2), Nexus 5 (Android 6.0.1), Nexus 6P (Android N Beta), 	Alcatel OneTouch, and on tablets: Xiaomi MI pad2 (Android 5.1.1), Nexus 9 (Android 5.1), Nexus 9 (Android 6.1).

I will leave this issue open in order to gather more information in the Crash-stats reports.
You need to log in before you can comment on or make changes to this bug.