Crash in shutdownhang | __psynch_cvwait | <name omitted> | <name omitted> | nsThread::GetEvent | nsThread::ProcessNextEvent | nsThread::Shutdown | mozilla::image::DecodePool::Observe

RESOLVED FIXED in Firefox 56

Status

()

P1
critical
Rank:
16
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: marcia, Assigned: aosmond)

Tracking

({crash, regression})

56 Branch
mozilla57
Unspecified
Mac OS X
crash, regression
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 fixed, firefox57 fixed)

Details

(Whiteboard: gfx-noted, crash signature)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-a10c212a-f048-4a31-87a0-286ef0170729.
=============================================================

Seen while looking at nightly Mac crash stats - crashes started using 20170728100358 when Nightly was still in 56: http://bit.ly/2vMNsEn. So far about 10 crashes all on 10.12

webrtc::vp8::kVP8Log2Range is at the bottom of the stack. There is also "MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.)"
Blocking on xpcom-shutdown. That stack looks suspect to me. kVP8Log2Range() and ToHex() calls firefox start()? Anyone know what that might mean?
Rank: 16
Priority: -- → P1
This looks maybe related to mozilla::image::DecodePool to me, not webrtc.
Moving to Imagelib based on Comment 2.
Component: WebRTC → ImageLib
Hi Timothy, do you have any idea about this? I don't see pushlog that is related with DecodePool from https://hg.mozilla.org/mozilla-central/pushloghtml/6.
Flags: needinfo?(tnikkel)
You landed a patch that had to do with shutdown recently, right?
Flags: needinfo?(aosmond)
(Assignee)

Comment 6

a year ago
Yes but this doesn't seem related to that fix. The one of the decoder threads is in the ICO decoder which is enough of a smoking gun for me to suspect a regression from bug 1315554.
Assignee: nobody → aosmond
Blocks: 1315554
Status: NEW → ASSIGNED
Flags: needinfo?(aosmond)
(Assignee)

Updated

a year ago
Flags: needinfo?(tnikkel)
Whiteboard: gfx-noted
(Assignee)

Comment 7

a year ago
Certain traces suggest it is stuck looping in the do { ... } while(pos > 0) loop in StreamingLexer::Clone. Perhaps it somehow returned a state other than READY, in which case we really ought to bail. I haven't figured out a set of parameters and SourceBuffer states which allows this to happen yet though.
(Assignee)

Comment 8

a year ago
Created attachment 8896983 [details] [diff] [review]
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready., v1
(Assignee)

Updated

a year ago
Attachment #8896983 - Flags: review?(tnikkel)
Attachment #8896983 - Flags: review?(tnikkel) → review+

Comment 10

a year ago
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c956cbe4e5f
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready. r=tnikkel

Comment 11

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/4c956cbe4e5f
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
status-firefox57: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
status-firefox55: --- → unaffected
status-firefox-esr52: --- → unaffected
Version: Trunk → 56 Branch
I'm not seeing any crashes on 56 for this signature since it went to Beta. Can we let this fix ride the trains or should it be considered for uplift anyway?
Flags: needinfo?(aosmond)
(Assignee)

Comment 13

a year ago
Comment on attachment 8896983 [details] [diff] [review]
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready., v1

Approval Request Comment
[Feature/Bug causing the regression]: 1315554
[User impact if declined]: Browser may crash when decoding certain ICO resources. I expect they are corrupted or truncated, so relatively rare but not unheard of.
[Is this code covered by automated tests?]: The code is exercised but not all of the failure cases are reached. The failure path is simple however and simply stops decoding immediately (which itself a well tested path).
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No. I do not know which images caused this problem in the field.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: The change has a small, well defined scope. It changes some asserts to conditions we enforce at runtime (and also assert), so we can fail gracefully in release.
[String changes made/needed]: None.
Flags: needinfo?(aosmond)
Attachment #8896983 - Flags: approval-mozilla-beta?
Comment on attachment 8896983 [details] [diff] [review]
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready., v1

Crash fix for a new regression in 56. Let's uplift it for beta 8 even if it is a rarely recorded crash. It seems preferable to fail to decode rather than to crash.
Attachment #8896983 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 15

a year ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/45080fa9b68c
status-firefox56: affected → fixed
(In reply to Andrew Osmond [:aosmond] from comment #13)
> [Is this code covered by automated tests?]: The code is exercised but not
> all of the failure cases are reached. The failure path is simple however and
> simply stops decoding immediately (which itself a well tested path).
> [Has the fix been verified in Nightly?]: Yes.
> [Needs manual test from QE? If yes, steps to reproduce]: No. I do not know
> which images caused this problem in the field.

Setting qe-verify- based on Andrew's assessment on manual testing needs and the fact that this fix has some automated coverage.
Flags: qe-verify-
(Assignee)

Updated

a year ago
Depends on: 1399079
You need to log in before you can comment on or make changes to this bug.