Closed Bug 1388590 Opened 3 years ago Closed 3 years ago

Crash in shutdownhang | __psynch_cvwait | <name omitted> | <name omitted> | nsThread::GetEvent | nsThread::ProcessNextEvent | nsThread::Shutdown | mozilla::image::DecodePool::Observe

Categories

(Core :: ImageLib, defect, P1)

56 Branch
Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: marcia, Assigned: aosmond)

References

Details

(Keywords: crash, regression, Whiteboard: gfx-noted)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-a10c212a-f048-4a31-87a0-286ef0170729.
=============================================================

Seen while looking at nightly Mac crash stats - crashes started using 20170728100358 when Nightly was still in 56: http://bit.ly/2vMNsEn. So far about 10 crashes all on 10.12

webrtc::vp8::kVP8Log2Range is at the bottom of the stack. There is also "MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.)"
Blocking on xpcom-shutdown. That stack looks suspect to me. kVP8Log2Range() and ToHex() calls firefox start()? Anyone know what that might mean?
Rank: 16
Priority: -- → P1
This looks maybe related to mozilla::image::DecodePool to me, not webrtc.
Moving to Imagelib based on Comment 2.
Component: WebRTC → ImageLib
Hi Timothy, do you have any idea about this? I don't see pushlog that is related with DecodePool from https://hg.mozilla.org/mozilla-central/pushloghtml/6.
Flags: needinfo?(tnikkel)
You landed a patch that had to do with shutdown recently, right?
Flags: needinfo?(aosmond)
Yes but this doesn't seem related to that fix. The one of the decoder threads is in the ICO decoder which is enough of a smoking gun for me to suspect a regression from bug 1315554.
Assignee: nobody → aosmond
Blocks: 1315554
Status: NEW → ASSIGNED
Flags: needinfo?(aosmond)
Flags: needinfo?(tnikkel)
Whiteboard: gfx-noted
Certain traces suggest it is stuck looping in the do { ... } while(pos > 0) loop in StreamingLexer::Clone. Perhaps it somehow returned a state other than READY, in which case we really ought to bail. I haven't figured out a set of parameters and SourceBuffer states which allows this to happen yet though.
Attachment #8896983 - Flags: review?(tnikkel)
Attachment #8896983 - Flags: review?(tnikkel) → review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c956cbe4e5f
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/4c956cbe4e5f
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Version: Trunk → 56 Branch
I'm not seeing any crashes on 56 for this signature since it went to Beta. Can we let this fix ride the trains or should it be considered for uplift anyway?
Flags: needinfo?(aosmond)
Comment on attachment 8896983 [details] [diff] [review]
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready., v1

Approval Request Comment
[Feature/Bug causing the regression]: 1315554
[User impact if declined]: Browser may crash when decoding certain ICO resources. I expect they are corrupted or truncated, so relatively rare but not unheard of.
[Is this code covered by automated tests?]: The code is exercised but not all of the failure cases are reached. The failure path is simple however and simply stops decoding immediately (which itself a well tested path).
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No. I do not know which images caused this problem in the field.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: The change has a small, well defined scope. It changes some asserts to conditions we enforce at runtime (and also assert), so we can fail gracefully in release.
[String changes made/needed]: None.
Flags: needinfo?(aosmond)
Attachment #8896983 - Flags: approval-mozilla-beta?
Comment on attachment 8896983 [details] [diff] [review]
StreamingLexer::Clone should bail if SourceBufferIterator::Advance returns not ready., v1

Crash fix for a new regression in 56. Let's uplift it for beta 8 even if it is a rarely recorded crash. It seems preferable to fail to decode rather than to crash.
Attachment #8896983 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Andrew Osmond [:aosmond] from comment #13)
> [Is this code covered by automated tests?]: The code is exercised but not
> all of the failure cases are reached. The failure path is simple however and
> simply stops decoding immediately (which itself a well tested path).
> [Has the fix been verified in Nightly?]: Yes.
> [Needs manual test from QE? If yes, steps to reproduce]: No. I do not know
> which images caused this problem in the field.

Setting qe-verify- based on Andrew's assessment on manual testing needs and the fact that this fix has some automated coverage.
Flags: qe-verify-
Depends on: 1399079
You need to log in before you can comment on or make changes to this bug.