Closed Bug 1636615 Opened 2 months ago Closed 2 months ago

Crash in [@ mozilla::MozPromise<T>::Private::Resolve<T>] on OSX

Categories

(Core :: Audio/Video: Playback, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla78
Tracking Status
firefox-esr68 --- unaffected
firefox75 --- unaffected
firefox76 --- unaffected
firefox77 --- unaffected
firefox78 blocking verified

People

(Reporter: mccr8, Assigned: bwc)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug is for crash report bp-357d3ef4-a78d-477b-805a-9fc8c0200507.

Top 10 frames of crashing thread:

0 XUL void mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true>::Private::Resolve<nsTArray<RefPtr<mozilla::MediaData> > > xpcom/threads/MozPromise.h:1071
1 XUL mozilla::AppleVTDecoder::OutputFrame dom/media/platforms/apple/AppleVTDecoder.cpp:440
2 XUL mozilla::PlatformCallback dom/media/platforms/apple/AppleVTDecoder.cpp:303
3 VideoToolbox VTDecoderSessionSetPixelBufferAttributes 
4 VideoToolbox VTDecoderSessionSetPixelBufferAttributes 
5 VideoToolbox VTRemoteVideoDecoderGetClassID 
6 VideoToolbox VCH263VideoDecoder_CreateInstance 
7 VideoToolbox VCH263VideoDecoder_CreateInstance 
8 VideoToolbox VTRemoteVideoDecoderGetClassID 
9 CoreMedia figThreadMain 

This is a frequent OSX-only crash that started on with the 20200506093716 build and is happening inside mozilla::AppleVTDecoder::OutputFrame.

This is the set of patches for that build: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=385f49adaf00d02fc8e04da1e0031e3477182d67&tochange=1fa1d4f4b0e247804927d474cfecdf6c6f185203

Bug 1626364 is in the range and touched AppleVTDecoder.cpp so I'm going to guess that is related to this regression.

Could you take a look, Byron? Thanks.

Flags: needinfo?(docfaraday)

[Tracking Requested - why for this release]: OSX-specific top crash regression.

What this crash indicates is that we resolved the promise either there https://searchfox.org/mozilla-central/source/dom/media/platforms/apple/AppleVTDecoder.cpp#191 or there: https://searchfox.org/mozilla-central/source/dom/media/platforms/apple/AppleVTDecoder.cpp#201

Yet decoding started and we would have normally returned a frame.

It's likely playback is now broken on mac.

Priority: -- → P1

(In reply to Jean-Yves Avenard [:jya] from comment #3)

What this crash indicates is that we resolved the promise either there https://searchfox.org/mozilla-central/source/dom/media/platforms/apple/AppleVTDecoder.cpp#191 or there: https://searchfox.org/mozilla-central/source/dom/media/platforms/apple/AppleVTDecoder.cpp#201

Yet decoding started and we would have normally returned a frame.

It's likely playback is now broken on mac.

This seems to only effect OS X versions 10.12 and earlier, so maybe VideoToolbox on those versions still decodes frames sometimes when there's an error. This comment hints that this can happen:

https://searchfox.org/mozilla-central/rev/e9131060a0322f6c181b4933ac74e6440b6e723d/dom/media/platforms/apple/AppleVTDecoder.cpp#198-199

However, bug 1626364 did not change how we treat most errors returned synchronously by VTDecompressionSessionDecodeFrame. It did change how we process the dropped frame error:

Old: https://searchfox.org/mozilla-central/rev/afd1feade0231932c1b5c9a3f262ac9c9fec2bad/dom/media/platforms/apple/AppleVTDecoder.cpp#187-198

New: https://searchfox.org/mozilla-central/rev/e9131060a0322f6c181b4933ac74e6440b6e723d/dom/media/platforms/apple/AppleVTDecoder.cpp#187-206

We used to ignore the kVTDecodeInfo_FrameDropped error entirely. Perhaps it is possible (or even expected) to still get a frame when this error occurs on older versions of VideoToolbox.

The other change from bug 1626364 was handling of errors here:

New: https://searchfox.org/mozilla-central/rev/e9131060a0322f6c181b4933ac74e6440b6e723d/dom/media/platforms/apple/AppleVTDecoder.cpp#290-293

Old: https://searchfox.org/mozilla-central/rev/afd1feade0231932c1b5c9a3f262ac9c9fec2bad/dom/media/platforms/apple/AppleVTDecoder.cpp#282-284

It may be possible that older versions of VideoToolbox will call this callback again sometimes after calling it with an error.

I guess we could make this call a ResolveIfExists, but that seems a bit of a cop-out:

https://searchfox.org/mozilla-central/rev/e9131060a0322f6c181b4933ac74e6440b6e723d/dom/media/platforms/apple/AppleVTDecoder.cpp#440

It may be the only thing we can do in this situation though, if VideoToolbox is behaving erratically. The question is whether VideoToolbox only behaves this way occasionally. Looking at the crash-stats, I see the same install crashing multiple times in a number of cases, which implies that it is a pretty reproducible problem (at least for those users).

So this is interesting. Here's a report with a more complete stack (all of the complete stacks I've found look like this):

https://crash-stats.mozilla.org/report/index/497872fa-b71f-4fb1-9651-825470200508

Note the appearance of VCH263VideoDecoder_CreateInstance in that stack. That hints that this is the first frame. That says to me that re-initting when we see this problem may lead to a re-occurrance of the problem. Maybe we should simply use ResolveIfExists, or put a check in PlatformCallback that bails if mPromise is already settled.

Thoughts?

Flags: needinfo?(docfaraday) → needinfo?(jyavenard)

(In reply to Byron Campen [:bwc] from comment #7)

So this is interesting. Here's a report with a more complete stack (all of the complete stacks I've found look like this):

https://crash-stats.mozilla.org/report/index/497872fa-b71f-4fb1-9651-825470200508

Note the appearance of VCH263VideoDecoder_CreateInstance in that stack. That hints that this is the first frame. That says to me that re-initting when we see this problem may lead to a re-occurrance of the problem. Maybe we should simply use ResolveIfExists, or put a check in PlatformCallback that bails if mPromise is already settled.

Thoughts?

This would be the simplest workaround, but I'm concerned that it could be a more generic issue and somehow we end up dropping video frames that would otherwise play properly.

So the easy fix may not be the right one. We should only be dropping frames that need to be dropped.

Flags: needinfo?(jyavenard)

(In reply to Jean-Yves Avenard [:jya] from comment #8)

(In reply to Byron Campen [:bwc] from comment #7)

So this is interesting. Here's a report with a more complete stack (all of the complete stacks I've found look like this):

https://crash-stats.mozilla.org/report/index/497872fa-b71f-4fb1-9651-825470200508

Note the appearance of VCH263VideoDecoder_CreateInstance in that stack. That hints that this is the first frame. That says to me that re-initting when we see this problem may lead to a re-occurrance of the problem. Maybe we should simply use ResolveIfExists, or put a check in PlatformCallback that bails if mPromise is already settled.

Thoughts?

This would be the simplest workaround, but I'm concerned that it could be a more generic issue and somehow we end up dropping video frames that would otherwise play properly.

So the easy fix may not be the right one. We should only be dropping frames that need to be dropped.

True, but if the decoder tells us it has dropped the frame when it has not, what should we do? We would need to ignore VT when it tells us it has dropped a frame, which would mean we would need to have a backup plan for rejecting that promise. We can't just set a timeout, because blocking decode for 30ms (just an example) every time a frame is dropped would probably not be a good idea. We would probably need to first get rid of the sync wait, and shift over to an API that uses a separate promise for each frame (multiple of which can be unsettled at a time). Then maybe we could reject old unsettled promises when a more recent frame is decoded (or an error occurs). Of course, we still aren't sure whether webrtc.org is compatible with such an approach.

Try looks fine.

Assignee: nobody → docfaraday
Attachment #9147293 - Attachment description: Bug 1636615: Workaround for bug to stop crashes. → Bug 1636615: Workaround VT decoder sometimes telling us that it dropped a frame, and then decoding it anyway.
Severity: -- → S1

We do this by keeping the unexpected frame in mReorderQueue, and outputting it later when the caller is expecting frames.

Try looks good. Doing some manual testing on my end.

Seems to work ok. I'll be able to do some testing on old OS X next week once some hardware arrives.

Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eb882d1719d2
Work around Apple VT decoder saying it dropped a frame, and then decoding that frame anyway. r=jya
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78
Attachment #9147293 - Attachment is obsolete: true
Status: RESOLVED → VERIFIED

I've verified that the crash happens very reliably when using Amazon Chime on OS X 10.12, and have also verified that the crash goes away with the changeset from this bug. H264 decode on Amazon Chime seems to work well using the try push in comment 16. I am seeing some strange behavior with current nightly, however. Might be an unrelated regression.

Verified that this revision (which landed very soon after the fix from this bug; this was a backout that un-busted a crash on newtab) works ok.

https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=332e06a545d2

The fix for bug 1632489 is causing the weirdness I'm seeing. Have filed bug 1638942 for this.

You need to log in before you can comment on or make changes to this bug.