Closed Bug 1976929 Opened 11 months ago Closed 5 months ago

WebCodecs VideoDecoder frequent reset & close causes tab crash

Categories

(Core :: Audio/Video: Web Codecs, defect, P2)

Firefox 140
defect

Tracking

()

RESOLVED FIXED
148 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox-esr140 --- fixed
firefox140 --- wontfix
firefox146 --- wontfix
firefox147 --- verified
firefox148 --- verified

People

(Reporter: andreas, Assigned: aosmond)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Crash Data

Attachments

(4 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:140.0) Gecko/20100101 Firefox/140.0

Steps to reproduce:

I have a wasm app that makes use of WebCodec for video decoding. Both upon close and reset of the VideoDecoder I experience tab crashes.

There's no actionable information in the log - my own logs indicate that close/reset is the last thing that happened.

I unfortunately do not have a minimal repro case at this point, but I have a build that repros it very well here:
https://rerun.io/viewer/pr/10590?url=https%3A%2F%2Fstatic.rerun.io%2Frrd%2F0.23.0%2Ftest_video_00d14499a43e355706609cb7f8f902e2b9718b71.rrd
After the application & file is loaded scrub on the timeline. This reliably repros the crash for me on Windows and I have reports from coworkers getting it on Linux as well.

Actual results:

Intermittent tab crashes, as far as I can tell from logs these happen right after a call to either close or reset

Something I forgot to mention: I haven't checked if the codec is relevant, but it's an H264 Annex B video in this case with codec string avc1.640015

I was able to reproduce the tab crash on Win11x64 using Firefox build 140.0.4 and the link from description (https://rerun.io/viewer/pr/10590?url=https%3A%2F%2Fstatic.rerun.io%2Frrd%2F0.23.0%2Ftest_video_00d14499a43e355706609cb7f8f902e2b9718b71.rrd)
Mark issue as new for engineering input. Thank you.

Status: UNCONFIRMED → NEW
Component: Untriaged → General
Ever confirmed: true
OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64
Component: General → Audio/Video: Web Codecs
Product: Firefox → Core

The severity field is not set for this bug.
:padenot, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(padenot)

Reporter, thanks for filing!

Would you mind getting us a couple of those crashes from about:crashes, and paste them here? This will get us stacks so we can assign the right person to look into this. Usually we can match them up by time it occurred. Thanks!

Flags: needinfo?(padenot) → needinfo?(andreas)

Chun-Min, something funny with promises in the `MediaChangeMonitor```, can you have a look

Crash Signature: @ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() @ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator()
Flags: needinfo?(cchang)
Crash Signature: @ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() @ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() → mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator()
Crash Signature: mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() → [ @ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() ] [ @ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() ]
Crash Signature: [ @ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() ] [ @ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() ] → [@ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() ] [@ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() ]
Blocks: VideoDecoder
Flags: needinfo?(cchang)
Flags: needinfo?(cchang)
Severity: -- → S3
Priority: -- → P2

I am able to consistently reproduce this in CI trying to land some ffmpeg + Android work....

The owner decides to call MediaChangeMonitor::Decode, after which we call MediaChangeMonitor::CheckForChange:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#972

If we don't have a decoder yet, we create one:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1279

After which we ensure a promise is stored in mDecodePromise:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#986

When the underlying decoder is initialized, we call MediaChangeMonitor::DecodeFirstSample:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1197

Which calls Decode on the underlying decoder, which is probably running on another thread and/or in another process:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1261

The owner decides to call MediaChangeMonitor::Shutdown, which rejects mDecodePromise:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1075

The Decode promise from earlier finally returns with a result, but mDecodePromise was already rejected:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1267

Crash Signature: [@ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() ] [@ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() ] → [@ mozilla::MediaChangeMonitor::CreateDecoderAndInit::<T>::operator() ] [@ mozilla::MediaChangeMonitor::DecodeFirstSample::<T>::operator() ] [@ mozilla::MediaChangeMonitor::DecodeFirstSample::$::operator() ]
Duplicate of this bug: 1916625
See Also: → 1686433

My first thought is to just delay rejecting the promises:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1072-1079

To happen when the shutdown promise returns:
https://searchfox.org/firefox-main/rev/33bba5cfe4a89dda0ee07fa9fbac578353713fd3/dom/media/platforms/wrappers/MediaChangeMonitor.cpp#1087

But maybe that causes other problems further up in the state machine, because they might resolve the promise instead of reject....

The alternative is to just stop using MozPromiseHolder::Resolve and MozPromiseHolder::Reject, at least without checking if the promise is empty first. That is probably okay.

Since the owner can call MediaChangeMonitor::Shutdown and reject any
outstanding promises, we shouldn't assume mDecodePromise will still be
set in any promise chains after dispatching. This will allow us to
shutdown gracefully.

Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Flags: needinfo?(cchang)

The patch uplifts cleanly to ESR115, I think it is impacted, even if we aren't getting crash reports for it.

Keywords: regression
OS: Windows 10 → All
Regressed by: 1374210
Hardware: x86_64 → All
Blocks: 2008354

Since the owner can call MediaChangeMonitor::Shutdown and reject any
outstanding promises, we shouldn't assume mDecodePromise will still be
set in any promise chains after dispatching. This will allow us to
shutdown gracefully.

Attachment #9536125 - Flags: approval-mozilla-esr140?

Since the owner can call MediaChangeMonitor::Shutdown and reject any
outstanding promises, we shouldn't assume mDecodePromise will still be
set in any promise chains after dispatching. This will allow us to
shutdown gracefully.

Attachment #9536126 - Flags: approval-mozilla-esr115?

Since the owner can call MediaChangeMonitor::Shutdown and reject any
outstanding promises, we shouldn't assume mDecodePromise will still be
set in any promise chains after dispatching. This will allow us to
shutdown gracefully.

Original Revision: https://phabricator.services.mozilla.com/D278077

Attachment #9536128 - Flags: approval-mozilla-release?

firefox-release Uplift Approval Request

  • User impact if declined: High volume content process crash
  • Code covered by automated testing: yes
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: low
  • Explanation of risk level: Just adds a check to only fulfill valid promises. Well understood on how we got into that state.
  • String changes made/needed: N/A
  • Is Android affected?: yes

firefox-esr140 Uplift Approval Request

  • User impact if declined: High volume content process crash
  • Code covered by automated testing: yes
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: low
  • Explanation of risk level: Just adds a check to only fulfill valid promises. Well understood on how we got into that state.
  • String changes made/needed: N/A
  • Is Android affected?: yes

firefox-esr115 Uplift Approval Request

  • User impact if declined: High volume content process crash
  • Code covered by automated testing: yes
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: low
  • Explanation of risk level: Just adds a check to only fulfill valid promises. Well understood on how we got into that state.
  • String changes made/needed: N/A
  • Is Android affected?: yes

I only added 147 because of the fix-optional tag. It should be low risk, but could easily ride a dot release because we've lived with this crash for a long time, what's one more release :).

Pushed by aosmond@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/1f4c1dae040f https://hg.mozilla.org/integration/autoland/rev/65e0e57bfc63 Don't assume MediaChangeMonitor::mDecodePromise is set after dispatching. r=media-playback-reviewers,padenot
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 148 Branch
Attachment #9536126 - Attachment is obsolete: true
Attachment #9536126 - Flags: approval-mozilla-esr115?

The patch landed in nightly and beta is affected.
:aosmond, is this bug important enough to require an uplift?

For more information, please visit BugBot documentation.

Flags: needinfo?(aosmond)

Beta has already merged to release.

Flags: needinfo?(aosmond)
QA Whiteboard: [qa-triage-done-c149/b148]
Attachment #9536125 - Flags: approval-mozilla-esr140? → approval-mozilla-esr140+
Attachment #9536128 - Flags: approval-mozilla-release? → approval-mozilla-release+

I was able to reproduce the tab crash on Win11x64 using FF build 140.0.4 (crashID: https://crash-stats.mozilla.org/report/index/eb85364a-740b-4894-98b0-63fdd0260116 )
Verified issue as fixed meaning that tab is not crashing on Win11x64 using Firefox builds: 149.0a1, 148.0b2 and 147.0.1 (build from Treeherder).

Attached image 2026-01-16_12h37_29.png

I have a question though, even if it does not crash anymore, the link from description is not playing on latest builds like it did on 140.0.4.
Andrew, is this expected? Thank you.

Flags: needinfo?(aosmond)

I would not have expected my patch to fix a playback issue unless it was the crash itself that was stopping the playback. Please file a new bug about the playback failing, and if you are able to find a regression window for this, that would be awesome! Thanks.

Flags: needinfo?(aosmond)
See Also: → 2011428

Logged bug 2011428 for comment#27.
Marking this as verified as tab is not crashing anymore on 148.0b4, 149.0a1, 147.0.1 and 140.8.0esr(20260119133002)- from treeherder.
Not marking as verified as release 140.8.0esr is still pending.

QA Whiteboard: [qa-triage-done-c149/b148] → [qa-triage-done-c149/b148][qa-verification-done-c149/b148]
QA Contact: mchiorean
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: