Closed Bug 1765350 Opened 2 years ago Closed 2 years ago

MOZ_DIAGNOSTIC_ASSERT(!IsUsed()) VAAPI: Crash in [@ mozilla::VideoFrameSurface<T>::ReleaseVAAPIData]

Categories

(Core :: Audio/Video: Playback, defect)

Unspecified
Linux
defect

Tracking

()

RESOLVED FIXED
103 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox101 --- disabled
firefox102 --- fixed
firefox103 --- fixed

People

(Reporter: gsvelto, Assigned: stransky)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/4135848e-1e17-4ee4-9d28-824520220416

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(!IsUsed())

Top 10 frames of crashing thread:

0 libxul.so mozilla::VideoFrameSurface<59>::ReleaseVAAPIData dom/media/platforms/ffmpeg/FFmpegVideoFramePool.cpp:66
1 libxul.so mozilla::VideoFramePool<59>::ReleaseUnusedVAAPIFrames dom/media/platforms/ffmpeg/FFmpegVideoFramePool.cpp:90
2 libxul.so mozilla::FFmpegVideoDecoder<59>::DoDecode dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:815
3 libxul.so mozilla::FFmpegDataDecoder<59>::DoDecode dom/media/platforms/ffmpeg/FFmpegDataDecoder.cpp:193
4 libxul.so mozilla::FFmpegDataDecoder<59>::ProcessDecode dom/media/platforms/ffmpeg/FFmpegDataDecoder.cpp:147
5 libxul.so mozilla::detail::ProxyRunnable<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true>, RefPtr<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true> >  xpcom/threads/MozPromise.h:1538
6 libxul.so mozilla::TaskQueue::Runner::Run xpcom/threads/TaskQueue.cpp:196
7 libxul.so nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:310
8 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1174
9 libxul.so mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:300

Crashes appear to start April 6, but there are two different 20220404093932 installs and an even earlier 20220330093158 build (installed on 3/30, too). Looks like a regression, but maybe also some content (or tests?) that changed is involved, too.

Keywords: regression

We have very few crashes, should we reduce the severity?

Flags: needinfo?(jmathies)

Might be related to bug 1721617/bug 1743638.
1 user ran into this even after bug 1770407: bp-f9206526-a8c8-4cdf-af06-385fb0220529

Component: Audio/Video → Audio/Video: Playback
See Also: → 1721617, 1743638
Summary: Crash in [@ mozilla::VideoFrameSurface<T>::ReleaseVAAPIData] → VAAPI: Crash in [@ mozilla::VideoFrameSurface<T>::ReleaseVAAPIData]

That's MOZ_DIAGNOSTIC_ASSERT(!IsUsed()) assertion here:

https://searchfox.org/mozilla-central/rev/7751fef9eeb3db0a07ae4680daa2a62bd8f49882/dom/media/platforms/ffmpeg/FFmpegVideoFramePool.cpp#73

That means we're recycling frames which are still used by compositor.

Flags: needinfo?(jmathies)
Summary: VAAPI: Crash in [@ mozilla::VideoFrameSurface<T>::ReleaseVAAPIData] → MOZ_DIAGNOSTIC_ASSERT(!IsUsed()) VAAPI: Crash in [@ mozilla::VideoFrameSurface<T>::ReleaseVAAPIData]
See Also: 1721617, 1743638

Found the bug here. The sequence is:

  1. DMABufSurfaceYUV::Serialize() (ref = 1)
  2. DMABUFSurfaceImage::~DMABUFSurfaceImage() (ref = 0)
  3. DMABufSurfaceYUV::ImportSurfaceDescriptor() (ref = 1)

The correct (expected) sequence is:

  1. DMABufSurfaceYUV::Serialize() (ref = 1)
  2. DMABufSurfaceYUV::ImportSurfaceDescriptor() (ref = 2)
  3. DMABUFSurfaceImage::~DMABUFSurfaceImage() (ref = 1)

so when Serialize() -> IPC -> ImportSurfaceDescriptor() takes too long, DMABUFSurfaceImage is called meanwhile so we're marked as unused temporary.

Note that 'unused' in this case means the frame may be reused for decoding another frame, so in the worst case we'll see image artifacts in decoded video. The surfaces are still referenced by surface pool.

In this patch we refrence dmabuf surface when it's exported via. IPC to keep
the reference active in case it's released on exporter side before it's received.

Assignee: nobody → stransky
Status: NEW → ASSIGNED
Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/389e2c28a97f
[Linux] Reference dmabuf surface when it's exported via. IPC r=emilio
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 103 Branch

Comment on attachment 9278880 [details]
Bug 1765350 [Linux] Reference dmabuf surface when it's exported via. IPC r?emilio

Beta/Release Uplift Approval Request

  • User impact if declined: When VA-API is enabled it may crash due to wrong dmabuf surface releases.
    VA-API is almost ready to use and would be pity to miss this last fix in whole next ESR line.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Reference the surface by sender and not by receiver, should not cause any problem.
  • String changes made/needed: none
  • Is Android affected?: No
Attachment #9278880 - Flags: approval-mozilla-beta?

Comment on attachment 9278880 [details]
Bug 1765350 [Linux] Reference dmabuf surface when it's exported via. IPC r?emilio

Approved for 102 beta 5, thanks.

Attachment #9278880 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Regressions: 1809116
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: