Closed Bug 1759137 Opened 5 months ago Closed 4 months ago

FFMPEG 5.0 Crash in [@ av_buffer_ref]

Categories

(Core :: Audio/Video: Playback, defect, P3)

Firefox 100
Desktop
Linux
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr91 --- unaffected
firefox98 --- unaffected
firefox99 + wontfix
firefox100 --- fixed
firefox101 --- fixed
firefox102 --- fixed

People

(Reporter: calixte, Assigned: stransky)

References

(Blocks 2 open bugs, Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: https://crash-stats.mozilla.org/report/index/0abffca8-511c-4acd-96e7-7f8d40220308

Reason: SIGSEGV / SI_KERNEL

Top 10 frames of crashing thread:

0 libavutil.so.57 av_buffer_ref 
1 libxul.so mozilla::VideoFrameSurfaceVAAPI::LockVAAPIData dom/media/platforms/ffmpeg/FFmpegVideoFramePool.cpp:43
2 libxul.so mozilla::VideoFramePool::GetVideoFrameSurface dom/media/platforms/ffmpeg/FFmpegVideoFramePool.cpp:143
3 libxul.so mozilla::FFmpegVideoDecoder<59>::CreateImageVAAPI dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:1093
4 libxul.so mozilla::FFmpegVideoDecoder<59>::DoDecode dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:844
5 libxul.so mozilla::FFmpegDataDecoder<59>::DoDecode dom/media/platforms/ffmpeg/FFmpegDataDecoder.cpp:192
6 libxul.so mozilla::FFmpegDataDecoder<59>::ProcessDecode dom/media/platforms/ffmpeg/FFmpegDataDecoder.cpp:146
7 libxul.so mozilla::detail::ProxyRunnable<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true>, RefPtr<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true> >  xpcom/threads/MozPromise.h:1538
8 libxul.so mozilla::TaskQueue::Runner::Run xpcom/threads/TaskQueue.cpp:206
9 libxul.so nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:310

There are 489 crashes (from 10 installations) in nightly 99 starting with buildid 20220308092232. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1750760.

[1] https://hg.mozilla.org/mozilla-central/rev?node=0c6b7084cdb5

Flags: needinfo?(stransky)
Priority: -- → P3

Looks like ffmpeg 5.0 uses hw_frames_ctx differently or we use a different path to init va-api so it's null.

Flags: needinfo?(stransky)
Summary: Crash in [@ av_buffer_ref] → FFMPEG 5.0 Crash in [@ av_buffer_ref]
Has Regression Range: --- → yes

(In reply to Martin Stránský [:stransky] (ni? me) from comment #1)

Looks like ffmpeg 5.0 uses hw_frames_ctx differently or we use a different path to init va-api so it's null.

avcodec.h from both FFmpeg 4.4 and 5.0's contains this:

    /**
     * A reference to the AVHWFramesContext describing the input (for encoding)
     * or output (decoding) frames. The reference is set by the caller and
     * afterwards owned (and freed) by libavcodec - it should never be read by
     * the caller after being set.
     *
     * - decoding: This field should be set by the caller from the get_format()
     *             callback. The previous reference (if any) will always be
     *             unreffed by libavcodec before the get_format() call.
     *
     *             If the default get_buffer2() is used with a hwaccel pixel
     *             format, then this AVHWFramesContext will be used for
     *             allocating the frame buffers.
<snip>
     */
    AVBufferRef *hw_frames_ctx;

The code doesn't initialize it in get_format as directed. I guess FFmpeg 4.4 was lenient enough to initialize it for you, but with 5.0 the field remains NULL.

Duplicate of this bug: 1759596
Crash Signature: [@ av_buffer_ref] → [@ av_buffer_ref] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda]
No longer blocks: 1750760

stransky: seems like it would be really good to get a patch for this crash if possible, so we can uplift it and don't end up shipping 99 with the issue.

Crash Signature: [@ av_buffer_ref] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda] → [@ av_buffer_ref] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda]
Flags: needinfo?(stransky)

(In reply to James Graham [:jgraham] from comment #4)

stransky: seems like it would be really good to get a patch for this crash if possible, so we can uplift it and don't end up shipping 99 with the issue.

Please note the low user counts. VAAPI hasn't shipped, but some users have turned it on for testing purposes.

This issue may be the cause of issue 1760414, where we see logging up to VAAPI locking dmabuf surface then the RDD process seems to die.

We use AV_CODEC_HW_CONFIG_METHOD_HW_DEVICE_CTX for HW decode so we need to reference hw_device_ctx instead of hw_frames_ctx.

Assignee: nobody → stransky
Status: NEW → ASSIGNED
Flags: needinfo?(stransky)
Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/2f7f1c9d7bc8
[Linux] Reference hw_device_ctx instead of hw_frames_ctx r=alwu
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch

Hi, I'm using latest nightly (2b624fdb002e6012209c725042f072a50bd4c4b6) but still see this crash. One of my latest crash report is here: https://crash-stats.mozilla.org/report/index/b19b8525-15b0-4e9f-a8c7-349900220326

I'm also still getting this crash. Using Arch Linux.

https://crash-stats.mozilla.org/report/index/2029e4da-953a-4664-96e5-016da0220313

Sorry for the double comment, forgot to add: I am mainly seeing it when looking at videos on Twitter. Sites like YouTube seem to just work.

Status: RESOLVED → REOPENED
Crash Signature: [@ av_buffer_ref] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda] → [@ av_buffer_ref] [@ libavutil.so.57@0x1e353] [@ libavutil.so.57@0x1e365] [@ libavutil.so.56@0x3081d] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda]
Resolution: FIXED → ---
Target Milestone: 100 Branch → ---

YouTube uses VP9 (when it's not preferring AV1) so that would be served by ffvpx, I think. I just had the crash again with a YouTube livestream, which uses H.264 and thus the system FFmpeg 5. (bp-e5ef9f46-6d4f-4184-ab8d-b990d0220327)

Crash Signature: [@ av_buffer_ref] [@ libavutil.so.57@0x1e353] [@ libavutil.so.57@0x1e365] [@ libavutil.so.56@0x3081d] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda] → [@ av_buffer_ref] [@ libavutil.so.57@0x1e353] [@ libavutil.so.57@0x1e365] [@ libavutil.so.56@0x3081d] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda] [@ vaapi_buffer_free | buffer_pool_free | buffer_replace]

Bug 1757791 may be duplicate of bug 1759137

Duplicate of this bug: 1760414

This is still an issue.

Flags: needinfo?(stransky)

bp-e5ef9f46-6d4f-4184-ab8d-b990d0220327 is fixed by Bug 1758610 - it's caused by different AVFrame layout which is fixed by different VideoFramePool modules.

Status: REOPENED → RESOLVED
Closed: 5 months ago5 months ago
Flags: needinfo?(stransky)
Resolution: --- → FIXED
See Also: → 1758610
Target Milestone: --- → 100 Branch

Just tested in 101.0a1 (2022-04-04) (64-bit)
still crashing.
bp-38e3c1d9-d779-4cc0-9ce1-986120220401

(In reply to abonnements from comment #20)

bp-6bf9720c-5bf4-4c11-90ed-6cfe80220405

This is something different, please file another bug for it.

Hm. This one here was marked as a duplicate of 1759596. I guess it is not.

I reopened the 1759596 for it.
Thanks.

Crash Signature: [@ av_buffer_ref] [@ libavutil.so.57@0x1e353] [@ libavutil.so.57@0x1e365] [@ libavutil.so.56@0x3081d] [@ mozalloc_abort | abort | libavcodec.so.59@0x33bcda] [@ vaapi_buffer_free | buffer_pool_free | buffer_replace] → [@ av_buffer_ref] [@ libavutil.so.57@0x1e353] [@ libavutil.so.57@0x1e365] [@ libavutil.so.56@0x3081d] [@ vaapi_buffer_free | buffer_pool_free | buffer_replace]
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → NEW

The fix here is wrong and we need to revert it due to Bug 1762725.
I added an assertion there so let's keep eye on that.

btw. After reverting this one (Bug 1762725) I can't reproduce it with ffmpeg 5.0.1 - hw_frames_ctx is supposed to be init by ffmpeg because it holds frame pool.

Regressions: 1762725
Depends on: 1766693
No longer depends on: 1766693
Duplicate of this bug: 1766693
OS: Unspecified → Linux
Hardware: Unspecified → Desktop
Target Milestone: 100 Branch → ---

Let's close this one to avoid confusions as it created the Bug 1762725 regression.

Status: NEW → RESOLVED
Closed: 5 months ago4 months ago
Resolution: --- → FIXED

If any new crash pops up in av_buffer_ref (but I don't expect so) let's track that as a new bug.

You need to log in before you can comment on or make changes to this bug.