Crash in [@ put_8tap_4h_43]
Categories
(Core :: Audio/Video: Playback, defect)
Tracking
()
People
(Reporter: RyanVM, Assigned: az)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
|
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/8002f140-24bf-4294-b6ed-78f510240219
Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Top 10 frames of crashing thread:
0 libmozavcodec.dylib put_8tap_4h_43
1 libmozavcodec.dylib put_regular4_hv_neon media/ffvpx/libavcodec/aarch64/vp9dsp_init_aarch64.c:100
2 libmozavcodec.dylib mc_chroma_unscaled media/ffvpx/libavcodec/vp9recon.c
2 libmozavcodec.dylib inter_pred_8bpp media/ffvpx/libavcodec/vp9_mc_template.c:418
3 libmozavcodec.dylib inter_recon media/ffvpx/libavcodec/vp9recon.c:593
3 libmozavcodec.dylib ff_vp9_inter_recon_8bpp media/ffvpx/libavcodec/vp9recon.c:648
4 libmozavcodec.dylib ff_vp9_decode_block media/ffvpx/libavcodec/vp9block.c:1397
5 libmozavcodec.dylib decode_sb media/ffvpx/libavcodec/vp9.c
6 libmozavcodec.dylib decode_sb media/ffvpx/libavcodec/vp9.c:1130
7 libmozavcodec.dylib decode_sb media/ffvpx/libavcodec/vp9.c:1135
Comment 1•1 year ago
|
||
Adding another signature that looks quite close to the first one, crash addresses also look similar.
Comment 2•1 year ago
|
||
And more signatures, I suspect we might be looking at a single installation sending all of these.
Comment 3•1 year ago
|
||
And one more... these crashes all seem to be coming from a single ARM-based machine.
Updated•1 year ago
|
Comment 4•1 year ago
|
||
And one more... these crashes all seem to be coming from a single ARM-based machine.
I see:
macOS 11 5 38.5%
macOS 14 5 38.5%
macOS 13 3 23.1%
So it's likely more than 1 machine.
Comment 5•1 year ago
•
|
||
https://crash-pings.mozilla.org/ now shows this to be the nr. 1 topcrasher on macOS, affecting at least 250 clients and causing thousands of crashes. We're not seeing this on Socorro because it's in the RDD process so people never submit reports. Raising to S2.
Comment 6•1 year ago
|
||
Added a couple more signatures which are clearly related.
Comment 7•11 months ago
|
||
Since the crash volume is low (less than 15 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.
For more information, please visit BugBot documentation.
Comment 8•11 months ago
|
||
The crash volume is much higher than what we can see on Socorro. Crash ping data indicates that there are thousand of instances per week with hundreds of users being affected.
Comment 9•11 months ago
|
||
Symbols for Linux 64-bit ARM builds show that this is a problem there too, so likely tied to the AArch64-specific code paths and not to any particular operating system. It might be worth looking into ffmpeg's tracker if they're already aware of this issue or not.
Comment 10•11 months ago
|
||
From our mastodon conversation I gather that firefox allocates the frame buffers as '1280 * 640 + 640 * 320 + 640 * 320 == 300 * 4096', i.e. the buffer size matches the pixel data exactly? This triggers the issue.
There is a mismatch in FFmpeg's VP9 decoder on when to emulate edges and how much the 8tap NEON mc filter over reads from the reference frame. This is hidden within FFmpeg since its avcodec_default_get_buffer2() over allocates the planes by at least 31-bytes, see https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/HEAD:/libavcodec/get_buffer.c#l122
A quick workaround could be to do the same in firefox. Fixing the issue in FFmpeg might take a few days.
Updated•11 months ago
|
Comment 11•11 months ago
|
||
Since nightly and release are affected, beta will likely be affected too.
For more information, please visit BugBot documentation.
Comment 12•11 months ago
|
||
I've sent fixes for the root issue in FFmpeg's VP9 decoder:
https://ffmpeg.org//pipermail/ffmpeg-devel/2024-December/337723.html
https://ffmpeg.org//pipermail/ffmpeg-devel/2024-December/337724.html
https://ffmpeg.org//pipermail/ffmpeg-devel/2024-December/337725.html
Comment 13•11 months ago
|
||
Thank you very much for your help here Janne.
| Assignee | ||
Comment 14•11 months ago
|
||
Updated•11 months ago
|
| Assignee | ||
Comment 15•11 months ago
|
||
We're not currently using libavcodec/arm/vp9mc_neon.S so it's not included in the attached patch.
Comment 16•11 months ago
|
||
media/ffvpx/libavcodec/vp9recon.c is generating a code coverage warning. Do we not test VP9 decoding in CI?
Comment 17•10 months ago
|
||
(In reply to Janne Grunau from comment #10)
There is a mismatch in FFmpeg's VP9 decoder on when to emulate edges and how much the 8tap NEON mc filter over reads from the reference frame. This is hidden within FFmpeg since its avcodec_default_get_buffer2() over allocates the planes by at least 31-bytes, see https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/HEAD:/libavcodec/get_buffer.c#l122
A quick workaround could be to do the same in firefox.
Hi guys. I've been discussing the patch with Janne upstream. Thanks to him for finding the root cause of the crash. Upstream, buffers are allocated with extra padding (31 bytes) to allow minor overreads in individual decoder implementations. Individual decoder implementations (including SIMD optimized variants for particular target systems) may use this at their discretion to do minor overreads, as we see here.
Janne's patch fixes the issue, but other issues may remain in other corners of the FFmpeg codebase. A more generic approach would be to over-allocate the buffer allocated by Firefox in the AVCodecContext.get_buffer() callback also - by the same 31 bytes also. This would be identical to what upstream does and would therefore be more future-proof.
Would you guys be open to such a change?
Comment 18•10 months ago
|
||
Janne's original patch-set was merged upstream yesterday. Providing padding in the buffer returned from the get_buffer() callback would still be desirable but VP9 now no longer relies on that. (Other codecs may, though.)
Comment 19•10 months ago
|
||
We should very much do the same as upstream indeed, thanks for taking the time to mention it.
I filed this as https://bugzilla.mozilla.org/show_bug.cgi?id=1940034.
Comment 20•10 months ago
|
||
Comment 21•10 months ago
|
||
| bugherder | ||
| Reporter | ||
Updated•10 months ago
|
Comment 22•10 months ago
|
||
The patch landed in nightly and beta is affected.
:az, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox135towontfix.
For more information, please visit BugBot documentation.
| Assignee | ||
Comment 23•10 months ago
|
||
Comment on attachment 9444751 [details]
Bug 1881185 - Add and apply ffmpeg patch to fix VP9 decoding edge emulation and potential over-reads on aarch64.
Beta/Release Uplift Approval Request
- User impact if declined/Reason for urgency: Top crasher on mac
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Patch from upstream where quite a few people have already looked at it
- String changes made/needed:
- Is Android affected?: Yes
| Reporter | ||
Comment 24•10 months ago
|
||
Comment on attachment 9444751 [details]
Bug 1881185 - Add and apply ffmpeg patch to fix VP9 decoding edge emulation and potential over-reads on aarch64.
Approved for 135.0b4.
Comment 25•10 months ago
|
||
| uplift | ||
| Reporter | ||
Updated•10 months ago
|
| Reporter | ||
Updated•10 months ago
|
| Reporter | ||
Updated•10 months ago
|
Description
•