Closed Bug 1749609 Opened 3 years ago Closed 3 years ago

RDD/VAAPI: Crash in [@ __GI_flock ] with Nightly 96 on AMDGPU + Wayland

Categories

(Core :: Security: Process Sandboxing, defect, P3)

Firefox 96
x86_64
Linux
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr91 --- unaffected
firefox96 --- disabled
firefox97 --- disabled
firefox98 --- disabled

People

(Reporter: coolx67, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0

Steps to reproduce:

with
Archlinux Gnome 41.2 + Wayland + AMDGPU (Radeon Pro Duo / Fiji)
Troubleshooting Information:
https://gist.github.com/romanstingler/214039ca451935294f5bfe0997e82746

media.ffmpeg.vaapi.enabled = 1
open http://demo.nimius.net/video_test/

Actual results:

opening the page immediately crashes the process and generates a coredump
https://gist.github.com/romanstingler/4fdf9f12683279b748c1e92bee6bfad1

I have tested all relevant versions and found the good and the bad one

good
https://archive.mozilla.org/pub/firefox/nightly/2021/11/2021-11-22-09-39-12-mozilla-central/firefox-96.0a1.en-US.linux-x86_64.tar.bz2

broken
https://archive.mozilla.org/pub/firefox/nightly/2021/11/2021-11-23-09-42-49-mozilla-central/firefox-96.0a1.en-US.linux-x86_64.tar.bz2

Expected results:

Doesn't crash on 96(release). 97(beta)
but on all nightly after the bad one from above.

I got a ton of crashes like this
https://crash-stats.mozilla.org/report/index/23ba7cc8-b113-4a80-93d5-44f830220111#allthreads

The Bugbug bot thinks this bug should belong to the 'Core::Audio/Video: Playback' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core
Severity: -- → S3
Priority: -- → P3

Thanks for the report!

Build ID 20211123094249 (2021-11-23)

Please update to Nightly 98. Does the crash still occur?
RDD sandbox fixes happened in 97, not 96.

Crash Signature: [@ __GI_flock ]
Keywords: crash
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Summary: Regression Nightly96 VAAPI crash on AMDGPU + Wayland → RDD/VAAPI: Crash in [@ __GI_flock ] with Nightly 96 on AMDGPU + Wayland

Can you please use mozregression tool to find exact commits?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Use_Mozregression_tool
I expect the bug was introduced between 95 and 96 then.

Also please run Firefox with MOZ_LOG="Dmabuf:5, PlatformDecoderModule:5" env variable, crash it and attach the log here.
Thanks.

Flags: needinfo?(coolx67)

There's plenty of crashes in version 97 and 98. Confirming. This is the sandbox preventing a flock() syscall.

Status: UNCONFIRMED → NEW
Component: Audio/Video: Playback → Security: Process Sandboxing
Ever confirmed: true

The uses are coming either from here or here.

The component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit auto_nag documentation.

Priority: P3 → --

(In reply to Roman Stingler from comment #0)

Doesn't crash on 96(release). 97(beta)

Please attach about:support of Release and Beta here.
You might have changed other prefs which might explain the differences.

I assume rdd-ffvpx tried to use VP8 VAAPI right after VAAPI has been allowed in RDD: https://hg.mozilla.org/integration/autoland/rev/669ca27af67f

Regressed by: 1698778
Blocks: 1743926
No longer blocks: egl-linux-vaapi
No longer regressed by: 1698778

Firefox Beta 97.0b2 troubleshooting information

Flags: needinfo?(coolx67)

ff-release-troubleshooting-info 96.0

(In reply to Darkspirit from comment #2)

Thanks for the report!

Build ID 20211123094249 (2021-11-23)

Please update to Nightly 98. Does the crash still occur?
RDD sandbox fixes happened in 97, not 96.

Yes happens also with latest nightly 98.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #3)

Can you please use mozregression tool to find exact commits?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Use_Mozregression_tool
I expect the bug was introduced between 95 and 96 then.

Also please run Firefox with MOZ_LOG="Dmabuf:5, PlatformDecoderModule:5" env variable, crash it and attach the log here.
Thanks.
No I can't use mozregression since I can't launch it within wayland as non root user and as a root user I'm not allowed to open the binaries mozregression is providing.....

Therefore, I bisected it manually and this is the MOZ_LOG with latest beta
https://gist.github.com/romanstingler/343dcef20ba83451f74c048d5e23ea5c

Attachment #9258743 - Attachment mime type: application/octet-stream → text/plain
blackout@Workstation ~ % QT_QPA_PLATFORM=wayland mozregression-gui
Warning: Ignoring XDG_SESSION_TYPE=wayland on Gnome. Use QT_QPA_PLATFORM=wayland to run on Wayland anyway.

(mozregression-gui:32611): GLib-GIO-ERROR **: 22:30:06.319: Settings schema 'org.gnome.settings-daemon.plugins.xsettings' does not contain a key named 'antialiasing'
[1]    32610 trace trap (core dumped)  QT_QPA_PLATFORM=wayland mozregression-gui

for the interested here is the ff-nightly dmabuf log
https://gist.github.com/romanstingler/e99cb04e8be174163c7bb46fa9fcc94f

Thanks!

It's possible to install mozregression as a user. The commands would have been:
$ pip3 install --user --upgrade mozregression
$ MOZ_ENABLE_WAYLAND=1 ~/.local/bin/mozregression --good 95 --bad 96 --pref media.ffmpeg.vaapi.enabled:true -a http://demo.nimius.net/video_test/

Range from comment 0 is:
mozregression --good 20211122093912 --bad 20211123094249

0:02.49 INFO: Last good revision: ace2f4af2c29de1886e1e627d0fdb583e7573b59 (2021-11-22 09:39:12)
0:02.49 INFO: First bad revision: 71332992f78f548b219f6405ca970828bd22c071 (2021-11-23 09:42:49)
0:02.49 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ace2f4af2c29de1886e1e627d0fdb583e7573b59&tochange=71332992f78f548b219f6405ca970828bd22c071

(In reply to Darkspirit from comment #12)

Thanks!

It's possible to install mozregression as a user. The commands would have been:
$ pip3 install --user --upgrade mozregression
$ MOZ_ENABLE_WAYLAND=1 ~/.local/bin/mozregression --good 95 --bad 96 --pref media.ffmpeg.vaapi.enabled:true -a http://demo.nimius.net/video_test/

Range from comment 0 is:
mozregression --good 20211122093912 --bad 20211123094249

0:02.49 INFO: Last good revision: ace2f4af2c29de1886e1e627d0fdb583e7573b59 (2021-11-22 09:39:12)
0:02.49 INFO: First bad revision: 71332992f78f548b219f6405ca970828bd22c071 (2021-11-23 09:42:49)
0:02.49 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ace2f4af2c29de1886e1e627d0fdb583e7573b59&tochange=71332992f78f548b219f6405ca970828bd22c071

Thank you a lot, I previously installed the binary from the archlinux repository.

I run the tests and my final information is

 5:00.96 INFO: Narrowed integration regression window from [e662ecf3, 669ca27a] (4 builds) to [e24b5167, 669ca27a] (2 builds) (~1 steps left)
 5:00.96 INFO: No more integration revisions, bisection finished.
 5:00.96 INFO: Last good revision: e24b51679d83b9edd509529057ff176c364c6414
 5:00.96 INFO: First bad revision: 669ca27af67f297133783f30c87b709470b0d3fa
 5:00.96 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=e24b51679d83b9edd509529057ff176c364c6414&tochange=669ca27af67f297133783f30c87b709470b0d3fa
Priority: -- → P1

The spike here appears to have disappeared.

Priority: P1 → P3

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: