Closed Bug 1749324 Opened 2 years ago Closed 2 years ago

nvidia-vaapi-driver: Crash in [@ __GI___sched_get_priority_max]

Categories

(Core :: Security: Process Sandboxing, defect, P3)

Unspecified
Linux
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox98 --- disabled

People

(Reporter: gsvelto, Assigned: jld)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/be4ac4b5-0245-4763-a024-54e990220110

Reason: SIGSYS / 0x00000001

Top 10 frames of crashing thread:

0 libc.so.6 __GI___sched_get_priority_max 
1 libcuda.so.1 <.text ELF section in libcuda.so.495.46> 
2 libcuda.so.1 <.text ELF section in libcuda.so.495.46> 
3 libcuda.so.1 cudbgApiInit 
4 libcuda.so.1 <.init ELF section in libcuda.so.495.46> 
5 ld-linux-x86-64.so.2 call_init 
6 ld-linux-x86-64.so.2 _dl_init 
7 libc.so.6 __GI__dl_catch_exception 
8 ld-linux-x86-64.so.2 dl_open_worker 
9 libc.so.6 __GI__dl_catch_exception 

These are calls to sched_get_priority_max() obviously (syscall 142). They seem to come from deep within CUDA libraries (cringe) being invoked by VAAPI code.

Assignee: nobody → jld
Priority: -- → P1

Adding regressed by as this is spiking now.

Regressed by: 1745225
Has Regression Range: --- → yes

32097 crashes off a single install on just one day O.o
Looks like a prime example of bug 1746232

All crashes occured with Nvidia.
https://github.com/elFarto/nvidia-vaapi-driver crashes the RDD process when used without MOZ_DISABLE_RDD_SANDBOX=1 env var.
Duplicate of bug 1748460.

Blocks: 1748460
Type: defect → enhancement
Summary: Crash in [@ __GI___sched_get_priority_max] → nvidia-vaapi-driver: Crash in [@ __GI___sched_get_priority_max]
Type: enhancement → defect

Regressed by: bug 1745225

Here they are trying to add AV1 support: https://github.com/elFarto/nvidia-vaapi-driver/issues/31

I'm going to take a look at changing the library to fail to init if it detects it's in the sandbox. The only issue I can see with that is that the CUDA library is running before any of our code is run, so I'll need to take a close look and make sure we're not directly linking with libcuda.

I'm also somewhat confused how va_infoMessage called dlopen.

I've added a check to v0.0.3 of the library, which will check for the sandbox. I'll be releasing this shortly.

See Also: → 1746232
Severity: S2 → S3
Priority: P1 → --
Priority: -- → P3

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.