1754511 - Crash in [@ RtlpWaitOnCriticalSection | RtlpEnterCriticalSectionContended | RtlEnterCriticalSection | <unknown in nvencmfth264x.dll> | CSerialWorkQueue::QueueItem::ExecuteWorkItem]

Ryan VanderMeulen [:RyanVM]

Reporter

Description

•

3 years ago

Maybe Fission related. (DOMFissionEnabled=1)

I see reports for this going back to September. Not sure how actionable this is on our end.

Crash report: https://crash-stats.mozilla.org/report/index/d297f57f-4230-4ec3-9d2d-389670220128

Reason: EXCEPTION_ACCESS_VIOLATION_WRITE

Top 10 frames of crashing thread:

0 ntdll.dll RtlpWaitOnCriticalSection 
1 ntdll.dll RtlpEnterCriticalSectionContended 
2 ntdll.dll RtlEnterCriticalSection 
3 nvencmfth264x.dll <unknown in nvencmfth264x.dll> 
4 nvencmfth264x.dll <unknown in nvencmfth264x.dll> 
5 rtworkq.dll int CSerialWorkQueue::QueueItem::ExecuteWorkItem 
6 rtworkq.dll virtual long CSerialWorkQueue::QueueItem::OnWorkItemAsyncCallback::Invoke 
7 rtworkq.dll ThreadPoolWorkCallback 
8 ntdll.dll TppWorkpExecuteCallback 
9 ntdll.dll TppWorkerThread

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 1

•

3 years ago

Is this possibly related to bug 1751964?

Flags: needinfo?(jolin)

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 2

•

3 years ago

This is spiking on release quite a bit. Can we find an owner for this?

Flags: needinfo?(jmathies)

Jim Mathies [:jimm]

Updated

•

3 years ago

Blocks: media-triage

Flags: needinfo?(jmathies)

Jim Mathies [:jimm]

Comment 3

•

3 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #1)

Is this possibly related to bug 1751964?

That's not in release yet.

Jim Mathies [:jimm]

Comment 4

•

3 years ago

We thought this might be tied to the webrtc update, but that shipped in 96. This is some sort of issue with an nvidia hardware encoding library that crashes in content. Unfortunately we don't have much to go on based on the stacks.

Flags: needinfo?(jolin)

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 5

•

3 years ago

Looking at just the Nightly reports, it looks like the first ones started coming in back during the 94 Nightly cycle, but the big spike started in 96. And then it rode to release on 97. Does that line up with any feature work that comes to mind? Note that this is currently the #10 top content process crash.

Jim Mathies [:jimm]

Updated

•

3 years ago

Blocks: webrtc-triage
No longer blocks: media-triage

Jim Mathies [:jimm]

Comment 6

•

3 years ago

So then this points to the webrtc update we did in 96.

FYI in 99, we disabled hardware encoding for webrtc due to win32k lockdown, so this should fall off then. Eventually we'll move that to the RDD, so it might show back up there.

Will triage with the webrtc team. Note though the stacks here don't tell us much. Might be worth my posting about this to the nvidia list as well.

Jim Mathies [:jimm]

Updated

•

3 years ago

Blocks: media-triage
No longer blocks: webrtc-triage

Jim Mathies [:jimm]

Comment 7

•

3 years ago

hardware encoding in content was disabled in 100. we're working on moving this to the rdd. so this should go away
over time.
we have the ability to block specific hardware encoders, and could consider that here if it comes back in the rdd.

No longer blocks: media-triage

Severity: S2 → S4

Priority: -- → P3

Jeff Muizelaar [:jrmuizel]

Comment 8

•

3 years ago

I've put together a prototype of a tool that filters out sensitive information from minidumps: https://github.com/jrmuizel/minidump-filter

We should be able to use it to send a minidump to Nvidia so that they can perhaps help us out.

Jeff Muizelaar [:jrmuizel]

Updated

•

3 years ago

Flags: needinfo?(jmuizelaar)

BugBot [:suhaib / :marco/ :calixte]

Comment 9

•

3 years ago

The bug is linked to a topcrash signature, which matches the following criterion:

Top 10 content process crashes on beta

:jimm, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jmathies)

Keywords: topcrash

BugBot [:suhaib / :marco/ :calixte]

Comment 10

•

3 years ago

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit auto_nag documentation.

Keywords: topcrash

Jim Mathies [:jimm]

Comment 11

•

3 years ago

FYI dueling bots here. When the top crash setting was removed by the second bot it should have also removed the ni for triage.

Flags: needinfo?(jmathies) → needinfo?(mcastelluccio)

Marco Castelluccio [:marco]

Comment 12

•

3 years ago

Good point, filed https://github.com/mozilla/relman-auto-nag/issues/1684.
Hopefully this is not happening too frequently, as usually topcrashes stay topcrashes for longer than a week!

Flags: needinfo?(mcastelluccio)

BugBot [:suhaib / :marco/ :calixte]

Comment 13

•

3 years ago

The bug is linked to a topcrash signature, which matches the following criterion:

Top 10 content process crashes on beta

:jimm, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jmathies)

Keywords: topcrash

Marco Castelluccio [:marco]

Comment 14

•

3 years ago

For some reason this crash is oscillating between topcrash status and not topcrash status, we should increase the delay between the last keyword change by the bot and the next one to reduce the noise (or increase the threshold so we add it to topcrash if it's top 10, but not remove it until it becomes lower than, say, top 20).

Flags: needinfo?(jmathies)

BugBot [:suhaib / :marco/ :calixte]

Comment 15

•

3 years ago

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit auto_nag documentation.

Keywords: topcrash

BugBot [:suhaib / :marco/ :calixte]

Comment 16

•

1 year ago

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → WORKSFORME

Bugzilla

Crash in [@ RtlpWaitOnCriticalSection | RtlpEnterCriticalSectionContended | RtlEnterCriticalSection | <unknown in nvencmfth264x.dll> | CSerialWorkQueue::QueueItem::ExecuteWorkItem]

Categories

(Core :: Audio/Video, defect, P3)

Tracking

()

People

(Reporter: RyanVM, Unassigned, NeedInfo)

References

Details

(Keywords: crash)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Updated

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16