Closed Bug 1676999 Opened 4 years ago Closed 4 years ago

Crash in [@ DdQueryDisplaySettingsUniqueness]

Categories

(Core :: Audio/Video: Playback, defect, P3)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
85 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- unaffected
firefox83 --- unaffected
firefox84 --- fixed
firefox85 --- fixed

People

(Reporter: mccr8, Assigned: jya)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files)

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: https://crash-stats.mozilla.org/report/index/c8bcea65-8811-4916-be1a-7f39a0201112

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 gdi32.dll DdQueryDisplaySettingsUniqueness 
1 dxgi.dll long CDXGIFactory::SampleAdapters 
2 dxgi.dll long CDXGIFactory::Initialize 
3 dxgi.dll long CreateDXGIFactoryImpl 
4 dxgi.dll long CreateDXGIFactoryActualImpl1 
5 dxgi.dll CreateDXGIFactory1 
6 xul.dll mozilla::gfx::DeviceManagerDx::GetDXGIAdapter gfx/thebes/DeviceManagerDx.cpp:503
7 xul.dll mozilla::gfx::DeviceManagerDx::CreateContentDevice gfx/thebes/DeviceManagerDx.cpp:805
8 xul.dll mozilla::gfx::DeviceManagerDx::CreateContentDevices gfx/thebes/DeviceManagerDx.cpp:481
9 xul.dll mozilla::RDDParent::RecvInitVideoBridge dom/media/ipc/RDDParent.cpp:190

643 crashes from 6 installations on Nightly. I don't know if this is a dupe of something. These are crashes in the RDD process.

Seem to be all from the same machine (6 different build ids, 1 graphic adapter type).

Has Regression Range: --- → yes

jya, regressed by your bug 1595994.

Flags: needinfo?(jyavenard)

Hmm. This is a worry.

I have no clue.

There's nothing having fission enabled should cause. we have a single video bridge between the RDD and the GPU process. Having more content process is independent to that.

The crash address in all those reports is 0x180090 ; not what I would expect to see on a 64 bits system.

I wonder if this could be a sandboxing issue.

Matt, any ideas?

Flags: needinfo?(jyavenard) → needinfo?(matt.woodrow)

To test the sandboxing issue, could we temporarily disable the sandbox on windows nightly to see if that reduce the crash rate?

Flags: needinfo?(bobowencode)

(In reply to Jean-Yves Avenard [:jya] from comment #4)

To test the sandboxing issue, could we temporarily disable the sandbox on windows nightly to see if that reduce the crash rate?

One of the crash reports had "win32k lockdown" in the comment.
So I tried enabling via the pref that still exists security.sandbox.rdd.win32k-disable and sure enough I get the same crash.
My strong suspicion is that this is a small number of installs, possibly only 1, that has that pref set.
It falls back to using another process, so they probably won't notice.

I guess we could remove the pref altogether for the moment, especially given the noise.

For what it's worth, I don't think disabling one of the sandboxes entirely in Nightly is ever something we should be doing to test things like this.

Flags: needinfo?(matt.woodrow)
Flags: needinfo?(bobowencode)
Severity: -- → S4
Priority: -- → P3

We believe with a very high confidence that despite the very high number of crashes it's only two installs. Otherwise that would be a very high number of people with exactly the same graphic adapter running the exact same drivers.

What we believe is happening is the following:
1- Someone is going to Facebook and there's a video in the feed.
2- The RDD is started and crashes
3- This return a video playback error
4- Facebook sees that as a decoding error and load the video again
5- Repeat and go to step 2.

So you end up with a very high number of crashes, though they wouldn't really be noticeable for the user.

So what we are going to do for this is:
1- Remove the security.sandbox.rdd.win32k-disable pref
2- Launch the RDD; if we get a crash, re-spawn the RDD process but disable HW acceleration
3- If RDD crashes again, we'll disable the RDD process.

Assignee: nobody → jyavenard

The RDD process can no longer work without having access to win32k ; enabling this pref would lead to a crash on Nightly and failure to work elsewhere.

Pushed by jyavenard@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ab085905410d P1. Remove preference. r=bobowen https://hg.mozilla.org/integration/autoland/rev/3d3d8e8473ff P2. Disable windows hardware acceleration if the RDD crashed. r=mattwoodrow,mjf
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 85 Branch

The patch landed in nightly and beta is affected.
:jya, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jyavenard)

Comment on attachment 9189138 [details]
Bug 1676999 - P1. Remove preference. r?bobowen

Beta/Release Uplift Approval Request

  • User impact if declined: Users may play with that pref and that would trigger a massive amount of crash reports.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce: see https://bugzilla.mozilla.org/show_bug.cgi?id=1676999#c5
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): removing unused code, making it impossible to be called.
  • String changes made/needed: none
Flags: needinfo?(jyavenard)
Attachment #9189138 - Flags: approval-mozilla-beta?

Let's do P1 first.

We could do P2, but it will need rework as it depends on bug 1518344. We can do a backport, but it may not be worth it as for now, what would most likely cause the RDD to crash is prefed-out in beta.

Comment on attachment 9189138 [details]
Bug 1676999 - P1. Remove preference. r?bobowen

Approved for 84.0b7.

Attachment #9189138 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: