Closed Bug 1554438 Opened 5 years ago Closed 5 years ago

Crash should pref "gfx.direct3d11.use-double-buffering" be changed

Categories

(Core :: Graphics: Layers, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox67 --- unaffected
firefox68 --- wontfix
firefox69 --- verified

People

(Reporter: jya, Assigned: jya)

References

(Regression)

Details

(Keywords: regression)

Attachments

(3 files)

Common case:
Start Firefox with the pref "gfx.direct3d11.use-double-buffering" set to false.
then set it to true. Shutdown.

It will assert on debug build, or hang

The reason for this is in:
https://searchfox.org/mozilla-central/source/gfx/ipc/GPUParent.cpp#272
if the pref is true we call:
widget::WinCompositorWindowThread::Start();

and when shutting down:
https://searchfox.org/mozilla-central/source/gfx/ipc/GPUParent.cpp#527
it calls
widget::WinCompositorWindowThread::ShutDown();

however, WinCompositorWindowThread::ShutDown()
assert that https://searchfox.org/mozilla-central/source/widget/windows/WinCompositorWindowThread.cpp#54
sWinCompositorWindowThread is set.

This is only true if widget::WinCompositorWindowThread::Start() has been called.

More troublesome case (this caused the backout of bug 1550422).
In all.js the pref gfx.direct3d11.use-double-buffering is set to have a default value of true.

But in gfxPrefs it is set to have a default value of false.

gfxPrefs was always initialised well after Preferences.
For StaticPrefs however it occurs earlier. We have an interval during which this StaticPrefs will be initialised to its default (false), and then all.js is read which would reset the value to true.
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=248345210&repo=try&lineNumber=1923
So we have a race where we see the GPUParent being created with the pref value set to false, and then at shutdown it's false.

And this cause the assertion.

I haven't determined on why we get a different values for the pref on this run. There's nothing in the code that could switch the pref.

So for now will just address this situation, and prevent the pref changing to cause a problem

The value of the pref may change between start and shutdown.

So we shouldn't rely on that pref to determine if we need to shutdown on thread started earlier if that pref was true.

It is theorically possible that the WinCompositorWindowThread failed to start. Should this happen, attempting to shut it down will cause shutdown to hang forever.

Depends on D32601

Attachment #9067584 - Attachment description: Bug 1554438 - P2. Only shutdown thread if it was started. r?Bas,?sotaro → Bug 1554438 - P2. Only shutdown thread if it was started. r?Bas,sotaro

We also have a similar case that cause bug 1438408.

If gfxVars::UseWebRender() change its value between the time GPUParent was started and then shutdown, we could attempt to call widget::WinCompositorWindowThread::ShutDown() when widget::WinCompositorWindowThread::Start() didn't.

Causing a crash or hang.

I believe the code should be changed such that we shouldn't assume that the start conditions are identical to the shutdown one. And we shutdown thread whenever they exists and not based on assuming they were started earlier (with invalid assumptions at that)

See Also: → 1438408
Attachment #9067584 - Attachment description: Bug 1554438 - P2. Only shutdown thread if it was started. r?Bas,sotaro → Bug 1554438 - P3. Shutdown threads whenever they exist. r?Bas,sotaro,mattwoodrow
Attachment #9067584 - Attachment description: Bug 1554438 - P3. Shutdown threads whenever they exist. r?Bas,sotaro,mattwoodrow → Bug 1554438 - P2. Only shutdown thread if it was started. r?Bas,sotaro,mattwoodrow

We only attempted to shutdow WinCompositorWindowThread if the WebRender thread hadn't not been started.

However, it is possible that the value of gfxVars::UseWebRender() changed since GPUParent::Init got called.

So don't assume anything, shutdown the thread if it still exists.

Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e40827b3a352
P1. Don't rely on a live pref to determine order of actions. r=sotaro
https://hg.mozilla.org/integration/autoland/rev/5247954abf59
P2. Only shutdown thread if it was started. r=sotaro
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69

I think I've figured out what was going on.
But that's an even weirder cause that I could have imagined.

So I've confirmed through try that the pref gfx.direct3d11.use-double-buffering doesn't change during the lifetime of the GPUParent.

However, during GPUParent::RecvInit() IsWin10OrLater() returns true.

But during GPUParent::ActorDestroy IsWin10OrLater() returns false !

Hence why this entire bug occurred in the first place as the code tested for gfxPrefs::Direct3D11UseDoubleBuffering() && IsWin10OrLater()

Will open a new bug about this.

wierd becomes wierder

Will close when P3 gets r+ and lands.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Actually, earlier comment was totally wrong.

Here is why it used to work, but following the move to StaticPrefs it no longer does.

gfxPrefs is initialised in GPUParent::Init()
https://searchfox.org/mozilla-central/rev/4606c7974a68cab416c038acaedcae49eed93822/gfx/ipc/GPUParent.cpp#113

where gfxPrefs::Singleton() is called.

As preferences weren't initialised in the GPU process, the default value for gfxPrefs of gfx.direct3d11.use-double-buffering was false.

The actual settings of the gfxPrefs values was done in GPUParent::RecvInit(): https://searchfox.org/mozilla-central/rev/4606c7974a68cab416c038acaedcae49eed93822/gfx/ipc/GPUParent.cpp#180

Now with the work done in bug 1550422, StaticPrefs is initialised very early and set at the same time as the GPU process is started.
So the default value of StaticPrefs::Direct3D11UseDoubleBuffering() is true.

In all those tests, GPUParent::RecvInit is never called, and as such neither wr::RenderThread::Start() is called, nor widget::WinCompositorWindowThread::Start().

When GPUParent::ActorShutdown() is called.
gfxPrefs::Direct3D11UseDoubleBuffering() would still return false, and as such didn't attempt to call widget::WinCompositorWindowThread::Shutdown() (https://searchfox.org/mozilla-central/rev/4606c7974a68cab416c038acaedcae49eed93822/gfx/ipc/GPUParent.cpp#522)

As wr::RenderThread::Get() would return nullptr, we fall on the test:
https://searchfox.org/mozilla-central/rev/4606c7974a68cab416c038acaedcae49eed93822/gfx/ipc/GPUParent.cpp#526
"else if (gfxPrefs::Direct3D11UseDoubleBuffering() && IsWin10OrLater()) {"

which with gfxPrefs's default would be false.
But StaticPrefs::Direct3D11UseDoubleBuffering() && IsWin10OrLater() instead returned true and would attempt to shutdown the thread that was never created.

And so: boom

Comment on attachment 9067583 [details]
Bug 1554438 - P1. Don't rely on a live pref to determine order of actions. r?Bas,sotaro

Beta/Release Uplift Approval Request

  • User impact if declined: If the user change the value of the pref gfx.direct3d11.use-double-buffering while firefox is running.
    Close firefox would make it hang forever.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Disable webrender if it's enabled.
    Start firefox.
    Change the preference gfx.direct3d11.use-double-buffering to false.
    Close firefox.

It will hang or assert if it's a debug build

  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): We only attempt to shutdown code if it got started
  • String changes made/needed:
Attachment #9067583 - Flags: approval-mozilla-beta?
Attachment #9067584 - Flags: approval-mozilla-beta?
Attachment #9067591 - Flags: approval-mozilla-beta?
Flags: qe-verify+
Assignee: nobody → jyavenard
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fd4d1986773e
P3. Shutdown threads whenever they exist. r=mattwoodrow

Is that pref ever changed by our code or UI? If not I don't think this warrants uplift.

Flags: needinfo?(jyavenard)
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED

(In reply to Julien Cristau [:jcristau] from comment #12)

Is that pref ever changed by our code or UI? If not I don't think this warrants uplift.

it's a visible pref in about:config

Flags: needinfo?(jyavenard)

Comment on attachment 9067583 [details]
Bug 1554438 - P1. Don't rely on a live pref to determine order of actions. r?Bas,sotaro

Doesn't seem worth worrying about then.

Attachment #9067583 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #9067584 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #9067591 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
QA Whiteboard: [qa-triaged]

I managed to reproduce the issue using an older version of Nightly from 25-05-2019 on Windows 10 x64.
I retested everything using the latest Nightly 69.0a1 on Windows 10 x64, Ubuntu 18.04 x64 and macOS 10.13. The bug doesn't seem to be reproducing anymore.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: