Closed Bug 1450293 Opened 2 years ago Closed 2 years ago

Avoid starting the GPU process for the initial navigator:blank window

Categories

(Core :: Graphics, enhancement, P5)

enhancement

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: florian, Assigned: florian)

References

Details

(Keywords: feature, perf, Whiteboard: [gfx-noted])

Attachments

(1 file, 1 obsolete file)

The general idea in bug 1336227 / bug 1447719 is to open an about:blank window at the right size and position as early as possible during startup, and to fill it later with a real browser.xul document. This gives us a nice 50+% win on the time to reach first paint, which matters a lot for perceived performance.

I would like this initial blank window (loading only about:blank) to show before we start the GPU process.

There are 2 reasons for that:

- creating the process is very expensive (at least on Windows, especially when an antivirus is installed). See https://perfht.ml/2pLpOCW for a dramatically slow example. Even without antivirus, starting the GPU process is often more than half of the remaining time to first paint.

- Starting GPU-enabled stuff before we had a chance to run the SanityTest that's supposed to disable GPU usage when its driver is crashy sounds like a recipe for startup crashes on release users with odd driver versions.

Any help to figure out how we can draw about:blank without GPU process and turn it on later will be greatly appreciated.
Keywords: feature, perf
Priority: -- → P5
Whiteboard: [gfx-noted]
Attached patch Patch (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=15758b0932337c9816f5ecc46f07453ea666f882

Locally this creates the main process compositor thread before painting the blank window, but the GPU process is started after showing the blank window and after the startupCrashDetectionBegin profiler marker. This also gets rid of the black flash when layers.acceleration.disabled is true (ie. bug 1448146).
Attachment #8964988 - Flags: review?(jmuizelaar)
Assignee: nobody → florian
Status: NEW → ASSIGNED
(In reply to Florian Quèze [:florian] from comment #1)

> Locally this creates the main process compositor thread before painting the
> blank window, but the GPU process is started after showing the blank window
> and after the startupCrashDetectionBegin profiler marker.

But the SanityTest code runs after the compositor process is already started, I'm not sure if that's OK or not. I haven't been able to see in startup profiles what triggers the creation of the GPU process when this patch is applied.
(In reply to Florian Quèze [:florian] from comment #2)
> (In reply to Florian Quèze [:florian] from comment #1)
> 
> > Locally this creates the main process compositor thread before painting the
> > blank window, but the GPU process is started after showing the blank window
> > and after the startupCrashDetectionBegin profiler marker.
> 
> But the SanityTest code runs after the compositor process is already
> started, I'm not sure if that's OK or not. I haven't been able to see in
> startup profiles what triggers the creation of the GPU process when this
> patch is applied.

Can you try making the SanityTest fail locally and make sure the right things still happen?
Flags: needinfo?(florian)
Comment on attachment 8964988 [details] [diff] [review]
Patch

Review of attachment 8964988 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsWindowGfx.cpp
@@ +225,5 @@
> +  // Avoid starting the GPU process for the initial navigator:blank window.
> +  if (mIsEarlyBlankWindow) {
> +    // Even though we are not really painting, record the time when the window
> +    // gets on screen for the first time.
> +    mozilla::StartupTimeline::RecordOnce(mozilla::StartupTimeline::FIRST_PAINT);

It seems like we should probably separate these metrics. It's still useful for us to record when the browser paints actual content. There should probably be FIRST_PAINT and FIRST_WINDOW.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

> ::: widget/windows/nsWindowGfx.cpp
> @@ +225,5 @@
> > +  // Avoid starting the GPU process for the initial navigator:blank window.
> > +  if (mIsEarlyBlankWindow) {
> > +    // Even though we are not really painting, record the time when the window
> > +    // gets on screen for the first time.
> > +    mozilla::StartupTimeline::RecordOnce(mozilla::StartupTimeline::FIRST_PAINT);
> 
> It seems like we should probably separate these metrics. It's still useful
> for us to record when the browser paints actual content. There should
> probably be FIRST_PAINT and FIRST_WINDOW.

Seems similar to the request in bug 1449305. SIMPLE_MEASURES_DELAYEDSTARTUPSTARTED covers the first MozAfterPaint event of a browser window, I think that's close enough.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)
> (In reply to Florian Quèze [:florian] from comment #2)
> > (In reply to Florian Quèze [:florian] from comment #1)
> > 
> > > Locally this creates the main process compositor thread before painting the
> > > blank window, but the GPU process is started after showing the blank window
> > > and after the startupCrashDetectionBegin profiler marker.
> > 
> > But the SanityTest code runs after the compositor process is already
> > started, I'm not sure if that's OK or not. I haven't been able to see in
> > startup profiles what triggers the creation of the GPU process when this
> > patch is applied.
> 
> Can you try making the SanityTest fail locally and make sure the right
> things still happen?

I did some more testing around the SanityTest, and it's running more asynchronously than I thought. I tried setting browser.startup.blankWindow to false and profiled startup a couple times: the actual "runSanityTest" code runs at a time when we have already opened the actual browser window and already styled it. The fact that the SanityTest code currently starts the compositor seems to just be a side effect of it doing the very first openWindow call. So I don't think I'm significantly affecting the SanityTest behavior.
Flags: needinfo?(florian)
Comment on attachment 8964988 [details] [diff] [review]
Patch

Review of attachment 8964988 [details] [diff] [review]:
-----------------------------------------------------------------

I still think it's worth renaming the telemetry probe to reflect that we're not painting any more.

Can you add some more documentation about what's happening here. Especially about the magic of using navigator:blank to change the behaviour of the window. Also, jimm is probably a more appropriate reviewer for this than me.
Attachment #8964988 - Flags: review?(jmuizelaar)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #7)

> I still think it's worth renaming the telemetry probe to reflect that we're
> not painting any more.

We are not running the painting code anymore, but from the user's point of view, something appears on screen, so it's still fair to say we reached first paint.

If we want to keep recording the time it takes us to paint something non-blank for the first time, and the delayed startup probe (which is right after the first MozAfterPaint event) isn't enough, then I would prefer adding another FIRST_UI_PAINT probe, and that could possibly be a follow-up.

I would like to keep the existing probe name to use telemetry data to confirm the assumption that avoiding the GPU process startup is going to save time for cold startup.

> Can you add some more documentation about what's happening here. Especially
> about the magic of using navigator:blank to change the behaviour of the
> window.

Do you mean a comment in the nsWindow::SetWindowClass method?
Comment on attachment 8964988 [details] [diff] [review]
Patch

Jimm, it would be nice if you could confirm whether this fixes the black paint you were seeing in bug 1447549 comment 5 (there's a try push in comment 1).
Attachment #8964988 - Flags: review?(jmathies)
Comment on attachment 8964988 [details] [diff] [review]
Patch

Review of attachment 8964988 [details] [diff] [review]:
-----------------------------------------------------------------

Other than the telemetry timing, this seems fine. The window class stuff is a bit new to me but I assume you know what you're doing there since it's front end.

::: widget/windows/nsWindowGfx.cpp
@@ +225,5 @@
> +  // Avoid starting the GPU process for the initial navigator:blank window.
> +  if (mIsEarlyBlankWindow) {
> +    // Even though we are not really painting, record the time when the window
> +    // gets on screen for the first time.
> +    mozilla::StartupTimeline::RecordOnce(mozilla::StartupTimeline::FIRST_PAINT);

r- on this, you're adding timing to a paint metric when we're not painting, throwing those values off. Lets nix this. If you want to know about the specific timing here, I like Jeff's suggestion of creating a new probe.
Attachment #8964988 - Flags: review?(jmathies) → review-
Attached patch Patch v2Splinter Review
Removed the telemetry stuff. I'll open another bug to add a probe, probably in front-end code, although that won't be able to capture exactly the same thing. I'll land the probe a day before landing this patch, so that we can see the impact of this patch on the nightly population.
Attachment #8967510 - Flags: review?(jmathies)
Attachment #8964988 - Attachment is obsolete: true
Depends on: 1453782
Attachment #8967510 - Flags: review?(jmathies) → review+
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/50091ebf4918
Avoid starting the GPU process for the initial navigator:blank window, r=jimm.
https://hg.mozilla.org/mozilla-central/rev/50091ebf4918
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Depends on: 1454638
(In reply to Jeff Muizelaar [:jrmuizel] from comment #14)

I think this means we create the GPUParent instance at the same time as we create the compositor thread on the parent process, rather than when we actually create the GPU process.
(In reply to Florian Quèze [:florian] from comment #15)

My guess in comment 15 was incorrect. I just investigated using the profiler (I put long PR_Sleep calls at the beginning and end of the measurements to see when we start and stop measuring) : https://perfht.ml/2FnfPJF

What's happening here is that the GPU process starts right after we are done doing the no-op paint on the blank window. But it sits there idle for a while, and finishes its initialization only once the refresh driver starts ticking on the browser.xul window.

So the regression detected in comment 14 is 'real' in the sense that the GPU startup time is indeed longer. But there's no user-perceptible impact of this regression. This may even be an improvement for the cases where starting the GPU process takes a long time (I saw some profiles where starting the GPU process was taking several seconds, likely due to antivirus software scanning the new process), because we now start the process significantly before we first need it, so it can take a long time to initialize without blocking anything.

So this is probably a win, although it's unfortunate that it makes our existing measurements relatively irrelevant. In a follow-up we may want to think about changing these probes or adding new ones that measure something relevant to user perceptions.
You need to log in before you can comment on or make changes to this bug.