Closed Bug 1464756 Opened 6 years ago Closed 3 years ago

[wdspec] new_session fixture takes a long time due to a delay during startup of Firefox QuantumRender

Categories

(Testing :: geckodriver, defect, P3)

All
Linux
defect

Tracking

(firefox61 affected, firefox62 affected)

RESOLVED WORKSFORME
Tracking Status
firefox61 --- affected
firefox62 --- affected

People

(Reporter: whimboo, Unassigned)

References

Details

+++ This bug was initially created as a clone of Bug #1449538 +++

For several wdspec tests as run with Firefox QuantumRender we have some high-frequent intermittent failures. Those happen for tests which use the `new_session` fixture, and cause a restart of Firefox. Then the startup of Firefox has a ~5s delay before the `sessionstore-window-restored` observer notification is fired. As result some test modules time out because we restart Firefox a lot.

Here an example from bug 1459548:

https://treeherder.mozilla.org/logviewer.html#?job_id=180430429&repo=mozilla-inbound&lineNumber=50967-51017

> [task 2018-05-27T00:21:43.165Z] 00:21:43     INFO - PID 5710 | 1527380503161	Marionette	DEBUG	Received observer notification profile-after-change
> [task 2018-05-27T00:21:43.186Z] 00:21:43     INFO - PID 5710 | [8814, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012: file /builds/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 2910
> [task 2018-05-27T00:21:43.267Z] 00:21:43     INFO - PID 5710 | ATTENTION: default value of option force_s3tc_enable overridden by environment.
> [task 2018-05-27T00:21:43.274Z] 00:21:43     INFO - PID 5710 | [8814, Main Thread] WARNING: GLX_swap_control unsupported, ASAP mode may still block on buffer swaps.: file /builds/worker/workspace/build/src/gfx/gl/GLContextProviderGLX.cpp, line 219
> [task 2018-05-27T00:21:43.275Z] 00:21:43     INFO - PID 5710 | [8814, Main Thread] WARNING: SGI_video_sync unsupported. Falling back to software vsync.: file /builds/worker/workspace/build/src/gfx/thebes/gfxPlatformGtk.cpp, line 754
> [task 2018-05-27T00:21:43.331Z] 00:21:43     INFO - PID 5710 | 1527380503328	Marionette	DEBUG	Received observer notification command-line-startup
> [task 2018-05-27T00:21:43.332Z] 00:21:43     INFO - PID 5710 | 1527380503328	Marionette	DEBUG	Received observer notification nsPref:changed
> [task 2018-05-27T00:21:43.332Z] 00:21:43     INFO - PID 5710 | 1527380503329	Marionette	DEBUG	Init aborted (running=false, enabled=true, finalUIStartup=false)
> [..]
> [task 2018-05-27T00:21:44.875Z] 00:21:44     INFO - PID 5710 | [GLX] window 1e00011 has VisualID 0x1a4
> [task 2018-05-27T00:21:44.877Z] 00:21:44     INFO - PID 5710 | [Parent 8814, Renderer] WARNING: robust_buffer_access_behavior marked as unsupported: file /builds/worker/workspace/build/src/gfx/gl/GLContextFeatures.cpp, line 915
> [task 2018-05-27T00:21:44.879Z] 00:21:44     INFO - PID 5710 | [Parent 8814, Renderer] WARNING: Robustness supported, strategy is not LOSE_CONTEXT_ON_RESET!: file /builds/worker/workspace/build/src/gfx/gl/GLContext.cpp, line 1024
> [task 2018-05-27T00:21:44.879Z] 00:21:44     INFO - PID 5710 | [Parent 8814, Renderer] WARNING: robustness marked as unsupported: file /builds/worker/workspace/build/src/gfx/gl/GLContextFeatures.cpp, line 915
[..]
> [task 2018-05-27T00:21:51.347Z] 00:21:51     INFO - PID 5710 | 1527380511338	Marionette	DEBUG	Received observer notification sessionstore-windows-restored

Milan, is there anyone who could help us with that? I wonder if this could be related to bug 1423671, but not sure if that also affects opening of the very first window.
Flags: needinfo?(milaninbugzilla)
Blocks: 1459548
Blocks: 1456641
Blocks: 1459549
Blocks: 1462694
Summary: [wdspec] new_session fixture takes a long time due to long shutdown times of Firefox QuantumRender → [wdspec] new_session fixture takes a long time due to a delay during startup of Firefox QuantumRender
Blocks: 1462012
Blocks: 1448623
Blocks: 1418778
Note that Milan is no longer at Mozilla and will probably have limited to respond to these sorts of queries.

At the moment opening new windows is still slow; there is some work in progress in bug 1418202 to speed it up on Windows, but nothing in progress for Linux. I'm redirecting the needinfo to Sotaro who probably has a better idea of future plans in this area.
Flags: needinfo?(milaninbugzilla) → needinfo?(sotaro.ikeda.g)
"Perf" key word?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1)
> Note that Milan is no longer at Mozilla and will probably have limited to
> respond to these sorts of queries.

Oh, I never noticed that he left. Thanks for the information.

> At the moment opening new windows is still slow; there is some work in
> progress in bug 1418202 to speed it up on Windows, but nothing in progress
> for Linux. I'm redirecting the needinfo to Sotaro who probably has a better
> idea of future plans in this area.

This is only a suspicion and I'm not sure if the window opening plays into account here. Seeing that bug 1459548, and bug 1459549 have been fixed with a webrender push last night, I will have a look at more recent failures and check if timing has been improved.
Blocks: 1465915
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1)
> Note that Milan is no longer at Mozilla and will probably have limited to
> respond to these sorts of queries.
> 
> At the moment opening new windows is still slow; there is some work in
> progress in bug 1418202 to speed it up on Windows, but nothing in progress
> for Linux. I'm redirecting the needinfo to Sotaro who probably has a better
> idea of future plans in this area.

Compiling shaders are very slow with WebRender. WebRender compiles shaders when they become necessary. But actually more than 12 shaders are compiled during first Firefox rendering! The following issue talks about it.
 https://github.com/servo/webrender/issues/1802

There are several candidates for reducing shader compilation time.

- [1] Share ProgramBinary between browser window
- [2] Use ProgramBinary disk cache.
- [3] Precompile to d3d bytecode for windows.
- [4] Reduce number of shader programs.
- [5] Optimize glsl

[1] is done by Bug 1391159 only on windows, since I did not saw performance improvement on linux and mac.
[2] is done by Bug 1418202 on windows. It was just landed on nightly.
Shader compilation took very very long time on Windows with ANGLE. Then [1] and [2] makes first Firefox rendering fast on windows.

Before [2], I looked into [3]. But it was not viable option, since ProgramBinary of ANGLE has dependency to graphic driver's device id. It was discussed in the following.
  https://github.com/servo/webrender/issues/2580

[4] still seems like viable option. But I am not sure how we could reduce a number of ProgramBinaries.
[5] is mentioned in https://github.com/servo/webrender/issues/1802. But it looks tricky for WebRender, since WebRender generate final shader program source just before compilation.
Flags: needinfo?(sotaro.ikeda.g)
Sotaro, everything above seem to be related to Windows. But the wdspec tests we run in CI are only executed on Linux 32/64 yet. So I wonder if any of those things will also affect Linux.
Flags: needinfo?(sotaro.ikeda.g)
OS: Unspecified → Linux
Hardware: Unspecified → All
As in https://github.com/servo/webrender/issues/1802, shader compile latency is still a problem on linux and mac. But the latency seems not changed recently.

If some problems of this bug are addressed on linux recently, it seems to be related recent WebRender optimizations. But I am not sure about it.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Sotaro Ikeda [:sotaro] from comment #6)
> As in https://github.com/servo/webrender/issues/1802, shader compile latency
> is still a problem on linux and mac. But the latency seems not changed
> recently.

When I read through that issue I only can retrieve that we have issues on Mac and Windows. As it has been mentioned on Linux there would be a shader cache, but maybe that is not available/enabled in CI?

> If some problems of this bug are addressed on linux recently, it seems to be
> related recent WebRender optimizations. But I am not sure about it.

Yes, in some cases we have seen improvements but not all the known timeouts are gone. So it was only a small improvement.
Blocks: 1507121

I'm going to mark this bug as WFM given that we haven't seen any failure like this within the last 3 years. So I consider things have been improved a lot over that time span.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.