Sometimes Firefox has a 6s startup delay before the first top-level-window is ready
Categories
(Core :: Widget: Cocoa, defect, P3)
Tracking
()
People
(Reporter: whimboo, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: perf)
Attachments
(1 obsolete file)
I noticed this problem when running the Webdriver spec tests on MacOS. These tests start Firefox a lot and in some cases the startup takes very long. The delay specifically happens after evaluating the command line arguments and before opening the first top-level window:
0:07.44 pid:28455 1593026170711 geckodriver::marionette DEBUG Waiting 60s to connect to browser on 127.0.0.1:2828
0:08.34 pid:28455 1593026171613 Marionette TRACE Marionette enabled
0:14.44 pid:28455 1593026177720 Marionette TRACE Received observer notification toplevel-window-ready
In this case the delay is about 6s. Here the appropriate gecko profile:
https://share.firefox.dev/31cvbPZ
As it looks like during that time there is basically no activity and all the time is spend in mach_msg_trap.
To reproduce use the following mach command:
mach wpt --webdriver-arg=-vv testing/web-platform/tests/webdriver/tests/
If the geckodriver binary cannot be found change into testing/geckodriver and run cargo build. Afterward use --webdriver-binary to specify the path to th binary.
Markus, given that seems to be MacOS related, do you have an idea what's going wrong here?
Comment 1•5 years ago
|
||
It's blocking inside -[IMKInputSessionXPCInvocation invocationAwaitXPCReply], which means that it's waiting for some system process. It's unexpected that this would take 6 seconds, but it's also outside our control.
You could try to get a full system profile ("all processes") with Instruments and see if anything stands out.
| Reporter | ||
Comment 2•5 years ago
|
||
I tried to use Instruments but it didn't actually give me some helpful information. The WindowServer process stands out for prepare core animation update state, but that only consumes about 600ms CPU time during that 6s period.
So when I checked the main thread of the Firefox profile above the stack shows that it's all under ProcessPendingGetURLAppleEvents. By searching for this method I just found bug 1433358, and I wonder if I might see something similar.
| Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
| Reporter | ||
Comment 3•5 years ago
|
||
I still see this in 1 of 6 cases when running specific Marionette harness dependent tests (including wpt) in a loop, where Firefox gets restarted pretty often. Stephen is there something else I could provide to actually make some progress on this bug? It's kinda frustrating to me to have to wait those extra 6s over and over again when running tests.
Comment 4•5 years ago
|
||
It is conceivable that we could be smarter about the kinds of events that we're processing (i.e. change NSAnyEventMask[1] to something more specific to GetURL events), but I wouldn't know off-hand what that could be. And there is no guarantee that it isn't the GetURL events themselves that ultimately lead or contribute to this slowdown. Another approach might be to skip processing of GetURL events when run as part of this test suite, assuming that those aren't needed for the tests.
| Reporter | ||
Comment 5•5 years ago
|
||
Would you mind explaining what those GetURL events are, and are used for? Not sure if I get this right from the code comment as referenced in your last comment. So what I think... When a user clicks a URL eg in the Terminal and Firefox is not running yet, a new instance is created and those GetURL events are processed to append possible URLs as command line args for being opened in individual tabs?
Comment 6•5 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] (away 12/07 - 12/11) from comment #5)
Would you mind explaining what those
GetURLevents are, and are used for? Not sure if I get this right from the code comment as referenced in your last comment. So what I think... When a user clicks a URL eg in the Terminal and Firefox is not running yet, a new instance is created and thoseGetURLevents are processed to append possible URLs as command line args for being opened in individual tabs?
Yes, that's the gist of it.
| Reporter | ||
Comment 7•5 years ago
|
||
I observed this for a while when running mochitests, marionette tests, and webdriver tests. As it looks like this delay only happens for me when running the webdriver (web platform) tests. I could try with Instruments the next days in case this would give more details?
(In reply to Stephen A Pohl [:spohl] from comment #4)
Another approach might be to skip processing of GetURL events when run as part of this test suite, assuming that those aren't needed for the tests.
Could this be done via a preference?
Comment 8•5 years ago
|
||
Comment 9•5 years ago
|
||
I added a pref of widget.macos.disable-geturl-processing-at-startup and kicked off a try build. Are you able to check if you are no longer seeing this delay with the try build?
https://treeherder.mozilla.org/jobs?repo=try&revision=20ed7b38ff8a9a1644c5f4b63d3b7c1cbafc92c1
| Reporter | ||
Comment 10•5 years ago
|
||
So I tried to do a before/after check and it's strange but I'm not able to reproduce the problem today. I will have a look early next week again. As such keeping the needinfo.
| Reporter | ||
Comment 11•5 years ago
|
||
After such a long time I have seen this delay yesterday again, but only once. Maybe it depends on how long a machine hasn't been shutdown or restarted. I will keep watching and once it happens more often I will check your try build.
| Reporter | ||
Comment 12•4 years ago
|
||
Maybe there is a better chance to detect that when we got bug 1726465 fixed.
| Reporter | ||
Comment 13•3 years ago
|
||
I think that I actually got a gecko profile for this particular scenario for a Firefox startup on Windows. It's not a 5s delay but still around 3s of inactivity before the browsing window is shown.
https://share.firefox.dev/3ecqAG7
Stephen and Florian, could you maybe have a look? Note that this profile is still with a build with the background hang monitor changes from bug 965392 not being backed out. Thanks.
| Reporter | ||
Comment 14•3 years ago
|
||
Also please note that the browser-idle-startup-tasks-finished notification is send after 13s! So it's a quite slow startup of Firefox.
| Reporter | ||
Comment 15•3 years ago
|
||
Finally I was also able to get a profile with startup filters set to * for all the threads: https://share.firefox.dev/3wKkEdE
Comment 16•3 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #13)
I think that I actually got a gecko profile for this particular scenario for a Firefox startup on Windows. It's not a 5s delay but still around 3s of inactivity before the browsing window is shown.
https://share.firefox.dev/3ecqAG7
Stephen and Florian, could you maybe have a look?
There's a composite taking 3.7s on the Renderer thread on the GPU process: https://share.firefox.dev/3pYh5wH
The machine has 2 physical cores (4 logical cores). The parent process main thread has a higher priority than the GPU process Renderer thread, and during that time we create 4 content processes and have activity in the WebExtension process, all at the same priority as the GPU process Renderer thread, so it's likely the composite could happen significantly faster if we increased the priority of the renderer thread.
| Reporter | ||
Comment 17•3 years ago
|
||
Please note that such a delay doesn't always happen. It's very infrequent so not sure what's going on. I'm happy to also record a profile when I do not see such a delay in case of it would help to identify the cause.
| Reporter | ||
Comment 18•3 years ago
|
||
I talked to Florian and as he told me Bas might have a patch that I could try soon that should fix this issue. Moving the needinfo flag to him.
Comment 19•3 years ago
|
||
Here's the build and the patch I promised Florian. Bit of a hack but should do the trick.
https://treeherder.mozilla.org/jobs?repo=try&revision=6fd11a4056cd3c4270470623cde0e4b1f3cd42a8
Comment 20•3 years ago
|
||
(In reply to Bas Schouten (:bas.schouten) from comment #19)
Here's the build and the patch I promised Florian. Bit of a hack but should do the trick.
https://treeherder.mozilla.org/jobs?repo=try&revision=6fd11a4056cd3c4270470623cde0e4b1f3cd42a8
Here is a warm startup profile of the try build on the 2018 reference laptop (the slow Dell one) https://share.firefox.dev/3EmHexv
As you can see, the "Current thread priority" is 12 or 13 on the Renderer thread. The parent process main thread has a current thread priority of 12 once it has a window (before that, the priority varies between 9 and 12).
There's a very big gap in the sampling in the parent process, the WebExtension process and the preallocated content process, but we can see from the awake markers that they ran anyway, it's just the profiler sampler that likely didn't get any CPU. I have never seen a gap of this size in the sampling, so it's likely a side effect of the GPU process having higher priority. Might still be an acceptable tradeoff if it means things gets painted faster.
| Reporter | ||
Comment 21•3 years ago
|
||
Florian I assume that I do not have to test anything before your questions for Bas haven't been answered?
| Reporter | ||
Updated•3 years ago
|
| Reporter | ||
Comment 23•2 years ago
•
|
||
We actually switched in Marionette to use the final-ui-startup notification. Hereby we can also see delays at least for CI jobs. This is covered by bug 1794745.
At least on MacOS I haven't seen a delay of such type anymore for quite some time. So I would propose to close this bug as WFM. I'll reopen in case I have some reproducible testcase or a recorded profile from such a delay during startup.
Updated•2 years ago
|
Description
•