browsingContext.create is slow on macos arm64 runners on GitHub actions
Categories
(Remote Protocol :: WebDriver BiDi, defect, P2)
Tracking
(Not tracked)
People
(Reporter: alexrudenko, Unassigned, NeedInfo)
References
(Blocks 1 open bug)
Details
(Whiteboard: [webdriver:m11])
See https://github.com/puppeteer/puppeteer/actions/runs/8829855155/job/24242835681?pr=12318 It is a test run done on new macos arm64 runners on GitHub actions (https://github.com/actions/runner-images/blob/macos-14-arm64/20240422.3/images/macos/macos-14-arm64-Readme.md#browsers)
Several tests time out and in general the test run is very slow, 21 minutes for one test shard (usually up to 5 min). When run with the trace option, the logs indicate that browsingContext.create always blocks for 5 seconds: https://gist.github.com/OrKoN/1bcde28b31e2dba4ba50909deecdd010
Comment 1•22 days ago
|
||
I've triggered a new job with debug enabled. Lets see if this gives some more details:
https://github.com/puppeteer/puppeteer/actions/runs/8829855155/job/24363405593?pr=12318
Comment 2•22 days ago
|
||
And here is already some output:
1714376280846 RemoteAgent TRACE [12] ProgressListener Start: expectNavigation=false resolveWhenStarted=false unloadTimeout=5000 waitForExplicitStart=false
1714376280846 RemoteAgent TRACE [12] ProgressListener Setting unload timer (5000ms)
1714376285849 RemoteAgent DEBUG WebDriverBiDiConnection 7462f260-895e-4517-8a97-128c1d17c323 <- {"type":"event","method":"script.realmCreated","params":{"realm":"8f9e717e-c133-42cb-9051-61e1a4366967","origin":"null","context":"0c801b4e-d1d1-455f-b76b-495f31e4e308","type":"window"}}
It looks like the hang comes from waiting for about:blank
to be loaded. Sadly I do not see any update of the web progress listener status and wonder if we somehow miss loading this page.
Alex, is it easy to run a job with the MOZ_LOG="BCWebProgress:5"
environment variable set?
Reporter | ||
Comment 3•22 days ago
|
||
I believe the same log was already in the linked gist https://gist.github.com/OrKoN/1bcde28b31e2dba4ba50909deecdd010
To set extra env variables we can use the PR here https://github.com/puppeteer/puppeteer/pull/12328
Reporter | ||
Comment 4•22 days ago
|
||
I have triggered a run with the env var here https://github.com/puppeteer/puppeteer/actions/runs/8875780005?pr=12328
Reporter | ||
Comment 5•22 days ago
|
||
Comment 6•22 days ago
|
||
Here an example:
https://github.com/puppeteer/puppeteer/actions/runs/8875780005/job/24365989726?pr=12328#step:9:350
1714380471274 RemoteAgent DEBUG WebDriverBiDiConnection adc3c975-c114-4598-bfaa-f6881514cfc9 -> {"id":8,"method":"browsingContext.create","params":{"type":"tab","userContext":"3da5f8e5-eb20-4d60-b6a2-3b6706ee4274"}}
1714380471274 RemoteAgent TRACE Module root/browsingContext.sys.mjs found for ROOT
1714380471274 RemoteAgent TRACE Received command browsingContext.create for destination ROOT
1714380471274 RemoteAgent TRACE Module root/browsingContext.sys.mjs found for ROOT
1714380471275 RemoteAgent TRACE [10] NavigationListener onStateChange, stateFlags: 196612, status: 0, isStart: false, isStop: false, isNetwork: false, isBindingAborted: false, targetURI: about:blank
1714380471275 RemoteAgent TRACE Received command browsingContext._awaitVisibilityState for destination WINDOW_GLOBAL
1714380471275 RemoteAgent TRACE Module windowglobal/browsingContext.sys.mjs found for WINDOW_GLOBAL
1714380471277 RemoteAgent TRACE [10] NavigationListener onStateChange, stateFlags: 131088, status: 0, isStart: false, isStop: true, isNetwork: false, isBindingAborted: false, targetURI: about:blank
1714380471314 RemoteAgent DEBUG WebDriverBiDiConnection adc3c975-c114-4598-bfaa-f6881514cfc9 <- {"type":"event","method":"browsingContext.contextCreated","params":{"children":null,"context":"4e53e36f-31c1-4b45-ba2f-f6dc0adf121f","url":"about:blank","userContext":"3da5f8e5-eb20-4d60-b6a2-3b6706ee4274","parent":null}}
2024-04-29T08:47:51.315Z puppeteer:webDriverBiDi:RECV ◀ [
'{"type":"event","method":"browsingContext.contextCreated","params":{"children":null,"context":"4e53e36f-31c1-4b45-ba2f-f6dc0adf121f","url":"about:blank","userContext":"3da5f8e5-eb20-4d60-b6a2-3b6706ee4274","parent":null}}'
]
1714380471330 RemoteAgent TRACE Received DOM event TabSelect for [object XULElement]
[Parent 5860: Main Thread]: I/BCWebProgress OnLocationChange({isTopLevel:1, isLoadingDocument:1}, {URI:about:blank, originalURI:about:blank}, about:blank, 0) on {top:1, id:a, url:about:blank}
1714380471336 RemoteAgent TRACE Module windowglobal-in-root/realm-created.sys.mjs not found for WINDOW_GLOBAL
2024-04-29T08:47:51.337Z puppeteer:webDriverBiDi:RECV ◀ [
1714380471336 RemoteAgent DEBUG WebDriverBiDiConnection adc3c975-c114-4598-bfaa-f6881514cfc9 <- {"type":"event","method":"script.realmCreated","params":{"realm":"9279068c-35a3-483d-854f-cba368b3ed6a","origin":"null","context":"306dcc2d-3624-41f7-b432-d87b10d858bd","type":"window"}}
'{"type":"event","method":"script.realmCreated","params":{"realm":"9279068c-35a3-483d-854f-cba368b3ed6a","origin":"null","context":"306dcc2d-3624-41f7-b432-d87b10d858bd","type":"window"}}'
1714380471336 RemoteAgent TRACE Module windowglobal-in-root/window-global-handler-created.sys.mjs not found for WINDOW_GLOBAL
]
1714380471337 RemoteAgent DEBUG WebDriverBiDiConnection adc3c975-c114-4598-bfaa-f6881514cfc9 <- {"type":"event","method":"browsingContext.domContentLoaded","params":{"context":"306dcc2d-3624-41f7-b432-d87b10d858bd","timestamp":1714380471277,"url":"about:blank","navigation":null}}
[Parent 5860: Main Thread]: I/BCWebProgress OnSecurityChange(<null>, <null>, 4) on {top:1, id:a, url:about:blank}
1714380471337 RemoteAgent TRACE [306dcc2d-3624-41f7-b432-d87b10d858bd] No navigation found to stop for url: about:blank
[Parent 5860: Main Thread]: I/BCWebProgress OnStateChange({isTopLevel:1, isLoadingDocument:1}, {URI:about:blank, originalURI:about:blank}, STATE_STOP|STATE_IS_NETWORK|STATE_IS_WINDOW, NS_OK) on {top:1, id:a, url:about:blank}
1714380471337 RemoteAgent DEBUG WebDriverBiDiConnection adc3c975-c114-4598-bfaa-f6881514cfc9 <- {"type":"event","method":"browsingContext.load","params":{"context":"306dcc2d-3624-41f7-b432-d87b10d858bd","timestamp":1714380471277,"url":"about:blank","navigation":null}}
2024-04-29T08:47:51.338Z puppeteer:webDriverBiDi:RECV ◀ [
'{"type":"event","method":"browsingContext.domContentLoaded","params":{"context":"306dcc2d-3624-41f7-b432-d87b10d858bd","timestamp":1714380471277,"url":"about:blank","navigation":null}}'
]
2024-04-29T08:47:51.338Z puppeteer:webDriverBiDi:RECV ◀ [
'{"type":"event","method":"browsingContext.load","params":{"context":"306dcc2d-3624-41f7-b432-d87b10d858bd","timestamp":1714380471277,"url":"about:blank","navigation":null}}'
]
1714380471407 RemoteAgent TRACE [12] ProgressListener Start: expectNavigation=false resolveWhenStarted=false unloadTimeout=5000 waitForExplicitStart=false
1714380471407 RemoteAgent TRACE [12] ProgressListener Setting unload timer (5000ms)
2024-04-29T08:47:56.506Z puppeteer:webDriverBiDi:RECV ◀ [
1714380476506 RemoteAgent TRACE [12] ProgressListener No navigation detected: about:blank
'{"type":"success","id":8,"result":{"context":"4e53e36f-31c1-4b45-ba2f-f6dc0adf121f"}}'
1714380476506 RemoteAgent TRACE [12] ProgressListener Stop: has error=false
]
1714380476506 RemoteAgent DEBUG WebDriverBiDiConnection adc3c975-c114-4598-bfaa-f6881514cfc9 <- {"type":"success","id":8,"result":{"context":"4e53e36f-31c1-4b45-ba2f-f6dc0adf121f"}}
We seem to start the progress listener after the page about:blank
has already been loaded? Also I see the browsingContext.load
event emitted twice.
Updated•22 days ago
|
Comment 7•21 days ago
|
||
We discussed this in triage yesterday.
First observation is that browsingContext.create waits for the visibilityState of the previous tab to change (here) before we start waiting for the navigation (here).
It would probably be safer to start waiting for the initial navigation right away. However our waitForInitialNavigationCompleted
helper is supposed to detect situations where the initial page has already been loaded and resolve early (here).
To verify that, locally I artificially delayed the call to waitForInitialNavigationCompleted
by 500ms, 1000ms, 2000ms, and in all cases the helper resumed early, without waiting for the 5s timeout of the ProgressListener. Based on this test, I think that calling the helper "late" works fine.
Therefore I tried to delay the call to waitForInitialNavigationCompleted
on Puppeteer as well, by setting browser.tabs.remote.unloadDelayMs
to a few hundred ms and pushing to github CI. This should force waitForInitialNavigationCompleted
to start late as well, and give enough time to the initial navigation to complete, but it still seems to fail with macos-latest.
Comment 8•21 days ago
|
||
(In reply to Julian Descottes [:jdescottes] from comment #7)
Therefore I tried to delay the call to
waitForInitialNavigationCompleted
on Puppeteer as well, by settingbrowser.tabs.remote.unloadDelayMs
to a few hundred ms and pushing to github CI. This should forcewaitForInitialNavigationCompleted
to start late as well, and give enough time to the initial navigation to complete, but it still seems to fail with macos-latest.
Note that changing this preference does delay the visibilitychange event even further, so I wouldn't expect that anything changes for Puppeteer in GitHub CI. Based on my comment did you try it out locally by setting browser.tabs.remote.unloadDelayMs
to some higher values? Were you able to reproduce the timeout in waiting for the initial navigation?
Comment 9•20 days ago
|
||
(In reply to Henrik Skupin [:whimboo][⌚️UTC+1] from comment #8)
(In reply to Julian Descottes [:jdescottes] from comment #7)
Therefore I tried to delay the call to
waitForInitialNavigationCompleted
on Puppeteer as well, by settingbrowser.tabs.remote.unloadDelayMs
to a few hundred ms and pushing to github CI. This should forcewaitForInitialNavigationCompleted
to start late as well, and give enough time to the initial navigation to complete, but it still seems to fail with macos-latest.Note that changing this preference does delay the visibilitychange event even further, so I wouldn't expect that anything changes for Puppeteer in GitHub CI. Based on my comment did you try it out locally by setting
browser.tabs.remote.unloadDelayMs
to some higher values? Were you able to reproduce the timeout in waiting for the initial navigation?
Yes, that's what I meant with the previous paragraph: I tried locally with 500ms, 1000ms, 2000ms and it didn't reproduce the issue. The helper successfully detects that the navigation is already done and resolves immediately.
Note that changing this preference does delay the visibilitychange event even further, so I wouldn't expect that anything changes for Puppeteer in GitHub CI.
I think it should have an impact, and it's worrying that it doesn't.
Basically by increasing this value, we delay the "gap" between the call to addTab and the call to waitForInitialNavigationCompleted. For instance if we set the value to 1000 ms, it should give enough time to the new tab to do its initial navigation and waitForInitialNavigationCompleted
should then detect that the initial navigation is over and resolve. As I said, I verified locally that the helper was correctly returning early in those cases, so I don't really understand why it doesn't happen on Github CI.
Comment 10•15 days ago
|
||
I assume that with bug 1894305 now fixed, we should trigger some new debug jobs in GitHub CI for Puppeteer (with the Firefox cache removed first)?
Comment 11•8 days ago
|
||
I retriggered a run at https://github.com/puppeteer/puppeteer/actions/runs/9059188428/job/24886294787?pr=12336
My previous push had a typo in MOZ_LOG so it didn't include the Process logs.
Comment 12•8 days ago
|
||
I've re-triggered the failing job with debug logs enabled:
https://github.com/puppeteer/puppeteer/actions/runs/9059188428/job/24888261056?pr=12336
Comment 13•8 days ago
|
||
Thanks! So it seems it takes a long time to create a new process when we try to create new tabs:
2024-05-13T08:38:30.1111650Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:30.1211520Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:30.1313910Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:30.2590990Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:30.2595100Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:30.2596060Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:31.1948660Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:31.1949320Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:31.1950120Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:31.3429020Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:31.3457780Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:31.3458420Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:32.2271950Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:32.2373640Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:32.2475650Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:32.4686340Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:32.4725860Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:32.4768310Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:33.2278670Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:33.2283020Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:33.2287190Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:33.4980610Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:33.4981200Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:33.4981830Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
2024-05-13T08:38:34.3286360Z [Parent 5255: Main Thread]: D/Process Starting process allocate on idle
2024-05-13T08:38:34.3287130Z 1715589514319 RemoteAgent TRACE [12] ProgressListener No navigation detected: about:blank
2024-05-13T08:38:34.3307020Z 2024-05-13T08:38:34.320Z puppeteer:webDriverBiDi:RECV ◀ [
2024-05-13T08:38:34.3307680Z 1715589514319 RemoteAgent TRACE [12] ProgressListener Stop: has error=false
2024-05-13T08:38:34.3369870Z '{"type":"success","id":8,"result":{"context":"5e1f91fa-3853-493c-9146-eb0e5c25d226"}}'
2024-05-13T08:38:34.3371320Z 1715589514320 RemoteAgent DEBUG WebDriverBiDiConnection 33fdce95-6f81-4977-8679-0cec0e867265 <- {"type":"success","id":8,"result":{"context":"5e1f91fa-3853-493c-9146-eb0e5c25d226"}}
2024-05-13T08:38:34.3475890Z ]
2024-05-13T08:38:34.3478510Z 1715589514322 RemoteAgent DEBUG WebDriverBiDiConnection 33fdce95-6f81-4977-8679-0cec0e867265 -> {"id":9,"method":"browsingContext.setViewport","params":{"context":"5e1f91fa-3853-493c-9146-eb0e5c25d226","viewport":{"width":800,"height":600},"devicePixelRatio":null}}
2024-05-13T08:38:34.3570390Z 2024-05-13T08:38:34.321Z puppeteer:webDriverBiDi:SEND ► [
2024-05-13T08:38:34.3571080Z 1715589514322 RemoteAgent TRACE Module root/browsingContext.sys.mjs found for ROOT
2024-05-13T08:38:34.3616340Z '{"id":9,"method":"browsingContext.setViewport","params":{"context":"5e1f91fa-3853-493c-9146-eb0e5c25d226","viewport":{"width":800,"height":600},"devicePixelRatio":null}}'
2024-05-13T08:38:34.3620350Z 1715589514322 RemoteAgent TRACE Received command browsingContext.setViewport for destination ROOT
2024-05-13T08:38:34.3653510Z ]
2024-05-13T08:38:34.3654270Z 1715589514322 RemoteAgent TRACE Module root/browsingContext.sys.mjs found for ROOT
2024-05-13T08:38:34.3655360Z [Parent 5255: Main Thread]: D/Process Trying to start process now
2024-05-13T08:38:34.3656560Z [Parent 5255: Main Thread]: D/Process Starting delayed process start, delay=1000
We try to start a process, but we get Process Starting delayed process start, delay=1000 for 5 seconds.
Olli, do you know what could delay the creation of new processes? This is happening on macos arm workers on Github
Comment 14•6 days ago
|
||
If I don't remember wrong Randell worked on some process related changes in the past. So maybe he is a better person to ask. Randell, if that is not the case please let us know, and maybe you know a better person to ask for. Thanks!
Comment 15•6 days ago
|
||
(I don't know what "macos arm workers on Github" means).
I assume this is on opt/shippable builds?
Do we have performance profiles for this?
Comment 16•6 days ago
|
||
(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #15)
(I don't know what "macos arm workers on Github" means).
The workers provided by Github to run Github actions, see https://github.com/actions/runner-images/blob/macos-14-arm64/20240422.3/images/macos/macos-14-arm64-Readme.md#browsers
I assume this is on opt/shippable builds?
Yes.
Do we have performance profiles for this?
Not yet I'll try to get one.
Comment 17•6 days ago
|
||
I just saw that there are as well MacOS 13 ARM images:
https://github.com/actions/runner-images/blob/main/images/macos/macos-13-arm64-Readme.md
I wonder if those might show the same problem or not. The current puppeteer checks all run for MacOS 13. So maybe it might be a MacOS 14 issue.
Description
•