Closed Bug 1693296 Opened 4 years ago Closed 4 years ago

Browser.close() has to ensure that Firefox will be closed even when something prevents it

Categories

(Remote Protocol :: CDP, defect, P2)

defect
Points:
2

Tracking

(firefox90 fixed)

RESOLVED FIXED
90 Branch
Tracking Status
firefox90 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Whiteboard: [bidi-m1-mvp])

Attachments

(1 file)

As seen on bug 1656962 we do not wait until the call to Services.startup.quit() has been processed, and as such return too early. We clearly have to await the quit-application observer topic as we do similarly in Marionette:

https://searchfox.org/mozilla-central/rev/0379f315c75a2875d716b4f5e1a18bf27188f1e6/testing/marionette/driver.js#2887-2891

As you can see below the is an around 100ms time gap on MacOS with an opt build between calling Services.startup.quit() and the quit-application notification received:

0:08.49 GECKO(25146) *** time calling quit: 1613579047383
0:08.59 GECKO(25146) *** time quit notification: 1613579047479

This might be different for Windows, and we will see in a try build where I will leave the dump statements in.

Maybe that helps us on Windows to delay the response enough.

Given that browser chrome tests don't allow the application to quit, we cannot actually test the Browser.close() command. We would have to rely on Puppeteer unit tests.

Here excerpts from the pup job logs and what changes.

before:

[task 2021-02-17T17:54:48.122Z] PID 398 | 1613584488121	RemoteAgent	TRACE	(connection {0f2c623c-2e4b-4f03-b3d4-8ee2795f2191})-> {"method":"Browser.close","id":3}
[task 2021-02-17T17:54:48.150Z] PID 398 | 1613584488149	RemoteAgent	TRACE	<-(connection {0f2c623c-2e4b-4f03-b3d4-8ee2795f2191}) {"id":3,"result":{}}
[task 2021-02-17T17:54:48.154Z] PID 398 | 1613584488154	RemoteAgent	TRACE	<-(connection {26198ad8-0c58-4f9f-b8d1-55434d0ddd87}) {"method":"Target.targetDestroyed","params":{"targetId":"6a7d76ae-86b5-4178-87c6-e28db2c60792"}}
[task 2021-02-17T17:54:48.154Z] PID 398 | 1613584488154	RemoteAgent	TRACE	<-(connection {0f2c623c-2e4b-4f03-b3d4-8ee2795f2191}) {"method":"Target.targetDestroyed","params":{"targetId":"6a7d76ae-86b5-4178-87c6-e28db2c60792"}}

after:

[task 2021-02-17T17:13:52.391Z] PID 392 | 1613582032391	RemoteAgent	TRACE	(connection {2dd0adde-4dd4-4b90-84b7-a399863b59f1})-> {"method":"Browser.close","id":3}
[task 2021-02-17T17:13:52.392Z] PID 392 | *** time before registering observer: 1613582032392
[task 2021-02-17T17:13:52.392Z] PID 392 | *** time before calling quit: 1613582032392
[task 2021-02-17T17:13:52.414Z] PID 392 | 1613582032413	RemoteAgent	TRACE	<-(connection {257efb97-1d06-40c9-9f2a-900b85080572}) {"method":"Target.targetDestroyed","params":{"targetId":"c43f4b98-eb30-49cb-b776-0feab3062ab7"}}
[task 2021-02-17T17:13:52.414Z] PID 392 | 1613582032413	RemoteAgent	TRACE	<-(connection {2dd0adde-4dd4-4b90-84b7-a399863b59f1}) {"method":"Target.targetDestroyed","params":{"targetId":"c43f4b98-eb30-49cb-b776-0feab3062ab7"}}
[task 2021-02-17T17:13:52.426Z] PID 392 | *** time received quit notification: 1613582032425
[task 2021-02-17T17:13:52.429Z] PID 392 | 1613582032429	RemoteAgent	TRACE	<-(connection {2dd0adde-4dd4-4b90-84b7-a399863b59f1}) {"id":3,"result":{}}

You can clearly see that we weren't waiting long enough and returned before all the targets had been destroyed. This is no longer the case with the patch. On Linux it's just some milliseconds but it might look different on Windows. Sadly we don't have anything to test with in our CI.

We should get this landed and then I will try to re-enable Windows tests in the Puppeteer CIi.

Note that with the patch the ordering of received events will be different to what Chrome is actually sending out. Not sure if that would cause a problem with Puppeteer. I'll get this discussed with Jan from the Puppeteer team.

Also Puppeteer's browser.close() implementation is kinda brute-force and doesn't wait until the browser actually has been shutdown. It's doing the following to things:

https://github.com/puppeteer/puppeteer/blob/main/src/common/Browser.ts#L524-L531

  async close(): Promise<void> {
    await this._closeCallback.call(null);
    this.disconnect();
  }

The _closeCallback is just call to CDP's Browser.close(), which seems to return immediately. And then the connection gets disconnected.

It would be better if there is a wait until the connection get actually dropped by the browser. And if it's not closing the process might need to be killed. But maybe the latter is somewhere else in the Puppeteer code.

I wonder if we should also add the code for Services.obs.notifyObservers() and force a shutdown of the browser in case some code prevents Firefox from shutting down. That would make it still a graceful shutdown, and Puppeteer wouldn't have to kill the process.

Julian, what do you think?

Flags: needinfo?(jdescottes)
Points: --- → 2
Priority: P2 → P3

Mathias, would you mind having a look at comment 4? Could you foresee troubles with CDP/Puppeteer clients when the order of those events is changed? Seeing Target.targetDestroyed before Browser.close() returns would actually be nice.

Flags: needinfo?(mathias)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #6)

I wonder if we should also add the code for Services.obs.notifyObservers() and force a shutdown of the browser in case some code prevents Firefox from shutting down. That would make it still a graceful shutdown, and Puppeteer wouldn't have to kill the process.

Discussed that with Julian and it would be actually good to have. I will implement that call and trigger a forced shutdown in case some component denies the shutdown of Firefox.

Flags: needinfo?(jdescottes)

Ensuring all Target.targetDestroyed events are sent before Browser.close returns sounds reasonable to me. I don’t have the historical background as to why this is not already the case (was it on purpose?), but I’m asking around.

Flags: needinfo?(mathias)
Attachment #9203756 - Attachment description: Bug 1693296 - [remote] Browser.close() has to wait until application is about to close. → Bug 1693296 - [remote] Hardening Browser.close() for a clean shutdown.

(In reply to Mathias Bynens from comment #9)

Ensuring all Target.targetDestroyed events are sent before Browser.close returns sounds reasonable to me. I don’t have the historical background as to why this is not already the case (was it on purpose?), but I’m asking around.

Thanks. I will just go ahead and get it implemented that way. I'll wait with the landing of the patch through until around Friday in the hope that you might have some further details for us.

Chatted with Mathias and due to Chrome internals there won't be a change on their side. As such I will remove the code that delays the response from my current patch.

Summary: Browser.close() has to wait until application is about to close → Browser.close() has to ensure that Firefox will be closed even when something prevents it

With my most-recent patch there are test failures for some Puppeteer launcher tests:
https://treeherder.mozilla.org/jobs?repo=try&revision=1e72bdfabb61c6af1018d7f1097023c779d26cde&selectedTaskRun=QbjO-3iZTYaEmeV8Q4WyCw.0

I'm not able to reproduce them locally, and given that this bug is not needed I would defer remaining work for now.

Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Component: CDP: Browser → CDP
Assignee: nobody → hskupin
Status: NEW → ASSIGNED

Still not working on it but only pushed a rebase to check if things might have changed:
https://treeherder.mozilla.org/jobs?repo=try&revision=c3b7063889b4895184d2dfaf54d2f9076a792877

Assignee: hskupin → nobody
Status: ASSIGNED → NEW

It's still failing.

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/23031db2450d [remote] Hardening Browser.close() for a clean shutdown. r=remote-protocol-reviewers,jdescottes
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch
Priority: P3 → P2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: