Open Bug 1656962 Opened 1 year ago Updated 10 days ago

(Puppeteer) Firefox process is not handled properly on Windows

Categories

(Remote Protocol :: Agent, defect, P3)

Unspecified
Windows
defect
Points:
2

Tracking

(Not tracked)

People

(Reporter: impossibus, Unassigned)

Details

There are a number of issues with the Puppeteer launcher for Firefox on Windows. Here are two examples.

Problems with start-up in CI: https://github.com/puppeteer/puppeteer/issues/5673
Processes not being cleaned up: https://github.com/puppeteer/puppeteer/issues/6298

I think this is because Puppeteer's process management does not account for the Firefox Windows launcher process. In my investigation so far, adding the --wait-for-process args doesn't seem to fix the CI issue mentioned above, so we might need to dig into this a bit more.

Severity: -- → S3
Alias: [puppeteer-beta-reserve]
Whiteboard: [puppeteer-beta-reserve]
Whiteboard: [puppeteer-beta-reserve] → [puppeteer-beta2-mvp]

I just created a PR on the Puppeteer Github repository to check if the problem is still around; and it is:
https://github.com/puppeteer/puppeteer/pull/6895

I assume (without having had a look at the code yet) that shutting down the browser will always be done via [Browser.close()[https://chromedevtools.github.io/devtools-protocol/tot/Browser/#method-close]. Here is our implementation:

https://searchfox.org/mozilla-central/rev/0379f315c75a2875d716b4f5e1a18bf27188f1e6/remote/cdp/domains/parent/Browser.jsm#31-33

I tried to build a custom Windows64 build with the launcher process disabled, but as it looks like this doesn't work anymore:

https://treeherder.mozilla.org/jobs?repo=try&revision=c566d6fe18e2840bcc722d9d4363a4425010d526&selectedTaskRun=B_660yb4RySf3KGXbj0CCA.0

Aaron, I assume that this a bug, and should get fixed? Or did I do something wrong?

Flags: needinfo?(aklotz)
Whiteboard: [puppeteer-beta2-mvp] → [bidi-m1-mvp]
Points: --- → 2
No longer blocks: 1693763

It's no longer blocking us from getting Windows CI jobs running for Puppeteer, but I would kind like to understand and fix the underlying problem here.

Priority: P2 → P3
No longer depends on: 1693296

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #3)

Aaron, I assume that this a bug, and should get fixed? Or did I do something wrong?

As discussed, this is a bug that should be fixed.

Flags: needinfo?(aklotz)

(In reply to Aaron Klotz [:aklotz] from comment #5)

As discussed, this is a bug that should be fixed.

I filed bug 1697282 for that.

Also as discussed I will get more information first in how processes are handled by Puppeteer before asking Aaron for advice.

(In reply to Maja Frydrychowicz :impossibus (was :maja_zf) (needinfo me) from comment #0)

Problems with start-up in CI: https://github.com/puppeteer/puppeteer/issues/5673

That was fixed by not killing Firefox anymore as workaround, but causing an in-app shutdown.

Processes not being cleaned up: https://github.com/puppeteer/puppeteer/issues/6298

This is still valid, and even with --wait-for-process, which is present for a while as argument in the Puppeteer Launcher.

When following the lifetime of a process the following is done:

  1. Creating a runner and starting it
  2. The process is spawned via childProcess.spawn() (detached is kept as false)
  3. Finally the process gets killed via this.proc.kill('SIGKILL')

I tried to test the Puppeteer code path with a disabled launcher process but due to bug 1697282 it's not possible right now. As such not sure of the launcher process is the problem here, or something else causes Firefox to not completely shutdown.

Aaron, it would be great to get some help here. Maybe you can already see the problem. If not hints/tips for testing various scenarios would be welcome too. Thanks a lot!

Flags: needinfo?(aklotz)

The launcher process does not start the browser under a job object, so killing the launcher process (which is obviously the one whose handle you have) won't have any effect on the browser itself.

I suppose there are a few options here, but it really depends on effort and whether we want to add this functionality to Puppeteer/node or to the launcher process itself.

If we landed a change to the launcher process to use a job object, we could probably make this all Just Work (TM). I would add the caveat that we would only do this for Windows 8 and higher, so if Windows 7 matters, then this isn't an option.

Other options probably involve adding code (and possibly necessitating the addition of new dependent packages) to Puppeteer itself that call various Windows APIs to discover the browser process and terminate it:

  • Puppeteer could use a different API that supports Windows job objects to start Firefox (again, has the same Windows 7 limitation as above);
  • Using toolhelp for enumerating processes, finding the one whose parent is the launcher, and then terminating that;
  • Using FindWindow for resolving a Firefox window, obtaining its pid, and then terminating that (hard to distinguish between multiple instances of Firefox if that is a concern)
Flags: needinfo?(aklotz)
Whiteboard: [bidi-m1-mvp] → [bidi-m2-mvp]
Priority: P3 → --
Priority: -- → P3
Whiteboard: [bidi-m2-mvp]

Moving outside of the milestone because Puppeteer CI jobs are running on Windows, so there is no need to address this immediately.

You need to log in before you can comment on or make changes to this bug.