Closed Bug 1791675 Opened 2 years ago Closed 2 years ago

purgeHTTPCache background task is closed very soon after main process shutdown on Windows

Categories

(Core :: Networking: Cache, defect, P2)

x86_64
Windows 10
defect

Tracking

()

RESOLVED FIXED
108 Branch
Tracking Status
firefox108 --- fixed

People

(Reporter: saschanaz, Assigned: saschanaz)

References

Details

(Whiteboard: [necko-triaged][necko-priority-review])

Attachments

(2 files, 2 obsolete files)

See https://treeherder.mozilla.org/jobs?repo=try&revision=e745b8b7507e1fb48b59a1a7998d96157808197c. On macOS/Linux the task waits 5 sec and do the removal, but on Windows it just closes without removal.

I wonder this is a malfunctioning of the Windows implementation of PR_DetachProcess. Per the docs CloseHandle() should not close the process, though. 🤔

Blocks: 1786256
No longer blocks: 1705676

That CloseHandle() inside detach function is only about releasing the reference AFAICT, not really about "making the process independent from the parent one".

CREATE_NEW_PROCESS_GROUP might work at the creation time.

Sounds like Taskcluster had a similar issue? https://searchfox.org/mozilla-central/rev/b1e5f2c7c96be36974262551978d54f457db2cae/taskcluster/scripts/run-task#707

Great investigation. This does seem problematic.
I see the update service uses the taskScheduler to do this which might work for Windows and OSX. Given that purgeHTTPCache is only relevant on Windows this might be the easiest fix.

It's hard to say if PR_DetachProcess is working as expected. NSPR is less well documented than I would prefer.

Let me know if you'd like to work on a fix. Either way, thank you for the test!

Severity: -- → S3
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-review]

Given that we want a shared API to open this task which I intend to use on other OSes too, I think It'll be more ideal to have a simple cross-platform way. Using a task scheduler sounds like a bit of overkill, but that could work as a workaround if there's no other choice.

And yes, I intend to work on this since this is blocking my patch.

Huh, I was unaware of this. I expect it will bite us as we try to replace pingsender with a background task (Bug 1746983).

I don't think a scheduled (OS-level) task is the best fit for this type of work. Such tasks are per-user and not easily made per-profile. Coordination is possible, but is uncharted territory. Better to ensure we can spawn a process with the expected settings.

To that end, Subprocess.jsm is a more modern implementation of process interaction -- but I don't see support for detaching processes. We might want to add it.

But on Windows, with a recent Nightly, a process launched with nsIProcess.run(false, ...) is detached. That is, in a browser console, I can quite Firefox and leave e.g., notepad, running with:

var p = Cc["@mozilla.org/file/local;1"].createInstance(Ci.nsIFile);
p.initWithPath("c:\\Windows\\System32\\notepad.exe");

var P = Cc["@mozilla.org/process/util;1"].createInstance(Ci.nsIProcess);
P.init(p);
var args = [];

P.run(false, args, args.length);

Kagami, does not behave in the same way for you? If not, this suggests some subtle behaviour with the Windows console.

Blocks: 1746983
Flags: needinfo?(krosylight)

Ah, so, that works for me if I start Firefox by double clicking the icon, but doesn't if I start it via ./mach run.
Setting p.noShell = true; doesn't help unfortunately. Nor does passing CREATE_NEW_PROCESS_GROUP or DETACHED_PROCESS into CreateProcess.

Same here, nsIProcess and Subprocess.call didn't help and neither CREATE_NEW_PROCESS_GROUP.

I confirmed mozilla::ShellExecuteByExplorer makes things work.

Per the comments I wonder any security configuration is blocking us from getting a separate process.

Flags: needinfo?(krosylight)

I have a suspicion it's how mach is spawning processes that's causing these issues. Haven't investigated though.

Confirming CREATE_BREAKAWAY_FROM_JOB does the work.

My theory is that bug 1740619 started to put all the subprocesses into a job which will be terminated later by the launcher, on Windows. And we don't want that for background tasks for obvious reasons.

Now we need an API to do this cleanly, since I don't want to touch ancient nspr for this 🙂

(In reply to Valentin Gosu [:valentin] (he/him) from comment #9)

I have a suspicion it's how mach is spawning processes that's causing these issues. Haven't investigated though.

Well, sort-of: the issue is that --wait-for-browser interacts with the launcher process, which absolutely impacts how these subprocesses are launched and collected.

Essentially nobody will be running with --wait-for-browser, so I'm not sure we need to put too much effort into this, but I see patches in progress so if it's easy, let's make it work in all configurations. I wonder if a little API (in BackgroundTasksUtils.jsm, perhaps?) that got the nsIProcess invocation Just Right would help, so that it was always obvious how to arrange the command line, arguments, flags, etc.

I wonder if a little API (in BackgroundTasksUtils.jsm, perhaps?) that got the nsIProcess invocation Just Right would help, so that it was always obvious how to arrange the command line, arguments, flags, etc.

Yes, I have a local patch to do that 👍 (which I planned to put on bug 1788986, but maybe putting it here makes sense).

We have two duplicated implementations of assembleCmdLine(), one for NSPR and one for nsIProcess. The latter is a copied version added by bug 1601905, which somehow skipped empty string processing because that bug was mainly about whitespaces. This patch fills the hole.

See bug 1605131 to eventually consolidate them.

Depends on D157941

Attachment #9295813 - Attachment description: WIP: Bug 1791675 - Part 1: Add nsIProcess.breakAwayFromJobOnWindows → Bug 1791675 - Part 1: Add nsIProcess.breakAwayFromJobOnWindows r=#xpcom-reviewers
Attachment #9295910 - Attachment description: WIP: Bug 1791675 - Part 2: Quote empty string args passed to nsIProcess → Bug 1791675 - Part 2: Quote empty string args passed to nsIProcess r=mossop
Attachment #9295485 - Attachment description: WIP: Bug 1791675 - PoC → Bug 1791675 - Part 3: Make sure purgeHTTPCache runs after shutdown r=valentin!,nalexander!
Attachment #9295813 - Attachment description: Bug 1791675 - Part 1: Add nsIProcess.breakAwayFromJobOnWindows r=#xpcom-reviewers → Bug 1791675 - Part 2: Add nsIProcess.breakAwayFromJobOnWindows r=#xpcom-reviewers
Attachment #9295910 - Attachment description: Bug 1791675 - Part 2: Quote empty string args passed to nsIProcess r=mossop → Bug 1791675 - Part 3: Quote empty string args passed to nsIProcess r=mossop
Attachment #9295485 - Attachment description: Bug 1791675 - Part 3: Make sure purgeHTTPCache runs after shutdown r=valentin!,nalexander! → Bug 1791675 - Part 4: Make sure purgeHTTPCache runs after shutdown r=valentin!,nalexander!
See Also: → 1794581
Blocks: 1639548
Attachment #9297884 - Attachment is obsolete: true
Attachment #9295813 - Attachment is obsolete: true
Pushed by krosylight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7a3e0d31138c
Part 3: Quote empty string args passed to nsIProcess r=mossop,kmag
https://hg.mozilla.org/integration/autoland/rev/421f472fe4ff
Part 4: Make sure purgeHTTPCache runs after shutdown r=webdriver-reviewers,necko-reviewers,valentin,nalexander,whimboo
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 108 Branch
Regressions: 1801020
See Also: → 1859615
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: