Closed Bug 1597803 Opened 10 months ago Closed 10 months ago

PingSender runs in the install directory, prevents staged update replacement

Categories

(Toolkit :: Telemetry, defect, P1)

Desktop
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: agashlin, Assigned: gsvelto)

Details

Attachments

(1 file)

Sorry if this is a long walk, I'm trying to explain how I got here in case there's a better interpretation.

tl;dr: I think that, if pingsender is still running when the updater attempts to rename the install directory, the update will fail. This might be happening for me due to a particularly slow network.


For the past few days I've had a number of failures when the updater is trying to replace the install directory with the new staged update, the logs indicate that this is because the old directory is still in use:

Begin moving destDir (C:\Program Files\Firefox Nightly) to tmpDir (C:\Program Files\Firefox Nightly.bak)
rename_file: proceeding to rename the directory
rename_file: failed to rename file - src: C:\Program Files\Firefox Nightly, dst:C:\Program Files\Firefox Nightly.bak, err: 13
PerformReplaceRequest: destDir rename attempt 1 failed. File: C:\Program Files\Firefox Nightly. Last error: 32, err: 7
...

13 is Posix errno EACCES, 32 is win32 ERROR_SHARING_VIOLATION. 7 is our general code for the rename failing, WRITE_ERROR.

This retries 10 times, sleeping 100ms between, and then fails, which causes Firefox to eventually show the manual install prompt once it restarts. If I instead exit Firefox, and then manually start it (so that some time passes before it starts and attempts to run the updater), it will succeed.

Today it was able to succeed after 4 failures while I had Process Monitor logging (though I mistakenly lost it before saving). I noticed that, between the unsuccessful and successful attempt to open the directory, pingsender.exe was in the process of shutting down and it closed a handle on C:\Program Files\Firefox Nightly. (When I get a chance to recreate the procmon log I need to check if this was the last of the three pingsender.exe processes that had been launched.)

I don't think pingsender opens this directory for any particular reason, it just happens to be the working directory of Firefox and so it gets inherited. For this to not be a problem with the updater itself we change it's working directory to a system directory.

It's weird that this just started happening for me recently. My best guess is that it's because I started testing a VPN Nov 12, perhaps this added enough latency to keep those pingsenders from finishing. I intend to perform some more tests without the VPN but with some artificial delay to see if I can reproduce it reliably.

+gsvelto for keyword "pingsender"

You are right that pingsender does not want a handle inside the appdir (except for arg0 I suppose, since maybe we need a handle to our own executable? (I dunno)) as all of its operations are in the data dirs instead.

(In reply to Adam Gashlin (he/him) [:agashlin] from comment #0)

I don't think pingsender opens this directory for any particular reason, it just happens to be the working directory of Firefox and so it gets inherited. For this to not be a problem with the updater itself we change it's working directory to a system directory.

It's weird that this just started happening for me recently. My best guess is that it's because I started testing a VPN Nov 12, perhaps this added enough latency to keep those pingsenders from finishing. I intend to perform some more tests without the VPN but with some artificial delay to see if I can reproduce it reliably.

Yes, this is most likely the cause: I've just tested with a dummy application, if I run it and try to rename the directory that holds it while it's running Windows rejects the operation. I guess that the solution here is to avoid running pingsender when we know we'll be doing an update.

Hey Molly, you know updater stuff. Is there a way that, say, TelemetrySend.jsm could know that we're gonna update between this shutdown and the next startup so we should ignore the usePingsender flag? Is there a particular case this describes (maybe the "Restart to update" button is the only case where this would be a problem?) that we can identify even more narrowly?

Flags: needinfo?(mhowell)

You can have files open in the directory (otherwise the updater.exe couldn't work at all), it's the handle on the directory itself, I think. So having pingsender.exe running should be fine as long as it changes to a different working directory, that's probably the simplest solution.

(In reply to Chris H-C :chutten from comment #3)

Hey Molly, you know updater stuff. Is there a way that, say, TelemetrySend.jsm could know that we're gonna update between this shutdown and the next startup so we should ignore the usePingsender flag? Is there a particular case this describes (maybe the "Restart to update" button is the only case where this would be a problem?) that we can identify even more narrowly?

Mostly it just looks like any other explicitly-requested restart; I don't know if it would make sense to disable the pingsender for any such restart. If not, there's a couple options. You could listen for the update-staged notification, which means that a restart hasn't been requested yet but the next time it does we'll be applying an update. Or, you can query the update service to see if there's an active update in one of the applied states, and that would indicate the same thing. Both of those things only apply when update staging is enabled, but this bug doesn't exist if it isn't (we don't mess with the directory in that case, only individual files).

Flags: needinfo?(mhowell)

(In reply to Adam Gashlin (he/him) [:agashlin] from comment #4)

You can have files open in the directory (otherwise the updater.exe couldn't work at all), it's the handle on the directory itself, I think. So having pingsender.exe running should be fine as long as it changes to a different working directory, that's probably the simplest solution.

OK, that should be trivial to do.

(( Presuming from Gabriele's comment that he's working on this actively ))

Assignee: nobody → gsvelto
Priority: -- → P1

(In reply to Chris H-C :chutten from comment #7)

(( Presuming from Gabriele's comment that he's working on this actively ))

I'll give it a spin next week.

Status: NEW → ASSIGNED
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5079547664c5
Prevent pingsender from holding on to the current working directory on Windows r=chutten
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
You need to log in before you can comment on or make changes to this bug.