1780004 - isOtherInstanceRunning() is unexpectedly false when /tmp/MozillaUpdateLock* is removed (e.g. by OS)

Reporter

Description

•

2 years ago

The update manager relies on a file in a temporary location (i.e. MozillaUpdateLock) to detect whether other instances are active. This is not reliable, because the file can freely be deleted by the OS. Consequently, Firefox binaries can be replaced even with existing Firefox instances still running.

For example, on macOS temporary files are deleted after three days (according to atime/mtime of the file).

The user impact is that after ~3-4 days, a new Firefox instance triggers an application update (replacement) and breaks existing Firefox instances. A Firefox launch or quit does not touch atime/mtime of the MozillaUpdateLock, so the clock starts ticking since the first Firefox launch (you can verify this by using stat -x /private/tmp/MozillaUpdateLock-31210A081F86E80E, which shows that all timestamps are not changed).

STR:

Start Firefox
Open another instance of Firefox, e.g.
'/Applications/Firefox Nightly.app/Contents/MacOS/firefox' --no-remote -profile profdir
(or web-ext run -f nightly, to load Firefox with the web-ext tool)
Visit a privileged page and open the console there (or just use the Browser Console if you've enabled browser chrome debugging), and run the following snippet:
Cc["@mozilla.org/updates/update-sync-manager;1"].getService(Ci.nsIUpdateSyncManager).isOtherInstanceRunning();
Look in /tmp/ for the MozillaUpdateLock file. E.g. /private/tmp/MozillaUpdateLock-31210A081F86E80E
Wait 4 days (on macOS), without closing Firefox.
Repeat step 4 (to observe that the file has been removed by macOS)
Open a new instance of Firefox (with a new profile directory).
Repeat step 3 to see whether thew Firefox instance is aware of the other instances.
Run lsof | grep MozillaUpdateLock to see the open file handles.

Expected:

At step 3, the result of isOtherInstanceRunning() is true.
At step 4, the MozillaUpdateLock file exists.
At step 6, the MozillaUpdateLock file has been removed by the OS.
At step 8, the result of isOtherInstanceRunning() should be true.
At step 9, all three Firefox instances should have a reference to the same MozillaUpdateLock file AND the same inode.

Actual:

Step 3, 4, 6 are as expected (confirming that the test setup is valid).
At step 8, isOtherInstanceRunning() is false. because of this, Firefox will auto-update itself and cause bugs like https://github.com/mozilla/web-ext/issues/2383
At step 9, while the first two Firefox instances share the same MozillaUpdateLock file, the third one has a new file, as seen by the different inode.

Example of lsof output. Note that 61581958 is the inode of the first two launches, and 61650972 is the inode of the last launch (firefox-b is firefox-bin here via web-ext).

firefox   63019  user  14r      REG                1,4          0            61581958 /private/tmp/MozillaUpdateLock-31210A081F86E80E
firefox-b 63300  user  14r      REG                1,4          0            61581958 /private/tmp/MozillaUpdateLock-31210A081F86E80E
firefox-b 69254  user  14r      REG                1,4          0            61650972 /private/tmp/MozillaUpdateLock-31210A081F86E80E

BugBot [:suhaib / :marco/ :calixte]

Updated

•

2 years ago

Keywords: regression

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

2 years ago

Set release status flags based on info from the regressing bug 1553982

status-firefox102: --- → affected

status-firefox103: --- → affected

status-firefox104: --- → affected

status-firefox-esr102: --- → affected

status-firefox-esr91: --- → affected

bhearsum@mozilla.com (:bhearsum)

Comment 2

•

2 years ago

This feels like it's quite edge case-y given how long an instance needs to be running for to trigger it in the example given. Nonetheless, it seems like we ought to renew this lock from time to time (perhaps tying it in with update checks could make sense, given that those are somewhat related, and already on a fairly long timer). We even have code for doing so (that's currently only used by tests AFAICT).

Severity: -- → S3

Priority: -- → P3

Donal Meehan [:dmeehan]

Updated

•

2 years ago

status-firefox102: affected → wontfix

status-firefox103: affected → wontfix

status-firefox-esr91: affected → wontfix

Rob Wu [:robwu]

Reporter

Comment 3

•

2 years ago

(In reply to bhearsum@mozilla.com (:bhearsum) from comment #2)

This feels like it's quite edge case-y given how long an instance needs to be running for to trigger it in the example given.

3-4 days is not an edge case. A device can be suspended without quitting Firefox, to easily end up with a multi-day browsing session.
If my SQL query is correctly written (I am no expert), then CDF of longest session lengths of clients of Firefox Desktop in the past 28 days (1% sample) shows that 10% of the Nightly users have 5+ day sessions. So despite the daily nags to download the update, 10% don't update (or at least they don't restart after an update).

Nonetheless, it seems like we ought to renew this lock from time to time (perhaps tying it in with update checks could make sense, given that those are somewhat related, and already on a fairly long timer). We even have code for doing so (that's currently only used by tests AFAICT).

That sounds reasonable.

Another option is to write to a location where the lock file is not automatically nuked by a periodic task. Or is that not an option for some reason?

BugBot [:suhaib / :marco/ :calixte]

Comment 4

•

2 years ago

:mhowell, since you are the author of the regressor, bug 1553982, could you take a look?
For more information, please visit auto_nag documentation.

Flags: needinfo?(mhowell)

bhearsum@mozilla.com (:bhearsum)

Comment 5

•

2 years ago

(In reply to Rob Wu [:robwu] from comment #3)

(In reply to bhearsum@mozilla.com (:bhearsum) from comment #2)

This feels like it's quite edge case-y given how long an instance needs to be running for to trigger it in the example given.

3-4 days is not an edge case. A device can be suspended without quitting Firefox, to easily end up with a multi-day browsing session.
If my SQL query is correctly written (I am no expert), then CDF of longest session lengths of clients of Firefox Desktop in the past 28 days (1% sample) shows that 10% of the Nightly users have 5+ day sessions. So despite the daily nags to download the update, 10% don't update (or at least they don't restart after an update).

Nonetheless, it seems like we ought to renew this lock from time to time (perhaps tying it in with update checks could make sense, given that those are somewhat related, and already on a fairly long timer). We even have code for doing so (that's currently only used by tests AFAICT).

That sounds reasonable.

Another option is to write to a location where the lock file is not automatically nuked by a periodic task. Or is that not an option for some reason?

I talked to Molly (the original author of this) for some context. She said it was chosen because there was no real equivalent of ProgramData (where we put these on Windows) and wanting to have some level of confidence that the file would eventually get deleted. She also suggested that we could bump atime whenever we check the lock state - which seems like a good idea to me.

Flags: needinfo?(mhowell)

Whiteboard: [fidedi-ope]

Jira Integration Bot

Updated

•

2 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/FIDEDI-277

BugBot [:suhaib / :marco/ :calixte]

Comment 6

•

2 years ago

Set release status flags based on info from the regressing bug 1553982

status-firefox105: --- → affected

Dianna Smith [:diannaS]

Updated

•

2 years ago

status-firefox104: affected → wontfix

Ryan VanderMeulen [:RyanVM]

Comment 7

•

2 years ago

Should we bump the priority here given comment 3?

status-firefox105: affected → wontfix

status-firefox106: --- → affected

Flags: needinfo?(bhearsum)

bhearsum@mozilla.com (:bhearsum)

Comment 8

•

2 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #7)

Should we bump the priority here given comment 3?

Going to defer this to Kirk, who has much more context/history.

Flags: needinfo?(bhearsum) → needinfo?(bytesized)

Robin Steuber (she/her) [:bytesized]

Comment 9

•

2 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #7)

Should we bump the priority here given comment 3?

I'm inclined to say no. I would like to fix this soonish. But AFAIK this is only used by the update system. And it times out waiting for other instances to close after 6 hours. If the user's session has already been running for 3 days, it doesn't feel urgent that we make sure that we wait another 6 hours for other instances to close.

That being said, I will see about bumping this a bit further up my to do list.

Flags: needinfo?(bytesized)

Rob Wu [:robwu]

Reporter

Comment 10

•

2 years ago

(In reply to Kirk Steuber (he/him) [:bytesized] from comment #9)

But AFAIK this is only used by the update system. And it times out waiting for other instances to close after 6 hours. If the user's session has already been running for 3 days, it doesn't feel urgent that we make sure that we wait another 6 hours for other instances to close.

I filed this bug because I was fed up with my browser becoming unusable after having launched an independent new instance. Your comment suggests that even if the identified issue were to be fixed, that the updater would still update after 6 more hours.

If this is indeed the case, then I'd like to know what I can do to start a new instance of Firefox without replacing/updating the binary. I'd like to use whatever supported mechanism you recommend in the web-ext CLI tool, to launch Firefox without side effects.

Robin Steuber (she/her) [:bytesized]

Comment 11

•

2 years ago

(In reply to Rob Wu [:robwu] from comment #10)

I filed this bug because I was fed up with my browser becoming unusable after having launched an independent new instance. Your comment suggests that even if the identified issue were to be fixed, that the updater would still update after 6 more hours.

Actually, it might not even get you the additional 6 hours. We check if there is currently another instance running when we download and stage a new update. If there is only one instance running at that point, we will do that without waiting the additional 6 hours. When Firefox is launched with a new profile and there is an update staged, it will immediately be installed.

If this is indeed the case, then I'd like to know what I can do to start a new instance of Firefox without replacing/updating the binary. I'd like to use whatever supported mechanism you recommend in the web-ext CLI tool, to launch Firefox without side effects.

Sadly, until we are able to fix Bug 1480452, this is pretty much the expected behavior. Really the only way to get around it would be to use an Enterprise Policy like DisableAppUpdate or ManualAppUpdateOnly to prevent Firefox from ever updating automatically, putting the burden on the user to make sure it stays up to date.

Pascal Chevrel:pascalc

Updated

•

2 years ago

status-firefox106: affected → wontfix

Bugzilla

isOtherInstanceRunning() is unexpectedly false when /tmp/MozillaUpdateLock* is removed (e.g. by OS)

Categories

(Toolkit :: Application Update, defect, P3)

Tracking

()

People

(Reporter: robwu, Unassigned)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [fidedi-ope])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated