Closed Bug 1991490 Opened 3 months ago Closed 3 months ago

Stalled out download of trainhop XPI can cause shutdown hangs / crashes

Categories

(Firefox :: New Tab Page, task)

task

Tracking

()

RESOLVED FIXED
145 Branch
Tracking Status
firefox143 --- wontfix
firefox144 + fixed
firefox145 + fixed

People

(Reporter: mconley, Assigned: mconley, NeedInfo)

References

(Blocks 1 open bug)

Details

Crash Data

Attachments

(3 files)

With our most recent train-hop to 143, we noticed a small but significant spike in crashes with the signature AsyncShutdownTimeout | quit-application | newtabTrainhopAddon scheduleUpdateTrainhopAddonState shutting down

https://crash-stats.mozilla.org/signature/?product=Firefox&version=143.0.1&signature=AsyncShutdownTimeout%20%7C%20quit-application%20%7C%20newtabTrainhopAddon%20scheduleUpdateTrainhopAddonState%20shutting%20down&date=%3E%3D2025-09-15T13%3A26%3A00.000Z&date=%3C2025-09-29T13%3A26%3A00.000Z#graphs

My current hypothesis is that this is caused when a shutdown begins while a train-hop XPI download either hasn't yet begun, or has only just begun at deferred task finalization time, and doesn't complete before the AsyncShutdown timeout.

STR:

  1. Download the attached Python script, and run it. This will launch a local HTTP server at http://localhost:8080.
  2. Set browser.newtabpage.trainhopAddon.xpiBaseURL to http://localhost:8080/
  3. Install the Nimbus devtools
  4. Launch Nimbus devtools, switch the feature to newtabTrainhopAddon, check "isRollout" to true, and have this put into the text area:
{
  "addon_version": "145.999.0",
  "xpi_download_path": "newtab-145.0.0-build1/newtab.xpi"
}
  1. Wait a few seconds for the network request to be opened to the server. The server should indicate that with a message like this:
127.0.0.1 - - [29/Sep/2025 13:25:49] "GET /newtab-145.0.0-build1/newtab.xpi HTTP/1.1" 200 -
127.0.0.1 - - [29/Sep/2025 13:25:49] Stalled after sending 10485760 / 20971520 bytes
  1. Shut down the browser

ER:

The browser should shut down.

AR:

The browser hangs on shutdown, and eventually crashes.

:mconley Thanks for looking into getting into an STR to double-check if a slow download would be able to trigger this kind of shutdown crash, this seems to be at least a confirmation that this would be one of the ways we may be hitting these shutdown crash.

Would you mind to try out if adding an onDownloadProgress listener to the install listener added from inside AboutNewTabResourceMapping _installTrainhopAddon method that checks if we got past the shutdown confirmation and explicitly cancel the in progress installation would be able to prevent this shutdown crash?

I haven't tried locally (and so it may needs to be tweaked a bit), but the following diff is what I was thinking of:

diff --git a/browser/components/newtab/AboutNewTabResourceMapping.sys.mjs b/browser/components/newtab/AboutNewTabResourceMapping.sys.mjs
index 35abab0d0fbc..0a5dcf09a45c 100644
--- a/browser/components/newtab/AboutNewTabResourceMapping.sys.mjs
+++ b/browser/components/newtab/AboutNewTabResourceMapping.sys.mjs
@@ -548,6 +548,18 @@ export var AboutNewTabResourceMapping = {
       );
       const deferred = Promise.withResolvers();
       newInstall.addListener({
+        onDownloadProgress() {
+          const isPastShutdownConfirmed =
+            Services.startup.isInOrBeyondShutdownPhase(
+              Ci.nsIAppStartup.SHUTDOWN_PHASE_APPSHUTDOWNCONFIRMED
+            );
+          if (isPastShutdownConfirmed) {
+            this.logger.debug(
+              "tran-hop add-on download cancelled on appShutdownConfirmed barrier"
+            );
+            newInstall.cancel();
+          }
+        },
         onDownloadEnded() {
           if (
             newInstall.addon.id !== BUILTIN_ADDON_ID ||

Normally, the network:offline-about-to-go-offline topic would satisfy here - however,
in cases where we have an AsyncShutdown blocker that's preventing us from reaching
network:offline-about-to-go-offline AND that blocker is awaiting on an XPIInstall,
the quit-application-granted topic can help break that circular dependency.

Assignee: nobody → mconley
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → 145 Branch
Duplicate of this bug: 1992135

firefox-beta Uplift Approval Request

  • User impact if declined: Users that are enrolled in a train-hop run the risk of hitting a shutdown hang / crash if they shutdown and the XPI download does not complete before the AsyncShutdown timer goes off.
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing:
  • Risk associated with taking this patch: low
  • Explanation of risk level: We're taking advantage of a pre-existing download cancel mechanism which was watching for network shutdown. We're simply adding a new way of entering that mechanism - shutdown being granted.
  • String changes made/needed: None.
  • Is Android affected?: yes
Attachment #9517769 - Flags: approval-mozilla-beta?

Normally, the network:offline-about-to-go-offline topic would satisfy here - however,
in cases where we have an AsyncShutdown blocker that's preventing us from reaching
network:offline-about-to-go-offline AND that blocker is awaiting on an XPIInstall,
the quit-application-granted topic can help break that circular dependency.

Original Revision: https://phabricator.services.mozilla.com/D267029

Copying crash signatures from duplicate bugs.

Crash Signature: [@ AsyncShutdownTimeout | quit-application | newtabTrainhopAddon scheduleUpdateTrainhopAddonState shutting down]
Attachment #9517769 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

:mconley, there is still crash volume for Fx144?
Any ideas?

Flags: needinfo?(mconley)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: