Open Bug 1543354 Opened 1 year ago Updated 11 days ago

AsyncShutdown timeout in asyncEmitManifestEntry("background") - abort background startup on shutdown

Categories

(WebExtensions :: General, defect, P1)

66 Branch
defect

Tracking

(firefox66 wontfix, firefox67 wontfix, firefox68 affected)

Tracking Status
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- affected

People

(Reporter: robwu, Assigned: robwu, NeedInfo)

References

(Blocks 1 open bug)

Details

Crash Data

Attachments

(1 file)

Firefox shutdown is blocked on extension shutdown,
Extension shutdown is blocked on extension startup,
extension startup is blocked on background startup,
background startup is blocked on an extension's background page.

The background startup can take a long while; see telemetry: WEBEXT_BACKGROUND_PAGE_LOAD_MS, for Firefox Beta 67:

  • The median is 1.42 seconds
  • The 75th percentile is 3.03 seconds
  • The 95th percentile is 15.21 seconds.
  • 0.51% takes over 50 seconds.

This long delay is responsible for 499 of the AsyncShutdown timeout crashes in the past week.

ext-backgroundPage.js is already prepared to reliably handle early shut down when delayed startups are enabled (especially after the changes from bug 1501375 that landed in 67).

We should also stop blocking startup in other cases, i.e. avoid await this.build(). Many of our tests rely on the background page to have started up before proceeding with the tests, so to avoid intermittent test failures, we need to update tests and await the background page's startup event where necessary to avoid intermittents.

Tracking for 67 as this is an actionable item to mitigate bug 1464938

Attached file background-slow.zip

Test case to 100% reproduce the crash.
Example: bp-ee28781a-14c3-46f0-b104-5caef0190410

STR:

  1. Visit about:config and set the following preferences:
    • extensions.webextensions.base-content-security-policy to script-src 'self' http://127.0.0.1:7569
    • extensions.webextensions.default-content-security-policy to script-src 'self' http://127.0.0.1:7569
  2. Open the terminal, and use netcat to start accepting connections at port 7569: nc -l 7569
  3. Load the attached extension.
  4. Quit Firefox.

Alternative STR (fewer manual steps, using web-ext):

  1. Extract attached extension to a directory.
  2. Start netcat as before, nc -l 7569
  3. Run Firefox: web-ext run --pref=extensions.webextensions.base-content-security-policy="script-src 'self' http://*" --pref=extensions.webextensions.default-content-security-policy="script-src 'self' http://*" --pref=extensions.webextensions.background-delayed-startup=false
  4. Quit Firefox.

Expected:

  • Firefox quits

Actual:

  • Firefox hangs. After one minute a crash report dialog appears.

The test case requires some pref changes to make reproduction easier, but the scenario can also be simulated without pref changes, by using a https URL in background.html and putting the https origin in the content_security_policy field in manifest.json.

Interestingly, bp-ee28781a-14c3-46f0-b104-5caef0190410 has a different async shutdown timeout signature than reported.
This is because the test is a temporary add-on, and stuck at https://searchfox.org/mozilla-central/rev/dd7e27f4a805e4115d0dbee70e1220b23b23c567/toolkit/mozapps/extensions/internal/XPIProvider.jsm#2340

Another interesting observation is that the state is Startup: Run manifest instead of Startup: Run manifest: asyncEmitManifestEntry("background") (to see the state: enable debugging at about:debugging, press Ctrl-Shift-J to open the global JS console and run ChromeUtils.import('resource://gre/modules/addons/XPIProvider.jsm').XPIProvider.activeAddons.get('@reproduction-for-bug-1543354').scope.extension.state ).
This is because the extra state is discarded upon running this line: https://searchfox.org/mozilla-central/rev/dd7e27f4a805e4115d0dbee70e1220b23b23c567/toolkit/components/extensions/Extension.jsm#1976

EDIT: Reported the "interesting observation" as bug 1543384

Crash Signature: [@ AsyncShutdownTimeout | quit-application-granted | XPIProvider shutdown ]
Priority: -- → P1
Assignee: nobody → rob

Removing await before this.build is risky, because there are many tests (and some non-test code) that rely on [the "ready" event](https://searchfox.org/mozilla-central/search?q=on(%22ready%22) or the startup promise that is currently blocked on the background page.

I believe that the long-term goal is to not block on background startup, but that takes quite some effort and it is not something that I dare to uplift.

To fix this crash bug on a shorter time scale, I am thinking of updating the implementation to still wait for the background load on startup, but skip the promise as soon as shutdown commences.
I already had an example of this concept in a previous patch, see waitUnlessShutdown at https://phabricator.services.mozilla.com/D19701?id=62129 (use Ctrl-F, Extension.jsm)

Andrew, do you consider my proposed await extension.waitUnlessShutdown(this.build()); an acceptable way to resolve this bug, which is the top cause of the AsyncShutdown timeout crash?

Flags: needinfo?(aswan)

I'm still not excited about the way the addon manager and extension interact in your earlier patch.
If nothing else, how about just adding an observer for quit-application-granted?
I have a gut feeling that there must be a better way to handle this but I don't know what it is. Forwarding to Kris for his input.

Flags: needinfo?(aswan) → needinfo?(kmaglione+bmo)
See Also: → 1545513

A few crashes only in beta today and crashes went down in bug 1464938, wontfix 67.

This is small enough that I feel like it should not be a P1. But it's the only remaining blocker on this overall child shutdown bug.

You need to log in before you can comment on or make changes to this bug.