Open Bug 1833831 Opened 1 year ago Updated 3 months ago

In automation, add-on manager in msix builds spend ~400ms or more on startup doing mainthread IO `stat` calls on every language pack we ship to get the last modified date of all of them (and then uninstalls all of them!?)

Categories

(Toolkit :: Add-ons Manager, task, P3)

Desktop
All
task

Tracking

()

Performance Impact low
Tracking Status
firefox115 --- affected

People

(Reporter: Gijs, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: main-thread-io, perf:startup, reproducible)

So there's a first problem, which is we do this for every single langpack:

[task 2023-05-17T13:04:09.772Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-xh@firefox.mozilla.org before opening first browser window - 
[task 2023-05-17T13:04:09.772Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   get nsIFile.lastModifiedTime
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   getModTime (resource://gre/modules/addons/XPIProvider.jsm:591:12)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   addFile (resource://gre/modules/addons/XPIProvider.jsm:824:9)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   scanForChanges (resource://gre/modules/addons/XPIProvider.jsm:1438:16)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   XREMain::XRE_main

That's not good but at least, on our very-fast-disk infra machines, it's not very slow, "only" 20 or so ms. It's probably a bunch slower on spinny disk user machines, but I don't have data for that right now.

The second bit is worse, however, we do this which takes about 300ms on the same very-fast-disk machines:

[task 2023-05-17T13:04:09.810Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-ach@firefox.mozilla.org before opening first browser window - 
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   nsLocalFile::ResolveAndStat
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   nsIFile.moveTo
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   _installFile (resource://gre/modules/addons/XPIInstall.jsm:1106:14)
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   moveUnder (resource://gre/modules/addons/XPIInstall.jsm:1145:11)
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   uninstallAddon (resource://gre/modules/addons/XPIInstall.jsm:3383:16)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   addMetadata (resource://gre/modules/addons/XPIDatabase.jsm:3062:13)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   processFileChanges (resource://gre/modules/addons/XPIDatabase.jsm:3504:20)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   XREMain::XRE_main
[task 2023-05-17T13:04:09.813Z] 13:04:09     INFO - (PoisonIOInterposer) stat - C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-ach@firefox.mozilla.org
[task 2023-05-17T13:04:09.813Z] 13:04:09     INFO - Not taking screenshot here: see the one that was previously logged
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\trash before opening first browser window - 
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO -   nsLocalFile::ResolveAndStat
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   nsIFile.moveTo
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   _installFile (resource://gre/modules/addons/XPIInstall.jsm:1106:14)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   moveUnder (resource://gre/modules/addons/XPIInstall.jsm:1145:11)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   uninstallAddon (resource://gre/modules/addons/XPIInstall.jsm:3383:16)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   addMetadata (resource://gre/modules/addons/XPIDatabase.jsm:3062:13)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   processFileChanges (resource://gre/modules/addons/XPIDatabase.jsm:3504:20)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   XREMain::XRE_main
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO - (PoisonIOInterposer) stat - C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\trash
[task 2023-05-17T13:04:09.817Z] 13:04:09     INFO - Not taking screenshot here: see the one that was previously logged
[task 2023-05-17T13:04:09.818Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\trash before opening first browser window - 
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   set nsIFile.permissions
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   setFilePermissions (resource://gre/modules/addons/XPIInstall.jsm:1046:27)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   recursiveRemove (resource://gre/modules/addons/XPIInstall.jsm:992:24)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   uninstallAddon (resource://gre/modules/addons/XPIInstall.jsm:3383:16)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   addMetadata (resource://gre/modules/addons/XPIDatabase.jsm:3062:13)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   processFileChanges (resource://gre/modules/addons/XPIDatabase.jsm:3504:20)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   XREMain::XRE_main

So we try to move the add-on file to trash (after, btw, checking if trash exists, if so recursively removing it and all its contents, and then recreating it, all also on the main thread), which involves another stat call on the add-on dir, plus 2 of them on the trash dir, and presumably the actual IO of moving the file (not captured by the profiler or test).

I don't know exactly how msix stuff works so I have a lot of questions like:

  • why do we need to check the LM time of all these add-ons
  • why are we uninstalling all these add-ons
  • does that even work, ie does the user running Firefox normally have permissions to mess with the app dir, where these langpacks live?
  • what's up with the "trash" dir - do we really need to empty it, synchronously on the main thread? Could we not just pick a unique name and create a separate one for the lifetime of the process, and queue an async off-mainthread idle priority task to clean up the old stuff?

I'm hoping Ben or Luca know some of the background here. :-)

Flags: needinfo?(lgreco)
Flags: needinfo?(bhearsum)

I don't think I have a complete answer for you, but here's what I know of the MSIX specific parts:

(In reply to :Gijs (he/him) from comment #0)

I don't know exactly how msix stuff works so I have a lot of questions like:

  • does that even work, ie does the user running Firefox normally have permissions to mess with the app dir, where these langpacks live?

We definitely cannot remove files from the install directory in MSIX. Those directories are only modifiable by very very high privileged users (even if Firefox was running elevated it still wouldn't be able to make these changes). However, I want to note that the log appears to be showing stat's on these files from a profile directory (eg: C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-ach@firefox.mozilla.org). The install directory would be somewhere under C:\Program Files\WindowsApps (as we see for the non-langpack bundled extensions: C:\Program Files\WindowsApps\Mozilla.MozillaFirefoxNightly_115.2305.1809.0_x64__gmpnhwe7bv608\VFS\ProgramFiles\MozillaFirefoxNightly Package Root\browser\features\formautofill@mozilla.org.xpi). It kindof sounds like these langpacks have been copied to the profile directory - which seems to be intentional based on what I see in https://bugzilla.mozilla.org/show_bug.cgi?id=1726214#c12.

In any case - anything we're doing that is ultimately trying to remove langpacks from the install directory of an MSIX install is doomed to failure, and probably should never happen unless there's other side effects that we care about.

Nick - I think you probably ought to weigh in here as well as you did much of the original work.

Flags: needinfo?(bhearsum) → needinfo?(nalexander)

(In reply to :Gijs (he/him) from comment #0)

So there's a first problem, which is we do this for every single langpack:

[task 2023-05-17T13:04:09.772Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-xh@firefox.mozilla.org before opening first browser window - 
[task 2023-05-17T13:04:09.772Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   get nsIFile.lastModifiedTime
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   getModTime (resource://gre/modules/addons/XPIProvider.jsm:591:12)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   addFile (resource://gre/modules/addons/XPIProvider.jsm:824:9)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   scanForChanges (resource://gre/modules/addons/XPIProvider.jsm:1438:16)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.774Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.775Z] 13:04:09     INFO -   XREMain::XRE_main

That's not good but at least, on our very-fast-disk infra machines, it's not very slow, "only" 20 or so ms. It's probably a bunch slower on spinny disk user machines, but I don't have data for that right now.

The second bit is worse, however, we do this which takes about 300ms on the same very-fast-disk machines:

[task 2023-05-17T13:04:09.810Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-ach@firefox.mozilla.org before opening first browser window - 
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   nsLocalFile::ResolveAndStat
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   nsIFile.moveTo
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   _installFile (resource://gre/modules/addons/XPIInstall.jsm:1106:14)
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   moveUnder (resource://gre/modules/addons/XPIInstall.jsm:1145:11)
[task 2023-05-17T13:04:09.811Z] 13:04:09     INFO -   uninstallAddon (resource://gre/modules/addons/XPIInstall.jsm:3383:16)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   addMetadata (resource://gre/modules/addons/XPIDatabase.jsm:3062:13)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   processFileChanges (resource://gre/modules/addons/XPIDatabase.jsm:3504:20)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.812Z] 13:04:09     INFO -   XREMain::XRE_main
[task 2023-05-17T13:04:09.813Z] 13:04:09     INFO - (PoisonIOInterposer) stat - C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\langpack-ach@firefox.mozilla.org
[task 2023-05-17T13:04:09.813Z] 13:04:09     INFO - Not taking screenshot here: see the one that was previously logged
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\trash before opening first browser window - 
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.814Z] 13:04:09     INFO -   nsLocalFile::ResolveAndStat
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   nsIFile.moveTo
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   _installFile (resource://gre/modules/addons/XPIInstall.jsm:1106:14)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   moveUnder (resource://gre/modules/addons/XPIInstall.jsm:1145:11)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   uninstallAddon (resource://gre/modules/addons/XPIInstall.jsm:3383:16)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   addMetadata (resource://gre/modules/addons/XPIDatabase.jsm:3062:13)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   processFileChanges (resource://gre/modules/addons/XPIDatabase.jsm:3504:20)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.815Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO -   XREMain::XRE_main
[task 2023-05-17T13:04:09.816Z] 13:04:09     INFO - (PoisonIOInterposer) stat - C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\trash
[task 2023-05-17T13:04:09.817Z] 13:04:09     INFO - Not taking screenshot here: see the one that was previously logged
[task 2023-05-17T13:04:09.818Z] 13:04:09     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected stat on C:\Users\task_168432394098143\AppData\Local\Temp\tmpyvbcg1x8.mozrunner\extensions\trash before opening first browser window - 
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO - Stack trace:
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   0x7ffcad362450
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   set nsIFile.permissions
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   setFilePermissions (resource://gre/modules/addons/XPIInstall.jsm:1046:27)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   recursiveRemove (resource://gre/modules/addons/XPIInstall.jsm:992:24)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   uninstallAddon (resource://gre/modules/addons/XPIInstall.jsm:3383:16)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   addMetadata (resource://gre/modules/addons/XPIDatabase.jsm:3062:13)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   processFileChanges (resource://gre/modules/addons/XPIDatabase.jsm:3504:20)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   checkForChanges (resource://gre/modules/addons/XPIProvider.jsm:3043:17)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   startup (resource://gre/modules/addons/XPIProvider.jsm:2496:9)
[task 2023-05-17T13:04:09.819Z] 13:04:09     INFO -   callProvider (resource://gre/modules/AddonManager.jsm:240:21)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   _startProvider (resource://gre/modules/AddonManager.jsm:545:16)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:609:9)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   startup (resource://gre/modules/AddonManager.jsm:3665:9)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   observe (resource://gre/modules/addonManager.js:72:9)
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   XPCWrappedJS method call
[task 2023-05-17T13:04:09.820Z] 13:04:09     INFO -   XREMain::XRE_main

So we try to move the add-on file to trash (after, btw, checking if trash exists, if so recursively removing it and all its contents, and then recreating it, all also on the main thread), which involves another stat call on the add-on dir, plus 2 of them on the trash dir, and presumably the actual IO of moving the file (not captured by the profiler or test).

I don't know exactly how msix stuff works so I have a lot of questions like:

  • why do we need to check the LM time of all these add-ons

We distribute a single multi-locale MSIX file by packaging all of the langpack XPIs as "distribution XPIs" in the MSIX itself. Processing all of these addons is not snappy (https://bugzilla.mozilla.org/show_bug.cgi?id=1726214) so we try to only process the required languages. But I imagine that we can avoid a great deal of the I/O by looking for N specific XPIs rather than iterating all XPIs and processing the N required ones.

  • why are we uninstalling all these add-ons

This I can't speak to; I'll defer to Luca. This is all standard "distribution XPI" stuff: it's my understanding that the distribution XPIs are copied out to allow for in-product addon updates to proceed.

  • does that even work, ie does the user running Firefox normally have permissions to mess with the app dir, where these langpacks live?

No, Firefox can't write to the distribution directory in an MSIX package. (Or most installs, which are elevated, i.e., in c:/Program Files.)

  • what's up with the "trash" dir - do we really need to empty it, synchronously on the main thread? Could we not just pick a unique name and create a separate one for the lifetime of the process, and queue an async off-mainthread idle priority task to clean up the old stuff?

Again, this I know nothing about. Luca?

I'm hoping Ben or Luca know some of the background here. :-)

Flags: needinfo?(nalexander)

does that even work, ie does the user running Firefox normally have permissions to mess with the app dir, where these langpacks live?

I took a look into the changes originally introduced from Bug 1726214, and I see that the distribution addons are being installed in the profile (see here on searchfox), which explains at least where installed langpacks live and how they could be successfully removed.

why are we uninstalling all these add-ons?

I’m digging more into it, but in the meantime I wanted to see if I could gather some more context around the failure and determine what are the pre-conditions to force my local build to consistently hitting it.

In particular, locally I've been able to force this test file to hit installDistributionAddon, and the test reported a failure when Gecko was hitting the disk to copy the file into the profile, but it wasn't hitting the recursiveRemove that I see in comment 0 logs and in the linked profiling data (while I don't see the copyTo call in the profiling data attached to comment 0).

From the stacktrace in comment 0 I see that recursiveRemove is coming from addMetadata (through a call to uninstallAddon), and that should only be happening if we get into the catch block inside addMetadata, which seems to suggest that we hit one of the error conditions handled in that try…catch (“install disabled by enterprise policy”, “error loading the manifest”, “addon id mismatch” or the “xpi isn’t signed correctly”), and at the moment the exception raised when "the xpi is not signed correctly" seems the one that I would guess we may more likely be hitting in this case (but I can’t exclude I may still be missing some detail about how we get to the observed behaviors).

And so I’m currently wondering: are the langpack xpis installed as distribution addons by this test actually signed?

A quick way to confirm if that may be part of how we hit the observed behaviors while running this test case may also be to try to disable the langpack signature checks using the “extensions.langpacks.signatures.required” pref in a push to try.

what's up with the "trash" dir - do we really need to empty it, synchronously on the main thread? Could we not just pick a unique name and create a separate one for the lifetime of the process, and queue an async off-mainthread idle priority task to clean up the old stuff?

I wouldn’t exclude that we may handle the trash dir cleanup in a way that doesn’t at least hit us on startup.

But I’m also wondering if for these kind of multi language builds where we have all langpacks xpi files available in the application dir, we may want to consider to “install them in place” instead of copying them into the profile directory, given that the langpacks installed from a previous Firefox version are very likely going to be incompatible with a new Firefox major version upgrade being installed and we would have to remove them from disk and then copy the new ones (while we could keep them in place and installing/uninstalling them by just changing the addon db data and not copying and removing any xpi file).

Similarly, at install time instead of iterating over the distribution/extensions directory looking for the locales directory from which we would be picking the xpi files, we could consider to prebuild a list and use the prebuilt list to determine which set of locales we should negotiate with (and avoid the for loop that iterates over local- directories all together, which seems similar to what Nick is also describing in Comment 2).

Flags: needinfo?(lgreco)

(In reply to Luca Greco [:rpl] [:luca] [:lgreco] from comment #3)

why are we uninstalling all these add-ons?

(...)
From the stacktrace in comment 0 I see that recursiveRemove is coming from addMetadata (through a call to uninstallAddon), and that should only be happening if we get into the catch block inside addMetadata, which seems to suggest that we hit one of the error conditions handled in that try…catch (“install disabled by enterprise policy”, “error loading the manifest”, “addon id mismatch” or the “xpi isn’t signed correctly”), and at the moment the exception raised when "the xpi is not signed correctly" seems the one that I would guess we may more likely be hitting in this case (but I can’t exclude I may still be missing some detail about how we get to the observed behaviors).

"Error loading the manifest" is also a possible error, if the path has changed. E.g. in bug bug 1830507 we witnessed built-in add-ons being broken because the path of the application binary changed.

The severity field is not set for this bug.
:zombie, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(tomica)

Luca: the MSIX XPIs are signed. Here's an MSIX Nightly job; that job depends on signed Linux jobs like. A sample XPI like has content like:

$ unzip target.langpack.xpi
...
  inflating: manifest.json
  inflating: META-INF/cose.manifest
  inflating: META-INF/cose.sig
  inflating: META-INF/manifest.mf
  inflating: META-INF/mozilla.sf
  inflating: META-INF/mozilla.rsa
Flags: needinfo?(lgreco)

(In reply to Nick Alexander :nalexander [he/him] from comment #6)

Luca: the MSIX XPIs are signed. Here's an MSIX Nightly job; that job depends on signed Linux jobs like. A sample XPI like has content like:

Thanks for the confirmation, I've pushed to try a temporary patch which un-skips the test file and adds a debug log in the try/catch which seems we may be hitting, so that I can both confirm that I can consistently hit the expected failure in the CI job and see if I can gather some more details about which unexpected issue we are hitting.

I'll keep the needinfo assigned to me to come back to my push to try and double-check if I managed to gather some more details useful to pinpoint the actual underlying issue.

Flags: needinfo?(tomica)

Push to try seems to confirm that we are hitting the try...catch block in addMetadata as mentioned in comment 3, but based on the error message not for a missing signature, the error being hit is actually the "does not contain a valid manifest" (which should be the one raised from loadManifest here):

...
Add-on langpack-ach@firefox.mozilla.org is invalid: Error: File C:\Users\task_168657028430879\AppData\Local\Temp\tmpg88_km2s.mozrunner\extensions\langpack-ach@firefox.mozilla.org does not contain a valid manifest

quoted from the errors captured in the failure logs here before hitting the error due to the unexpected mainthread io, but it is also logged for other tests running before (and so that seems to suggest we keep hitting that install failure from each test).

I tried to manually unpack the msix and check the langpacks xpi files from the distribution/extensions directory, but those seems to be contains the manifest.json as expected (and as also shown in Nick comment 6).

I would also expect ".xpi" to be included in the path to the file which seems to not be including a manifest from the "Error File ... does not contain a valid manifest", but the error message doesn't include an extension and so based on just the current logs I got it looks like it is a sub-directory instead of an xpi file.

I'll dig more into that.

I think that I finally managed to pin point what is actually going on, the unexpected langpacks directories in the profile extensions directory seems to be created by the test harness and not gecko itself.

Following some additional detail from this last round of investigation:

  • I confirmed locally that in the profile where the test was running included in the profile extensions directory a sub-directory named after the locale xpi (as noticed in the log added in my previous push to try and mentioned in comment 8), one for each of the langpacks I manually included into the distribution/extensions directory (following the naming convention locale-NN/langpack-NN@firefox.mozilla.org.xpi). The langpack xpi was in the subdirectory (e.g. profileDir/extensions/langpack-it@firefox.mozilla.org/langpack-it@firefox.mozilla.org.xpi instead of profileDir/extensions/langpack-it@firefox.mozilla.org.xpi)

  • the unexpected langpack directory in the profile extensions dir was not created by Gecko as part of installing the langpacks included in the distribution, it was instead copied before Gecko is started on that test profile from the test harness python code, in particular this part of runtests.py is the one mistakenly collecting all the locale-NN dirs from the distribution/extensions directory and installing then in the test profile using the wrong convention (which will then hit the Add-on ... is invalid error and will be removed by gecko as we noticed already while investigating this issue):

Locally I've been trying this tweak to the getExtensionsToInstall function defined in runtests.py to avoid collecting subdirectories from the distributions extensions dir:

I wanted to double-check what failure we would hit when running browser_startup_mainthreadio.js with the changes in a new push to try, but the ones I just pushed seems to be hitting a failure build-signing-linux64-shippable/opt job, which seems to be preventing to run the jobs that will sign the langpacks and so I'm not sure any of these two push to try will be running the job that would be hitting the failure being investigated:

(I haven't double-checked yet if my draft patch on runtests.py has anything to do with the build-signing-linux64-shippable/opt job failure)

I'll try to run another push to try later today, but in the meantime I wanted to provide all the details collected so far, in case you may also give a try to confirm this theory and get to see what failure we would be hitting with runtests.py tweaked to don't create the langpacks directory in the test profile extensions dir.

Flags: needinfo?(lgreco)

The Performance Impact Calculator has determined this bug's performance impact to be low. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: Windows
Impact on browser: Causes noticeable startup delay
Websites affected: Rare
[x] Able to reproduce locally

Performance Impact: ? → low
Keywords: reproducible

From Luca's investigation in comment 9, this seems to be an issue with the test harness, and not with actual shipping code, so triaging this as P3 Task. If there's indication that this does in fact impact Firefox, feel free to ping us or re-prioritize.

Type: defect → task
Priority: -- → P3

Is there anything that should be done to move this forward or close it out?

Flags: needinfo?(nalexander)

AIUI we should at least fix the test harness to not put the add-ons somewhere it shouldn't and thus triggering this IO. Then we can re-enable the test hitting this for msix builds (bug 1833639).

There may also be follow-up fodder in terms of the timing of cleaning up trash, and perhaps establishing how often we hit this type of thing "in the wild" so that we have some idea of whether it's worth optimizing this at all.

Nick may have further ideas.

Severity: -- → N/A

Sorry for the long delay. Let's mutate this one to be about improving the test harness.

We could include a Glean event recording how long installing distribution addons takes in the wild, but I'm not going to file for that because it's not something we're likely to actively monitor at this time. (That would be a follow-up to the underlying changes in https://bugzilla.mozilla.org/show_bug.cgi?id=1726214, if and when we want to do that.)

Flags: needinfo?(nalexander)
Summary: Add-on manager in msix builds spend ~400ms or more on startup doing mainthread IO `stat` calls on every language pack we ship to get the last modified date of all of them (and them uninstalls all of them!?) → In automation, add-on manager in msix builds spend ~400ms or more on startup doing mainthread IO `stat` calls on every language pack we ship to get the last modified date of all of them (and then uninstalls all of them!?)
You need to log in before you can comment on or make changes to this bug.