Closed Bug 1664144 Opened 4 years ago Closed 3 years ago

sideloaded updates in the profile directory may not startup correctly

Categories

(Toolkit :: Add-ons Manager, defect, P2)

80 Branch
defect

Tracking

()

RESOLVED FIXED
85 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox80 --- wontfix
firefox81 --- wontfix
firefox82 --- wontfix
firefox83 --- wontfix
firefox84 --- wontfix
firefox85 --- fixed

People

(Reporter: bugzilla, Assigned: mixedpuppy)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0

Steps to reproduce:

We have an add-on that uses ExtensionSupport.registerWindowListener():
https://pep-security.lu/dev/repos/pEp_for_Thunderbird/file/5d5772ff8e86/addon/experiments.js#l120
On onUnloadWindow() we run our function unpaint().

Actual results:

We noticed that in TB 81 beta 3 (after add-on installation?) we randomly receive an unload event although the 3pane window isn't closed at all. This leads to add-on malfunction. It will also affect all other WE experiments which listen to closing windows.

Expected results:

No window unload should be notified unless the window is really closed. We haven't seen such behaviour in TB 68 or TB 78. We should have noticed since this lead to a complete add-on malfunction.

John, dan you please take a look. Your "WE Exp support API" uses such listeners, so all the add-ons building on that might fail now.

Flags: needinfo?(john)
Keywords: regression

Related to bug 1661216 which showed up on TB 81 beta 3 recently?

I cannot tell. Investigating at the moment.

Flags: needinfo?(john)

Ignore that. The issue doesn't show on a TB 78.2.2 build from here:
https://treeherder.mozilla.org/#/jobs?repo=comm-esr78&revision=8ec9b9d4f520697628102d97c199a9136c7e3c34

John, can you confirm this bug?

78.2.2 has bug 1661216 backed out (currently it is only in beta). Still investigating.

As per comment #4, we were using a Tb 78.2.2 build with the other bug included. So bug 1661216 doesn't appear to be the culprit. Confirmation is only about running TB 81 beta 3 with one of your add-ons which uses the window notification to see whether notifications are received for windows which aren't actually closed.

Since https://hg.mozilla.org/comm-central/log/tip/common/src/ExtensionSupport.jsm hasn't changed in a while, this might well be some M-C issue where incorrect notifications are dispatched.

You might want to see if you can add a breakpoint via the developer toolbox for when the unload is being hit, and see if you can trace it to its origins. You might need to put the breakpoint in ExtensionSupport.jsm.

Excellent tip, I'll do that right now. console.trace() should do it or else we have JS code that gives a stack.

This time it only failed on the fourth attempt. Here is the stack:

unpaint jar:file:///C:/Users/jorgk/AppData/Roaming/Thunderbird/Profiles/lhqs21u5.off42-peptest.ch for 75/extensions/pEp4Tb@pEp.security.xpi!/experiments.js:233
close jar:file:///C:/Users/jorgk/AppData/Roaming/Thunderbird/Profiles/lhqs21u5.off42-peptest.ch for 75/extensions/pEp4Tb@pEp.security.xpi!/experiments.js:152
unload resource://gre/modules/ExtensionCommon.jsm:909
unload resource://gre/modules/ExtensionParent.jsm:528
unload resource://gre/modules/ExtensionParent.jsm:609
shutdown resource://gre/modules/ExtensionParent.jsm:520
shutdown resource://gre/modules/ExtensionParent.jsm:615
shutdownExtension resource://gre/modules/ExtensionParent.jsm:800
shutdown resource://gre/modules/Extension.jsm:2660
shutdown resource://gre/modules/Extension.jsm:1736
callBootstrapMethod resource://gre/modules/addons/XPIProvider.jsm:1820
_shutdown resource://gre/modules/addons/XPIProvider.jsm:1949
awaitPromise resource://gre/modules/addons/XPIProvider.jsm:228
processFileChanges resource://gre/modules/addons/XPIDatabase.jsm:3329
getNewSideloads resource://gre/modules/addons/XPIProvider.jsm:3035

It's not related to ExtensionSupport.registerWindowListener() at all, but to another unpaint() we have here:
https://pep-security.lu/dev/repos/pEp_for_Thunderbird/file/5d5772ff8e86/addon/experiments.js#l148
in the close() function which is registered here:
https://pep-security.lu/dev/repos/pEp_for_Thunderbird/file/5d5772ff8e86/addon/experiments.js#l98

Hmm, so how to proceed here? Somehow intermittently this callOnClose() which is treated as add-on removal. Hence we try to remove all traces of the add-on from all open windows.

We change the bug summary accordingly. Still, other add-ons unloading their add-on could be affected.

Summary: ExtensionSupport.registerWindowListener() delivers onUnloadWindow when window is actually not unloaded → Add-on manager seemingly randomly delivers callOnClose() callback leading to add-on self-unload

The issue happens after replacing the add-on directly the profile's extensions directory and starting TB with -purgecaches. That is working well in TB 68 and 78. Maybe the bug is invalid and we have to take care of this case in the add-on itself. In that case, sorry about the noise.

If you install an add-on over the top of an existing one, or if you upgrade, then the old one will be unloaded and the new one will be loaded.

However, I have a vague feeling that sometime in the past I found that the close of the old one can happen after the addition of the new one. For my add-on, I think I now do all the UI addition & subsequent removal in experiment event listeners, e.g. https://github.com/thunderbird-conversations/thunderbird-conversations/blob/5cb15b53074f84dbe62338c4da0ba509a502d421/addon/experiment-api/msgWindowApi.js#L247-L267

If I remember correctly, doing the add & cleanup in those listeners meant everything was handled better anyway.

For module cleanup, I do it in onShutdown, as then you can detect if the app is shutting down and avoid the unnecessary work: https://github.com/thunderbird-conversations/thunderbird-conversations/blob/5cb15b53074f84dbe62338c4da0ba509a502d421/addon/experiment-api/api.js#L228-L248

Thanks, Mark. Yes, the "close of the old one can happen after the addition of the new one" seems to be the case here. We'd have to take a closer look to confirm.

So basically you're saying that we should react to callOnClose() but instead do removal at onShutdown(). But how to remove what we loaded into the window to replace it with a new version? Like here:
https://pep-security.lu/dev/repos/pEp_for_Thunderbird/file/5d5772ff8e86/addon/experiments.js#l201 - add it
https://pep-security.lu/dev/repos/pEp_for_Thunderbird/file/5d5772ff8e86/addon/experiments.js#l232 - remove it.

Anyway, maybe we should move this discussion elsewhere.

So this is what happens: After copying the XPI into the profile's extension folder and starting TB with -purgecaches, all that, for a quick development cycle, some version of the add-on (old or now) happily does some stuff, and then many seconds into the process, callOnClose() and shutdown() arrive while TB is happily running. That really looks like a bug and would also make Conversations malfunction. As far as we can see, those calls only sometimes arrive and they don't seem to arrive for 68/78.

Summary: Add-on manager seemingly randomly delivers callOnClose() callback leading to add-on self-unload → callOnClose() and shutdown() arrive sometimes after add-on was already started

So this can happen when the add-on is being updated by placing a newer version into the profile's extension folder? Has it been observed if this also happens, when the add-on is updated using the default or a custom updateURL?

I am fighting with this add-on toolkit code all day long and if this is really an issue with the loader, we probably cannot fix that on our own, but you need to file that against m-c.

(In reply to John Bieling (:TbSync) from comment #16)

So this can happen when the add-on is being updated by placing a newer version into the profile's extension folder?

Yes, but only observed in TB 81 lately. Copying/purgecaches as been used in the development cycle for many months in TB 68/78.

Has it been observed if this also happens, when the add-on is updated using the default or a custom updateURL?

Not tested.

I am fighting with this add-on toolkit code all day long and if this is really an issue with the loader, we probably cannot fix that on our own, but you need to file that against m-c.

Thanks for looking. Correct, we need to give a precise statement of what is going wrong and then file a bug in M-C unless we're doing something illegal. Meanwhile I suggest to replace an add-on which has the two callbacks the way that was described (during the development cycle) and see if the behaviour can be confirmed. The behavious appears firstly changed from 68/78 and secondly at random which hints towards a bug.

Some logging:
16:17:01.657 [DEBUG] pEpHdrView: init()
16:17:01.659 [INFO] pEpForThunderbird.init(): Window count now 1 pEpForThunderbird.js:80:14
Then a lot of stuff happens, including XHR calls, sending out a message, etc. And then one second after startup:
16:17:02.654 WebExtensions: pEp4Tb: close() called experiments.js:150
16:17:02.655 WebExtensions: pEp4Tb: onShutdown() called experiments.js:85
That kills the add-on.

Bisecting led us to bug 1639705 comment #15. The behaviour changed between the Dailies of 21-07-2020 and 22-07-2020.
Running thunderbird.exe -p -no-remote -allow-downgrade -purgecaches after copying the add-on to the profile's extension folder leads to a call of the onShutdown() handler. Setting pref extensions.startupScanScopes to 5 instead of default 4 makes the issue go away. In summary, this appears to be a bug.

Regressed by: 1639705
Summary: callOnClose() and shutdown() arrive sometimes after add-on was already started → callOnClose() and shutdown() arrive on first launch with -purgecaches after add-on has been copied to extensions folder
Has Regression Range: --- → yes

This add-on load code is toolkit and must be fixed in m-c. Either move this bug to toolkit or create a new one.

Component: Add-Ons: Extensions API → Add-ons Manager
Product: Thunderbird → Toolkit
Summary: callOnClose() and shutdown() arrive on first launch with -purgecaches after add-on has been copied to extensions folder → callOnClose() and shutdown() arrive on first launch with -purgecaches after add-on has been copied to extensions folder (skip to comment #19)
Version: Thunderbird 81 → 80 Branch

Set release status flags based on info from the regressing bug 1639705

Is there a STR with Firefox (or xpcshell test)? Otherwise, this probably belongs in Thunderbird.

Flags: needinfo?(bugzilla)
Attached file addon.xpi

Thanks for your comment. Yes, the issue is reproducible in Firefox, but you have to set some preferences to make it look more like Thunderbird.

STR:

Set prefs as follows:
extensions.experiments.enabled true
xpinstall.signatures.required false
extensions.startupScanScopes 4

Start FF Daily and install the add-on. Close FF. Visit your profile's extensions directory, modify the file date of the XPI (using touch, for example). Start FF again. Check the console:
WebExtensions: pEp4Tb: onStartup() called
WebExtensions: pEp4Tb: onShutdown() called

The second call is not expected and undesired.

Flags: needinfo?(bugzilla)

Hello,

I’ve managed to reproduce the issue based on the STR provided in Comment 23 on the latest Nightly (82.0a1/20200917142508), Beta Developer Edition (81.0b9/20200910180444) and Unbranded Release (80.0.1/20200831163820) under Windows 10 Pro 64-bit and Ubuntu 16.04 LTS.

Launching the browser after changing the file date of the .xpi, both calls mentioned in Comment 23 are shown in the console. See the attached screenshot.

As the version status for the issue has already been set, I’ll only change the bug to New. Thank you !

Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached image 2020-09-18_10h56_27.png
Summary: callOnClose() and shutdown() arrive on first launch with -purgecaches after add-on has been copied to extensions folder (skip to comment #19) → callOnClose() and onShutdown() arrive on first launch with -purgecaches after add-on has been copied to extensions folder (skip to comment #19)
Flags: needinfo?(mixedpuppy)

In this situation we used to uninstall then install, however that presented other problems. The fix for those issues however, is not restarting the addon when doing the update call.

https://searchfox.org/mozilla-central/rev/b58ca45005fe02077c92779483d1b60e9a49687c/toolkit/mozapps/extensions/internal/XPIDatabase.jsm#3431

Assignee: nobody → mixedpuppy
Severity: -- → S2
Flags: needinfo?(mixedpuppy)
Priority: -- → P1

--- deleted old comment where I thought this was a non-issue ----

Summary: callOnClose() and onShutdown() arrive on first launch with -purgecaches after add-on has been copied to extensions folder (skip to comment #19) → sideloaded updates in the profile directory may not startup correctly

The STR in comment 23 can be modified to exclude the change to the startupScanScopes preference, it works (fails) without that.

Attachment #9177749 - Attachment description: Bug 1664144 fix addon update during startup with different startupScanScopes → Bug 1664144 fix startup of addons updated during app startup

This is poorly documented, but if its possible that an extension is changed on disk outside the control of the app, the app needs to know to check for changes on startup. That is, you either need to perform updates from within the app or change the extensions.startupScanScopes pref (in this case, change it from 4 to 5 to include SCOPE_PROFILE). Reporter, if you change that pref, is the problem fixed?

Flags: needinfo?(bugzilla)

Yes, as per comment #19 (quote):

Setting pref extensions.startupScanScopes to 5 instead of default 4 makes the issue go away.

Flags: needinfo?(bugzilla)

(hit Save Changes too quickly)
It still appears peculiar that a shutdown() event is dispatched for a running add-on regardless of pref values.

Shane, should that be a P2 given that we shipped multiple releases with this bug?

Flags: needinfo?(mixedpuppy)
Flags: needinfo?(mixedpuppy)
Priority: P1 → P2
Blocks: 1672248
Blocks: 1680360

Comment on attachment 9178509 [details]
Bug 1664144 check sideloaded addons early in startup during initial scan to address startup cache consistency

Revision D91778 was moved to bug 1680360. Setting attachment 9178509 [details] to obsolete.

Attachment #9178509 - Attachment is obsolete: true
Pushed by scaraveo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/20e9f56119c8
fix startup of addons updated during app startup r=zombie,rpl
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 85 Branch
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: