Closed Bug 1367823 Opened 3 years ago Closed 3 years ago

Delay Telemetry experiments addon uninstalls until after "sessionstore-windows-restored"

Categories

(Toolkit :: Telemetry, defect, P2)

55 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 - wontfix
firefox56 --- fixed

People

(Reporter: u279076, Assigned: aswan)

References

Details

(Keywords: regression)

Attachments

(1 file)

Attached image Screenshot
[Originally discovered via bug 1367246]

With recent changes in Nightly (possibly bug 1356826 and/or bug 1358846) Telemetry Experiments now display a warning to the user about being sideloaded. Initially when the add-on is deployed it is enabled but some time later a warning icon will appear in the menu button. If you click through you'll be brought to the addons manager which shows the experiment has been disabled and a doorhanger (see screenshot) appears asking if the user wants to enable it.

Obviously this will severely impact the ability of Telemetry Experiments to run successfully and provide reliable data. There was some assumption on bug 1367246 that Telemetry Experiments were being deprecated but I've not yet seen public documentation of this decision nor of some deprecation deadline having been passed. In addition, there are still experiments and plans for experiments in the pipeline.
Kris, flagging you so that you're aware of this issue.
Flags: needinfo?(kmaglione+bmo)
Assignee: nobody → kmaglione+bmo
Flags: needinfo?(kmaglione+bmo)
Priority: -- → P1
Whiteboard: triaged
I can't reproduce this issue. Can you reproduce it in a fresh profile, or give me more details?
(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #3)
> I can't reproduce this issue. Can you reproduce it in a fresh profile, or
> give me more details?
Flags: needinfo?(anthony.s.hughes)
Georg or dexter, can you take a look at this? I believe we do have experiments planned for 55 beta and release. Thanks.
Flags: needinfo?(gfritzsche)
Flags: needinfo?(alessio.placitelli)
(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #3)
> I can't reproduce this issue. Can you reproduce it in a fresh profile, or
> give me more details?

This doesn't always happen immediately. The warning usually appears after a Nightly restart and some amount of idle time. Sometimes I've seen it show up after a few minutes, sometimes the following day. I suspect that's because of the random amo ping timer.

You should be able to test this by setting up a local Telemetry Experiment Server instance and apply the code from bug 1367246. Since the client for my experiment is Windows-only you'll need to test with a Windows client.
Flags: needinfo?(anthony.s.hughes)
I think the addons team is best equipped to look at this, as we don't know the code that triggers these warnings.
We should not show any of these warnings for the experiments addon type.
Flags: needinfo?(gfritzsche)
Flags: needinfo?(alessio.placitelli)
(In reply to Anthony Hughes (:ashughes) [GFX][QA][Mentor] from comment #6)
> This doesn't always happen immediately. The warning usually appears after a
> Nightly restart and some amount of idle time. Sometimes I've seen it show up
> after a few minutes, sometimes the following day. I suspect that's because
> of the random amo ping timer.

I don't think this has anything to do with AMO.  We explicitly skip the update check for experiments:
http://searchfox.org/mozilla-central/rev/507743376d1ba753d14ab6b9305b7c6358570be8/toolkit/mozapps/extensions/internal/XPIProvider.jsm#5574-5578

I'm also surprised by comment 6 -- the check for sideloaded extensions happens at startup so having it happen after a lengthy amount of time is strange.  The notification in the screenshot is typically shown after selecting an item in the hamburger menu.  Can you confirm that that's what you did right before taking the screenshot?  And based on comment 6 do I understand correctly that the hamburger menu item for the experiment was not there at startup but it appeared after some period of time?
Flags: needinfo?(anthony.s.hughes)
(In reply to Andrew Swan [:aswan] from comment #8)

Hi Andrew,

I saw the (!) icon on top of the Menu button in the toolbar. When I clicked the Menu button there was a message in the menu panel saying my experiment addon was disabled. Clicking that took me to the about:addons page which showed me the doorhanger in the screenshot. I'll try to run through the steps again locally with the latest Nightly to see if I can get you updated steps to reproduce using my code. 

In case it matters, let me remind you that I am testing this on a Windows 10 client as that's what my experiment means to target. I've not tried to reproduce this on any other platform.
Flags: needinfo?(anthony.s.hughes)
I've done some further testing and haven't been able to reproduce it just letting my Nightly idle. Here are the steps that reproduce the behaviour reliably:
1) Make sure there's no experiments.json in your profile then start Nightly
2) Force an Experiments ping through Browser Console
3) Load about:addons through the Menu button and confirm there's an experiment listed as active
> Experiment appears as active
4) Restart Nightly then load Menu > Addons to check the experiment still appears active
> Experiment appears as active
5) Quit Nightly and delete experiments.json from the profile then start Nightly
> (!) appears in Menu button and panel
6) Load about:addons
> Doorhanger warning appears
While this probably isn't a truly valid user flow, it is what I've used in the past to test/retest experiments and start anew without starting on a new profile every time. 

To reproduce the idle behaviour I've run through steps 1-4 and am I'm going to leave my Nightly running again overnight to see if the experiment is still active in the morning.
The experiment was still running when I checked this morning so I guess the normal scenario here is not an issue anymore. There is still the issue where an experiment will be flagged as sideloaded if someone deletes their experiments.json file though.

As a note, when the experiment is first installed the addon XPI state has a parameter "foreignInstall:false" as reported in the Browser Console. When the warning appears this parameter has changed to "foreignInstall:true". The user has to go to about:addons to clear the warning despite the experiment no longer being installed. As noted in comment 10, deleting the experiments.json file is how we used to retest experiment installations on the same profile.
Based on recent comments, this does not sound like an add-ons manager bug.
It still sounds suspicious though I'll keep following and if you're able to reproduce without touching experiments.json, please ping me and/or Kris.
Assignee: kmaglione+bmo → nobody
Component: Add-ons Manager → Telemetry
(In reply to Andrew Swan [:aswan] from comment #12)
> Based on recent comments, this does not sound like an add-ons manager bug.
> It still sounds suspicious though I'll keep following and if you're able to
> reproduce without touching experiments.json, please ping me and/or Kris.

Andrew, this kind of warning comes from addon manager code or its front-end, i wouldn't know where that is triggered.
That code should not trigger any of these warnings for experiments.
I'm not sure how this is not an addon manager bug, can you expand?

If you can point to any misuse of AOM APIs or so from experiments code, i'm happy to take a look though.
Flags: needinfo?(aswan)
(In reply to Georg Fritzsche [:gfritzsche] from comment #13)
> Andrew, this kind of warning comes from addon manager code or its front-end,
> i wouldn't know where that is triggered.
> That code should not trigger any of these warnings for experiments.
> I'm not sure how this is not an addon manager bug, can you expand?

Well the boundary between telemetry experiments and the add-ons manager is hazy (there's a bunch of special case code for experiments in eg XPIProvider.jsm that I don't really understand, which I assume was added and is maintained by the folks who know more about experiments).  Also, the most recent STR (see comment 11) point toward experiments.json, I'm unfamiliar with the code that handles that file, and that code appears to be in browser/experiments/Experiments.jsm.  The trail right now leads into Experiments.jsm where it would ideally be followed by somebody who actually knows that code, I'd be happy to help if the trail eventually leads back into the add-ons manager.
Flags: needinfo?(aswan)
(In reply to Andrew Swan [:aswan] from comment #14)
> The trail right now leads into
> Experiments.jsm where it would ideally be followed by somebody who actually
> knows that code, I'd be happy to help if the trail eventually leads back
> into the add-ons manager.

I don't know where that warning comes from (in AOM) and it seems to have steps to reproduce.
The code that triggers this warning is in AOM, which is not my expertise.
Can you or somebody else with addons expertise investigate what is causing this or provide next steps?

FWIW:
We originally built the experiments code in the addon manager, but don't maintain it (as far as i understand).
There were no recent relevant changes to the experiments code, so i assume changes to addons code caused this (hard to be sure).
I think changing experiments.json is for STR and not the root cause.
If specific Experiments.jsm code is triggering this, i'm happy to take a look.
Flags: needinfo?(aswan)
(In reply to Georg Fritzsche [:gfritzsche] from comment #15)
> I think changing experiments.json is for STR and not the root cause.

Removing experiments.json is almost certainly the remote cause. The sideload notifications are caused by us detecting the presence of an add-on at startup that isn't present in the database, and removing experiments.json is exactly the kind of thing that I would expect to cause that.
Catching up with aswan on IRC, there is a theory of what could happen here with the STR:
With experiments.json removed, we hit this code:
http://searchfox.org/mozilla-central/source/browser/experiments/Experiments.jsm#1200-1208

If we would trigger uninstall and remove addon state from the AOM, but not remove the actual XPI, it could look like a side-loaded extension.

Anthony, would you mind reproducing this with the following logging enabled:
- pref extensions.logging.enabled;true
- pref experiments.logging.level;"Trace" [1]

It should be specifically helpful to see the log output around this log line, if we see it:
"..._evaluateExperiments() - unknown add-ons in AddonManager: ..."

[1] https://wiki.mozilla.org/QA/Telemetry/Logging
Flags: needinfo?(aswan) → needinfo?(anthony.s.hughes)
https://pastebin.mozilla.org/9023786: Log w/ active experiment, prior to deleting experiments.json
https://pastebin.mozilla.org/9023787: Log after deleting experiments.json and restarting Nightly
Flags: needinfo?(anthony.s.hughes)
Hmm:

1496863833136   addons.xpi WARN     Attempted to remove unaccelerated-compositor-process-nightly55@experiments.mozilla.org from app-profile but it was already gone

followed shortly thereafter by

1496863834569   addons.xpi DEBUG    New add-on unaccelerated-compositor-process-nightly55@experiments.mozilla.org in app-profile

Arg, this looks like uninstall() is getting called before the DirectoryInstallLocation has been fully initialized.

We should guard against this in the add-ons manager but it would also be good if experiments could defer this activity until after startup.  Georg, is that feasible?
Flags: needinfo?(gfritzsche)
(In reply to Andrew Swan [:aswan] from comment #19)
> Arg, this looks like uninstall() is getting called before the
> DirectoryInstallLocation has been fully initialized.
> 
> We should guard against this in the add-ons manager but it would also be
> good if experiments could defer this activity until after startup.  Georg,
> is that feasible?

We should not start this activity until "profile-after-change". Is that too early?
http://searchfox.org/mozilla-central/rev/1a054419976437d0778a2b89be1b00207a744e94/browser/experiments/ExperimentsService.js#79

The problem is that experiments should be able to run early in startup to allow active experiments to act early in the session (for customizations, pref flips, ...).
As far as i know we haven't defined exactly how early that needs to be.
Flags: needinfo?(gfritzsche)
(In reply to Georg Fritzsche [:gfritzsche] from comment #20)
> (In reply to Andrew Swan [:aswan] from comment #19)
> > Arg, this looks like uninstall() is getting called before the
> > DirectoryInstallLocation has been fully initialized.
> > 
> > We should guard against this in the add-ons manager but it would also be
> > good if experiments could defer this activity until after startup.  Georg,
> > is that feasible?
> 
> We should not start this activity until "profile-after-change". Is that too
> early?
> http://searchfox.org/mozilla-central/rev/
> 1a054419976437d0778a2b89be1b00207a744e94/browser/experiments/
> ExperimentsService.js#79
> 
> The problem is that experiments should be able to run early in startup to
> allow active experiments to act early in the session (for customizations,
> pref flips, ...).
> As far as i know we haven't defined exactly how early that needs to be.

We try to defer as much as possible to after sessionstore-windows-restored.
Having a new experiment installed early is understandable, but can removing old unwanted experiments be deferred?  That would address this specific bug.
We decided for aswan to break the AOM part out into a new bug, as that seems fairly well scoped now.
Thanks for the data Anthony.

For the Experiments part, just delaying uninstalls until after "sessionstore-windows-restored" would be good enough.
I can take a look next week how much work that would actually be with the current Experiments code and if that's feasible.
Priority: P1 → P2
Flags: needinfo?(gfritzsche)
(In reply to Georg Fritzsche [:gfritzsche] from comment #22)
> We decided for aswan to break the AOM part out into a new bug, as that seems
> fairly well scoped now.

This is bug 1371363.
See Also: → 1371363
Summary: Telemetry Experiment deactivated, warns about being side-loaded → Delay Telemetry experiments addon uninstalls until after "sessionstore-windows-restored"
Flags: needinfo?(gfritzsche)
Whiteboard: triaged
Flags: needinfo?(gfritzsche)
From email discussion with Anthony, sounds like whatever's left here will only show to a few users (edge case if someone's deleted their experiments.json file). So this doesn't need to block the experiment deployment.
This was fixed at the same time as bug 1358907, specifically the code here:
https://hg.mozilla.org/mozilla-central/rev/12cfb7c5610f#l1.21
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
(In reply to Andrew Swan [:aswan] from comment #25)
> This was fixed at the same time as bug 1358907, specifically the code here:
> https://hg.mozilla.org/mozilla-central/rev/12cfb7c5610f#l1.21

Hi Andrew, should we uplift the relevant fixes to Beta55? There are several experiments planned for 55 cycle and if the data is unreliable without these fixes, it makes sense to uplift them to 55. Wdyt of the risk?
Flags: needinfo?(aswan)
Without getting into how risky the uplift would be, I don't think this bug affects the quality of data gathered from telemetry experiments.  It is just about startup performance and the bogus sideloaded extension warning described in the original bug.  My sense is that those consequences aren't severe enough to justify uplift.
Flags: needinfo?(aswan)
Flags: needinfo?(gfritzsche)
Assignee: nobody → aswan
Depends on: 1358907
Target Milestone: --- → mozilla56
Blocks: 1394903
You need to log in before you can comment on or make changes to this bug.