Open Bug 1363759 Opened 4 years ago Updated 3 years ago

SHIELD "shield-study" pings don't have addons in the environment (breaks about:telemetry)

Categories

(Shield :: Shield Study, defect)

defect
Not set
normal

Tracking

(Not tracked)

People

(Reporter: benjamin, Unassigned)

Details

STR:
Run a build with an active SHIELD experiment
Quit the build
Run it again
Visit about:telemetry
Choose "Archived Ping Data"

Actual:
Archived ping data is broken and there is no selector for an archived ping to load
Message in the JS console: TypeError: addons is undefined[Learn More]  aboutTelemetry.js:720:5

This is https://hg.mozilla.org/mozilla-central/annotate/b21b974d60d3075ae24f6fb1bae75d0f122f28fc/toolkit/content/aboutTelemetry.js#l720

I loaded the ping by hand using TelemetryArchive, and it's a shield-study ping. This ping has an environment section, but no addons block. I'm going to attach the entire ping contents.

I suspect that this is because shield is sending the ping "too late" after addons have already started shutting down, and that's throwing an exception which ends up messing up the environment, but I'm not sure of this because it's all in shutdown and I can't see the console.

I believe there are several things to fix here, but I'd like gfritzsche or Dexter to decide how best to organize the bug tree:
* unexpected exceptions in aboutTelemetry.js rendering shouldn't affect the history view or prevent me from at least viewing the raw data
* the aboutTelemetry.js code should be much more forgiving of missing data
* either the shield code needs to send its ping sooner or the telemetry code needs to handle this correctly
(In reply to Benjamin Smedberg [:bsmedberg] from comment #0)
> STR:
> Run a build with an active SHIELD experiment
> Quit the build
> Run it again
> Visit about:telemetry
> Choose "Archived Ping Data"
...
> I loaded the ping by hand using TelemetryArchive, and it's a shield-study
> ping. This ping has an environment section, but no addons block. I'm going
> to attach the entire ping contents.
> 
> I suspect that this is because shield is sending the ping "too late" after
> addons have already started shutting down, and that's throwing an exception
> which ends up messing up the environment, but I'm not sure of this because
> it's all in shutdown and I can't see the console.

Michael, can you point me to where those shield-study pings are submitted from?
And what event/notification/... they hang off from?
Do you have simple steps on how we could run a SHIELD study to check what is happening?
Flags: needinfo?(mkelly)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #0)
> * unexpected exceptions in aboutTelemetry.js rendering shouldn't affect the
> history view or prevent me from at least viewing the raw data

Filed bug 1363828.

> * the aboutTelemetry.js code should be much more forgiving of missing data

Good point. We're about to do a re-design of about:telemetry as part of GSoC, this is best left for that project.
The shield-study pings are sent by individual SHIELD study add-ons; the system add-on only sends heartbeat pings. Some studies use https://github.com/mozilla/shield-studies-addon-utils/ to send pings, but they may also be sending the pings manually. We'll have to find a specific study add-on to test with.

gregglind: What studies are currently/recently active, and where can we find their source code / XPIs?
Flags: needinfo?(mkelly) → needinfo?(glind)
The addon in question is the Flash study, which code lives at https://github.com/squarewave/shield-study-essential-flash

The code in this repository doesn't send the shield-study ping, so I assumed it was some other common lib. I don't see a reference to the string "shield-studies-addon-utils" in this repository, though.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4)
> The code in this repository doesn't send the shield-study ping, so I assumed
> it was some other common lib. I don't see a reference to the string
> "shield-studies-addon-utils" in this repository, though.

It's in the devDependencies in package.json. But, from looking at the code, it manually sends telemetry pings of the type 'flash-shield-study': https://github.com/squarewave/shield-study-essential-flash/blob/master/lib/dataCollection.js#L279

It keeps the normal study-lifecycle hooks from the util library though, which calls a "report" function that sends pings with the type 'shield-study': https://github.com/mozilla/shield-studies-addon-utils/blob/master/lib/index.js#L194

So if the offending pings are 'shield-study' pings, then the culprit is the util library; if they're 'flash-shield-study' pings, then it's the custom ping code in the add-on.
Flags: needinfo?(glind)
as noted in comment 0 this is a "shield-study" ping (which has an environment block but incomplete), not a flash-shield-study ping which doesn't have an environment block.

The ping is being sent via this:

this.on("normal-shutdown"...)

I don't know when this happens in the addon SDK. But since the SDK is going away RSN, what are shield studies going to do instead?
(In reply to Benjamin Smedberg [:bsmedberg] from comment #6)
> I don't know when this happens in the addon SDK. But since the SDK is going
> away RSN, what are shield studies going to do instead?

gregglind is working on a version of the utils that won't rely on the SDK right now. Studies in the future will be bootstrapped add-ons (signed using a special key so that they work past the WebExtensions cutoff; see bug 1336576), potentially with embedded WebExtensions for features that can be handled by them.
Priority: -- → P3
I'm moving this to the SHIELD product for now, the assumption being that SHIELD submits those pings too late in shutdown.
Feel free to move this back if needed.

SHIELD should submit the pings blocking AsyncShutdown.sendTelemetry:
https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/AsyncShutdown.jsm

(the observer topic that shuts down Telemetry is "profile-before-change-telemetry")
Component: Telemetry → General
Priority: P3 → --
Product: Toolkit → Shield
Component: General → Shield Study
You need to log in before you can comment on or make changes to this bug.