Closed Bug 1471647 Opened 5 years ago Closed 5 years ago

Telemetry for ASan Nightly Project

Categories

(Firefox :: General, enhancement)

x86_64
All
enhancement
Not set
critical

Tracking

()

RESOLVED FIXED
Firefox 63
Tracking Status
firefox63 --- fixed

People

(Reporter: decoder, Assigned: decoder)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

As discussed on IRC, we need basic telemetry for the ASan Nightly Project in order to be able to see how many users are actually using it (preferably before we go live).

Even though the builds receive their own updates, they still identify as "nightly" update channel and have Telemetry currently disabled. Before we can enable Telemetry, we probably have to change some code to be able to override the channel that is reported via Telemetry so these results don't get mixed with the regular Nightly results.


One option would be to add an override pref and patch toolkit/components/telemetry/TelemetryController.jsm to use this pref instead of UpdateUtils.getUpdateChannel if it is set. We can then set the pref by default in the MOZ_ASAN_REPORTER builds.
Playing this question from the other side: Why -don't- we want these results mixed with other Nightly data? Could we not treat these as nightly users running on different platforms?
(In reply to Chris H-C :chutten from comment #1)
> Playing this question from the other side: Why -don't- we want these results
> mixed with other Nightly data? Could we not treat these as nightly users
> running on different platforms?

That is a decision I cannot make. Note that ASan Nightly has different properties when it comes to speed and memory consumption, compared to the regular Nightly builds. So e.g. their performance is not comparable. Also, how would they identify as a different platform? Their OS/Platform/Channel settings are all identical to regular Nightly.
I wonder if the difference will be more/less of a difference than the situation we're already in with people changing about:config flags however they'd like. Nightly is the Wild West, I'm not sure ASan will make too much difference so long as we put something in the Environment that lets us know.

Of course, this is just my perspective. If this has already been decided, I apologize for the noise.
(In reply to Chris H-C :chutten from comment #3)
> I wonder if the difference will be more/less of a difference than the
> situation we're already in with people changing about:config flags however
> they'd like. Nightly is the Wild West, I'm not sure ASan will make too much
> difference so long as we put something in the Environment that lets us know.

Fair enough. ASAN builds consume much more memory and are way slower. As Christian mentioned, since Telemetry is disabled, they don't know how many people are running them. If it's a significant number (compared to the size of the Nightly channel), I would expect alerts to be raised. Is this a concern for cerberus? If so, would having a flag in the Environment help? Would we need to filter out these builds in other analyses as well?

> Of course, this is just my perspective. If this has already been decided, I
> apologize for the noise.

All perspectives are welcome :)
Flags: needinfo?(chutten)
(In reply to Alessio Placitelli [:Dexter] from comment #4)
> If it's a significant number (compared to the size
> of the Nightly channel), I would expect alerts to be raised. 

Right now, the number should be extremely low, but we have no clue what will happen once we announce the project to a wider audience. My initial talking to :mreid seemed to confirm that this will trigger alerts on Telemetry and that in the current state, it will be hard to differentiate those builds from the regular ones.

Also, it is currently Linux only but we are already working on Windows support, which might increase the number of users further.


> 
> > Of course, this is just my perspective. If this has already been decided, I
> > apologize for the noise.
> 
> All perspectives are welcome :)

I agree, thanks for the input! And I would be the first to cheer if we could just enable Telemetry without any further code changes ;)
If presented as nightly, nightly-asan data will be aggregated and the aggregated data will be used by cerberus. Whether it would raise alerts is a question of whether or not the data from nightly-asan is noticeably different from that of nightly, and if there's enough of it to quickly shift the distribution of performance probes. Given the operating assumptions that the population is presently small and adoption would not be immediate.

Either way, I presume we'd -want- alerting (even imperfect alerting like cerberus provides) on ASAN if only to identify changes that make nightly-asan worse enough that we'll lose population.

I'm in favour of turning Telemetry on for nightly-asan without modification. :mreid, you're cited here as a reason for being cagey. Would you mind sharing your rationale for it?
Flags: needinfo?(chutten) → needinfo?(mreid)
My rationale is that ASan is expected to have markedly different performance characteristics from plain Firefox, so telemetry data from this population will likely add noise to signals we currently use to detect regressions on nightly.

Without an easy way to differentiate this cohort from the rest of nightly, we hinder our ability to identify regressions in both populations. 

Further, ASan builds actually do have their own update channel (as I understand it), so their build ids may not directly relate to those of the official nightly channel, so it seems sensible to differentiate this channel using a separate name.

I guess the question is whether this should truly be considered a separate update channel. If so, then we should report it as such. If not, we should probably identify this population some other way - tagging the environment similarly to what we do with experiments / normandy might do the trick.
Flags: needinfo?(mreid)
(In reply to Mark Reid [:mreid] from comment #7)
> 
> Further, ASan builds actually do have their own update channel (as I
> understand it), so their build ids may not directly relate to those of the
> official nightly channel, so it seems sensible to differentiate this channel
> using a separate name.


This is correct, these are entirely separate builds with their own update channel, but it was easier for RelEng to achieve this with leaving the actual update channel on "nightly" and differentiate this on their side (I think using a different MAR channel id).
The attached two patches realize the solution with distinguishing the builds by channel name. If you happen to decide that this isn't required (or that you want to distinguish them by some other means), let me know and I'll adjust accordingly. Thanks!
(In reply to Mark Reid [:mreid] from comment #7)
> Further, ASan builds actually do have their own update channel (as I
> understand it), so their build ids may not directly relate to those of the
> official nightly channel, so it seems sensible to differentiate this channel
> using a separate name.

Christian, do you know who might be able to confirm this?

> I guess the question is whether this should truly be considered a separate
> update channel. If so, then we should report it as such. If not, we should
> probably identify this population some other way - tagging the environment
> similarly to what we do with experiments / normandy might do the trick.

Yes, I agree with you on this. Let's also consider that exposing some other way (e.g. tagging the environment) will require changes to the datasets to expose such field as well. This is not a big deal, but it's a dependency that needs to be taken care of.
Flags: needinfo?(choller)
(In reply to Alessio Placitelli [:Dexter] from comment #14)
> (In reply to Mark Reid [:mreid] from comment #7)
> > Further, ASan builds actually do have their own update channel (as I
> > understand it), so their build ids may not directly relate to those of the
> > official nightly channel, so it seems sensible to differentiate this channel
> > using a separate name.
> 
> Christian, do you know who might be able to confirm this?

Whoops, looks like this was confirmed in comment 8. Chris, does this and comment 7 help with your concerns?
Flags: needinfo?(choller) → needinfo?(chutten)
Since it is using a separate channel to update it only makes sense that it advertise a separate channel through Telemetry. All's good.
Flags: needinfo?(chutten)
Comment on attachment 8988607 [details]
Bug 1471647 - Enable Telemetry on ASan Nightly Reporter builds.

https://reviewboard.mozilla.org/r/253844/#review261234

::: browser/config/mozconfigs/linux64/nightly-asan-reporter:24
(Diff revision 3)
>  
>  # Need this to add source information into platform.ini
>  export MOZILLA_OFFICIAL=1
>  
> +# Enable Telemetry
> +# The build will identify itself as "nightly-asan" using the override pref

nit: let's be a bit more specific here, mentioning that the channel reported by telemetry will be "nightly-asan", while the normal channel (used for updates, etc.) will be "whateveritis" :)
Comment on attachment 8988609 [details]
Bug 1471647 - Add a pref to override the Telemetry update channel.

https://reviewboard.mozilla.org/r/253846/#review261236

::: toolkit/components/telemetry/tests/unit/test_TelemetryUtils.js:178
(Diff revision 1)
>    Assert.ok(ObjectUtils.deepEqual(EXPECTED_DATA_TEST, packedKeyedHistogramsTest),
>              "Packed data must be correct.");
>  });
> +
> +add_task(async function testUpdateChannelOverride() {
> +  if (Preferences.has(TelemetryUtils.Preferences.OverrideUpdateChannel)) {

This is not needed in mozilla central, right? Let's remove lines from 178 to 182

::: toolkit/components/telemetry/tests/unit/test_TelemetryUtils.js:185
(Diff revision 1)
> +    // that makes use of the override pref. For testing purposes, unset the pref.
> +    Preferences.set(TelemetryUtils.Preferences.OverrideUpdateChannel, "");
> +  }
> +
> +  // Check that we return the same channel as UpdateUtils, by default
> +  Assert.equal(TelemetryUtils.getUpdateChannel(), UpdateUtils.getUpdateChannel(false));

nit: let's add a message to this check, so that we can quickly tell what's going on from the logs if this fails. So this should become 

`Assert.equal(TelemetryUtils.getUpdateChannel(), UpdateUtils.getUpdateChannel(false), "The telemetry reported channel must match the one from UpdateChannel, by default");`

::: toolkit/components/telemetry/tests/unit/test_TelemetryUtils.js:190
(Diff revision 1)
> +  Assert.equal(TelemetryUtils.getUpdateChannel(), UpdateUtils.getUpdateChannel(false));
> +
> +  // Now set the override pref and check that we return the correct channel
> +  const OVERRIDE_TEST_CHANNEL = "nightly-test";
> +  Preferences.set(TelemetryUtils.Preferences.OverrideUpdateChannel, OVERRIDE_TEST_CHANNEL);
> +  Assert.equal(TelemetryUtils.getUpdateChannel(), OVERRIDE_TEST_CHANNEL);

As above, let's add a message to this check as well
Comment on attachment 8988609 [details]
Bug 1471647 - Add a pref to override the Telemetry update channel.

https://reviewboard.mozilla.org/r/253846/#review261236

> This is not needed in mozilla central, right? Let's remove lines from 178 to 182

We do have treeherder ASAN builds, but they are not currently running xpcshell tests. This might change in the future, so let's keep this lines.
Comment on attachment 8988607 [details]
Bug 1471647 - Enable Telemetry on ASan Nightly Reporter builds.

https://reviewboard.mozilla.org/r/253844/#review261624
Attachment #8988607 - Flags: review?(alessio.placitelli) → review+
Comment on attachment 8988609 [details]
Bug 1471647 - Add a pref to override the Telemetry update channel.

https://reviewboard.mozilla.org/r/253846/#review261628
Attachment #8988609 - Flags: review?(alessio.placitelli) → review+
Pushed by choller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ccb58e704f47
Add a pref to override the Telemetry update channel. r=Dexter
https://hg.mozilla.org/integration/autoland/rev/8db2e9b079d9
Enable Telemetry on ASan Nightly Reporter builds. r=Dexter
https://hg.mozilla.org/mozilla-central/rev/ccb58e704f47
https://hg.mozilla.org/mozilla-central/rev/8db2e9b079d9
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 63
You need to log in before you can comment on or make changes to this bug.