Closed Bug 1406392 Opened 7 years ago Closed 7 years ago

Make nsITelemetry.canRecordExtended reflect release/pre-release channels

Categories

(Toolkit :: Telemetry, enhancement, P1)

enhancement

Tracking

()

VERIFIED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: gfritzsche, Assigned: Dexter)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

This is per the preferences changes plan in here: https://docs.google.com/document/d/12U_s9zHvpt7iGEMF5DazSzzx3whGbCnVWEsZTyclta0/edit# canRecordExtended should now be "true" on pre-release channels, "false" on release channels. Fennec/Android may not be able to support this behavior yet. We will need to override this for testing, so we probably have to add a new pref "toolkit.telemetry.testing.prerelease" or so.
No longer depends on: 1406391
Assignee: nobody → alessio.placitelli
Priority: P2 → P1
Here's the different cases I identified: - New profile, beta We can change [1] so that, for platforms where |IS_UNIFIED_TELEMETRY| is set, we set |canRecordExtended| according to the new default setting. We retain the legacy behaviour when |IS_UNIFIED_TELEMETRY| is false (e.g. check Utils.isTelemetryEnabed). - Old profile, extended enabled (beta) We can apply the same logic from the "new profile" case (previous one). - Old profile, extended disabled (beta) Here, things get tricky. We should make a call about how to handle this case. * Determining the new state * canRecordExtended needs to be set to true on pre-release channels, false on release channels. We can use the logic at [2]. According to this, it looks like we already set the pref to the correct state. * Potential impact * - C++ code relying on Telemetry::canRecordExtended to perform specific Nightly-only measurements? - JS test coverage fallout? * Open questions * - Should we make canRecordExtended a read only nsITelemetry property? [1] - http://searchfox.org/mozilla-central/rev/ed1d5223adcdc07e9a2589ee20f4e5ee91b4df10/toolkit/components/telemetry/TelemetryController.jsm#627 [2] - http://searchfox.org/mozilla-central/source/modules/libpref/Preferences.cpp#5176
Hey Ted! This bug is about changing Telemetry behaviour so that "opt-in" probes always get recorded on pre-release channels and never in the Release channel. Is there any build time definition that identifies a "Release" channel build? By looking at [1], I can only find RELEASE_OR_BETA or EARLY_BETA_OR_EARLIER, which do not quite fit my requirements. Looking at the "update channel" is not really ideal: it doesn't seem very reliable as it can change for different reasons (e.g. be "default", "release-cck", ...). [1] - https://wiki.mozilla.org/Platform/Channel-specific_build_defines
Flags: needinfo?(ted)
(In reply to Alessio Placitelli [:Dexter] from comment #2) > Hey Ted! This bug is about changing Telemetry behaviour so that "opt-in" > probes always get recorded on pre-release channels and never in the Release > channel. Clearing this as Ted replied over IRC: "there's a MOZ_UPDATE_CHANNEL define that gets set directly from the --enable-update-channel option". It looks like this is being used for other features as well in the code base, so using this approach for now.
Flags: needinfo?(ted)
The main reason things look like "RELEASE_OR_BETA" now is that the final beta build is effectively the release candidate and gets released, so it's not possible to tell at build time whether a build is "just a beta" or a release. I'm not sure what MOZ_UPDATE_CHANNEL looks like in those cases, though, as obviously we do have separate beta and release populations.
Hey Liz, sorry to bother, I'm not sure if you're the right person to ask this question to: do you know what AppConstants.MOZ_UPDATE_CHANNEL [1] looks like on release candidate builds distributed to the beta population? [1] - http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/toolkit/modules/AppConstants.jsm#310
Flags: needinfo?(lhenry)
No problem - We build that release candidate from the mozilla-release repo so seems likely it is labelled as "release". I'm just pulling from m-r now to check.
Flags: needinfo?(lhenry)
Looks like it is pulling this value from app.update.channel so it would be release, for the RCs. Does that help?
Flags: needinfo?(alessio.placitelli)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #9) > Looks like it is pulling this value from app.update.channel so it would be > release, for the RCs. Does that help? Yes! Thanks Liz, sorry again for the trouble!
Flags: needinfo?(alessio.placitelli)
Comment on attachment 8918913 [details] Bug 1406392 - Make nsITelemetry.canRecordExtended true on pre release channels. https://reviewboard.mozilla.org/r/189808/#review195412 ::: toolkit/components/telemetry/TelemetryController.jsm:631 (Diff revision 2) > const enabled = Utils.isTelemetryEnabled; > Telemetry.canRecordBase = enabled || IS_UNIFIED_TELEMETRY; > + if (IS_UNIFIED_TELEMETRY) { > + // Enable extended Telemetry on pre-release channels and disable it > + // on Release. > + const isPrereleaseChannel = AppConstants.MOZ_UPDATE_CHANNEL != "release"; I'm still not entirely convinced this is super reliable, but I did not find any better option. This seems to be used in other parts of the [codebase](http://searchfox.org/mozilla-central/search?q=MOZ_UPDATE_CHANNEL+%3D%3D&case=false&regexp=false&path=) anyway.
Comment on attachment 8918913 [details] Bug 1406392 - Make nsITelemetry.canRecordExtended true on pre release channels. https://reviewboard.mozilla.org/r/189808/#review195418 A surprisingly-straightforward change for such a monumental shift. ::: toolkit/components/telemetry/TelemetryController.jsm:626 (Diff revision 2) > } > > // Configure base Telemetry recording. > // Unified Telemetry makes it opt-out. If extended Telemetry is enabled, base recording > // is always on as well. > const enabled = Utils.isTelemetryEnabled; Since we're now looking at IS_UNIFIED_TELEMETRY below we can switch this over to something like if (IS_UNIFIED_TELEMETRY) { base = true; extended = <that new logic>; } else { base = extended = Utils.isTelemetryEnabled; }
Attachment #8918913 - Flags: review?(chutten) → review+
Comment on attachment 8918913 [details] Bug 1406392 - Make nsITelemetry.canRecordExtended true on pre release channels. https://reviewboard.mozilla.org/r/189808/#review195412 > I'm still not entirely convinced this is super reliable, but I did not find any better option. This seems to be used in other parts of the [codebase](http://searchfox.org/mozilla-central/search?q=MOZ_UPDATE_CHANNEL+%3D%3D&case=false&regexp=false&path=) anyway. It's all we have. If it means we lose extended data on RC builds, then it means we lose extended data on RC builds.
Comment on attachment 8918913 [details] Bug 1406392 - Make nsITelemetry.canRecordExtended true on pre release channels. https://reviewboard.mozilla.org/r/189808/#review195856 ::: toolkit/components/telemetry/TelemetryController.jsm:629 (Diff revision 3) > // Unified Telemetry makes it opt-out. If extended Telemetry is enabled, base recording > // is always on as well. > - const enabled = Utils.isTelemetryEnabled; > - Telemetry.canRecordBase = enabled || IS_UNIFIED_TELEMETRY; > - Telemetry.canRecordExtended = enabled; > + if (IS_UNIFIED_TELEMETRY) { > + // Enable extended Telemetry on pre-release channels and disable it > + // on Release/ESR. > + const isPrereleaseChannel = !["release", "esr"].includes(AppConstants.MOZ_UPDATE_CHANNEL); One thing that i'm not sure off-hand about: What does this approach mean for e.g. Linux distributions? ::: toolkit/components/telemetry/docs/internals/preferences.rst:153 (Diff revision 3) > > ``toolkit.telemetry.send.overrideOfficialCheck`` > > If true, allows sending pings on unofficial builds. Requires a restart. > > +``toolkit.telemetry.send.overridePreRelease`` *.testing.* ::: toolkit/components/telemetry/docs/internals/preferences.rst:155 (Diff revision 3) > > If true, allows sending pings on unofficial builds. Requires a restart. > > +``toolkit.telemetry.send.overridePreRelease`` > + > + If true, allows recording opt-in Telemetry on the Release channel. Requires a restart. Do we need to override the other way too? Recording only "release"/"opt-out" on pre-release builds?
Comment on attachment 8919641 [details] Bug 1406392 - Whitelist the "overridePreRelease" testing pref in ContentPrefs.cpp. https://reviewboard.mozilla.org/r/190542/#review195864 Bummer. Ah well, we do what we must.
Attachment #8919641 - Flags: review?(chutten) → review+
Comment on attachment 8918913 [details] Bug 1406392 - Make nsITelemetry.canRecordExtended true on pre release channels. https://reviewboard.mozilla.org/r/189808/#review195856 > One thing that i'm not sure off-hand about: > What does this approach mean for e.g. Linux distributions? I've been running some analyses this morning to confirm that, as I had some doubts about the reliability of `MOZ_UPDATE_CHANNEL`. First, a note: while the `app.update.channel` pref can be easily changed, `AppConstants.MOZ_UPDATE_CHANNEL` requires building Firefox again or injecting code. The value of `AppConstants.MOZ_UPDATE_CHANNEL` is populated at compilation time and it's used for the initial value of `app.update.channel` as well. It means that, for environments reporting weird channels through the pref, we might still have a reasonable value in `AppConstants.MOZ_UPDATE_CHANNEL`. I ran a simple [STMO query](https://sql.telemetry.mozilla.org/queries/47690#table) to check the distribution of the reported channel in the *Longitudinal* dataset together with the partner information (that is populated by Linux repacks), if available. It turns out that: - most of the channels behave as expected: we have *release*, *beta*, *aurora* (still!) and *nightly*; - users on Fedora are reporting the *default* channel, which is what we have in our local builds; - users on an openSuse build are reporting *esr52*; - there's a long, long tail of weird/random values in the reported channel. I think puttng weird values in there is purposedly done with the intent to manually disable updates. With that said, I think we should be generally ok with this approach. However, a safest path might be to *whitelist* the channels instead of *blacklisting* them: let's just turn *extended telemetry* on for Nightly, Beta and Aurora. However, I'm not entirely sure how we can completely fix the problem on Linux builds: is the Fedora *default* channel Release? Is it Nightly? Is it a single channel for all the different builds because they're using a different update mechanism? Proposal: change this to a "whitelisting" approach and file follow-up bugs to investigate a fix for the Fedora/openSUSE Linux distributions.
Comment on attachment 8918913 [details] Bug 1406392 - Make nsITelemetry.canRecordExtended true on pre release channels. https://reviewboard.mozilla.org/r/189808/#review195856 > Do we need to override the other way too? > Recording only "release"/"opt-out" on pre-release builds? I don't think that's needed, the current tests work just fine the way it is, since that pref is always true there. Were you thinking of some different test coverage that requires it?
To my knowledge "aurora" is still the correct update channel for developer edition. That it happens to contain beta repacks is of no concern to the update code. I'm good with switching to a whitelisting approach for now. I suspect this will be as far as we get and we will just never receive extended telemetry from Linux rebuilds.
For a history of the pain involved on using the channel name (because Linux distros use "default"), take a look at bug 1348576, and the ugly workaround I had to come up with: https://hg.mozilla.org/mozilla-central/rev/171c6ef99177 Some more details about the channel name: it always use the `default` value of app.update.channel, regardless of what's in about:config. It's possible to programmatically change the default value, but not in a permanent way, and with legacy extensions out, I think it's safe to say that external addons won't be able to muck with that.. However, we do sometimes change that on purpose through code for testing, and to test new releases. QA changes the channel name from release to release-test or something like that to test updates from one version to the other before they are out in the wild.. So bear in mind that while they are doing that (for example), it might also means that their telemetry code changes. What _is_ more reliable is the build-time defines like NIGHTLY, BETA, etc, because they come from the tree being built, and not the setting in about:config. (Not sure if that's a valid suggestion for this case or not.. I'm just doing a braindump here with what I know about the channel name..)
Although if your whitelisting is towards nightly/beta only, it might work just fine.. If it was to whitelist in the other direction (release) then it gets all the complications of "default" for linux, "release-yahoo", etc..
(In reply to :Felipe Gomes (needinfo me!) from comment #22) > For a history of the pain involved on using the channel name (because Linux > distros use "default"), take a look at bug 1348576, and the ugly workaround > I had to come up with: > https://hg.mozilla.org/mozilla-central/rev/171c6ef99177 Ouch, thanks for sharing that Felipe! I see your pain :( > Some more details about the channel name: it always use the `default` value > of app.update.channel, regardless of what's in about:config. It's possible > to programmatically change the default value, but not in a permanent way, > and with legacy extensions out, I think it's safe to say that external > addons won't be able to muck with that.. However, we do sometimes change > that on purpose through code for testing, and to test new releases. > QA changes the channel name from release to release-test or something like > that to test updates from one version to the other before they are out in > the wild.. So bear in mind that while they are doing that (for example), it > might also means that their telemetry code changes. While this is true for the "app.update.channel" pref, I think that "AppConstants.MOZ_UPDATE_CHANNEL" makes things a bit trickier: that is set at build time and its value is used as the default value for "app.update.channel" (that, as you mentioned, can be changed programmatically!). While even the latter could potentially still be messed up by legacy add-ons, this change will live in FF58. In that version these unexpected behaviours should hopefully be less of a problem! :) > What _is_ more reliable is the build-time defines like NIGHTLY, BETA, etc, > because they come from the tree being built, and not the setting in > about:config. (Not sure if that's a valid suggestion for this case or not.. > I'm just doing a braindump here with what I know about the channel name..) I *completely* agree with you. However, I could not find any reasonable define that covered Nightly, Aurora and Beta. See comment 2 and comment 3. (In reply to :Felipe Gomes (needinfo me!) from comment #23) > Although if your whitelisting is towards nightly/beta only, it might work > just fine.. If it was to whitelist in the other direction (release) then it > gets all the complications of "default" for linux, "release-yahoo", etc.. Right! I'm changing this to a whitelisting approach to make it less prone to surprises :-) Thanks for your feedback/braindump Felipe!
Ok, looks like I've sorted out all the test problems. It's time for me to try to land this and write to fx-data-dev :)
Pushed by alessio.placitelli@gmail.com: https://hg.mozilla.org/integration/autoland/rev/5e5b474946b7 Make nsITelemetry.canRecordExtended true on pre release channels. r=chutten https://hg.mozilla.org/integration/autoland/rev/f41df8e8f3f9 Whitelist the "overridePreRelease" testing pref in ContentPrefs.cpp. r=chutten
Depends on: 1411093
Noticed these perf improvements: == Change summary for alert #10131 (as of October 23 2017 07:40 UTC) == Improvements: 2% Explicit Memory summary linux32-stylo-disabled opt stylo-disabled 216,254,747.86 -> 211,005,127.99 2% Heap Unclassified summary linux32-stylo-disabled opt stylo-disabled 41,769,126.09 -> 40,771,218.28 2% Heap Unclassified summary windows10-64 pgo 47,424,956.10 -> 46,360,353.10 2% JS summary windows10-64 opt 110,321,956.90 -> 108,037,978.22 2% Heap Unclassified summary windows10-64 opt 47,408,142.75 -> 46,433,676.12 2% Heap Unclassified summary windows10-64 opt 47,432,539.16 -> 46,477,979.39 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=10131
Depends on: 1411269
No longer depends on: 1411269
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #36) > Noticed these perf improvements: > > == Change summary for alert #10131 (as of October 23 2017 07:40 UTC) == > > Improvements: > > 2% Explicit Memory summary linux32-stylo-disabled opt stylo-disabled > 216,254,747.86 -> 211,005,127.99 > 2% Heap Unclassified summary linux32-stylo-disabled opt stylo-disabled > 41,769,126.09 -> 40,771,218.28 > 2% Heap Unclassified summary windows10-64 pgo > 47,424,956.10 -> 46,360,353.10 > 2% JS summary windows10-64 opt > 110,321,956.90 -> 108,037,978.22 > 2% Heap Unclassified summary windows10-64 opt > 47,408,142.75 -> 46,433,676.12 > 2% Heap Unclassified summary windows10-64 opt > 47,432,539.16 -> 46,477,979.39 > > For up to date results, see: > https://treeherder.mozilla.org/perf.html#/alerts?id=10131 We think this is due to Telemetry being disabled on dev builds due to their channel being "default" rather than "Nightly". We're looking into this and filed bug 1411269.
Verified as fixed in FF58 - Nightly and in Release and Beta channels builds provided by Allesio. The status of the pref was verified in the following platforms and FF versions: Desktop (windows 7 x64) Release - false Beta - true Nightly - true Desktop (MacOS 10.13) Release - fale Beta - true Nightly - true Desktop (Ubuntu 14.04) Release - false Beta - true Nightly - true Android 7.1.2 Release - false Beta - true Nightly - true
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: