Fennec is sending pings with canary client ID

VERIFIED FIXED in Firefox 63

Status

()

defect
P1
normal
VERIFIED FIXED
9 months ago
8 months ago

People

(Reporter: janerik, Assigned: janerik)

Tracking

unspecified
mozilla65
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(geckoview62 unaffected, geckoview63 unaffected, firefox63+ verified, firefox64+ verified, firefox65+ verified)

Details

Attachments

(3 attachments)

After seeing the data in bug 1499753 I figured that the patch from bug 1489520 is causing the client ID to be set to a canary client ID on Fennec.
Fennec does not use unified Telemetry and therefore does not use the preference we check for.
Assignee: nobody → jrediger
Priority: -- → P1
On Beta, we are seeing probably >90% of clients report the canary id: https://sql.telemetry.mozilla.org/queries/59633/source#154232.

We can't know exactly because we can't distinguish individual clients anymore in the canary population, so relative ping counts is our closest comparison.
Hi Chris, does this also impact GV? If yes, we may need an uplift to GV63 relbranch as well.
Flags: needinfo?(cpeterson)
(In reply to Ritu Kothari (:ritu) from comment #3)
> Hi Chris, does this also impact GV? If yes, we may need an uplift to GV63
> relbranch as well.

I don't know.

Frank, does this canary ID issue affect telemetry reported from GeckoView? Or is this Fennec-specific?
Flags: needinfo?(cpeterson) → needinfo?(fbertsch)
No, this does not affect GV, since that client-id lives in the app.

I did whip up this query to ensure that is the case: https://sql.telemetry.mozilla.org/queries/59636/source#154237

There are no clients reporting the canary client-id from GV.
Flags: needinfo?(fbertsch)
(In reply to Frank Bertsch [:frank] from comment #5)
> No, this does not affect GV, since that client-id lives in the app.

Thanks. Setting geckoview63=unaffected.
Blocks: 1501425
Patch attached is reviewed and passes xpcshell tests on a Linux and Android build on try.
A manual try with the built apk in the emulator succeeded as well.

Queued for landing so we get nightly builds to test with tomorrow and opened a follow-up bug for verification.
Pushed by jrediger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/59205b705c5c
Set valid client ID on Fennec if canary is detected r=chutten
Looking at the final green try run and searching through the X1, X2, X3 logs, i can't see find "test_clientid_canary_reset_canary_on_nonunified" in the Android run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f232f87bc636a2cee5aee7655b47a5f783a6eca9&selectedJob=207313966

The failing test is in X12:
https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&selectedJob=207351808&revision=59205b705c5cbba29276d8a342179ff0caebf071

Looking at the log, i see:
> TEST-PASS | toolkit/components/telemetry/tests/unit/test_TelemetryClientID_reset.js | test_clientid_canary_reset_canary_on_nonunified - [test_clientid_canary_reset_canary_on_nonunified : 172] Client ID should be valid and random - "c0ffeec0-ffee-c0ff-eec0-ffeec0ffeec0" != "d64d2582-bcbc-445f-9d93-430f200b9e8a"
> "CONSOLE_MESSAGE: (info) 1540333607501	Toolkit.Telemetry	TRACE	TelemetryEnvironment::observe - aTopic: nsPref:changed, aData: toolkit.telemetry.cachedClientID"
> "CONSOLE_MESSAGE: (info) 1540333607676	Toolkit.Telemetry	TRACE	TelemetryEnvironment::observe - aTopic: nsPref:changed, aData: toolkit.telemetry.cachedClientID"
> TEST-UNEXPECTED-FAIL | toolkit/components/telemetry/tests/unit/test_TelemetryClientID_reset.js | test_clientid_canary_reset_canary_on_nonunified - [test_clientid_canary_reset_canary_on_nonunified : 175] "c0ffeec0-ffee-c0ff-eec0-ffeec0ffeec0" == "fcdf8e8f-d748-4f70-8703-aebde71a3e2b"

Notice how we see two pref change observations for the cachedClientID.
In the mobile data pod we decided to hold the patch from landing until a meeting on friday.
(While still resolving the test failure, so we can proceed timely from there.)

We'll want to tag the pings with the fact that individual clients were affected by this.
This also affects Fennec code, as this is where the core ping lives.
Frank, you requested tagging outgoing pings of affected clients, so we can tell them apart.
Can you sketch out what kind of information you're looking for?
Flags: needinfo?(fbertsch)
:frank:
In particular would a boolean flag be enough when the client ID is reset from the canary client ID back to a new random client ID?
Should this be persisted and be available in all future pings?
Flags: needinfo?(jrediger)
A boolean flag would be good, and we would want it for the foreseeable future (so same storage and persistence mechanism as client ID).

There are a few analyses we would want to be able to do:
- What is the monthly retention? Leaving out tagged users
- How many new users? Again, leaves out tagged users
- How many users were affected by this bug in total? Check tag
Flags: needinfo?(fbertsch)
FTR We only need it for the core ping. If for some reason we need it for saved-session, we can retrieve it using a JOIN on the core ping.
We erroneously reset client IDs on Fennec to a canary client ID.
This is now detected and a new valid and random client ID is set.

This adds a new boolean attribute "wasCanary" to the `state.json` file
generated by ClientID.jsm.

Depends on D9544
The second patch above adds a "wasCanary" attribute to the persisted states.json (where the client id is stored as well).
Fennec gets the client ID from GeckoProfile[1], which reads this file[2]. It would need to grab the new attribute from the file as well and expose it for the core ping.
The third patch adds this field in the `environment.profile` section of the `saved-session` ping.

[1]: https://searchfox.org/mozilla-central/rev/72b1e834f384a2ffec6eb4ce405fbd4b5e881109/mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryCorePingDelegate.java#157-163
[2]: https://searchfox.org/mozilla-central/rev/72b1e834f384a2ffec6eb4ce405fbd4b5e881109/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#478-490
For completeness:

1. Patch: On Android only: generating a new random client ID when a canary client ID is detected at startup.
(try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=838a2b6b9bfd204768beafe93cbe6d1544569b2b)

2. Patch: On Android only: When regenerating a client ID, detect the previous canary client ID and store a flag. Adds a new "wasCanary" attributed to the persisted states.json when this was detected.

3. Patch: On Android only: Add a field `environment.profile.wasCanary`, used in the `saved-session` ping, using the persisted information from patch 2.
(try run for patch 2+3: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fdb30b6a385ceb8daf35d189c7d86a02d89f6d60)
Clarifiying the data side:

For Fennec clients that had a canary client ID set:
* We generate a new random one
* We set the `wasCanary` flag to true, it is included in the saved-sessiong ping
(It can be included in the core ping as listed above).

For Fennec clients that have a valid client ID set:
* We don't change the client ID
* The `wasCanary` flag is not included in the ping data.
(In reply to Jan-Erik Rediger [:janerik] from comment #21)
> Clarifiying the data side:

Frank, is this good?
Flags: needinfo?(fbertsch)
That does sound good, but we definitely need this to be reported in the core ping before landing.
Flags: needinfo?(fbertsch)
No longer blocks: 1501425
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3ae8da2e4b2b
Set valid client ID on Fennec if canary is detected r=chutten
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/72c4825708d7
Persist information about canary after resetting client ID r=chutten
https://hg.mozilla.org/integration/autoland/rev/db22d59d1d9b
Expose "wasCanary" flag in saved-session pings for Fennec r=chutten
Please nominate this for Beta approval when you get a chance.
Flags: needinfo?(jrediger)
Comment on attachment 9019394 [details]
Bug 1501329 - Set valid client ID on Fennec if canary is detected

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1489520

User impact if declined: Telemetry is sent with a  canary clientId and we are not able to distinguish them. Analysis is impacted.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Bug 1502435

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Patch has been verified manually and by automated test and confirmed as working.
We're monitoring incoming Telemetry data to verify it as well.

String changes made/needed:
Flags: needinfo?(jrediger)
Attachment #9019394 - Flags: approval-mozilla-beta?
Comment on attachment 9020311 [details]
Bug 1501329 - Persist information about canary after resetting client ID

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1489520

User impact if declined: Telemetry is sent with a  canary clientId and we are not able to distinguish them. Analysis is impacted.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Bug 1502435

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Patch has been verified manually and by automated test and confirmed as working.
We're monitoring incoming Telemetry data to verify it as well.

String changes made/needed:
Attachment #9020311 - Flags: approval-mozilla-beta?
Comment on attachment 9020312 [details]
Bug 1501329 - Expose "wasCanary" flag in saved-session pings for Fennec

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1489520

User impact if declined: Telemetry is sent with a  canary clientId and we are not able to distinguish them. Analysis is impacted.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Bug 1502435

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Patch has been verified manually and by automated test and confirmed as working.
We're monitoring incoming Telemetry data to verify it as well.

String changes made/needed:
Attachment #9020312 - Flags: approval-mozilla-beta?
Comment on attachment 9019394 [details]
Bug 1501329 - Set valid client ID on Fennec if canary is detected

[Triage Comment]
Fix for broken Fennec Telemetry reporting needed for the 63.0.2 dot release. Approved for 64.0b6.
Attachment #9019394 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9020311 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9020312 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Flags: in-testsuite+
Duplicate of this bug: 1501425
Verified this along with the work from bug 1502435 on latest Beta build 64.0b6. All works as expected. Keeping the qe-verify flag to check also in Nightly.
Jan-Erik, now that the fix was verified in Beta, please request approval for an uplift to mozilla-release. Thanks!
Flags: needinfo?(jrediger)
Comment on attachment 9019394 [details]
Bug 1501329 - Set valid client ID on Fennec if canary is detected

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1489520

User impact if declined: Telemetry is sent with a  canary clientId and we are not able to distinguish them. Analysis is impacted.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Bug 1502435

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Patch has been verified manually and by automated test and confirmed as working.
We're monitoring incoming Telemetry data to verify it as well.

String changes made/needed:
Flags: needinfo?(jrediger)
Attachment #9019394 - Flags: approval-mozilla-release?
Comment on attachment 9020311 [details]
Bug 1501329 - Persist information about canary after resetting client ID

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1489520

User impact if declined: Telemetry is sent with a  canary clientId and we are not able to distinguish them. Analysis is impacted.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Bug 1502435

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Patch has been verified manually and by automated test and confirmed as working.
We're monitoring incoming Telemetry data to verify it as well.

String changes made/needed:
Attachment #9020311 - Flags: approval-mozilla-release?
Comment on attachment 9020312 [details]
Bug 1501329 - Expose "wasCanary" flag in saved-session pings for Fennec

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1489520

User impact if declined: Telemetry is sent with a  canary clientId and we are not able to distinguish them. Analysis is impacted.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Bug 1502435

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Patch has been verified manually and by automated test and confirmed as working.
We're monitoring incoming Telemetry data to verify it as well.

String changes made/needed:
Attachment #9020312 - Flags: approval-mozilla-release?
Comment on attachment 9019394 [details]
Bug 1501329 - Set valid client ID on Fennec if canary is detected

Fix for broken Fennec Telemetry reporting needed for the 63.0.2 dot release.
Attachment #9019394 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9020311 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9020312 - Flags: approval-mozilla-release? → approval-mozilla-release+
Verified as fixed on RC build 63.0.2 following the same steps from comment 34.
Status: RESOLVED → VERIFIED
Verified as fixed on latest Nightly build (65.0a1 - 11/11).
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.