Closed
Bug 1489520
Opened 7 years ago
Closed 7 years ago
Canary client id not always resetting to new value
Categories
(Toolkit :: Telemetry, defect, P1)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla64
People
(Reporter: mreid, Assigned: janerik)
References
Details
Attachments
(2 files)
I was looking at about:telemetry and noticed that my client id has changed to:
c0ffeec0-ffee-c0ff-eec0-ffeec0ffeec0
This seems an unlikely UUID to say the least :)
I double checked main_summary, and there appear to be about 500 main pings per day from this particular client id (I only have 8 so far in my local datareporting archive, so they're not all from me).
| Reporter | ||
Comment 1•7 years ago
|
||
Hm, never mind. Looks like I disabled telemetry locally. False alarm!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
| Assignee | ||
Comment 2•7 years ago
|
||
Pings with that client ID should never be sent. If it does happen that is indeed a bug.
I'll investigate.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
| Assignee | ||
Updated•7 years ago
|
Assignee: nobody → jrediger
| Assignee | ||
Updated•7 years ago
|
Points: --- → 1
Priority: -- → P1
| Assignee | ||
Comment 3•7 years ago
|
||
With the help of :mreid I investigated this a bit.
The main pings with a client ID of c0ffeec0-ffee-c0ff-eec0-ffeec0ffeec0 were never send out from his Firefox instance (as expected). We always archive pings, even if not sending them.
I verified this using stmo and the specific ping document IDs.
I identified a small data race that could happen when flipping the flag:
https://searchfox.org/mozilla-central/rev/d4ef4e9747133aa2914aca2a15cf9df1e42a6aa0/toolkit/components/telemetry/app/TelemetryController.jsm#846
If something is submitting a ping right after the flag was flipped, but before the observer is executed,
it will use the cached c0ffee ID and, because the pref is enabled, send a ping. I guess that's a rare thing, but with that many clients out there, it can potentially happen more often.
I don't know (yet) if that is what we are seeing with the 500 pings/day in main_summary.
| Assignee | ||
Comment 4•7 years ago
|
||
I investigated a bit further.
I looked into more details of pings with the canary client ID (c0ffeec0-ffee-c0ff-eec0-ffeec0ffeec0) here:
https://sql.telemetry.mozilla.org/queries/58858
This shows that the canary client ID turns up in pings from release/beta versions.
This can be explained like this:
1) A user tries out beta or nightly.
3) In beta/nightly they disable Telemetry upload, at which point the canary client ID is persisted.
4) They switch back to stable
5) They enable Telemetry.
5) A normal ping is send with the canary client ID.
This would explain why we see consistent pings with the canary ID from probably the same clients.
This does not fully explain why we see pings with a canary ID 500-800 times a day.
Possible solution:
At Telemetry init check the persisted client ID and if upload is enabled.
If client ID matches the canary and the upload is enabled, reset the client ID to a valid value.
(Probably in reverse if upload is disabled as well)
This should now be done regardless of other solutions in order to get those clients to valid client IDs again eventually.
It should probably be uplifted to beta to get it into release as soon as possible (not sure of direct stable uplifts).
Additionally:
We can extend Telemetry to avoid ever sending if the payload contains the canary clientID.
Though this would mean we might lose otherwise valid (and allowed-to-upload) data.
We should monitor the rate of pings with the canary client ID on ingestions.
Flags: needinfo?(mreid)
Flags: needinfo?(gfritzsche)
Comment 5•7 years ago
|
||
(In reply to Jan-Erik Rediger [:janerik] from comment #4)
> I looked into more details of pings with the canary client ID
> (c0ffeec0-ffee-c0ff-eec0-ffeec0ffeec0) here:
> https://sql.telemetry.mozilla.org/queries/58858
>
> This shows that the canary client ID turns up in pings from release/beta
> versions.
> This can be explained like this:
> 1) A user tries out beta or nightly.
> 3) In beta/nightly they disable Telemetry upload, at which point the canary
> client ID is persisted.
> 4) They switch back to stable
> 5) They enable Telemetry.
> 5) A normal ping is send with the canary client ID.
Why would the client id not set to a new value between 5) & 6)?
Flags: needinfo?(gfritzsche) → needinfo?(jrediger)
Comment 6•7 years ago
|
||
(In reply to Jan-Erik Rediger [:janerik] from comment #4)
> This would explain why we see consistent pings with the canary ID from
> probably the same clients.
> This does not fully explain why we see pings with a canary ID 500-800 times
Is it possible to get numbers on how many of these are linked to channel switching?
Comment 7•7 years ago
|
||
(In reply to Jan-Erik Rediger [:janerik] from comment #4)
> At Telemetry init check the persisted client ID and if upload is enabled.
> If client ID matches the canary and the upload is enabled, reset the client
> ID to a valid value.
> (Probably in reverse if upload is disabled as well)
>
> This should now be done regardless of other solutions in order to get those
> clients to valid client IDs again eventually.
> It should probably be uplifted to beta to get it into release as soon as
> possible (not sure of direct stable uplifts).
+1, to adding checks both ways & the uplifts.
How does that work out when switching back and forth? beta -> release -> beta?
| Assignee | ||
Comment 8•7 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #5)
> Why would the client id not set to a new value between 5) & 6)?
Because stable doesn't have that code yet.
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> Is it possible to get numbers on how many of these are linked to channel switching?
client ID is the only way for us to identify them across channels and in this case that's borked. The other information can give a hint, but won't be definitive.
(In reply to Georg Fritzsche [:gfritzsche] from comment #7)
> +1, to adding checks both ways & the uplifts.
> How does that work out when switching back and forth? beta -> release -> beta?
If the fix is in beta, it will correct the client ID. The release one won't. Prefs are part of the profile, so are also shared.
Flags: needinfo?(jrediger)
Comment 9•7 years ago
|
||
(In reply to Jan-Erik Rediger [:janerik] from comment #8)
> (In reply to Georg Fritzsche [:gfritzsche] from comment #7)
> > +1, to adding checks both ways & the uplifts.
> > How does that work out when switching back and forth? beta -> release -> beta?
>
> If the fix is in beta, it will correct the client ID. The release one won't.
> Prefs are part of the profile, so are also shared.
Can you expand a bit on what would happen from a Telemetry perspective when switching back and forth?
| Assignee | ||
Comment 10•7 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #9)
> Can you expand a bit on what would happen from a Telemetry perspective when
> switching back and forth?
Scenario: current code + fix-up client ID on startup in beta. Release as is today (e.g. no optout ping, no canary client id code).
1. User launches Beta, disables Telemetry -> canary id persisted
2. User launches release, loads canary ID. User enables Telemetry. client ID unmodified.
Telemetry is now send with canary ID
3. User launches Beta again. Startup detects: upload enabled && canary ID -> new client ID generated.
Telemetry is send with new client ID
4. User launches Release again: valid client ID & upload enabled -> valid Telemetry ping
Comment 11•7 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> (In reply to Jan-Erik Rediger [:janerik] from comment #4)
> > This would explain why we see consistent pings with the canary ID from
> > probably the same clients.
> > This does not fully explain why we see pings with a canary ID 500-800 times
>
> Is it possible to get numbers on how many of these are linked to channel
> switching?
Well, every coffee-ping coming from buildids before 63 is definitely due to channel switching :)
...we could count "optout" pings per channel and then coffee-pings per channel to get some handle on the scope of the issue. Maybe all we'll get is "This affects at most X clients.", but it could be handy to know whether X is measured in hundreds or hundreds of thousands.
Updated•7 years ago
|
Summary: My Client ID seems to have changed to an unexpected value on August 22nd → Canary client id not always resetting to new value
| Assignee | ||
Comment 12•7 years ago
|
||
| Assignee | ||
Updated•7 years ago
|
Points: 1 → 2
| Reporter | ||
Comment 13•7 years ago
|
||
> We should monitor the rate of pings with the canary client ID on ingestions.
Agreed. Should we monitor the overall count over time? Or break things down by channel, build, etc? What would be the most actionable thing to do here?
Flags: needinfo?(mreid)
| Assignee | ||
Comment 14•7 years ago
|
||
chutten: Updated patch submitted.
mreid: I think breaking it down by channel would be helpful for now. Unless there's another bug hidden, the above patch should bring down the numbers.
I will try to run some queries on the current data as per the above suggestions to see if we can currently say a bit more about affected clients.
Flags: needinfo?(mreid)
Comment 15•7 years ago
|
||
Comment on attachment 9008753 [details]
Bug 1489520 - Fix canary client ID on startup when Telemetry upload is enabled
Chris H-C :chutten has approved the revision.
Attachment #9008753 -
Flags: review+
Comment 16•7 years ago
|
||
Pushed by jrediger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/961cfd2ee957
Fix canary client ID on startup when Telemetry upload is enabled r=chutten
Comment 17•7 years ago
|
||
| bugherder | ||
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
| Assignee | ||
Comment 18•7 years ago
|
||
Comment 19•7 years ago
|
||
Comment on attachment 9010716 [details]
Bug 1489520 - [beta] Fix canary client ID on startup when Telemetry upload is enabled
Chris H-C :chutten has approved the revision.
Attachment #9010716 -
Flags: review+
| Assignee | ||
Comment 20•7 years ago
|
||
Comment on attachment 9010716 [details]
Bug 1489520 - [beta] Fix canary client ID on startup when Telemetry upload is enabled
Approval Request Comment
[Feature/Bug causing the regression]:
bug 1445921, "optout" ping
[User impact if declined]:
a small percentage of profiles will have an invalid client id
[Is this code covered by automated tests?]:
Yes, this patch contains the tests
[Has the fix been verified in Nightly?]:
Yes, see bug 1491819
[Needs manual test from QE? If yes, steps to reproduce]:
No, we keep monitoring Telemetry data
[List of other uplifts needed for the feature/fix]:
None
[Is the change risky?]:
No
[Why is the change risky/not risky?]:
Not risky because it's covered by tests and analysis on Telemetry nightly data. We will keep monitoring incoming data. The fix only applies if it detects the issue and corrects it.
[String changes made/needed]:
None
Attachment #9010716 -
Flags: approval-mozilla-beta?
Comment 21•7 years ago
|
||
Comment on attachment 9010716 [details]
Bug 1489520 - [beta] Fix canary client ID on startup when Telemetry upload is enabled
P1 telemetry bug fix with tests, , uplift approved for 63 beta 9, thanks.
Attachment #9010716 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 22•7 years ago
|
||
| bugherder uplift | ||
status-firefox63:
--- → fixed
| Reporter | ||
Updated•7 years ago
|
Flags: needinfo?(mreid)
You need to log in
before you can comment on or make changes to this bug.
Description
•