Need better sanity-checking of values computed based on timestamps generated from monotonicNow
Categories
(Toolkit :: Telemetry, enhancement, P1)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox67 | --- | fixed |
People
(Reporter: wlach, Assigned: janerik)
References
Details
Attachments
(2 files, 1 obsolete file)
| Reporter | ||
Comment 1•7 years ago
|
||
Updated•7 years ago
|
| Reporter | ||
Comment 2•7 years ago
|
||
| Reporter | ||
Updated•7 years ago
|
Comment 3•7 years ago
|
||
Comment 4•7 years ago
|
||
Comment 5•7 years ago
|
||
| Reporter | ||
Comment 6•7 years ago
|
||
| Reporter | ||
Comment 7•7 years ago
|
||
Comment 8•7 years ago
|
||
Updated•7 years ago
|
Comment 9•7 years ago
|
||
| Reporter | ||
Comment 10•7 years ago
|
||
| Reporter | ||
Comment 11•7 years ago
|
||
| Reporter | ||
Comment 12•7 years ago
|
||
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Comment 13•7 years ago
|
||
ni? myself, we'll have to figure out the way forward.
Updated•7 years ago
|
Comment 14•7 years ago
|
||
I took a look at "event" pings' processStartTimestamp, one of the consumers of msSinceProcessStart.
Here's the notebook: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/75037/command/75126
The tl;dr is that, 3% of sessions described by the event pings sent from Nightly this January (from builds new enough to have "event" pings) had more than 1 processStartTimestamp.
:(
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Comment 15•7 years ago
|
||
| Assignee | ||
Comment 16•7 years ago
•
|
||
Long document with notes: https://docs.google.com/document/d/1DAwLvF5F3cT1dG85VA2zC7ANPx2EZxPb9vL0RwaWIV0
I did another query last night on the number of pings with sketchy subsession lengths (between 24 and 48 hours and more than 48 hours):
https://sql.telemetry.mozilla.org/queries/61420/source
It hovers around 1% for most (up to 6% of pings, on release, on a 1% sample of all pings).
I also took :wlach's query and adjusted it for actual client counts:
https://sql.telemetry.mozilla.org/queries/61419/source
It shows similar small numbers.
While we certainly do have clients with invalid session lengths, IMO they are much less significant than initially thought.
We might still need to be careful when basing analysis on session length/subsession hours sum (active ticks/active hours seems to be more accurate anyway)
I went ahead and submitted the most minimal patch fixing the monotonicNow/Date.now/-1 bug.
I left out any instrumentation, as I think if we want that we need to implement it much lower (in TimeStamp directly), as Telemetry is most likely not the first/only caller of this.
:wlach, does this make sense? do you agree with my conclusions?
(Edit: Oh and if you do, you could abandon your patch :))
| Reporter | ||
Updated•7 years ago
|
| Assignee | ||
Comment 17•7 years ago
|
||
| Reporter | ||
Comment 18•7 years ago
|
||
(In reply to Jan-Erik Rediger [:janerik] from comment #16)
While we certainly do have clients with invalid session lengths, IMO they are much less significant than initially thought.
We might still need to be careful when basing analysis on session length/subsession hours sum (active ticks/active hours seems to be more accurate anyway)
Yeah, I never really claimed that the volume of pings with incorrect subsession_hours was high, just that the small number of pings which did have this problem would have a disproportionate impact. :)
Aside from that minor quibble, I agree with the analysis. Let's get this landed and see what the results look like on Nightly!
Comment 19•7 years ago
|
||
Comment 20•7 years ago
|
||
Comment 21•7 years ago
|
||
| bugherder | ||
Comment 22•5 years ago
|
||
The telemetry probe added here proved to trigger quite rarely; https://bugzilla.mozilla.org/show_bug.cgi?id=1592012#c11 has a comment.
Description
•