Closed
Bug 1186713
Opened 9 years ago
Closed 9 years ago
FHR v2/v4 comparison -- totalTime and activeTicks per session must match
Categories
(Firefox Health Report Graveyard :: Client: Desktop, defect)
Firefox Health Report Graveyard
Client: Desktop
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: bcolloran, Unassigned)
References
Details
(Whiteboard: [unifiedTelemetry][data-validation])
For each complete reconstructed session starting within this date range (i.e., not the ongoing session in appSession.current or a session for which not all subsessions have been collected), we should also have these match identically: * totalTime per session * activeTicks per session Among other things, this check is meant to help us verify that measurements are being binned correctly across subsessions.
So there appear to be some discrepancies that actually do roll up to be substantially different between v2 and v4. http://nbviewer.ipython.org/gist/bcolloran/1ca418a0e2ce3a81c343#Comparing-overall-sums-of-totalTime-between-v2-and-v4 Looking into this some, it seems like there may be a problem in some cases with the last subsession of a session having a negative measurement for subsessionLength. This may indicate a problem with shutdown. http://nbviewer.ipython.org/gist/bcolloran/1ca418a0e2ce3a81c343#Are-all-negative-values-%22subsessionLength%22-in-the-final-subsession?
Comment 3•9 years ago
|
||
So, we are using Date() now to calculate subsessionLength and we know some people have weird or jumping clocks. We need to use monotonic clocks for subsessionLength to fix that. Spot-checking the code i don't see other potential issues really with subsessionLength and nothing special about shutdown really. Could there be other reasons to make shutdown affected more proportionally? Like highest average subsession length or most sesssions consisting of one or two fragments?
Comment 4•9 years ago
|
||
See also bug 1106048. With bug 1188416 we should see sane numbers in v4, but v2 would still be off. I don't think we can reconstruct this properly (as v2 & v4 are affected differently), so i think we really have to drop any pairing where either of v2 or v4 shows clock issues (like it was done in the notebook AFAICT).
(In reply to Benjamin Smedberg [:bsmedberg] from comment #2) > Did you file that? Look like Georg filed it as bug 1188416.
Flags: needinfo?(bcolloran)
Comment 6•9 years ago
|
||
I find it hard to believe that clock reset is causing the negative numbers being reported here. So even if we get bug 1188416 landed immediately, I think we should also target some manual QA at the shutdown timing. Brendan, can you look at the "reason" code for the subsessions that have a negative total time? e.g. is -85359 a "shutdown" or "aborted-session" and is there a pattern in the other subsessions with negative values?
Updated•9 years ago
|
Flags: needinfo?(bcolloran)
Comment 7•9 years ago
|
||
Per bug 1106048, the uptime value in v4 might be "similarly" broken as v2. ni?me to check if that makes sense to compare.
Flags: needinfo?(gfritzsche)
Comment 8•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #7) > Per bug 1106048, the uptime value in v4 might be "similarly" broken as v2. > ni?me to check if that makes sense to compare. We actually have ping.payload.simpleMeasurements.totalTime which should match the FHR totalTime pretty directly (slight difference expected as it is stored at different times). The totalTime value is not per subsession, so we should be able to match this pretty directly to the FHR v2 behavior if we use the values from the last fragment of a session (shutdown or aborted-session).
Flags: needinfo?(gfritzsche)
Comment 9•9 years ago
|
||
FWIW, i confirmed the negative values for both locally by setting my system clock back.
Reporter | ||
Comment 10•9 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #6) > I find it hard to believe that clock reset is causing the negative numbers > being reported here. So even if we get bug 1188416 landed immediately, I > think we should also target some manual QA at the shutdown timing. > > Brendan, can you look at the "reason" code for the subsessions that have a > negative total time? e.g. is -85359 a "shutdown" or "aborted-session" and is > there a pattern in the other subsessions with negative values? There does not appear to be any pattern in the reason codes of subsessions with negative subsessionLength values. http://nbviewer.ipython.org/gist/bcolloran/19240eb870da7df5dde1#Reason-codes-for-subsessions-with-negative-subsessionLength
Flags: needinfo?(bcolloran)
Updated•9 years ago
|
Whiteboard: [unifiedTelemetry][data-validation]
Comment 11•9 years ago
|
||
issues with v2 that won't be fixed in v4
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Updated•6 years ago
|
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•