Closed
Bug 1147395
Opened 9 years ago
Closed 9 years ago
Validation: Compare a few telemetry measurements between "saved-session" and "main" pings.
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: mreid, Unassigned)
References
Details
(Whiteboard: [40b9] [unifiedTelemetry][data-validation])
We should ensure that we're receiving the same data via the new "main" pings as we are receiving via the "saved-session" pings. One simple way is to select a few common measures and calculate the aggregations using both record types. They should come out the same if everything is working as expected. I propose we use the following commonly submitted values: SIMPLE_MEASURES_UPTIME CYCLE_COLLECTOR GC_MS And an uncommon measure like: SHUMWAY_ERROR We should filter records by - Application Name: "Firefox" - Channel: "nightly" We should aggregate separately by type="saved-session" vs. type="main", excluding duplicate Document IDs. The aggregation should be done by the appBuildId field. Ideally, the resulting aggregations should be comparable to the data on telemetry.mozilla.org in addition to comparing with each other.
Reporter | ||
Updated•9 years ago
|
Priority: -- → P3
Reporter | ||
Updated•9 years ago
|
Priority: P3 → P2
Updated•9 years ago
|
Assignee: nobody → rvitillo
Reporter | ||
Comment 1•9 years ago
|
||
Roberto, is this work ongoing? Or are you waiting for the follow-up bugs? Are you blocked on anything?
Flags: needinfo?(rvitillo)
Comment 2•9 years ago
|
||
I was waiting for Brendan to be happy with the data first, as he is doing a great job of validating the v4 dataset.
Flags: needinfo?(rvitillo)
Reporter | ||
Updated•9 years ago
|
Whiteboard: [unifiedTelemetry][b5]
Updated•9 years ago
|
Whiteboard: [unifiedTelemetry][b5] → [unifiedTelemetry][b5][data-validation]
Updated•9 years ago
|
Assignee: rvitillo → nobody
Comment 5•9 years ago
|
||
A preliminary analysis is available at [1]. I compared only few metrics but there appears already to be a mismatch in about 7% of sessions for one of those (GC_MS). At the end of the notebook I dumped few mismatching sessions. Note that in the notebook I have used a single build-id but the percentage seems to be stable across recent build-ids as well. [1] http://nbviewer.ipython.org/gist/vitillo/8aec1f023265c9bf2293
Flags: needinfo?(gfritzsche)
Comment 6•9 years ago
|
||
Forgot to mention that 7% applies only to multi-fragment sessions.
Updated•9 years ago
|
Flags: needinfo?(alessio.placitelli)
Updated•9 years ago
|
Whiteboard: [unifiedTelemetry][b5][data-validation] → [40b9] [unifiedTelemetry][data-validation]
Comment 7•9 years ago
|
||
After a quite some testing, I was not able to consistently reproduce the issue locally. This is my test procedure (I've changed the telemetry server pref to point to a local, non existent server, so my pings are kept in the pending pings directory, for simplicity): - Start Firefox - Browse to about:telemetry - Wait for Telemetry to start (1 minute) - Play a bit with the browser to trigger the GC - Enable a restartless addon to break the session - Play a bit more - Close Firefox The GC_MS histogram in the environment-changed ping and the one in the shutdown ping sum up nicely and the result equals the value reported by the saved-session ping. What I've noticed though is that there's a mismatch in the GC_MS histogram in the "childPayloads" section of the pings. I'll keep digging further in.
Flags: needinfo?(alessio.placitelli)
Comment 8•9 years ago
|
||
Checking through a few things here, it turns out that the GC_MS data used in the notebook is a sum of both child & parent histograms. Roberto, can you please rerun this with only looking at the parent data? The child-data discrepancy needs to be fixed before we'd switch away from saved-session or e10s ships, but its not blocking 41 now.
Flags: needinfo?(gfritzsche) → needinfo?(rvitillo)
Comment 9•9 years ago
|
||
Using only parent histograms nearly none of the complete multi-fragment sessions have a mismatch. See http://nbviewer.ipython.org/gist/vitillo/b352ec160ce5c5ee2af6
Flags: needinfo?(rvitillo)
Comment 10•9 years ago
|
||
Thanks for rechecking, that is great to hear. That seems ok and minor mismatches are expected (for bug 1186871). So i think we can close this off as WORKSFORME and file a follow-up to investigate the child payload discrepancies? (I think those come mostly down to not collecting child payloads on each subsession collection)
Comment 11•9 years ago
|
||
Can we just rename the Bug?
Comment 12•9 years ago
|
||
I'd rather close this one as the original question we tracked here is resolved and the context above isn't all related to the e10s issue.
Comment 13•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] [away until july 22] from comment #12) > I'd rather close this one as the original question we tracked here is > resolved and the context above isn't all related to the e10s issue. Filed bug 1187327.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•