Closed Bug 1191681 Opened 10 years ago Closed 10 years ago

Write a client side validation tool that we can run on our own local data

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kparlante, Assigned: gfritzsche)

References

Details

(Whiteboard: [unifiedTelemetry])

Attachments

(1 file)

IIRC, vladan brought this idea up at today's standup. Alessio & others thought it was a good idea...
I can e.g. write something to copy/paste for everyone to paste in their scratchpad. But i think we should define first what this should evaluate? Session chaining, profileSubsessionCounter breakage, ...?
(In reply to Georg Fritzsche [:gfritzsche] from comment #1) > I can e.g. write something to copy/paste for everyone to paste in their > scratchpad. Why when Python is so much more fun? :) No, but seriously, you could make it a JetPack extension pretty easily I think. > But i think we should define first what this should evaluate? > Session chaining, profileSubsessionCounter breakage, ...? Yes. Didn't Stuart write a script for this already when he manually tested shutdown?
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #2) > (In reply to Georg Fritzsche [:gfritzsche] from comment #1) > > I can e.g. write something to copy/paste for everyone to paste in their > > scratchpad. > > Why when Python is so much more fun? :) No, but seriously, you could make it > a JetPack extension pretty easily I think. JS because we have a well defined archiving API to access this info. What do get from putting work into packaging it up as in extension? Copy-paste into scratchpad seems easy enough and is trivial to edit if you want to investigate something yourself. > > > But i think we should define first what this should evaluate? > > Session chaining, profileSubsessionCounter breakage, ...? > > Yes. > > Didn't Stuart write a script for this already when he manually tested > shutdown?
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #2) > Didn't Stuart write a script for this already when he manually tested > shutdown?
Flags: needinfo?(sphilp)
I have a script to generate n shutdown/saved-session pings, and a script to combine the "info" sections so they are easier to compare, but the actual validation was manual. Could certainly extend it to validate some parts automatically. Also it's in python, and uses the gzipServer, not on the client :P
Flags: needinfo?(sphilp)
+1 for the JS to be run in Firefox Scratchpad: it's trivial to run and modify, and would probably allow everyone of us to run it on the every day profile without much effort.
I'm sure you guys are much more familiar with this so it may not matter, but I started down this path of scripting it and ran into trouble as the pings aren't always in order, so the logic to filter/sort/validate gets complex fairly quickly. In the end it was faster to manually validate, but given enough time to write this it would be a nice thing to have (could even script it to run periodically on jenkins or something). It looks like the archive dir is by timestamp first, so that might help.
Somewhat related, if subsessionStartDate and sessionStartDate showed the full timestamp instead of midnight on the day of, that would have helped. Was that done for a reason? Currently it's: "subsessionStartDate": "2015-08-04T00:00:00.0-04:00",
Yes, that is an anti-identification privacy measure: if you know when a session starts, it's much easier to identify a particular person from the FHR data, and also to profile usage patterns that we don't need and therefore don't want.
(In reply to Stuart Philp :sphilp from comment #7) > I'm sure you guys are much more familiar with this so it may not matter, but > I started down this path of scripting it and ran into trouble as the pings > aren't always in order, so the logic to filter/sort/validate gets complex > fairly quickly. We can e.g. order them by the pings creation time and check for inconsistencies from there. I don't think we need something perfect here, it's enough to flag possible issues and manually inspect from there.
Assignee: nobody → gfritzsche
Status: NEW → ASSIGNED
Whiteboard: [unifiedTelemetry]
Iteration: --- → 43.1 - Aug 24
First cut covering Telemetry v4 consistency: https://gist.github.com/georgf/1b0831a6b81b6c9fe240
Priority: -- → P1
(In reply to Mike Trinkala [:trink] from comment #12) > Created attachment 8648047 [details] > client side eval with errors This is on Fx40 / release, which doesn't have all the fixes. Ideally we check on Nightly 43 / Aurora 42.
Summary: write a client side evaluation tool that we can run on our own local data → Write a client side validation tool that we can run on our own local data
Second cut up: https://gist.github.com/georgf/1b0831a6b81b6c9fe240 This now opens a second tab that compares accumulated v2/v4 data (starting with search counts). Note that we can't perfectly match v2 & v4 sessions as the historic data is stored differently: * v2 has it per UTC day * v4 has it per subsessions (which may be partially in two UTC days) However, we should be able to use this for "close enough" & "expected N more counts" comparisons. Also, starting out on new profiles this comparison should match 100%.
Nice! Most of my archive looks fine on nightly and beta (same profiles I ran 90% of the tests on). There's one thing I haven't seen before though, "reason": "gather-subsession-payload". This seems to be starting a new session/subsession as there is no previous id's and the profile counter resets to 1. Is that expected?
(In reply to Stuart Philp :sphilp from comment #15) > Nice! Most of my archive looks fine on nightly and beta (same profiles I ran > 90% of the tests on). > > There's one thing I haven't seen before though, "reason": > "gather-subsession-payload". This seems to be starting a new > session/subsession as there is no previous id's and the profile counter > resets to 1. Is that expected? Yes, that's expected. The problem is that, if we want to compare all the data, we also need the counts since the last subsession for v4. So i always push one artificial ping at the end of the list that has the "current" measurements.
Work win7 laptop, Nightly-e10s-profile: ============================================== - Ping chain is fine, going back to July 23 (54 subsessions) - v2 vs v4 is off by a bit: what v2 v4 search: google.contextmenu 17 17 search: google.searchbar 93 98 <---------------- search: google.urlbar 40 46 <---------------- search: google.newtab 1 1 search: yahoo.contextmenu 2 2 search: yahoo.searchbar 16 16 search: yahoo.urlbar 11 11 search: wikipedia.searchbar 1 1
Work win7 laptop, Aurora profile: ============================================== - Session chain is fine again, going back to July 24th (51 subsessions) - v2 vs v4 is slightly off again (expected?): what v2 v4 search: google.urlbar 59 62 <------ search: google.searchbar 55 56 <------ search: google.contextmenu 10 10 search: ddg.searchbar 1 1 search: yahoo.searchbar 0 4 <------ search: yahoo.urlbar 0 3 <------ Note I've done some channel-switching with this profile
Right, slight mismatches are expected unless it is a fresher profile. The problem is that i cut off at the oldest common date v2 & v4 have, but can't match them a 100%. So worst case the accumulations are off up to one day.
I haven't heard about breakage or issues yet, so i'm closing this bug. Lets move any issues that turn up later to follow-up bugs.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
I found a broken session chain on my Nightly e10s profile on my home Windows 7 machine. https://docs.google.com/a/mozilla.com/spreadsheets/d/13qw14UBQ2VEMk5Pu_OMt92kc_cSOJwmK9LHwpvu-xnk/edit?usp=sharing It looks like there's a fork inside a single session's chain: 1. at roughly midnight local time, session writes a "daily ping" 2. 45 minutes later, session writes an aborted-session ping 3. 5 minutes after that, session shuts down and writes a shutdown ping with the SAME subsessionID as #2
Status: RESOLVED → REOPENED
Flags: needinfo?(gfritzsche)
Resolution: WORKSFORME → ---
Blocks: 1196796
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #21) > I found a broken session chain on my Nightly e10s profile on my home Windows > 7 machine. > > https://docs.google.com/a/mozilla.com/spreadsheets/d/ > 13qw14UBQ2VEMk5Pu_OMt92kc_cSOJwmK9LHwpvu-xnk/edit?usp=sharing Moved this to bug 1196852. Lets keep this bug closed and file follow-up bugs for issues found. (or comment here or drop me a mail and i'll file something)
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Flags: needinfo?(gfritzsche)
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: