Closed Bug 1179376 Opened 10 years ago Closed 10 years ago

Compare FHR v2 and FHR v4 search, crash, and other fields

Tracking

(Not tracked)

Status:

RESOLVED DUPLICATE of bug 1199393

People

(Reporter: spenrose, Assigned: spenrose)

References

Details

(Whiteboard: [unifiedTelemetry] [data-validation])

Sam Penrose

Assignee

Description

•

10 years ago

Within the set of comparable days in v2 and v4, the following should be identical: - the number of searches (by provider and SAP) on each date - the number of crashes on each date - the number of update checks, successes, etc - any client version changes should fall on the same date

Benjamin Smedberg

Updated

•

10 years ago

Group: mozilla-employee-confidential

Sam Penrose

Assignee

Comment 1

•

10 years ago

Using bcolloran's paired v2 and v4 dataset, I find that v2 is counting an order of magnitude more searches. There may be a simple explanation due to the way the dataset is constructed. My notebook is only a few cells long: http://nbviewer.ipython.org/gist/SamPenrose/05e4dd652c6b95fec6bc

Whiteboard: [unifiedTelemetry] → [unifiedTelemetry] [data-validation]

brendan c

Comment 2

•

10 years ago

Quick thought about that notebook: In the paired data, we only grab pings with * submission_date between "20150525" and "20150609" * build_id between "20150507000000" and "99990507000000" (see http://nbviewer.ipython.org/gist/bcolloran/ac508f1d141eacdf7098 ) But the corresponding v2 data for each client with at least one v4 ping satisfying the criteria above will contain up to 180 days of data for that client. Therefore, if we add up *all* of the recorded searches in this data set, we should expect the total to much lower for v4. This is why I would recommend looking only at the set of "comparable days" for each client. As a first pass at a definition of "comparable days" you could try looking at the set of dates present in both the v2 and v4 data for a client and then dropping the first and last date. If you add up the number of searches per client within those dates the v2 and v4 totals should be in the right ball park. Checking the aggregate is a good initial sanity-check; after that looks roughly right, the direction we really want to head will be to make sure that each individual client has the same counts on each individual day. The difficulty here will be dealing with the fact the v2 and v4 handle timezones differently, but I think we should check that: (1) all records located in UTC+0 should have the same counts on the same calendar dates (2) if we do some kind of windowed moving sum, those values should be similar for v2 and v4. If you want, I can work out in more detail what that comparison function should be.

Sam Penrose

Assignee

Comment 3

•

10 years ago

This result was bogus, because I was mishandling that dataset per previous comment. In the meantime, I compared aggregate totals for the week of June 22 using spark/get_pings() for v4 and hadoop/our hdfs deduped samples, and found close agreement: v2 v4 Aurora: 2,171,085 2,096,495 Nightly: 903,636 853,671 Notebook to follow. (In reply to Sam Penrose from comment #1) > Using bcolloran's paired v2 and v4 dataset, I find that v2 is counting an > order of magnitude more searches. There may be a simple explanation due to > the way the dataset is constructed. My notebook is only a few cells long: > > http://nbviewer.ipython.org/gist/SamPenrose/05e4dd652c6b95fec6bc

Katie Parlante

Updated

•

10 years ago

Whiteboard: [unifiedTelemetry] [data-validation] → [unifiedTelemetry] [data-validation][b5]

Thomas Huelbert

Updated

•

10 years ago

Assignee: nobody → spenrose

Priority: -- → P1

Thomas Huelbert

Comment 4

•

10 years ago

moving to next milestone [40b7]

Whiteboard: [unifiedTelemetry] [data-validation][b5] → [unifiedTelemetry] [data-validation][40b7]

Sam Penrose

Assignee

Comment 5

•

10 years ago

Ben do you have an opinion on how I should compare v2 and v4 crashes?

Flags: needinfo?(benjamin)

Sam Penrose

Assignee

Comment 6

•

10 years ago

Work in progress: http://nbviewer.ipython.org/gist/SamPenrose/5583376698171397420e

Benjamin Smedberg

Comment 7

•

10 years ago

v2 crashes are documented here: http://gecko.readthedocs.org/en/latest/services/healthreport/healthreport/dataformat.html#org-mozilla-crashes-crashes What we care about are: main-crash main-crash-submission-succeeded main-crash-submission-failed main-hang main-hang-submission-succeeded main-hang-submission-failed plugin-crash plugin-crash-submission-succeeded plugin-crash-submission-failed plugin-hang plugin-hang-submission-succeeded plugin-hang-submission-failed v4 crashes are documented in two places: main crashes: https://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/crash-ping.html histograms for the other metrics (see Histograms.json): SUBPROCESS_ABNORMAL_ABORT key "plugin" SUBPROCESS_CRASHES_WITH_DUMP key "plugin" PROCESS_CRASH_SUBMIT_ATTEMPT and PROCESS_CRASH_SUBMIT_SUCCESS keys "main" and "plugin" (I think... there may be plugin-hang in there somewhere also).

Flags: needinfo?(benjamin)

Thomas Huelbert

Updated

•

10 years ago

Iteration: --- → 42.3 - Aug 10

Whiteboard: [unifiedTelemetry] [data-validation][40b7] → [unifiedTelemetry] [data-validation]

Sam Penrose

Assignee

Comment 8

•

10 years ago

This bug has been superceded by the big 41 Beta comparison.

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → DUPLICATE

Nobody; OK to take it and work on it

Updated

•

9 years ago

Product: Firefox Health Report → Firefox Health Report Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Compare FHR v2 and FHR v4 search, crash, and other fields

Categories

(Firefox Health Report Graveyard :: Data Request, defect, P1)

Tracking

(Not tracked)

People

(Reporter: spenrose, Assigned: spenrose)

References

Details

(Whiteboard: [unifiedTelemetry] [data-validation])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Updated