Closed Bug 1214154 Opened 9 years ago Closed 4 years ago

Investigate migration error rates after bug 731025 has had enough time to gather data

Categories

(Firefox :: Migration, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox44 --- affected

People

(Reporter: Gijs, Unassigned)

References

Details

After bug 731025 has gathered data for a while, we should be able to look at the data to assess what kind of error rates we're seeing in the wild. Things that would be suspicious:

0) 0 errors. We should at least doublecheck the code in this case.
1) errors for every attempted migration.
2) no migrations occurring for particular browsers (or all migrations being related to one browser)
3) no reports from particular entry points.


I can take point on this bug, and will put an alert in my calendar so I know when to deal with it. I imagine this will be 2-3 weeks after bug 731025 hits release channels.
I played around with the telemetry UI yesterday, but I couldn't find a good way to compare the _USAGE and other histogram information with the _ERROR information in order to have some idea of the proportion of users that hit errors compared to the overall number of errors (which is less helpful). Bryan and Blake, you seem to be more knowledgeable on this front. Do we need to create some kind of custom visualization? How/where do we do this?
Flags: needinfo?(clarkbw)
Flags: needinfo?(bwinton)
Are you trying to understand the % of users that hit an error during a migration? You can use Spark to collect the clientIDs of pings with a _USAGE histogram and the clientIDs of pings with a _ERROR histogram. Then you can figure out the % of users that had an error during migration.

You can also use get_one_ping_per_client if you're ok with sampling pings.

Everyone at Mozilla has access to Spark. Telemetry Spark tutorial docs:

* This tutorial is pretty user-friendly https://wiki.mozilla.org/Platform/GFX/Telemetry
* Another tutorial http://robertovitillo.com/2015/01/16/next-gen-data-analysis-framework-for-telemetry/
* A simple sample analysis: http://nbviewer.ipython.org/gist/vitillo/e1813025e7d26d640c80
* Another sample analysis: https://github.com/vitillo/e10s_analyses/blob/master/aurora/experiment_branches.ipynb
* Ask questions on #telemetry

Knowing how to do custom Telemetry analyses in Spark is definitely a useful skill to have. If this is time-sensitive, you can try asking one of the people in #telemetry to write this analysis for you.
(I think this probably answers your questions, Gijs, so I'm clearing the needinfo…  ;)
Flags: needinfo?(clarkbw)
Flags: needinfo?(bwinton)
You can also just create your own dashboard and use telemetry.js to get the numbers for the 2 histograms. You'd have to do the arithmetic yourself to get the ratio
Not working on this.
Assignee: gijskruitbosch+bugs → nobody
I added queries for FX_MIGRATION_ERRORS and FX_MIGRATION_USAGE to https://sql.telemetry.mozilla.org/dashboard/fx_migration which can be used as a starting point for this bug.

I noticed that FX_MIGRATION_USAGE is only for non-startup migration so I think the denominator actually needs to be FX_MIGRATION_SOURCE_BROWSER which I believe records for both startup and non-startup

I still look at this dashboard.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.