Closed Bug 1333233 Opened 4 years ago Closed 4 years ago

Add telemetry for the time it takes to undo initial automatic migration / import of data from other browsers, as well as any errors in that process

Categories

(Firefox :: Migration, defect)

53 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 54
Tracking Status
firefox52 --- fixed
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: Gijs, Assigned: Gijs)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

In bug 1331800 we noticed that sometimes 'undo' of the initial automatic import can take a substantial amount of time. To figure out how common that is and how highly we should prioritize making it faster, we would need some added telemetry.
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
(And Gijs mentioned we already have FX_STARTUP_MIGRATION_AUTOMATED_IMPORT_UNDO, for monitoring success/failures during the undo process)
(In reply to Justin Dolske [:Dolske] from comment #1)
> (And Gijs mentioned we already have
> FX_STARTUP_MIGRATION_AUTOMATED_IMPORT_UNDO, for monitoring success/failures
> during the undo process)

And as you then realized, now that we're try...catch-ing some of those, we won't necessarily know how frequent those are, so we should probably add telemetry for this as well.
Summary: Add telemetry for the time it takes to undo initial automatic migration / import of data from other browsers → Add telemetry for the time it takes to undo initial automatic migration / import of data from other browsers, as well as any errors in that process
Yeah. :/

I feel like I just mentioned this somewhere, but for counting the errors this feels similar to a problem we ran into with FX_MIGRATION_ERRORS... Sometimes you want to know "how many users encountered any error during the process", and other times you want to know "how many items failed".  Sort of a difference between "how widespread" and "how severe". FX_MIGRATION_ERRORS only tells us the former.

So maybe we want a similar FX_MIGRATION_UNDO_ERRORS, as well as FX_MIGRATION_*_UNDO_NUMERRORS probes for bookmarks/history/logins?

Actually, now that I think about it, another problem with FX_MIGRATION_ERRORS was that you have to compare it with something else to get error percentages (since it doesn't record successes). So maybe we should just unpack it too, into FX_MIGRATION_*_UNDO_HADERROR booleans?
Blocks: 1332318
(In reply to Justin Dolske [:Dolske] from comment #3)
> Yeah. :/
> 
> I feel like I just mentioned this somewhere, but for counting the errors
> this feels similar to a problem we ran into with FX_MIGRATION_ERRORS...
> Sometimes you want to know "how many users encountered any error during the
> process", and other times you want to know "how many items failed".  Sort of
> a difference between "how widespread" and "how severe". FX_MIGRATION_ERRORS
> only tells us the former.
> 
> So maybe we want a similar FX_MIGRATION_UNDO_ERRORS, as well as
> FX_MIGRATION_*_UNDO_NUMERRORS probes for bookmarks/history/logins?
> 
> Actually, now that I think about it, another problem with
> FX_MIGRATION_ERRORS was that you have to compare it with something else to
> get error percentages (since it doesn't record successes). So maybe we
> should just unpack it too, into FX_MIGRATION_*_UNDO_HADERROR booleans?

I ended up writing 1 histogram per resource type, and using the number of errors, which means '0' will also show up, which should make it possible to easily see how frequent any / some number of errors is (unlike for FX_MIGRATION_ERRORS, which I think uses a count histogram that only counts 1 for every time a migration produces any error(s), and keys a single histogram on the resource type instead of having 1 histogram per resource type keyed on the imported browser).
Comment on attachment 8830777 [details]
Bug 1333233 - part 2: add telemetry for the time it takes to do individual undo operations,

https://reviewboard.mozilla.org/r/107498/#review108950

data-r=me (review Histograms.json only)
Attachment #8830777 - Flags: review?(benjamin) → review+
Comment on attachment 8830776 [details]
Bug 1333233 - part 1: add telemetry for error counts from undo operations,

https://reviewboard.mozilla.org/r/107496/#review108952

data-r=me I suspect that in the future scalars will be cheaper/better for this, but given the newness and uplift to 52 the histograms are fine (I reviewed histograms.json only)
Attachment #8830776 - Flags: review?(benjamin) → review+
Comment on attachment 8830776 [details]
Bug 1333233 - part 1: add telemetry for error counts from undo operations,

https://reviewboard.mozilla.org/r/107496/#review109500
Attachment #8830776 - Flags: review?(dolske) → review+
Comment on attachment 8830777 [details]
Bug 1333233 - part 2: add telemetry for the time it takes to do individual undo operations,

https://reviewboard.mozilla.org/r/107498/#review109502
Attachment #8830777 - Flags: review?(dolske) → review+
(In reply to :Gijs from comment #6)

> I ended up writing 1 histogram per resource type, and using the number of
> errors, which means '0' will also show up, which should make it possible to
> easily see how frequent any / some number of errors is

I suppose a small downside of this is that the histogram chart on TMO isn't going to be useful (because the giant bar for "0" will dwarf everything else", but the table view should still be sufficient.
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/72a2e6016719
part 1: add telemetry for error counts from undo operations, r=bsmedberg,Dolske
https://hg.mozilla.org/integration/autoland/rev/1f8f9b5e74b0
part 2: add telemetry for the time it takes to do individual undo operations, r=bsmedberg,Dolske
https://hg.mozilla.org/mozilla-central/rev/72a2e6016719
https://hg.mozilla.org/mozilla-central/rev/1f8f9b5e74b0
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
Comment on attachment 8830776 [details]
Bug 1333233 - part 1: add telemetry for error counts from undo operations,

Approval Request Comment
[Feature/Bug causing the regression]: automigration that we use in funnelcakes
[User impact if declined]: we won't have good data about how fast/slow the undo process is, or how often there are issues with it, in the funnelcakes
[Is this code covered by automated tests?]: yes!
[Has the fix been verified in Nightly?]: automatically, yes, not manually
[Needs manual test from QE? If yes, steps to reproduce]: don't think it needs it given the automatic tests
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: no
[Why is the change risky/not risky?]: because this code is turned off for release (so this only affects the funnelcake) and there are automated tests for it
[String changes made/needed]: nope
Attachment #8830776 - Flags: approval-mozilla-beta?
Attachment #8830776 - Flags: approval-mozilla-aurora?
Blocks: 1335709
Just to make sure, the uplift request is for both patches, right?
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Julien Cristau [:jcristau] from comment #15)
> Just to make sure, the uplift request is for both patches, right?

Yes!
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #16)
> (In reply to Julien Cristau [:jcristau] from comment #15)
> > Just to make sure, the uplift request is for both patches, right?
> 
> Yes!

Bouncing ni back. :-)
Flags: needinfo?(jcristau)
Comment on attachment 8830776 [details]
Bug 1333233 - part 1: add telemetry for error counts from undo operations,

telemetry probe to count import undo errors, aurora53+, beta52+
Flags: needinfo?(jcristau)
Attachment #8830776 - Flags: approval-mozilla-beta?
Attachment #8830776 - Flags: approval-mozilla-beta+
Attachment #8830776 - Flags: approval-mozilla-aurora?
Attachment #8830776 - Flags: approval-mozilla-aurora+
Comment on attachment 8830777 [details]
Bug 1333233 - part 2: add telemetry for the time it takes to do individual undo operations,

telemetry probes for migration undo timing, aurora53+ and beta52+

Thanks Gijs!
Attachment #8830777 - Flags: approval-mozilla-beta+
Attachment #8830777 - Flags: approval-mozilla-aurora+
Comment on attachment 8830776 [details]
Bug 1333233 - part 1: add telemetry for error counts from undo operations,

Adds more telemetry data for automigration scenarios. Aurora53+
Depends on: 1389991
You need to log in before you can comment on or make changes to this bug.