Closed Bug 1340115 Opened 3 years ago Closed 3 years ago

Cap the amount of history we import from Chrome

Categories

(Firefox :: Migration, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 54
Tracking Status
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: dao, Assigned: dao)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

This bug is about setting default values for the prefs introduced in bug 1339413 (browser.migrate.chrome.history.maxAgeInDays, browser.migrate.chrome.history.limit).
Blocks: 1249008
For the maximum age I'm going to propose half a year. I think this is generous in terms of user expectations. However I don't know how it will impact the amount history we'll import -- I have no idea when Chrome starts evicting history entries. Would be nice if we had telemetry for the age of the oldest entry.
Ideally I'd like to know more about the correlation between the time to import and the quantity we're importing and at what point we reach problematic timing (more than say 10s, assuming we don't hang the UI in which case the limit should be lower, I guess). But getting that information requires fiddling with sql.tmo. I wrote a start: 

SELECT FX_MIGRATION_HISTORY_IMPORT_MS, FX_MIGRATION_HISTORY_QUANTITY, client_id, LENGTH(REGEXP_EXTRACT(ARRAY_JOIN(
         -- Get the only non-0 migration source identifier:
         ARRAY_REMOVE(ARRAY_DISTINCT(fx_migration_source_browser), array[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])[1], ''),
       '^0*[1-9]')) AS browserid from longitudinal
   WHERE settings IS NOT NULL
     AND fx_migration_source_browser IS NOT NULL
     AND FX_MIGRATION_HISTORY_IMPORT_MS IS NOT NULL
     AND FX_MIGRATION_HISTORY_QUANTITY IS NOT NULL
     AND CARDINALITY(ARRAY_REMOVE(ARRAY_DISTINCT(fx_migration_source_browser), array[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])) = 1
     AND normalized_channel = 'release'
     AND LENGTH(REGEXP_EXTRACT(ARRAY_JOIN(
         -- Get the only non-0 migration source identifier:
         ARRAY_REMOVE(ARRAY_DISTINCT(fx_migration_source_browser), array[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])[1], ''),
       '^0*[1-9]')) = 6
     LIMIT 100


but keyed histograms are a pain to extract data from. You can look at the data like this, but you'd have to export it and then use a spreadsheet program to actually get something meaningful out in terms of the correlation.

I'm asking around in #telemetry to get some help to finish this query, but it might not happen until later today.
(In reply to :Gijs from comment #4)
> Ideally I'd like to know more about the correlation between the time to
> import and the quantity we're importing and at what point we reach
> problematic timing

Ideally yes... we could also land this, wait for Telemetry results, and keep tweaking the prefs as needed, but there will still be some handwaving and best guesses involved since it likely won't be easy to tell apart the effects of both prefs.
(In reply to Dão Gottwald [:dao] from comment #5)
> (In reply to :Gijs from comment #4)
> > Ideally I'd like to know more about the correlation between the time to
> > import and the quantity we're importing and at what point we reach
> > problematic timing
> 
> Ideally yes... we could also land this, wait for Telemetry results, and keep
> tweaking the prefs as needed, but there will still be some handwaving and
> best guesses involved since it likely won't be easy to tell apart the
> effects of both prefs.

https://sql.telemetry.mozilla.org/queries/2979

If you sort by time (in ms), you can see that even for as few as 3500 items we take 38 seconds to import. Sure, we might not be hung all that time, but it seems to me that this likely means that 10,000 items is still too large a cut-off. I'd prefer being more conservative with something like 3000. Combined with the sorting and time cutoff, I would expect this to still provide the user with all the sites they can recall using (frequently), without spending ages actually doing the import.
(In reply to :Gijs from comment #6)
> (In reply to Dão Gottwald [:dao] from comment #5)
> > (In reply to :Gijs from comment #4)
> > > Ideally I'd like to know more about the correlation between the time to
> > > import and the quantity we're importing and at what point we reach
> > > problematic timing
> > 
> > Ideally yes... we could also land this, wait for Telemetry results, and keep
> > tweaking the prefs as needed, but there will still be some handwaving and
> > best guesses involved since it likely won't be easy to tell apart the
> > effects of both prefs.
> 
> https://sql.telemetry.mozilla.org/queries/2979
> 
> If you sort by time (in ms), you can see that even for as few as 3500 items
> we take 38 seconds to import. Sure, we might not be hung all that time, but
> it seems to me that this likely means that 10,000 items is still too large a
> cut-off. I'd prefer being more conservative with something like 3000.
> Combined with the sorting and time cutoff, I would expect this to still
> provide the user with all the sites they can recall using (frequently),
> without spending ages actually doing the import.

Uh, and I totally misremembered what the patch did, clearly - you're using a 1300 item limit. Did the earlier patch on the other bug have a higher limit and is that why I'm confused?

Regardless, what's odd is that the 1300 doesn't seem to correspond to 80% of the userbase per that query...
(In reply to :Gijs from comment #7)
> Uh, and I totally misremembered what the patch did, clearly - you're using a
> 1300 item limit. Did the earlier patch on the other bug have a higher limit
> and is that why I'm confused?

It wasn't part of any patch but I initially suggested using 10,000 as the limit in bug 1339413.
So in my queries I'm seeing about 30-40% of people having more than 1300 items. Obviously we don't know how that number would be influenced if we also restricted the maximum age. It feels like 2000 items might be a reasonable compromise. What do you think, Dão?
Flags: needinfo?(dao+bmo)
Fine by me. We can reduce the number later if still needed.
Flags: needinfo?(dao+bmo)
Comment on attachment 8838020 [details]
Bug 1340115 - Cap the amount of history we import from Chrome.

https://reviewboard.mozilla.org/r/113022/#review115482
Attachment #8838020 - Flags: review?(gijskruitbosch+bugs) → review+
Pushed by dgottwald@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2262b2682f73
Cap the amount of history we import from Chrome. r=Gijs
https://hg.mozilla.org/mozilla-central/rev/2262b2682f73
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
Comment on attachment 8838020 [details]
Bug 1340115 - Cap the amount of history we import from Chrome.

Approval Request Comment
[Feature/Bug causing the regression]: /

[User impact if declined]: performance issues with history import

[Is this code covered by automated tests?]: looking at browser/components/migration/tests/unit/, it seems that there are no unit tests for history imports :/

[Has the fix been verified in Nightly?]: was tested manually before landing

[Needs manual test from QE? If yes, steps to reproduce]: /

[List of other uplifts needed for the feature/fix]: /

[Is the change risky?]: low risk

[Why is the change risky/not risky?]: the code behind these prefs was brand new, but now that it has baked for three weeks I think this is fairly safe

[String changes made/needed]: /
Attachment #8838020 - Flags: approval-mozilla-beta?
(In reply to Dão Gottwald [:dao] from comment #15)
> [Why is the change risky/not risky?]: the code behind these prefs was brand
> new, but now that it has baked for three weeks I think this is fairly safe

Closer to two weeks actually, but still.
Comment on attachment 8838020 [details]
Bug 1340115 - Cap the amount of history we import from Chrome.

Fix a performance issue when importing history. Beta53+.
Attachment #8838020 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Dão, should manual QA be looking at this? We're checking migration once per beta cycle, we could include this as well, if you think it would be of value.
Flags: qe-verify?
Flags: needinfo?(dao+bmo)
(In reply to Andrei Vaida, QA [:avaida] – please ni? me from comment #19)
> Dão, should manual QA be looking at this? We're checking migration once per
> beta cycle, we could include this as well, if you think it would be of value.

Yeah, probably a good idea.
Flags: needinfo?(dao+bmo)
Blocks: 1357448
You need to log in before you can comment on or make changes to this bug.