Closed Bug 1274029 Opened 3 years ago Closed 3 years ago

Regression in Top Sites loading performance

Categories

(Firefox for Android :: Data Providers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 49
Tracking Status
firefox48 --- wontfix
firefox49 --- fixed
fennec 49+ ---

People

(Reporter: Grisha, Assigned: Grisha)

References

(Blocks 1 open bug)

Details

Attachments

(8 files)

Bug 1265525 landed changes which improved quality/relevance of data shown in Top Sites. A more expensive query was required, which has slowed down loading of the Top Sites panel. Looking at the telemetry data that was gathered since those changes landed (on May 5) we see that, roughly speaking, the new query is twice as slow 95% of the time [0].

FFiOS's approach to address similar performance problems was to cache query results, display cached results immediately, and invalidate/recalculate cache in the background at certain points. While not perfect - see discussion on Bug 1211984 - this seems like a sensible solution to address performance concerns without too much work.

My main concern is figuring out all of the trigger points for cache invalidation. Again, Bug 1211984 has bunch of thinking on the subject.

[0] https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2016-05-16&keys=__none__!__none__!__none__&max_channel_version=nightly%252F49&measure=FENNEC_TOPSITES_LOADER_TIME_MS&min_channel_version=null&product=Fennec&sanitize=1&sort_keys=submissions&start_date=2016-05-06&table=0&trim=1&use_submission_date=0
tracking-fennec: --- → ?
Alternative to caching of query results: simplifying the combined view itself.

We can eliminate the need for the JOIN on the Visits table if add the following aggregates to the History table:
- local_visits
- remote_visits
- remote_last_visited
- (already have local_last_visited - it's date_last_visited)

We already keep "visits" aggregate column around, so this is inline with that pattern.
When we need to update these aggregates:
- local_visits: local browsing, android import.
- remote_visits, remote_last_visited: sync
- all: clearing of data

This will benefit both Top Sites as well as Search queries, and should bring their performance to pre-Bug 1265525 levels.

To me this seems like a more elegant solution than caching - it's also simple and won't open up a potential can of worms that is keeping our cache up-to-date, invalidating it, etc.
Grisha mentioned to me that we may be able to fix this with a change to the query, instead of implementing caching. The main thing here is that we need to fix this performance regression in 48.
tracking-fennec: ? → 48+
Summary: Cache combined view to improve Top Sites loading performance → Regression in Top Sites loading performance
tracking-fennec: 48+ → 49+
Review commit: https://reviewboard.mozilla.org/r/56462/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/56462/
Attachment #8758081 - Flags: review?(s.kaspari)
Attachment #8758082 - Flags: review?(s.kaspari)
Attachment #8758083 - Flags: review?(s.kaspari)
Attachment #8758084 - Flags: review?(s.kaspari)
Attachment #8758085 - Flags: review?(s.kaspari)
Attachment #8758086 - Flags: review?(s.kaspari)
Comment on attachment 8758081 [details]
MozReview Request: Bug 1274029 - Pre: Fix v32.db test db, set combined view to what it should be at this time r=sebastian

https://reviewboard.mozilla.org/r/56462/#review53050

What was the problem with the database?
Attachment #8758081 - Flags: review?(s.kaspari) → review+
Attachment #8758082 - Flags: review?(s.kaspari) → review+
Comment on attachment 8758082 [details]
MozReview Request: Bug 1274029 - Part 1: Database migration; initial aggregates calculation r=sebastian

https://reviewboard.mozilla.org/r/56464/#review53052

Do you know how expensive this migration is and if it will affect app startup?

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserDatabaseHelper.java:1889
(Diff revision 1)
> +        db.execSQL("UPDATE " + TABLE_HISTORY + " SET " +
> +                History.LOCAL_VISITS + " = (" +
> +                    "SELECT COALESCE(SUM(" + qualifyColumn(TABLE_VISITS, Visits.IS_LOCAL) + "), 0)" +
> +                        " FROM " + TABLE_VISITS +
> +                        " WHERE " + qualifyColumn(TABLE_VISITS, Visits.HISTORY_GUID) + " = " + qualifyColumn(TABLE_HISTORY, History.GUID) +
> +                ") " +
> +                History.REMOTE_VISITS + " = (" +
> +                    "SELECT COALESCE(SUM(CASE " + Visits.IS_LOCAL + " WHEN 0 THEN 1 ELSE 0 END), 0)" +
> +                        " FROM " + TABLE_VISITS +
> +                        " WHERE " + qualifyColumn(TABLE_VISITS, Visits.HISTORY_GUID) + " = " + qualifyColumn(TABLE_HISTORY, History.GUID) +
> +                ") " +

Does this work without a comma after every expression? -> http://sqlite.org/lang_update.html
Comment on attachment 8758083 [details]
MozReview Request: Bug 1274029 - Part 2: Update aggregates during local browsing r=sebastian

https://reviewboard.mozilla.org/r/56466/#review53054
Attachment #8758083 - Flags: review?(s.kaspari) → review+
https://reviewboard.mozilla.org/r/56462/#review53050

Ah, I forgot to add details and there's no diff here :). v32.db's version of the combined view was one that would only appear in v33. When I made v32.db, I took v31 as a basis and ran necessary sql migrations manually - so I must have screwed that up. The fixed version attached here has a correct version of the combined view.

Not that this currently makes a difference for migration testing... But good to fix up regardless.
https://reviewboard.mozilla.org/r/56464/#review53052

Query should be pretty fast. It's doing a bunch of selects, but they're all indexed. I've ran it on a few data sets, my own browsing history, an old browsing profile with 70k+ visits, and it was almost instant. I'll do a few more tests, but I think it shouldn't affect that initial post-update startup time too much.

> Does this work without a comma after every expression? -> http://sqlite.org/lang_update.html

Indeed, missed the commas.
Comment on attachment 8758082 [details]
MozReview Request: Bug 1274029 - Part 1: Database migration; initial aggregates calculation r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56464/diff/1-2/
Comment on attachment 8758083 [details]
MozReview Request: Bug 1274029 - Part 2: Update aggregates during local browsing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56466/diff/1-2/
Comment on attachment 8758084 [details]
MozReview Request: Bug 1274029 - Part 3: Update aggregates during syncing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56468/diff/1-2/
Comment on attachment 8758085 [details]
MozReview Request: Bug 1274029 - Part 4: Update aggregates during history import from android browser r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56470/diff/1-2/
Comment on attachment 8758086 [details]
MozReview Request: Bug 1274029 - Part 5: Unit tests for aggregate updates r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56472/diff/1-2/
Comment on attachment 8758082 [details]
MozReview Request: Bug 1274029 - Part 1: Database migration; initial aggregates calculation r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56464/diff/2-3/
Comment on attachment 8758083 [details]
MozReview Request: Bug 1274029 - Part 2: Update aggregates during local browsing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56466/diff/2-3/
Comment on attachment 8758084 [details]
MozReview Request: Bug 1274029 - Part 3: Update aggregates during syncing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56468/diff/2-3/
Comment on attachment 8758085 [details]
MozReview Request: Bug 1274029 - Part 4: Update aggregates during history import from android browser r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56470/diff/2-3/
Comment on attachment 8758086 [details]
MozReview Request: Bug 1274029 - Part 5: Unit tests for aggregate updates r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56472/diff/2-3/
Comment on attachment 8758084 [details]
MozReview Request: Bug 1274029 - Part 3: Update aggregates during syncing r=sebastian

https://reviewboard.mozilla.org/r/56468/#review53122

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/android/AndroidBrowserHistoryDataAccessor.java:139
(Diff revision 3)
> +    final ContentValues remoteVisitAggregateValues = new ContentValues();
> +    final Uri historyIncrementRemoteAggregateUri = getUri().buildUpon()
> +            .appendQueryParameter(BrowserContract.PARAM_INCREMENT_REMOTE_AGGREGATES, "true")
> +            .build();
>      for (Record record : records) {
>        HistoryRecord rec = (HistoryRecord) record;
>        if (rec.visits != null && rec.visits.size() != 0) {
> -        context.getContentResolver().bulkInsert(
> +        int remoteVisitsInserted = context.getContentResolver().bulkInsert(
>                  BrowserContract.Visits.CONTENT_URI,
>                  VisitsHelper.getVisitsContentValues(rec.guid, rec.visits)
>          );
> +
> +        // If we just inserted any visits, update remote visit aggregate values.
> +        // While inserting visits, we might not insert all of rec.visits - if we already have a local
> +        // visit record with matching (guid,date), we will skip that visit.
> +        // Remote visits aggregate value will be incremented by number of visits inserted.
> +        // Note that we don't need to set REMOTE_DATE_LAST_VISITED, because it already gets set above.
> +        if (remoteVisitsInserted > 0) {
> +          remoteVisitAggregateValues.put(BrowserContract.History.REMOTE_VISITS, remoteVisitsInserted);
> +          context.getContentResolver().update(
> +                  historyIncrementRemoteAggregateUri,
> +                  remoteVisitAggregateValues,
> +                  BrowserContract.History.GUID + " = ?", new String[] {rec.guid}
> +          );
> +        }

Above you throw if PARAM_INCREMENT_REMOTE_AGGREGATES is set but BrowserContract.History.REMOTE_VISITS is empty. However here you technically always add PARAM_INCREMENT_REMOTE_AGGREGATES but only conditonally add BrowserContract.History.REMOTE_VISITS.
Attachment #8758084 - Flags: review?(s.kaspari) → review+
Attachment #8758085 - Flags: review?(s.kaspari) → review+
Comment on attachment 8758085 [details]
MozReview Request: Bug 1274029 - Part 4: Update aggregates during history import from android browser r=sebastian

https://reviewboard.mozilla.org/r/56470/#review53584
Attachment #8758086 - Flags: review?(s.kaspari) → review+
Comment on attachment 8758086 [details]
MozReview Request: Bug 1274029 - Part 5: Unit tests for aggregate updates r=sebastian

https://reviewboard.mozilla.org/r/56472/#review53056

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/db/BrowserProviderHistoryTest.java:267
(Diff revision 3)
> +            final int vCol = c.getColumnIndexOrThrow(BrowserContract.History.VISITS);
> +            final int lvCol = c.getColumnIndexOrThrow(BrowserContract.History.LOCAL_VISITS);
> +            final int rvCol = c.getColumnIndexOrThrow(BrowserContract.History.REMOTE_VISITS);
> +            final int ldlv = c.getColumnIndexOrThrow(BrowserContract.History.LOCAL_DATE_LAST_VISITED);
> +            final int rdlv = c.getColumnIndexOrThrow(BrowserContract.History.REMOTE_DATE_LAST_VISITED);

nit: This looks like obfuscated code to me. Is there a particular reason for this brevity? :)
Nice!

Before landing this I'd reconsider the risk as this will not be maintained for a while; especially as this migration is riding the trains. It's also an option to leave the patches to be picked up by someone who can watch out for issues and fix them as they come up.
(In reply to Sebastian Kaspari (:sebastian) from comment #29)
> Nice!
> 
> Before landing this I'd reconsider the risk as this will not be maintained
> for a while; especially as this migration is riding the trains. It's also an
> option to leave the patches to be picked up by someone who can watch out for
> issues and fix them as they come up.

I agree we need to think about risk, but I'm not sure we can afford to not land this, given the performance regression that's currently in the tree.

I think we do need to land this, and we need to also commit to dealing with fallout bugs that might appear. Fennec stability and performance is still super important.
https://reviewboard.mozilla.org/r/56468/#review53122

> Above you throw if PARAM_INCREMENT_REMOTE_AGGREGATES is set but BrowserContract.History.REMOTE_VISITS is empty. However here you technically always add PARAM_INCREMENT_REMOTE_AGGREGATES but only conditonally add BrowserContract.History.REMOTE_VISITS.

.update is also called conditionally (on the same condition as setting REMOTE_VISITS), so I feel that this should be fine. I'll add a clarifying comment.
https://reviewboard.mozilla.org/r/56472/#review53056

> nit: This looks like obfuscated code to me. Is there a particular reason for this brevity? :)

No good reason :)
(In reply to :Margaret Leibovic from comment #30)
> (In reply to Sebastian Kaspari (:sebastian) from comment #29)
> > Nice!
> > 
> > Before landing this I'd reconsider the risk as this will not be maintained
> > for a while; especially as this migration is riding the trains. It's also an
> > option to leave the patches to be picked up by someone who can watch out for
> > issues and fix them as they come up.
> 
> I agree we need to think about risk, but I'm not sure we can afford to not
> land this, given the performance regression that's currently in the tree.
> 
> I think we do need to land this, and we need to also commit to dealing with
> fallout bugs that might appear. Fennec stability and performance is still
> super important.

Happy to keep tabs on any issues that might arise from these patches.

De-risking is specifically why I went with this approach (vs caching). The bugs might be subtler if aggregates get out of sync for some reason (e.g. strange ordering of items in the Top Sites), but at least nothing should be crashing. Worst case, it's very cheap to regenerate these aggregates from the visits table, if we have to - it's a simple SQL query that we can run whenever.

Another concern is performance of the migration - how meta! My tests so far show that this shouldn't be a problem. If it demonstrates itself as "too slow" in the wild, we can always move the slow part (calculating aggregates) out of the migration itself - see previous paragraph.
Filed some semi-relevant follow up issues:

- Bug 1277330
- Bug 1277329
Comment on attachment 8758081 [details]
MozReview Request: Bug 1274029 - Pre: Fix v32.db test db, set combined view to what it should be at this time r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56462/diff/1-2/
Comment on attachment 8758082 [details]
MozReview Request: Bug 1274029 - Part 1: Database migration; initial aggregates calculation r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56464/diff/3-4/
Comment on attachment 8758083 [details]
MozReview Request: Bug 1274029 - Part 2: Update aggregates during local browsing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56466/diff/3-4/
Comment on attachment 8758084 [details]
MozReview Request: Bug 1274029 - Part 3: Update aggregates during syncing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56468/diff/3-4/
Comment on attachment 8758085 [details]
MozReview Request: Bug 1274029 - Part 4: Update aggregates during history import from android browser r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56470/diff/3-4/
Comment on attachment 8758086 [details]
MozReview Request: Bug 1274029 - Part 5: Unit tests for aggregate updates r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56472/diff/3-4/
Comment on attachment 8758082 [details]
MozReview Request: Bug 1274029 - Part 1: Database migration; initial aggregates calculation r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56464/diff/4-5/
Comment on attachment 8758083 [details]
MozReview Request: Bug 1274029 - Part 2: Update aggregates during local browsing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56466/diff/4-5/
Comment on attachment 8758084 [details]
MozReview Request: Bug 1274029 - Part 3: Update aggregates during syncing r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56468/diff/4-5/
Comment on attachment 8758085 [details]
MozReview Request: Bug 1274029 - Part 4: Update aggregates during history import from android browser r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56470/diff/4-5/
Comment on attachment 8758086 [details]
MozReview Request: Bug 1274029 - Part 5: Unit tests for aggregate updates r=sebastian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56472/diff/4-5/
Comment on attachment 8758841 [details]
MozReview Request: Bug 1274029 - Pre 2: Correct old comments about combined view r=grisha

https://reviewboard.mozilla.org/r/56964/#review53714
Attachment #8758841 - Flags: review?(gkruglov) → review+
https://hg.mozilla.org/integration/fx-team/rev/d0288a0ca5ce65b439a69be3cce5b01dc1d2b42f
Bug 1274029 - Pre: Fix v32.db test db, set combined view to what it should be at this time r=sebastian

https://hg.mozilla.org/integration/fx-team/rev/348b9ccd0585aa0ad77a05009490a16c7f7e65a0
Bug 1274029 - Pre 2: Correct old comments about combined view r=grisha

https://hg.mozilla.org/integration/fx-team/rev/28b948e9cad9f93cab45cdb8947784352af5b25b
Bug 1274029 - Part 1: Database migration; initial aggregates calculation r=sebastian

https://hg.mozilla.org/integration/fx-team/rev/ce62fcd4df4d3c7dd31c92dd7d73d736a043cb0e
Bug 1274029 - Part 2: Update aggregates during local browsing r=sebastian

https://hg.mozilla.org/integration/fx-team/rev/4b6a54c20fff6cc1205a7098e403b2b27bf82c2a
Bug 1274029 - Part 3: Update aggregates during syncing r=sebastian

https://hg.mozilla.org/integration/fx-team/rev/ea45ad32acf03ea698f60f9c9346e9ce5b43e183
Bug 1274029 - Part 4: Update aggregates during history import from android browser r=sebastian

https://hg.mozilla.org/integration/fx-team/rev/4ecfc31fd9bddd906532482338273b752a37cf12
Bug 1274029 - Part 5: Unit tests for aggregate updates r=sebastian
Status: NEW → ASSIGNED
This needs to be uplifted together with Bug 1265525, it largely fixes the performance regression introduced by those changed.

Performance improvements should be enough to bring us to before-Bug 1265525 levels. Telemetry data concurs, 95th percentile for top sites query is down from ~1000ms to ~500ms. (see data from June 2 and on):

- top sites query performance: https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median!95th-percentile&cumulative=0&end_date=2015-09-11&keys=&max_channel_version=nightly%252F50&measure=FENNEC_TOPSITES_LOADER_TIME_MS&min_channel_version=nightly%252F46&product=Fennec&sanitize=1&sort_keys=submissions&start_date=2015-08-10&trim=1&use_submission_date=0

- search result query performance: https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median!95th-percentile&cumulative=0&end_date=2015-09-11&keys=&max_channel_version=nightly%252F50&measure=FENNEC_SEARCH_LOADER_TIME_MS&min_channel_version=nightly%252F46&product=Fennec&sanitize=1&sort_keys=submissions&start_date=2015-08-10&trim=1&use_submission_date=0

[Feature/regressing bug #]: Bug 1265525

[User impact if declined]: Bug 1265525 introduced a performance regression. Top Sites are slower to load, awesomebar search results are slower to generate, etc. If Bug 1265525 is uplifted without these changes, user impact is noticeable performance regression.

[Describe test coverage new/current, TreeHerder]: Unit tests cover how we manage newly added aggregate counts, and test for regressions to BrowserProvider history/visits layers.

[Risks and why]: medium; changes were made to how we track visit counts locally, during import from the Android browser and while syncing. Unit tests are in place to help manage the risk, and manual testing + close to a week on nightly shows that things are going well so far.

[String/UUID change made/needed]: none
Attachment #8761368 - Flags: approval-mozilla-beta?
I don't think we should take this patch. With bug 1265525, it is a lot of changes for beta.
Could we instead backout the feature in beta and let it ride the train from 49?
Flags: needinfo?(gkruglov)
This needs uplifting only if Bug 1265525 is uplifted. We either let both of these ride the train for 49, or uplift both of these. If we choose to have this work in 49, no backing out is necessary. Data layer changes currently in 48 (which Bug 1265525 builds on top of) are fine to leave by themselves - they won't provide visible value, but user's profile history/visits data will be improving in the meantime.

Ideally, we want to wrap up this work as quickly as possible and resolve any issues now while these patches are fresh, vs in a month from now (49 hits beta in august). My vote is still to uplift, if possible, and commit to addressing any fallout swiftly. Telemetry data is showing good performance numbers, and (so far) no regressions regarding top sites, syncing, etc have been reported (although granted, possible issues will be subtle).
Flags: needinfo?(gkruglov)
Thinking of this another way, if we uplift we won't be able to easily back these changes out due to DB migrations. It certainly feels safer letting this ride the train to 49, vs. running a risk of creating a db migration mess on beta.
OK, let it ride the train from 49 then. Thanks!
Attachment #8761368 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in before you can comment on or make changes to this bug.