Closed Bug 1211984 Opened 9 years ago Closed 9 years ago

Cache frecency query for Top Sites to prevent 2 second delay in display top site items

Categories

(Firefox for iOS :: Home screen, defect)

All
iOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
fxios 1.2+ ---

People

(Reporter: sleroux, Assigned: sleroux)

References

Details

(Keywords: perf)

Attachments

(1 file)

Every time we display the Top Sites Panel, we wait for a fresh copy of the data from the frecency query before displaying anything on the screen. We need to figure out a better way of display this content instantly without waiting for a long DB query every time. Some solutions could include:

1. Speeding up the frecency query. From what I know it's a pretty hefty query so this doesn't seem likely.

2. Display a cached version of the query. The results from the frecency query will change less the more the user visits websites which means that after extended use of the app we are essentially receiving the same results back every time we make the query.

In going with query 2, we would need to come up with some cache heuristics as to how and when we invalidate it.
Depends on: 1211991
No longer depends on: 1211991
Keywords: perf
Whiteboard: [perf]
Assignee: nobody → sleroux
Status: NEW → ASSIGNED
My only advice is to make sure we add robust, not fragile, mechanisms. Using duct tape and band-aids to fix performance issues will cause pain in the future. Speaking from experience.

If you're going to cache, think about any other code paths that might use the frecency query now or later. Making a caching mechanism that improves performance lower in the layers might benefit more consumers.
More discussion is here:

https://gist.github.com/sleroux/0340f37154ec915a563b

My perspective here is about balance.

The right solution, as I see it, is to compute and store fine-grained frecency for each URL, which makes both the top sites query *and* awesomebar search faster.

That involves schema changes, a migration, and figuring out when and how to compute and update these. Desktop has some inspiration to offer. It's not rocket surgery.

That said, I think it's worth also investigating caching if we are confident that (a) it won't be much work, and (b) it'll buy us big wins with a simple approach, not whack-a-mole.

If we can come up with a simple heuristic for when to snapshot, when to invalidate that snapshot, and we're done without significant regressions… great! Let's do that to make Top Sites snappy for now.
Would it make sense to borrow the heuristic that desktop is using? One idea that Richard and I were throwing around was calculating the frecency at a fixed future timestamp which would allow us to materialize the frecency values into a table for Top Sites. We would then to purge the table at that future timestmap and recalculate. From what I know, desktop does a similar thing where they calculate an approximate value for frecency on a daily(?) interval.
(In reply to Stephan Leroux [:sleroux] from comment #3)
> Would it make sense to borrow the heuristic that desktop is using?

In that I think we should move to bucketed persisted per-URL frecency, yeah :D


> that Richard and I were throwing around was calculating the frecency at a
> fixed future timestamp which would allow us to materialize the frecency
> values into a table for Top Sites. We would then to purge the table at that
> future timestmap and recalculate. From what I know, desktop does a similar
> thing where they calculate an approximate value for frecency on a daily(?)
> interval.

That's part of persisting frecency, yeah -- on desktop we use idle-daily to update the stored value for each URL in the DB, using buckets for scoring.

I think we were talking about doing something strictly simpler, as a pure cache -- a separate table that we fill with the results of the unchanged current query.

So rather than

  func getTopSites {
    return SELECT ... FROM ... ORDER BY $complicated
  }

we'd do

  func getTopSites {
    if (not recent enough) or (no cached top sites) {
      DELETE FROM cached_top_sites
      INSERT INTO cached_top_sites (SELECT ... FROM ... ORDER BY $complicated)
    }
    return SELECT * FROM cached_top_sites
  }
      

That's *roughly* equivalent to storing precomputed frecency in a separate table, but it's simpler, in that neither the consuming code nor the frecency query need to change at all -- we just add a layer of indirection.

I think if we spend the time to persist individual-URL frecency, then we're close enough to just doing the right thing and making it part of the main data model.
Thanks Richard - that clarifies it more. I don't see the harm in adding that patch for now to increase performance of top sites while we work on the database schema changes as more of a long term project.
In-progress work on caching frecency results for a faster Top Sites Panel.
So far,

1. Added cached_top_sites table to store frecency values
2. Added getTopSitesWithLimit method to SQLiteHistory to return all top sites from cache if it's valid and dump/recalculate cache when cache is stale (> 1 day old).

To do:

1. Need to hook into the various history events to update our cache (updateVisit, insertVisit, remove, etc). Not sure if hooking into Swift methods is the best way or adding triggers to the visits/history table would be best.
2. Instead of using 'now' for the frecency calculations, it needs to be set to a time in the future for the approximation otherwise the calculations will be wrong. Just need to refactor the frecency query methods to pass in a timestamp (which will be a day ahead of our last cached time).

Richard, do you have any thoughts/concerns/suggestions about the current approach and where to go to next?
Attachment #8680048 - Flags: feedback?(rnewman)
(In reply to Stephan Leroux [:sleroux] from comment #6)

> 1. Need to hook into the various history events to update our cache
> (updateVisit, insertVisit, remove, etc). Not sure if hooking into Swift
> methods is the best way or adding triggers to the visits/history table would
> be best.

Our writes are high-volume, so I'd be inclined to not do this with DB triggers just yet. Perhaps after a history sync to start with, and then we can focus on solving invalidation for the new-user case.

> 2. Instead of using 'now' for the frecency calculations, it needs to be set
> to a time in the future for the approximation otherwise the calculations
> will be wrong. Just need to refactor the frecency query methods to pass in a
> timestamp (which will be a day ahead of our last cached time).

The calculations will be wrong, but not in a way that matters. (We should really graph the current calculation!)

> Richard, do you have any thoughts/concerns/suggestions about the current
> approach and where to go to next?

Comments on the PR. Test and measure. Make sure we don't set the timestamp at the wrong moment -- we should blank it on purge and set it on update, always.
Attachment #8680048 - Flags: feedback?(rnewman) → feedback+
Thanks for the feedback. I've updated the PR with the suggested fixes and added some tests as well for the top sites stuff.

> Our writes are high-volume, so I'd be inclined to not do this with DB triggers just yet. Perhaps after a > history sync to start with, and then we can focus on solving invalidation for the new-user case.

Invalidation after syncing history works but I think we still need a way of invalidating in the new-user/non-FxA cases for an MVP of this feature since not having Top Sites update for a day isn't acceptable. Even something like invalidating on app launches would work for now.
Actually - I can probably hook into the applicationDidBecomeActive callback in BrowserProfile
Right now, we show top sites and block until we get results.

I think this is strictly better and also simple:

When loading top sites, show cached results if any exist. If not, show only suggested sites. This matches our first run behavior: yay!

If:
  - There is no cache, or
  - Any pages have been visited since last cache update, or
  - This is the first launch of the browser, or
  - History has synced since last cache update
Then:
  - In the background, invalidate the cache and trigger a redisplay.


In the first run case, then, we'll show suggested sites. In the regular use case, we'll show your old top sites and almost never will they change… but they might change a little half a second after we show them. That seems fine to me.

Additionally: when clearing history, or on any other significant event that causes us to purge the cache, also immediately recompute in the background. That'll make sure our cache is up to date when large changes occur.

Thoughts?
That sounds more robust and makes sense. To clarify, we would be moving away from a 'cache this once a day' to:

1. If any of the events listed occur, set a flag indicating the cache is stale
2. When visiting TopSitesPanel, show cached TopSites immediately
3. If cache is stale, invalidate cache and redisplay with new data set

In the cases of significant events such as clear history/app launches, the invalidation is immediate.

I like this idea of splitting up smaller events into delayed updates and only using immediate invalidation in places where it makes sense like clear history.
The downside I see is the overhead of adding in these invalidation/stale state changes across the app vs something automatic like a DB trigger.
Comment on attachment 8680048 [details] [review]
https://github.com/mozilla/firefox-ios/pull/1196

I've updated the PR to use explicit invalidation as mentioned in the previous comments. Works pretty well! The only slowness I'm seeing now is the layout code for the cells - the cache is fast.
Attachment #8680048 - Flags: feedback+ → feedback?(rnewman)
Just curious: Is the new cache table for TopSites only, or can it be used for the filter-as-you-type frecency too?
(In reply to Mark Finkle (:mfinkle) from comment #14)
> Just curious: Is the new cache table for TopSites only, or can it be used
> for the filter-as-you-type frecency too?

Just top sites. We don't see such significant performance issues with the awesomebar, because the text filter can dramatically cut down the solution space before joining and ordering. That also doesn't kill our time-to-display for the home page, so it's less urgent.

The more thorough solution discussed earlier in this bug will improve both.
This looks great. The very first time it runs on an existing profile, we get a flicker from just suggested tiles, then we show your real top sites.

A possible fix for that is to check to see if we have a cache -- look at the timestamp pref? -- and decide to wait instead of showing the empty cached values.

Over to Robin to see what she thinks of that flicker.
Attachment #8680048 - Flags: ui-review?(randersen)
Attachment #8680048 - Flags: review+
Attachment #8680048 - Flags: feedback?(rnewman)
Not sure what the best way to fix this would be. We don't have a timestamp pref key anymore so we can't rely on that. It's also difficult to differentiate between a the cache not existing and it returning no results. Does this flicker only happen once whenever the user first loads the app with the caching patch and has a profile? If that's the case then it should only occur once in the history of the app while we populate the cache for the first time. I wonder if it's best to keep this instead of finding an exception to our 'show then refresh' logic in Top Sites panel?
Comment on attachment 8680048 [details] [review]
https://github.com/mozilla/firefox-ios/pull/1196

I don't get a flicker at all from first run on top sites, in fact, nothing happens at all until I switch panels and come back. In this example, http://c.tecgirl.com/dhSk, I waited 20+s, switched to Bookmarks, back to Top Sites, back to Bookmarks, then back to Top Sites in order to see them load. No flicker, just a slight delay in loading the favicons, which doesn't bother me. 

Is this expected behavior? I hope not. So I'll + for no flicker, but not happy about having to hop panels for Top Sites to load.
Attachment #8680048 - Flags: ui-review?(randersen) → ui-review+
That's no different to what we have now; you're just wandering through a flow we don't normally see.

What's happening there:

* You signed in to Sync. You have a big account. Ten seconds from now we'll have downloaded your passwords and 2,000 bookmarks. Thirty seconds from now we'll have downloaded your 5,000+ history items, which we do in one go (Bug 1168856).
* At t=20 you switch to bookmarks, and can see Desktop Bookmarks.
* At t=25 you switch back to Top Sites. We haven't finished downloading history yet, so you see only suggested sites.
* At t=30 you switch back and forth again. You see nothing (because the cache is empty). We kicked off a query. When it's done (500msec later) we finish populating the cache and show your top sites. You see them appear.

There are two immediate rough edges:

1. We don't redisplay home panels if a sync finishes while you're on them. We probably should do so for Top Sites (it would reset scroll position for history and bookmarks, which would suck).

2. We don't display any indication that your useless top sites will eventually be filled with wonder. That's a problem with or without Sync, to be honest.


Our first sync experience is pretty painful. We should consider showing some kind of progress somehow.
Merged

c5c7d7162edf2ba7013bc1e023138b8f8d6f9935
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Let's file separate bugs for any UI paper cuts we want to address for sync feedback
Follow-up for bustage:

73bc812
Hardware: Other → All
Depends on: 1222459
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: