Closed Bug 1393486 Opened 7 years ago Closed 7 years ago

Avoid joining with moz_openpages_temp when we don't need to

Categories

(Toolkit :: Places, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: simon.lindholm10, Assigned: simon.lindholm10)

Details

Attachments

(1 file, 7 obsolete files)

Noticed this opportunity for optimization when doing performance-testing for bug 1320301.

This reduces awesomebar search times by ~9% in my tests (i5-4200U, Linux, gcc, -O2, with an SSD), with the same methodology as in bug 1287277.
Attachment #8900769 - Flags: review?(mak77)
This is a bit surprising, since the only difference is passing t.open_count to AUTOCOMPLETE_MATCH... The SQL function itself doesn't save any time, since it still has to extract the argument and does a simple check on the value.

The query is still joining. I suppose the win comes from the fact sqlite doesn't need to join before the AUTOCOMPLETE_MATCH call, but only at the end, when results have been fetched already.
Comment on attachment 8900769 [details] [diff] [review]
Avoid joining with moz_openpages_temp when we don't need to

Review of attachment 8900769 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/UnifiedComplete.js
@@ +167,5 @@
>  // TODO bug 412736: in case of a frecency tie, we might break it with h.typed
>  // and h.visit_count.  That is slower though, so not doing it yet...
>  // NB: as a slight performance optimization, we only evaluate the "btitle"
> +// and "tags" queries for bookmarked entries.  Similarly, we only join with
> +// moz_openpages_temp if we need to.

The comment should be changed, since we always join moz_openpages_temp it looks confusing. The difference, I think, is that the query can do the join after filtering results, rather than before.

@@ +1993,5 @@
>     * @return a string consisting of the search query to be used based on the
>     * previously set urlbar suggestion preferences.
>     */
>    get _suggestionPrefQuery() {
> +    let restrictOpen = this.hasBehavior("openpage");

I have a doubt, the default settings should always have "openpage", so the cases where we get a win look limited? Which kind of search did you use to measure the win? Which settings?

I can see there's a win for the "restrict"+"openpage" case, and when the settings are not default (switch to tab is disabled), but I don't see a general win off.hand.
> The comment should be changed, since we always join moz_openpages_temp it looks confusing. The difference, I think, is that
> the query can do the join after filtering results, rather than before.

Ah, yeah, fair. Indeed sqlite does the join lazily, only on rows that match.

> I have a doubt, the default settings should always have "openpage", so the cases where we get a win look limited?

... ah, they do, I had totally missed that. :(  Good catch.  My benchmarks just compared the previous query to one with t.open_count replaced by 0, without considering hasBehavior("openpage") -- I added that check later...  Indeed, optimizing for the non-default case wouldn't be very useful...

So the patch *almost* works with the default settings, if we replace "restrictOpen = this.hasBehavior("openpage");" by essentially "restrictOpen = this.hasBehavior("restrict") || (non-default settings);".  In that case, the condition we check in AUTOCOMPLETE_MATCH:

      matches = (HAS_BEHAVIOR(HISTORY) && visitCount > 0) ||
                (HAS_BEHAVIOR(TYPED) && typed) ||
                (HAS_BEHAVIOR(BOOKMARK) && bookmark) ||
                (HAS_BEHAVIOR(TAG) && !tags.IsVoid()) ||
                (HAS_BEHAVIOR(OPENPAGE) && openPageCount > 0);

is basically always true when h.frecency <> 0, which the SQL query checks.  But making it exactly behavior-preserving is a bit tricky...  One option, although it's ugly, is to set

  restrictOpen = this.hasBehavior("restrict") || !this.hasBehavior("history") || !this.hasBehavior("bookmark");

and then in the happy case of !restrictOpen let openCountFragment equal

  CASE WHEN h.visit_count > 0 OR bookmarked THEN 0 ELSE t.open_count END

I.e., we only pass a correct open_count if we need to, because the item was neither bookmarked nor visited.

A nicer but more invasive option is to move all the match criteria into SQL, and make sure that the openpage one comes last.  That's probably a clarity win, too, and might even improve performance if we're lucky, by needing less sqlite-to-C++ marshalling.  (Or make it worse.  You never know with these things.)
Comment on attachment 8900769 [details] [diff] [review]
Avoid joining with moz_openpages_temp when we don't need to

Cancelling review for now.
Attachment #8900769 - Flags: review?(mak77)
This moves the matching logic from C++ into SQL, and should preserve existing behavior if I didn't mess anything up.  The perf improvement even seems to have gone from ~9% to ~20%.  (I really hope that holds up in the real world...  I find it hard to spot previous optimization attempts in telemetry data.)
Attachment #8900769 - Attachment is obsolete: true
Attachment #8901444 - Flags: review?(mak77)
Comment on attachment 8901444 [details] [diff] [review]
Move awesomebar behavior filtering from C++ to SQL

Review of attachment 8901444 [details] [diff] [review]:
-----------------------------------------------------------------

The only risk I oversee here, is that Sqlite decides to evaluate the custom function first, and then it will do expensive string matching before simple integers checks. The previous setup ensured that simple checks always bailed out before we'd even start string matching.
Hopefully it's not the case, since it doesn't know anything about the custom function, while it should know something about the other conditions. I didn't find much documentation about this kind of choices by the query optimizer.

How much of the gain we'd retain if we'd only factored out openpages and leave the rest as-is? Asking mostly because overall the patch looks a little bit behavioral-change risky and limiting it a bit while retaining most of the win would be the best. It depends on where most of the benefit is.

::: toolkit/components/places/UnifiedComplete.js
@@ +2038,5 @@
> +      }
> +
> +      // Further, if just one of "history" and "bookmark" is set, it is also
> +      // a hard requirement.
> +      // XXX do we need this?

What's the doubt exactly?

@@ +2091,5 @@
> +        (this.hasBehavior("history") || this.hasBehavior("bookmark") ||
> +         this.hasBehavior("typed") || this.hasBehavior("tag"))) {
> +      // We are trying to restrict based on something we have no data about.
> +      // That can't match anything, so bail.
> +      return ["SELECT 0 WHERE 0", {}];

This looks a bit hackish, the code using _switchToTabQuery can easily check the return value, so looks like this could just return null.
Attachment #8901444 - Flags: review?(mak77)
(In reply to Marco Bonardo [::mak] from comment #7)

> The only risk I oversee here, is that Sqlite decides to evaluate the custom
> function first, and then it will do expensive string matching before simple
> integers checks. The previous setup ensured that simple checks always bailed
> out before we'd even start string matching.
> Hopefully it's not the case, since it doesn't know anything about the custom
> function, while it should know something about the other conditions. I
> didn't find much documentation about this kind of choices by the query
> optimizer.

I don't see this as a big problem, since in practice with default settings the simple checks almost always pass.  I could move the conditions in front of the AUTOCOMPLETE_MATCH check, though, to help with non-default settings or restrict mode.  Doing so is a slight performance decrease (hinting that checks are always performed in left-to-right order).

> 
> How much of the gain we'd retain if we'd only factored out openpages and
> leave the rest as-is? Asking mostly because overall the patch looks a little
> bit behavioral-change risky and limiting it a bit while retaining most of
> the win would be the best. It depends on where most of the benefit is.

When I test with four different queries:

1) the one in this patch
2) same, but also passes four unnecessary parameters* to AUTOCOMPLETE_MATCH: h.visit_count, h.typed, bookmarked, t.open_count
3) same, but also passes three unnecessary parameters to AUTOCOMPLETE_MATCH: h.visit_count, h.typed, bookmarked
4) same, but with simple conditions before AUTOCOMPLETE_MATCH instead of after

I get the following measurements (in us, for a short query with just three matches):

1) 215724.25925925927 +/- 15351.957909340026
2) 274652.7011494253 +/- 16830.248680455185
3) 252748.57142857142 +/- 19056.42216961872
4) 216592.3563218391 +/- 18326.3744821577

I.e., most of the time seems to be spent by sqlite just trying to pass parameters to AUTOCOMPLETE_MATCH.  Which is a bit confusing, honestly, and I might try to look at its source code and see if I can make sense of it...

(* I modified AUTOCOMPLETE_MATCH to take, extract AsInt32 and ignore up to 4 extra parameters.  When not passing unnecessary parameters I pass a 0 instead.)

> 
> ::: toolkit/components/places/UnifiedComplete.js
> @@ +2038,5 @@
> > +      }
> > +
> > +      // Further, if just one of "history" and "bookmark" is set, it is also
> > +      // a hard requirement.
> > +      // XXX do we need this?
> 
> What's the doubt exactly?

Just seemed like a weird and that useful piece of logic.  Not that I really mind keeping it.
Attached patch [DO NOT LAND] Benchmarking patch (obsolete) — Splinter Review
> (In reply to Simon Lindholm from comment #8)
> I.e., most of the time seems to be spent by sqlite just trying to pass parameters to AUTOCOMPLETE_MATCH.  Which is a bit
> confusing, honestly, and I might try to look at its source code and see if I can make sense of it...

Hm, no, it isn't parameter passing which is slow, it really is the joins (verified by dumping sqlite bytecode with ".eqp full").  Both "t.open_count" and "bookmarked" are slow to evaluate and pass to AUTOCOMPLETE_MATCH.  "bookmarked" doesn't get memoized as I had initially imagined -- rather, each time it's referred to sqlite does a new lookup for it, which takes some time.  This also suggests another optimization: passing tags and title as a single parameter, getting rid of a slow "CASE WHEN bookmarked THEN ... ELSE '' END".  With that I have the following numbers:

1) 266570.8 +/- 20386.110577911746
2) 239870.9090909091 +/- 9489.987891373714
3) 211100.16666666666 +/- 5078.600736580444
4) 204969.375 +/- 7342.3221498891135
5) 175506.69491525425 +/- 8207.641169913584

where

1) passes h.visit_count, h.typed, bookmarked, t.open_count to AUTOCOMPLETE_MATCH (roughly matching current behavior)
2) passes h.visit_count, h.typed, bookmarked
3) passes h.visit_count, h.typed
4) passes nothing
5) passes nothing, and replaces

CASE WHEN bookmarked THEN
  IFNULL(btitle, h.title)
ELSE h.title END,
CASE WHEN bookmarked THEN
  tags
ELSE '' END,

by

CASE WHEN bookmarked THEN
  tags || ' ' || IFNULL(btitle, h.title)
ELSE h.title END,
'',

(which is behavior-preserving except for the edge cases where tags is invalid UTF-8, or when the string length exceeds 255.  We could bump the latter limit.)
Attached patch Fix review comments (interdiff) (obsolete) — Splinter Review
(for what it's worth. with a small optimization to sqlAdaptiveQuery added.)
Attached patch Alternative minimal patch (obsolete) — Splinter Review
This patch spot-fixes the query, instead of refactoring the world.  It doesn't win quite as much; I get the following timings:

1) 259250.52631578947 +/- 10857.55311345958
2) 231151.54761904763 +/- 14119.546712961768
3) 207262.9702970297 +/- 15939.349040291769

where

1) is before
2) is with the change to pass bookmarks and tags together
3) is like 2) but also avoids passing 'bookmarked' and 't.open_count' in the common case where hasBehavior("history"), !hasBehavior("restrict"), and the entry has h.visit_count > 0

So a ~20% win, to compare with the ~35% one with the full patch set (not entirely sure why the difference is so large, but I imagine it might come from doing more expensive SQL things?).  I've also somewhat come to like how minimal AUTOCOMPLETE_MATCH becomes with the refactoring.
ni? mak for opinions.
Flags: needinfo?(mak77)
Attached patch Another minimal patch (obsolete) — Splinter Review
This variation is probably better.  Only avoids repeatedly evaluating "bookmarked", but that seems to have been the main cost anyway (funny how bugs morph).  Timings are about the same as in comment 13, i.e., a ~20% win in the case I'm testing.
Attachment #8903364 - Attachment is obsolete: true
I'm sorry, I'm a little bit overloaded with reviews for 57, thus I couldn't look at everything here yet. As soon as the "emergency" is gone I'll be back to you. If I understood correctly the latest minimal patch covers most of the wins from the more complex one?
Totally understood, and this isn't exactly urgent.  The latest patch wins ~20%, while the more complex one won ~35% when I measured it.  One option would be to put the complex one into another bug and look at it after 57.
Hm yes, there are a bit too many patches, and we're unlikely to have time to address eventual regressions at this point.
I'd say to build the smaller scoped patch you can, and move on with that, leaving other investigations to a separate bug we could look into at the beginning of the 58 cycle.
The latest patch using an external CASE WHEN looks safe and if the benefit is the measured one we can surely take it.
Flags: needinfo?(mak77)
> The latest patch using an external CASE WHEN looks safe and if the benefit is the measured one we can surely take it.

I'll flag the patch for review with a reasonable commit message, then.  I'm rather sure the benefit is real in the case I'm measuring, so I do believe it's worth taking.  On the other hand, it's hard to see any effect of the similar bug 1287277 and bug 1387780 on the TELEMETRY_1ST_RESULT and TELEMETRY_6_FIRST_RESULTS telemetry probes, so I suspect this one might not show up either.  I have two hypotheses as to why:

- optimizing search times means that more searches finish in time, and thus don't cancel their stopwatches. This could cause timing measurements to increase as well as decrease.

- a majority of searches finish really quickly, so the majority of slowdowns are due to external load (e.g. slow disk) rather than slow CPU stuff.
Attachment #8901444 - Attachment is obsolete: true
Attachment #8903115 - Attachment is obsolete: true
Attachment #8903359 - Attachment is obsolete: true
Attachment #8903362 - Attachment is obsolete: true
Attachment #8904367 - Attachment is obsolete: true
Attachment #8906255 - Flags: review?(mak77)
Comment on attachment 8906255 [details] [diff] [review]
Reduce subquery evaluations in awesomebar search SQL

Review of attachment 8906255 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM, thank you!
Sorry, I don't recall if you have Try access, if so please push to try running xpcshell, firefox-ui-functional and mochitest-browser-e10s, and set checkin-needed once ready. Otherwise, just needinfo me.
Attachment #8906255 - Flags: review?(mak77) → review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5dd42a492295
Reduce subquery evaluations in awesomebar search SQL. r=mak
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/5dd42a492295
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: