Closed Bug 1387780 Opened 7 years ago Closed 7 years ago

Awesomebar autocomplete perf optimizations, take 2

Categories

(Toolkit :: Places, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: simon.lindholm10, Assigned: simon.lindholm10)

References

Details

(Keywords: perf)

Attachments

(2 files, 4 obsolete files)

Follow-up to bug 1287277. Felt reasonable to work a bit more on it, what with Firefox 57 and all.

When searching for a random three-letter string in the awesomebar, the upcoming patches improve performance of the SQL:y parts by ~8% and ~18% respectively on my machine (i5-4200U, Linux, gcc, -O2).

perf profiles before/after (a bit slow to load): http://bit.ly/2vCYoo8 http://bit.ly/2fik4iw
Do weird XPCOM things to avoid memory allocations. Not certain it's all kosher. (Feel free to forward review, I just picked the first name after mak in the git-log for SQLFunctions.cpp.)
Attachment #8894133 - Flags: review?(adw)
Do terrifying things with UTF-8 to speed up string searching. Not sure the "#ifdef DEBUG" assert is the right way to go, perhaps it should be a gtest in components/places/? or somewhere nearer to the rest of the Unicode stuff?

(I'm not sure the dotted-i case can even occur -- sqlite seems to normalize it to a plain I and a U+0307 COMBINING DOT ABOVE before we get there. And again, feel free to forward reviewer.)
Attachment #8894134 - Flags: review?(adw)
At the cost of some sanity, this patch cuts the runtime of the same testcase by another 40-50%. OTOH, it doesn't help at all for cases where non-boundary matches are not required. I'll ponder a bit more on whether this is worth proposing before flagging it for review or feedback...

(The patch works, but is a bit incomplete: it needs tests, some comments, and a proper place to put a global variable. Also, benchmarking to make sure for the case of boundary matches doesn't regress -- my intuition is that effects should be marginal, but that might be wrong.)
Disregard the last patch -- performance is okay, but it's broken due to SQL asynchronicity and "LIMIT 10". It's salvageable in theory, but in ugly enough ways that I won't go there...

Another idea is to decrease browser.urlbar.delay slightly, now that searches are faster.
Attachment #8894377 - Attachment is obsolete: true
Comment on attachment 8894133 [details] [diff] [review]
Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction

Review of attachment 8894133 [details] [diff] [review]:
-----------------------------------------------------------------

The only thing I'm unsure about with this patch is the XPCOM/refcount hack, and my knowledge here is limited.  What exactly are we avoiding here?  vtable lookups due to XPCOM incrementing and decrementing the ArgValueArray's refcount, ...?
Comment on attachment 8894134 [details] [diff] [review]
Part 2: Optimize string searching in MatchAutoCompleteFunction

Review of attachment 8894134 [details] [diff] [review]:
-----------------------------------------------------------------

Cool, this looks good to me.  r+ on the Places parts with the comments below.  I can't r+ the intl changes though (even though they're basically costmetic and LGTM), so you'll need to get an intl person to r+ that part.  I think that includes :masayuki, but I'm not sure.

::: toolkit/components/places/SQLFunctions.cpp
@@ +35,5 @@
>  
>    /**
> +   * Scan forward through UTF-8 text until the next potential character that
> +   * could match a given codepoint when lower-cased (false positives are okay).
> +   * This avoids having to actually parse the UTF-8 text, which is slow.

Please add @param's and a @return.

@@ +69,5 @@
> +        // and search for it in the interval [aStart, cur).
> +        unsigned char special = (aSearchFor == 'i' ? 0xc4 : 0xe2);
> +        const_char_iterator lim = cur;
> +        cur = aStart;
> +        while (cur != lim && (unsigned char)(*cur) != special) {

Couldn't you do this as part of the first loop and avoid two passes?

@@ +84,5 @@
>      return cur;
>    }
>  
> +#ifdef DEBUG
> +  // The above method assumes that the only non-ASCII characters that

Yes, this definitely doesn't belong here and should be in a test.  toolkit/components/places is where it should go I think, since your code here depends on it.  Please include a code comment in it explaining what it's testing and why.  Is an xpcshell test possible for this?  (You could just add a test task to the test file you're already modifying.)  gtest would be fine if not, it's just that we don't write compiled tests often.

@@ +104,5 @@
> +  /**
> +   * Check whether a character position (which must *not* be the first byte) is
> +   * on a word boundary of a UTF-8 string.  We define "within word" to be any
> +   * position between [a-zA-Z] and [a-z] -- this lets us match CamelCase words.
> +   * TODO: support non-latin alphabets.

Please add a @param and @return.

@@ +118,5 @@
> +  }
> +
> +  static
> +  MOZ_ALWAYS_INLINE bool
> +  stringMatch(const_char_iterator aTokenStart, const_char_iterator aTokenEnd,

Please add a doc comment with @param's and a @return.  Also, I think our style is still to put each param on its own new line.

@@ +131,5 @@
> +
> +      bool error;
> +      if (!CaseInsensitiveUTF8CharsEqual(sourceCur, tokenCur,
> +                                          aSourceEnd, aTokenEnd,
> +                                          &sourceCur, &tokenCur, &error)) {

Nit: Looks like these lines are indented an extra space.
Attachment #8894134 - Flags: review?(adw) → review+
(In reply to Simon Lindholm from comment #1)
> Created attachment 8894133 [details] [diff] [review]
> Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction
> 
> Do weird XPCOM things to avoid memory allocations.

Is it just not possible to create the ArgValueArray on the stack without making the changes you made?
> Is it just not possible to create the ArgValueArray on the stack without making the changes you made?

Indeed it's not: we get an error message from NS_IMPL_ADDREF -> MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING, requiring that the destructor is private. It also seems pretty sketchy in general to have a Release that does "delete this;" that we must manipulate refcounts to avoid calling.

(I'm also a bit unsure about whether this is way you're supposed to do it.)


> Couldn't you do this as part of the first loop and avoid two passes?

I suppose. I'll do some experiments to see how perf varies.

> Is an xpcshell test possible for this?

I suspect not; from what I can tell mozilla::unicode::GetLowercase is not JS-exposed, and .toLowerCase() uses another algorithm (although their results match in this case). I'll add a new gtest then.
(In reply to Simon Lindholm from comment #8)
> (I'm also a bit unsure about whether this is way you're supposed to do it.)

Me too.  Could you ask someone who knows XPCOM to review that part?  Maybe bsmedberg, I'm not sure who.  But he could at least redirect better than I can.  I can r+ the rest of the patch.  I'll clear the r? for now until the next version with the test.

> I suspect not; from what I can tell mozilla::unicode::GetLowercase is not
> JS-exposed, and .toLowerCase() uses another algorithm (although their
> results match in this case). I'll add a new gtest then.

OK.
Attachment #8894133 - Flags: review?(adw)
(In reply to Drew Willcoxon :adw from comment #9)
> (In reply to Simon Lindholm from comment #8)
> > (I'm also a bit unsure about whether this is way you're supposed to do it.)

The alternative of course is to deXPCOMify, but I know that's like pulling a loose thread and may not even be practical in this case.
Updated with comments, a gtest, and some tiny changes to nextSearchCandidate and findInString.

:masayuki - are you able to review the intl parts?
Attachment #8894134 - Attachment is obsolete: true
Attachment #8897138 - Flags: review?(masayuki)
Attachment #8897138 - Flags: review?(adw)
Comment on attachment 8894133 [details] [diff] [review]
Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction

Forwarding review to :bsmedberg; main question is whether the XPCOM stuff here is sane.
Attachment #8894133 - Flags: review?(benjamin)
Comment on attachment 8897138 [details] [diff] [review]
Part 2: Optimize string searching in MatchAutoCompleteFunction

I'm not familiar with intl module. Kimura-san, how about you? Or do you know a good reviewer for them?  Otherwise, I'll try to review this.
Attachment #8897138 - Flags: review?(masayuki) → review?(VYV03354)
Comment on attachment 8897138 [details] [diff] [review]
Part 2: Optimize string searching in MatchAutoCompleteFunction

Review of attachment 8897138 [details] [diff] [review]:
-----------------------------------------------------------------

r=me about the intl/ part.
Attachment #8897138 - Flags: review?(VYV03354) → review+
Could someone please add "perf" key word?

Thank you.
Keywords: perf
Comment on attachment 8897138 [details] [diff] [review]
Part 2: Optimize string searching in MatchAutoCompleteFunction

Review of attachment 8897138 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.  Please do push this to tryserver before landing.
Attachment #8897138 - Flags: review?(adw) → review+
Comment on attachment 8894133 [details] [diff] [review]
Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction

Off-hand manual addref is something we should be moving away from, so I don't think we should be doing it.
I also think (just read by chance on twitter since I'm on PTO atm) that Benjamin is leaving Mozilla, so forwarding to Nathan.
Attachment #8894133 - Flags: review?(benjamin) → review?(nfroyd)
Comment on attachment 8894133 [details] [diff] [review]
Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction

Review of attachment 8894133 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with with the comment change below.  It looks like Drew wanted to re-review the patch after the XPCOM changes were looked at, so please have Drew or another storage peer give this a once-over too.

::: storage/mozStorageArgValueArray.h
@@ +19,5 @@
>  {
>  public:
>    ArgValueArray(int32_t aArgc, sqlite3_value **aArgv);
>  
> +  NS_DECL_ISUPPORTS_INHERITED  // no mRefCnt

This is a little gross, but being about to stack allocate this class without giving up the XPCOM-ness is nice.  The only worrisome thing is if somebody were to keep a reference to the mozIStorageValueArray...and then the stack frame containing ArgValueArray was destroyed.  I think this is unlikely, and there are probably pre-existing issues if you held on to one of these objects--I think the sqlite3_value objects would be stale at some point, even if there were still outstanding references to ArgValueArray?

So all in all, this is probably no worse than the situation we already have.

::: toolkit/components/places/SQLFunctions.cpp
@@ +339,5 @@
> +    : mCachedZero(new IntegerVariant(0))
> +    , mCachedOne(new IntegerVariant(1))
> +  {
> +    static_assert(IntegerVariant::HasThreadSafeRefCnt::value,
> +        "Caching assumes that variants have thread-safe refcounting");

Nice, thank you.

::: toolkit/components/places/SQLFunctions.h
@@ +76,5 @@
>    ~MatchAutoCompleteFunction() {}
>  
>    /**
> +   * Cached IntegerVariant values for 0 and 1, respectively.  This function
> +   * is called a lot when using the awesomebar, so this is a nice perf win.

This comment refers to "this function", but it's commenting on member variables.  Perhaps we should say something like "IntegerVariants for 0 and 1 are frequently used in awesomebar queries, so we cache them to avoid allocating memory repeatedly."

@@ +79,5 @@
> +   * Cached IntegerVariant values for 0 and 1, respectively.  This function
> +   * is called a lot when using the awesomebar, so this is a nice perf win.
> +   */
> +  nsCOMPtr<nsIVariant> mCachedZero;
> +  nsCOMPtr<nsIVariant> mCachedOne;

Also, for my own edification, the logic here is that we're going to create a MatchAutoCompleteFunction once, hold onto it in JS, and then call it many times while requests are being made?  And then a single request (or several requests re-using the same MatchAutoCompleteFunction) are going to potentially create many of these integers?
Attachment #8894133 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] from comment #18)
> This comment refers to "this function", but it's commenting on member
> variables.  Perhaps we should say something like "IntegerVariants for 0 and
> 1 are frequently used in awesomebar queries, so we cache them to avoid
> allocating memory repeatedly."

That sounds much better, thanks.  ("this function" was meant to refer to MatchAutoCompleteFunction, but I agree it's very confusing.)

> 
> @@ +79,5 @@
> > +   * Cached IntegerVariant values for 0 and 1, respectively.  This function
> > +   * is called a lot when using the awesomebar, so this is a nice perf win.
> > +   */
> > +  nsCOMPtr<nsIVariant> mCachedZero;
> > +  nsCOMPtr<nsIVariant> mCachedOne;
> 
> Also, for my own edification, the logic here is that we're going to create a
> MatchAutoCompleteFunction once, hold onto it in JS, and then call it many
> times while requests are being made?  And then a single request (or several
> requests re-using the same MatchAutoCompleteFunction) are going to
> potentially create many of these integers?

Right.  We create just a single MatchAutoCompleteFunction per database connection (http://searchfox.org/mozilla-central/rev/13148faaa91a1c823a7d68563d9995480e714979/toolkit/components/places/Database.cpp#1304 -> http://searchfox.org/mozilla-central/rev/13148faaa91a1c823a7d68563d9995480e714979/toolkit/components/places/SQLFunctions.cpp#200 ), and then call it in the worst case ~2e5 times per awesomebar keystroke (http://searchfox.org/mozilla-central/rev/13148faaa91a1c823a7d68563d9995480e714979/toolkit/components/places/UnifiedComplete.js#171 ).  So this eliminates ~2e5 mallocs/frees per search.  (Every call to MatchAutoCompleteFunction::OnFunctionCall returns either 0 or 1, wrapped as a variant.)
Comment on attachment 8894133 [details] [diff] [review]
Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction

Re-flagging :adw for review.
Attachment #8894133 - Flags: review?(adw)
Hm, try shows crashes in storage/test/unit/test_storage_{aggregates,connection,function}.js... Will have to debug.

( https://treeherder.mozilla.org/#/jobs?repo=try&revision=d419710fab2ce6cf339de2eebaffe1b0ccd0f9e3 )
Comment on attachment 8894133 [details] [diff] [review]
Part 1: Avoid memory allocations when calling  MatchAutoCompleteFunction

Cancelling review in the meantime. (I'm getting the feeling that XPConnect might just not like stack allocation.)
Attachment #8894133 - Flags: review?(adw)
The problem is that those tests implement mozIStorageFunction from JS, and exposing stack-allocated objects to JS breaks when XPC wrappers get asynchronously garbage collected and try to destroy their wrappees.

I think the right thing to do is probably to make mozIStorageFunction non-[scriptable]? JS-implemented functions should be avoided anyway, since they cannot be called from off-thread, and only those specific tests use them (and they could be converted to C++). A few addons also made use of JS-implemented functions, but that's irrelevant now. After rewriting the tests deCOMtaminating mozIStorageFunction would also be viable.
I ran out of steam with the converting-tests-to-C++ thing, so let's do without the funky xpcom stack allocation for now. It's only a couple % win anyway.

Try looks greener without that change (and with a small eslint fix): https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c8032f67f32623ad80a231c8a6adac72aa7e6e4
Attachment #8894133 - Attachment is obsolete: true
Attachment #8899299 - Flags: review?(adw)
(with a small eslint test fix)
Attachment #8897138 - Attachment is obsolete: true
Attachment #8899300 - Flags: review+
(In reply to Simon Lindholm from comment #23)
> The problem is that those tests implement mozIStorageFunction from JS, and
> exposing stack-allocated objects to JS breaks when XPC wrappers get
> asynchronously garbage collected and try to destroy their wrappees.

Oof.

> I think the right thing to do is probably to make mozIStorageFunction
> non-[scriptable]? JS-implemented functions should be avoided anyway, since
> they cannot be called from off-thread, and only those specific tests use
> them (and they could be converted to C++). A few addons also made use of
> JS-implemented functions, but that's irrelevant now. After rewriting the
> tests deCOMtaminating mozIStorageFunction would also be viable.

All of this would make an excellent followup bug, especially because once we have deCOMtamination, we can pass C++ references and dispense with the refcounting nonsense entirely.  Would you please file that?
Sure, bug 1392238. (Not sure I'll have time to work on it.)
Comment on attachment 8899299 [details] [diff] [review]
Part 1: Avoid memory allocations when calling MatchAutoCompleteFunction

Review of attachment 8899299 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8899299 - Flags: review?(adw) → review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/04bd1ff1c8dd
Avoid memory allocations when calling MatchAutoCompleteFunction. r=adw, r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/d956bbef1b93
Optimize string searching in MatchAutoCompleteFunction. r=adw, r=emk
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/04bd1ff1c8dd
https://hg.mozilla.org/mozilla-central/rev/d956bbef1b93
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Depends on: 1500053
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: