Last Comment Bug 755480 - Optimize WordSplitState::FindSpecialWord some more
: Optimize WordSplitState::FindSpecialWord some more
Status: RESOLVED FIXED
: perf
Product: Core
Classification: Components
Component: Spelling checker (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla15
Assigned To: :Ehsan Akhgari (busy, don't ask for review please)
:
Mentors:
: 782359 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-15 13:26 PDT by :Ehsan Akhgari (busy, don't ask for review please)
Modified: 2012-08-13 14:24 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Part 1 (1.42 KB, patch)
2012-05-15 13:35 PDT, :Ehsan Akhgari (busy, don't ask for review please)
roc: review+
Details | Diff | Review
Part 2 (1.32 KB, patch)
2012-05-15 13:36 PDT, :Ehsan Akhgari (busy, don't ask for review please)
roc: review+
Details | Diff | Review
Part 3 (2.03 KB, patch)
2012-05-15 13:40 PDT, :Ehsan Akhgari (busy, don't ask for review please)
no flags Details | Diff | Review
Part 3 (4.87 KB, patch)
2012-05-16 11:13 PDT, :Ehsan Akhgari (busy, don't ask for review please)
no flags Details | Diff | Review
Part 3 (4.69 KB, patch)
2012-05-17 07:49 PDT, :Ehsan Akhgari (busy, don't ask for review please)
roc: review+
Details | Diff | Review

Description :Ehsan Akhgari (busy, don't ask for review please) 2012-05-15 13:26:48 PDT
Bug 753233 fixed the suckiness of this function for data: URIs, but this function still stinks perf-wise.

1. It goes through some effort to compute foundDot, and then doesn't use it!
2. When it finds a colon, it keeps on looking at every other character, when it can just look at the next character, and bail out if it detects a URL-like thingy.
3. Worst of all, for strings without @ (and : if number 2 above is fixed), this is O(nm) where n is the number of characters in the string, and m is the number of words in the string.  This is because the caller hopelessly calls this over and over *even* if it returns -1, which means, "hey caller, there's nothing interesting anywhere in this string!".

Point #3 just caused me to have to kill Firefox after waiting a few minutes for it to hit the event loop again.  This is unacceptable!
Comment 1 :Ehsan Akhgari (busy, don't ask for review please) 2012-05-15 13:35:01 PDT
Created attachment 624170 [details] [diff] [review]
Part 1
Comment 2 :Ehsan Akhgari (busy, don't ask for review please) 2012-05-15 13:36:43 PDT
Created attachment 624172 [details] [diff] [review]
Part 2
Comment 3 :Ehsan Akhgari (busy, don't ask for review please) 2012-05-15 13:40:35 PDT
Created attachment 624173 [details] [diff] [review]
Part 3

This is the most valuable optimization here: don't call FindSpecialWord repeatedly if it returns -1 in an iteration.

Note that this is mostly reindenting existing code.
Comment 4 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-15 14:12:24 PDT
Comment on attachment 624173 [details] [diff] [review]
Part 3

Review of attachment 624173 [details] [diff] [review]:
-----------------------------------------------------------------

I think this would be clearer if FindSpecialWord simply became IsSpecialWord, returning bool, and SplitDOMWord just did AdvanceThroughSeparators() first, then called IsSpecialWord, and if not a special word then did the loop.
Comment 5 :Ehsan Akhgari (busy, don't ask for review please) 2012-05-16 11:13:00 PDT
Created attachment 624453 [details] [diff] [review]
Part 3
Comment 6 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-16 15:03:50 PDT
Comment on attachment 624453 [details] [diff] [review]
Part 3

Review of attachment 624453 [details] [diff] [review]:
-----------------------------------------------------------------

::: extensions/spellcheck/src/mozInlineSpellWordUtil.cpp
@@ +1009,5 @@
> +
> +    // skip the special word
> +    state.mDOMWordOffset += specialWordLength;
> +    if (state.mDOMWordOffset + aStart >= aEnd)
> +      state.mCurCharClass = CHAR_CLASS_END_OF_INPUT;

I think state.mDOMWordOffset is always equal to aEnd here. That means we can simplify the code. In fact, it's very confusing to have dead code here.
Comment 7 :Ehsan Akhgari (busy, don't ask for review please) 2012-05-17 07:49:25 PDT
Created attachment 624742 [details] [diff] [review]
Part 3
Comment 8 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-17 15:16:21 PDT
Comment on attachment 624742 [details] [diff] [review]
Part 3

Review of attachment 624742 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with that fixed.

::: extensions/spellcheck/src/mozInlineSpellWordUtil.cpp
@@ +1007,5 @@
> +    mRealWords.AppendElement(
> +        RealWord(aStart + state.mDOMWordOffset, specialWordLength, false));
> +
> +    // skip the special word
> +    state.mDOMWordOffset += specialWordLength;

No point in updating 'state' since it's local to this function and we're about to return!
Comment 11 :Ehsan Akhgari (busy, don't ask for review please) 2012-08-13 14:24:31 PDT
*** Bug 782359 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.