Closed Bug 1303749 Opened 8 years ago Closed 5 years ago

Spell checker might be using sync IPC messages for every word.

Categories

(Core :: Spelling checker, defect, P1)

50 Branch
defect

Tracking

()

RESOLVED DUPLICATE of bug 1502661
Iteration:
54.2 - Feb 20
Performance Impact medium
Tracking Status
e10s + ---

People

(Reporter: mconley, Unassigned)

References

(Blocks 4 open bugs)

Details

(Keywords: perf, perf:responsiveness, Whiteboard: [e10s-multi:-])

Attachments

(4 files)

jrmuizel just noticed this today:

https://dxr.mozilla.org/mozilla-central/rev/f0f15b7c6aa77a0c5750918aa0a1cb3dc82185bc/extensions/spellcheck/hunspell/glue/PRemoteSpellcheckEngine.ipdl#15

which is called here:

http://searchfox.org/mozilla-central/rev/f6c298b36db67a7109079c0dd7755f329c1d58e2/extensions/spellcheck/src/mozSpellChecker.cpp#139

I'm not too familiar with our spell checking infrastructure, but the implication seems to be that we're sending a sync IPC message per word within an editor.

I don't actually know if this is the case, but I think it's probably something we want to investigate and understand.

If we are actually sending sync IPC messages per word, we should probably try finding a way to not do that for performance reasons.
ally, I remember you working on this stuff, and ehsan, I remember you reviewing it - is my read on this correct? Is it sync IPC message per word? Or something else?
Flags: needinfo?(ehsan)
Flags: needinfo?(a.m.naaktgeboren)
This could be the cause of bug 1297725.
See Also: → 1297725
Blocks: e10s-perf
Yes, this is one sync IPC message per word.

This is dictated by this method being synchronous: <http://searchfox.org/mozilla-central/source/editor/txtsvc/nsISpellChecker.h#54>

In general, our entire spell checking interface is composed of synchronous APIs, so doing anything else in the e10s case will require rewriting our spell checker to make it async friendly.  Ally wasn't in a rush to do that (understandably so), and since this component is essentially unowned, I'm not aware of anybody else who would be up to do this task.
Flags: needinfo?(ehsan)
Note that the other thing that hurts us here is our spell checker regularly over-checking larger areas of the DOM for spelling mistakes in response to smaller changes (such as changing one character.)  It's not uncommon for us to double check paragraphs or more in that case.  This is easily visible by watching the squiggly lines disappear and reappear in that etherpad test case, for example.
Why don't we spell check in the content process?
Thanks for the context, ehsan.

Just spitballing here (mostly re-saying stuff bkelly suggested), but is it possible to store our dictionaries (which I suspect do not change much) in shared memory such that content processes can synchronously read as often as they want?

blassey - I remember you did some of the shared memory stuff for tab switching way back in the day (at least for OS X). Do you know if that stuff is set up for something like what bkelly suggests?
ni for comment 6
Flags: needinfo?(a.m.naaktgeboren) → needinfo?(blassey.bugs)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #5)
> Why don't we spell check in the content process?

Bug 693555 comment 6 suggests one reason.
(In reply to Nathan Froyd [:froydnj] from comment #8)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #5)
> > Why don't we spell check in the content process?
> 
> Bug 693555 comment 6 suggests one reason.

Good context there.

It seems to me that a shared memory region that's writable by the parent process (but read-only by all content processes) might be super useful for this - and maybe other things. Like, I can see such a thing helping out with bug 1303096, for example.
(In reply to Mike Conley (:mconley) - (needinfo me!) from comment #6)
> Just spitballing here (mostly re-saying stuff bkelly suggested)

For those who are wondering where this stuff bkelly suggested is, he actually suggested it in IRC, and it's this:

"mconley: that sounds bad from child process memory footprint... can we keep the dictionary data in a shared memory page or something?"?

>
Just capturing some chatter with billm in IRC:

4:03 PM <bkelly> seems like typing is one of those things where people are highly sensitive to responsiveness... so any small gains to remove sync IPC, etc would probably be worth it
4:03 PM <&billm> yeah, the easiest fix here is some sort of batch checking
4:04 PM but I don't know if that fits into the spell check code well or not
4:04 PM <mconley> ehsan_ probably knows for sure
4:04 PM I get the impression that it's kinda crufty stuff
4:04 PM not well loved.
4:05 PM <&billm> well, whatever we do is going to involve messing with it I think
As I said on IRC, doing things such as "batch checking" while it sounds trivial requires massive changes to the spell checking code.

The shared memory idea is also easier said than done, since the data structures in question are created and managed by hunspell, and I'm assuming that we don't want to change the guts of that library and maintain it.

Before multi-e10s becomes a thing, one quick option would be to actually move all of the spell checking to happen in the *child* process.  In a sane world, we should not have anything that requires spell checking in the parent.  Grepping around seems to suggest that we don't have any <xul:textbox spellcheck="true"> except in tests.  This will still require changes to hunspell since it normally just wants to fopen the dictionary files, but we've already done hacks around that (see hunspell_fopen_readahead, for example.)  Something that opens the files in the parent and sends fds down to the child and makes hunspell_fopen_readahead return FILE*s on top of the said fds should alleviate most of the e10s specific regressions here.

I can't think of any good and easy options that doesn't involve rewriting spellchecker code for e10s-multi though.  :(
(In reply to :Ehsan Akhgari (intermittently available until 9/21) from comment #12)
> As I said on IRC, doing things such as "batch checking" while it sounds
> trivial requires massive changes to the spell checking code.
> 
> The shared memory idea is also easier said than done, since the data
> structures in question are created and managed by hunspell, and I'm assuming
> that we don't want to change the guts of that library and maintain it.

Well, if we could arrange for hunspell to load this data in particular memory pages (instead of mixed in with the rest of the heap) then wouldn't the OS share the page automatically when we fork the child process?

Of course, if words are added to the dictionary, then the pages would diverge.  But that raises the question of how to handle such changes in multiple child processes regardless of memory sharing.
The spellchecker code base is some of the oldest, and according to Ehsan, most gnarly and least maintained in firefox. I'm sorry to see that it's still unowned, though no one sane wants to touch or maintain the guts of hunspell.

As it was, billm had to rework IPIDL for the original e10s fix happen. I strongly suspect this is similar. Some significant technical capability, such as mconley suggests in comment 9, will need to made available, and a significant chunk of crufty spellchecker code will still need to be rewritten.

At one point Ehsan estimated that rewriting the spellchecker would take longer than (single)e10s itself, so that idea was deemed not worth the investment to perf improvement ratio. I did raise the issue of multiple child processes at the time, but was told by blassey not to worry about it; multi-e10s might not even be a thing, and if did it would be years in the future.

No one wants to hear it, but it may be time to throw out the old spellchecker. For an added dose of heresy, it may be wise to consider ditching the suggestions in the context menu UX in favor of a display method that is within the child process.
I have some ideas that I would like to try out here.
Assignee: nobody → mrbkap
(In reply to Ben Kelly [:bkelly] from comment #13)
> (In reply to :Ehsan Akhgari (intermittently available until 9/21) from
> comment #12)
> > As I said on IRC, doing things such as "batch checking" while it sounds
> > trivial requires massive changes to the spell checking code.
> > 
> > The shared memory idea is also easier said than done, since the data
> > structures in question are created and managed by hunspell, and I'm assuming
> > that we don't want to change the guts of that library and maintain it.
> 
> Well, if we could arrange for hunspell to load this data in particular
> memory pages (instead of mixed in with the rest of the heap) then wouldn't
> the OS share the page automatically when we fork the child process?

Only on Linux, I think.  On Windows you get a fresh process (there's no fork.)  Not sure what happens on OS X.

> Of course, if words are added to the dictionary, then the pages would
> diverge.  But that raises the question of how to handle such changes in
> multiple child processes regardless of memory sharing.

The words added to your personal dictionary are dealt with separately, they won't be added to the hunspell dictionaries (thankfully!)
Flags: needinfo?(blassey.bugs)
Priority: -- → P2
Have you had time to investigate here, Blake?
Flags: needinfo?(mrbkap)
Yes. I have I have a work in progress patch but I got pulled off it for a bit. I'll try to get back to this in the next couple of days.
Flags: needinfo?(mrbkap)
Ping?
Sorry, I got distracted again. I'm back on this.
Attached patch wipSplinter Review
This is an extremely rough WIP. There's one huge problem with it: if the text is changed between when we ask the parent for the spellchecking results and when we get the responses, we could get confused. One possible solution would be to keep track of changes in the spell-checked range and deal with them when we get the responses. Another would be to return a map of word -> correct/incorrect.

There's also a bunch of cleanup to do: I just wanted to get this to work at all.
Attachment #8816345 - Flags: feedback?(ehsan)
Comment on attachment 8816345 [details] [diff] [review]
wip

Review of attachment 8816345 [details] [diff] [review]:
-----------------------------------------------------------------

I only skimmed this as a careful review will take a significant amount of time.  I saw a few red flags below, but overall this is on the right track.

One thing which I think we may want to do now is to better keep track of which async spell checking jobs we have scheduled.  Right now, mozInlineSpellStatus is sort of tracking a "job" but it's a very loose notion.  We should probably start to change the number of pending spell checker jobs with a queue of mozInlineSpellStatus to be processed.  We can give each job a unique ID and send that across the IPC channel to be able to correlate the async results with what we had asked to be spell checked before.  Once we keep track of this stuff we can also analyze the ranges to be spelled to detect overlapping ranges, and either cancel or ignore everything but the latest result for a range.  What do you think?

::: extensions/spellcheck/src/mozInlineSpellChecker.cpp
@@ -1439,5 @@
>                                               Selection *aSpellCheckSelection,
>                                               mozInlineSpellStatus* aStatus,
>                                               bool* aDoneChecking)
>  {
> -  *aDoneChecking = true;

Hmm, so now you're never setting this?

@@ -1586,5 @@
> -          return NS_OK;
> -        }
> -        *aDoneChecking = false;
> -        return NS_OK;
> -      }

I'm not quite sure removing this is a good idea...

::: extensions/spellcheck/src/mozInlineSpellWordUtil.cpp
@@ +327,5 @@
> +mozInlineSpellWordUtil::CheckSoftTextUsing(nsISpellChecker* aSpellChecker,
> +                                           SoftTextCallback* aCallback)
> +{
> +  if (XRE_IsParentProcess()) {
> +    return NS_ERROR_NOT_IMPLEMENTED;

So given this change, how is the spell checker supposed to work in the parent process?

::: extensions/spellcheck/src/mozSpellChecker.cpp
@@ +454,5 @@
> +                                     const nsTArray<RealWord> &aRealWords)
> +{
> +  MOZ_ASSERT(!mEngine);
> +  nsTArray<bool> results(aRealWords.Length());
> +  for (auto &realWord : aRealWords) {

We should probably have some kind of an upper bound of the number of words we check here, since this loop can take an arbitrary long amount of time on the main thread of the parent process.  I think we should do this incrementally.
Attachment #8816345 - Flags: feedback?(ehsan) → feedback+
Blocks: SyncIPC
Blake, are you still working on this?  Have you made any more progress?

This is so far the 7th worst sync IPC message that we have in Gecko per https://docs.google.com/spreadsheets/d/1x_BWVlnQPg0DHbsrvPFX7g89lnFGa3lAIHWD_pLa_dE/edit#gid=738048355.
Flags: needinfo?(mrbkap)
(In reply to :Ehsan Akhgari from comment #23)
> Blake, are you still working on this?  Have you made any more progress?

Yes I'm still working on this. My current strategy is to leave the sync message but only send one per turn of the spellchecking machinery. It'll be more work to make the spell checking fully async. I got a bit distracted by the performance of the spell checker on large textareas. I'll speed myself up.
Flags: needinfo?(mrbkap)
Thanks!  I think we ultimately want to get rid of all sync messages here, but I guess we can reduce the number of them as a first step.  I still think my suggestion in comment 22 should work so I hope that we can give it a try after this bug.
Status: NEW → ASSIGNED
Iteration: --- → 54.2 - Feb 20
Whiteboard: [e10s-multi:?]
Depends on: 1340573
Blake, this page seems to be a great test case for this issue.  To reproduce, just load <https://public.etherpad-mozilla.org/p/wr-plan>.

Here is a profile on the current Nightly: <https://perf-html.io/public/cb2ce87d9c4bf0af4fda2a8391c31652831080ad/calltree/?range=3.8410_28.9421>
Whiteboard: [e10s-multi:?] → [e10s-multi:-]
Whiteboard: [e10s-multi:-] → [e10s-multi:-][qf:p1]
Thanks for the wip patch, Blake! I'll take care the rest of the work.
Assignee: mrbkap → kchen
Keywords: perf
https://perfht.ml/2mZ5Bet

Moving the spellchecker parent to its own thread seems to work well. Spellchecking in the content process still takes a long time, uninterruptible, but at least it won't jank with main process main thread.
Ehsan, do you think the profile in comment 29 looks good to you? If so I'll go ahead and polish the patch.
Flags: needinfo?(ehsan)
I get the profiles before and after my patch

before: https://perfht.ml/2oA5d62

after: https://perfht.ml/2nIhIZo

There is some improvement but not as huge as I thought. We still need to split the work into smaller chunks. Note in the second profile there is still sync IPC calls. It's just harder too see in the default view. If you zoom in then you will find them on the timeline.
Better profile gathered from a MacBook Pro

Before: https://perfht.ml/2oJwkw7
After: https://perfht.ml/2oJtmrj
For the record, this is the patch to move spellchecker to a spellcheck thread on main process.
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #26)
> Blake, this page seems to be a great test case for this issue.  To
> reproduce, just load <https://public.etherpad-mozilla.org/p/wr-plan>.
> 

FWIW, the test case in Nightly 55.0al on my Mac Pro works fine on page loading. My test scenario is: with one content process preference set, having a Facebook live video in a background tab and open this etherpad in a foreground tab. On Firefox 52, you will hear the audio of the video is paused because content main thread is busy on spell checking, but on Nightly 55, the audio isn't affected. 

Not sure by which part of performance improvement we've made that fixed the problem, but sounds great!
See Also: → 1343551
(In reply to Evelyn Hung [:evelyn] from comment #35) 
> Not sure by which part of performance improvement we've made that fixed the
> problem, but sounds great!

Credit goes to bug 1330912! \o/
I have a simpler WIP patch to turn the spell checker async. I'll post it shortly after I sort out some lifetime issue.
Whiteboard: [e10s-multi:-][qf:p1] → [e10s-multi:-][qf:p2]
(In reply to (Catching up emails) Kan-Ru Chen [:kanru] (UTC+8) from comment #30)
> Ehsan, do you think the profile in comment 29 looks good to you? If so I'll
> go ahead and polish the patch.

It's probably worth asking others involved here (OTTOMH: Evelyn, mconley, Masayuki, mrbkap, etc.)
Flags: needinfo?(ehsan) → needinfo?(kchen)
Whiteboard: [e10s-multi:-][qf:p2] → [e10s-multi:-][qf:p1]
Whiteboard: [e10s-multi:-][qf:p1] → [e10s-multi:-][qf:i60][qf:p1]
Whiteboard: [e10s-multi:-][qf:i60][qf:p1] → [e10s-multi:-][qf:f60][qf:p1]
Assignee: kanru → bugs
Whiteboard: [e10s-multi:-][qf:f60][qf:p1] → [e10s-multi:-][qf:f61][qf:p1]
Whiteboard: [e10s-multi:-][qf:f61][qf:p1] → [e10s-multi:-][qf:f64][qf:p1]
Whiteboard: [e10s-multi:-][qf:f64][qf:p1] → [e10s-multi:-][qf:p1:f64]
Flags: needinfo?(kanru)
Priority: P2 → P1
Whiteboard: [e10s-multi:-][qf:p1:f64] → [e10s-multi:-][qf:p2:responsiveness]

Spellchecker won't use sync IPC by bug 1502661.

(I will remove PRemoteSpellcheckEngine.Check by bug 1503491)

oh, superb. Makoto, want to make this a dup of one of those bugs then.

Assignee: bugs → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(m_kato)

Let's dup of bug 1502661.

Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(m_kato)
Resolution: --- → DUPLICATE
Performance Impact: --- → P2
Whiteboard: [e10s-multi:-][qf:p2:responsiveness] → [e10s-multi:-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: