Avoid main-thread IO for {xre}\dictionaries

NEW
Unassigned

Status

()

defect
6 years ago
5 years ago

People

(Reporter: rvitillo, Unassigned)

Tracking

(Blocks 1 bug, {main-thread-io})

Firefox Tracking Flags

(Not tracked)

Details

It seems that the {xre}\dictionaries directory and the files contained within are top main-thread IO offenders in terms of number of Telemetry submissions where they appear in after filtering out IO operations during startup and shutdown.
I'm sure there is an existing bug on this.  The difficulty with fixing this is that the dictionary and affix files are read by the hunspell code and I'd prefer not take local patches to that code if possible.  Ideas on how to proceed here welcome.
Bug 468779 and bug 535623 may be relevant. They're about reducing IO, not moving it off the main thread.
It seems that dictionary files are read when a new object of type Hunspell is instantiated which happens in mozHunspell::SetDictionary. SetDictionary is called in turn by mozSpellChecker::SetCurrentDictionary. We could use another thread to call mozSpellChecker::SetCurrentDictionary and "asyncify" the rest of the code. Not sure though how much effort that would require.
Ehsan, do you have any thoughts on my previous comment?
Flags: needinfo?(ehsan)
So the first thing we do these days before loading a dictionary is fetch a content prefs for the page to be able to load the dictionary corresponding to the user's preference for the website.  The content prefs service was switched to be asynchronous a while ago so we had to rewrite a bunch of machinery for loading dictionaries based on that.  See nsEditorSpellCheck::DictionaryFetched which is the method that actually determines the language of the dictionary (which I think is what ends up calling mozHunspell::SetDictionary indirectly.

What is not currently async compatible is checking words in dictionaries.  That basically means that our code expects the dictionary to be available right when we need it, which will require some rewriting to make that part async compatible.

Honestly I have not spent enough time to tell you exactly how much effort that is going to be, all I can tell you is that it will be "some effort".  This code is unfortunately very complicated and a bit over-engineered.

The good news is that our spell checking code itself which runs as you're for example typing into a text area is sort of asynchronous (see mozInlineSpellChecker::ScheduleSpellCheck for example) but there are also other synchronous code paths to spell check a word synchronously and so on.  The sad part is that this stuff is also scriptable through a bunch of XPCOM interfaces (such as nsIEditorSpellCheck) and there might be some extensions which use these interfaces.

If you're interested in working on this, please spend some time to understand the code involved (and feel free to ask questions) and then share your proposal on how to fix this.  I don't expect to be able to spend much time to come up with a proposal unfortunately, but I would be happy to review one if you're interested to work on this!

Please let me know if you have any questions!
Flags: needinfo?(ehsan)
You need to log in before you can comment on or make changes to this bug.