Language detection takes 2 seconds to initialize
Categories
(Firefox :: Translations, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox116 | --- | fixed |
People
(Reporter: gregtatum, Assigned: nordzilla)
References
Details
Attachments
(5 files)
Language detection takes a long time to initialize in a DOM Worker.
Steps to reproduce:
- Open browser
- Start profiler
- Go to https://speedometer-preview.netlify.app/#home
- Capture profile
Here is an example profile taking 2 seconds: https://share.firefox.dev/42jZikd
Reporter | ||
Comment 1•1 year ago
|
||
It would be worth figuring out how long CLD2 takes to initialize to compare. The wasm binary doesn't include symbols, so it's tough to see what's taking the time, but it's probably just how long the library takes to run the inference.
Reporter | ||
Comment 2•1 year ago
|
||
This might be an easier problem to solve with a single-process model, we would only need to initialize once. We'd need to investigate if the cost is in initialization, or in the language identification step. Then we'd need to determine the retained memory hit.
Reporter | ||
Comment 3•1 year ago
|
||
Another option is to remove language detection for the release, and follow-up with another solution.
Comment 4•1 year ago
|
||
While profiling something else, I noticed that language detection initialization takes about 900ms of CPU time in a DOM worker: https://share.firefox.dev/3qsMBGC
Would there be a way to move this work to make it happen once instead of doing it in every content process?
Comment 5•1 year ago
|
||
What's the reason for not using CLD2 compiled natively (or RLBox) vs FastText in Wasm?
Comment 6•1 year ago
|
||
We can download the FastText Wasm blob on demand only when necessary, and thus avoid increasing the Firefox binary size for all users.
We could use an RLBox binary library that we download on demand and link dynamically, but the implementation is much more complex than a Wasm blob.
Reporter | ||
Updated•1 year ago
|
Comment 8•1 year ago
•
|
||
(In reply to Greg Tatum [:gregtatum] from comment #0)
Here is an example profile taking 2 seconds: https://share.firefox.dev/42jZikd
I took a stab at fixing fastText but couldn't make the basic example work because it hasn't been updated to work with modern versions of emscripten, and because I couldn't install the version which was recent at the time fastText was written (emscripten SDK 1.39.12, April 2020). But if anyone wants to try, here's what I think the problem is: During fastText module initialization, there are too many calls which go from C++ through the emscripten binding layer back into C++. If those calls can be made to bypass the emscripten binding layer and stay within wasm, the initialization time could be cut by 50% or more.
The emscripten bindings are defined in webassembly/fasttext_wasm.cc.
Reporter | ||
Comment 9•1 year ago
|
||
I think Erik is going to take this one. I am not sure why I assigned myself to this when I filed it.
Comment 10•1 year ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #7)
When is it necessary?
I guess this is actually very often necessary, as soon as there is a page that doesn't have a lang tag.
The same reasoning applies though. If we ship it as part of Firefox, we increase the installer size. This way, we can download it later.
(In reply to Markus Stange [:mstange] from comment #8)
(In reply to Greg Tatum [:gregtatum] from comment #0)
Here is an example profile taking 2 seconds: https://share.firefox.dev/42jZikd
I took a stab at fixing fastText but couldn't make the basic example work because it hasn't been updated to work with modern versions of emscripten, and because I couldn't install the version which was recent at the time fastText was written (emscripten SDK 1.39.12, April 2020). But if anyone wants to try, here's what I think the problem is: During fastText module initialization, there are too many calls which go from C++ through the emscripten binding layer back into C++. If those calls can be made to bypass the emscripten binding layer and stay within wasm, the initialization time could be cut by 50% or more.
The emscripten bindings are defined in webassembly/fasttext_wasm.cc.
There are some docs on how to build it in https://firefox-source-docs.mozilla.org/toolkit/components/translations/resources/02_contributing.html#building-the-wasm-binary in case you want to take another stab at it.
Assignee | ||
Comment 11•1 year ago
|
||
Yes, I'm going to go ahead and take this bug, since I added the fastText functionality for translations.
Markus, as Marco said, I wrote documentation on all the steps I had to take to get fastText working with wasm in tree:
https://firefox-source-docs.mozilla.org/toolkit/components/translations/resources/02_contributing.html#building-the-wasm-binary
If you still want to give that a try, please reach out to me if anything is unclear. I'm very interested in your feedback on how to improve the performance here.
For the moment, I think that we should revert to using CLD2 and put the fastText code behind a pref while we investigate improving the performance of fastText. I will submit a patch to handle this, attached to this bug.
I've documented the reasoning behind the decision here:
https://docs.google.com/document/d/17YIDLquOnydkuZcmr2UEu4WD65HHMc2JYsWJqD6XzmA
Comment 12•1 year ago
•
|
||
Thanks, with these instructions I was able to build it. It turns out the overhead wasn't coming from the bindings, it was coming from C++ exception support. The EMCXXFLAGS_BASE
flags contain -s "DISABLE_EXCEPTION_CATCHING=0"
which engages a very slow mode in which every C++ call is wrapped in a JS call.
Replacing -s "DISABLE_EXCEPTION_CATCHING=0"
with -fwasm-exceptions
(and adding -O3
for good measure) appears to result in a 40x speedup. Whether that's fast enough to ship, I don't know.
Here's a profile showing both the fast and the slow version running at the same time (fast on the main thread): https://share.firefox.dev/3oVBnde
Here's how I tested my changes: https://gist.github.com/mstange/826733694546bd60d0e368feb2d40646
Assignee | ||
Comment 13•1 year ago
|
||
Thank you for taking another look Markus!
For the moment, I am still going to go ahead and put fastText behind a pref and default to CLD2, as outlined in the doc I linked above.
Then I am going to apply your optimization, and we will see if that is sufficient and/or if more can be done to improve the fastText performance.
Updated•1 year ago
|
Assignee | ||
Comment 15•1 year ago
|
||
Renames function to getOrCreateLanguageIdEngine to
more accurately reflect what it does.
Assignee | ||
Comment 16•1 year ago
|
||
Adds a pref that will be used to determine whether Translations
language identification should use fastText or not.
Depends on D180743
Assignee | ||
Comment 17•1 year ago
|
||
Uses the fastText pref to determine which method of language
identification to use in about:translations
Depends on D180744
Assignee | ||
Comment 18•1 year ago
|
||
Uses the fastText pref to determine which method of language
identification to use in full-page translation
Depends on D180745
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 19•1 year ago
|
||
Comment 20•1 year ago
|
||
Backed out for causing mochitest failures in dom/workers/test/browser_privilegedmozilla_remoteworker.js
Backout link: https://hg.mozilla.org/integration/autoland/rev/4eb17b12d0f95a461ede957a634561a8a169af3a
Reporter | ||
Updated•1 year ago
|
Comment 21•1 year ago
|
||
Comment 22•1 year ago
|
||
Backed out 4 changesets (Bug 1836974) for causing failures in browser_full_page.js CLOSED TREE
Log: https://treeherder.mozilla.org/logviewer?job_id=420753654&repo=autoland&lineNumber=15190
Backout: https://hg.mozilla.org/integration/autoland/rev/0b2ed348e59cd519fd45eb9dda2ed2e4aa2af42a
Assignee | ||
Comment 23•1 year ago
|
||
Renames some translations test files that did not previously
include the word "translations" so that they will now all
be run when running ./mach test browser_translations
Depends on D180746
Assignee | ||
Comment 24•1 year ago
|
||
I am re-landing this during soft freeze because at its core this is only rearranging existing functionality behind a pref.
The previous failures were due to some issues in the configuration options of our test suite that were overwriting some prefs accidentally, and were not exposed until I had tried to modify those prefs.
I have one clean try of our translations test suite:
https://treeherder.mozilla.org/jobs?repo=try&revision=f1072fc502ff6edb374d61e0ac1accb153fcccb2
And a clean try of a full Linux debug build:
https://treeherder.mozilla.org/jobs?repo=try&revision=dedaf78b8d3cb55ec50362a8ee71eba5503c65c1&selectedTaskRun=PjwY9epqRBuZgm3aQpOmhg.0
Comment 25•1 year ago
|
||
Comment 26•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5a2d031e38b1
https://hg.mozilla.org/mozilla-central/rev/a73704f84eef
https://hg.mozilla.org/mozilla-central/rev/6480498853fc
https://hg.mozilla.org/mozilla-central/rev/7abb5d825e63
https://hg.mozilla.org/mozilla-central/rev/905995a5de00
Description
•