Allow JS to start headless content processes (e.g. for Inference)
Categories
(Core :: IPC, enhancement)
Tracking
()
Tracking | Status | |
---|---|---|
firefox137 | --- | fixed |
People
(Reporter: nika, Assigned: nika)
References
Details
(Keywords: perf-alert)
Attachments
(2 files)
Currently the TranslationsEngine and MLEngine are run in an "inference" content process by creating a headless browser, and inserting a <browser remote=true remoteType=inference>
frame into that browser.
This bug is to track improving this by providing an API on ChromeUtils which can be used to start a potentially-empty content process with a given remote type, and hold a "keepAlive" for that process so it can be used for sandboxing code in this way.
Assignee | ||
Comment 1•1 month ago
|
||
This adds a new XPCOM type which will hold the
UniqueContentParentKeepAlive
internally, and can be returned into
Chrome JS using a new method on ChromeUtils.sys.mjs.
The KeepAlive will be cleaned up when the cycle-collected JS object is
destroyed, or when the invalidateKeepAlive()
method is called on the
object (for times when JS wants to take more direct control over the
lifecycle of the KeepAlive).
In the future it would be possible to add methods to e.g. get a
KeepAlive for a specific existing process, or clone a KeepAlive object,
however that is not required for the initial use-case.
This will be used in part 2 to replace the use of hidden iframes with
dummy pages to create the "inference" content process.
Assignee | ||
Comment 2•1 month ago
|
||
This patch uses the changes from part 1, which introduced the ability
for Chrome JS to start and manage headless content processes, to switch
the MLEngine and TranslationsEngine actors away from using hidden
<browser> elements.
Under the new model, these actors are now JSProcessActors, bound to the
process scope, and do not require loading dummy documents to function.
The most significant functionality change is that these engine actors
can now be re-used after they have been "destroyed" if the process is
being kept alive (e.g. by the other actor type). Some changes were
required to the actors to make them handle being re-used, but they were
relatively minor.
This patch should also improve inference process crash recovery for the
translations and MLEngine code, as after an inference process crash, the
next call to EngineProcess will automatically start a new process,
rather than returning an actor still bound to a dead process.
Comment 3•1 month ago
|
||
It looks like this may also fix Bug 1942174, which we happened to file on the same day.
Updated•21 days ago
|
Comment 5•14 days ago
|
||
Backed out for causing ES lint failures.
- Backout link
- Push with failures
- Failure Log
- Failure line: TEST-UNEXPECTED-ERROR | /builds/worker/checkouts/gecko/toolkit/components/translations/actors/TranslationsEngineChild.sys.mjs:37:15 | 'transferables' is assigned a value but never used. (no-unused-vars)
Assignee | ||
Updated•13 days ago
|
https://hg.mozilla.org/mozilla-central/rev/132021ffd756
https://hg.mozilla.org/mozilla-central/rev/b95de61de763
Comment 8•4 days ago
|
||
(In reply to Pulsebot from comment #6)
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/132021ffd756
Part 1: Add a mechanism to allow Chrome JS to manage headless content
processes, r=smaug
https://hg.mozilla.org/integration/autoland/rev/b95de61de763
Part 2: Use a headless content process for the inference content process,
r=firefox-ai-ml-reviewers,translations-reviewers,gregtatum,tarek
Perfherder has detected a mozperftest performance change from push b95de61de7638546897604c446045ce42782b388.
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
14% | browser_translations_perf_es_en.js engine-init-time | windows11-64-shippable-qr | 216.60 -> 185.82 | |
12% | browser_translations_perf_es_en.js engine-init-time | linux1804-64-shippable | 341.94 -> 300.27 | |
7% | browser_ml_autofill_perf.js AUTOFILL-pipeline-ready-latency | windows11-64-2009-hw-ref-shippable | 291.08 -> 271.46 | |
6% | browser_ml_autofill_perf.js AUTOFILL-initialization-latency | windows11-64-2009-hw-ref-shippable | 302.15 -> 283.21 | |
4% | browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-initialization-latency | linux1804-64-shippable | 3,799.73 -> 3,665.67 | |
... | ... | ... | ... | ... |
3% | browser_ml_summarizer_perf.js SUM-XENOVA-DISTILBART-CNN-12-6_MEDIUM-pipeline-ready-latency | linux1804-64-shippable | 2,753.92 -> 2,682.04 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 43800
For more information on performance sheriffing please see our FAQ.
Description
•