Closed Bug 1458339 Opened 5 years ago Closed 1 year ago

Investigate sharing self-hosted compartment across content processes

Categories

(Core :: JavaScript Engine, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
90 Branch
Tracking Status
firefox90 --- fixed

People

(Reporter: tcampbell, Assigned: nbp)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [overhead:250k])

Attachments

(3 files)

As part of Fission efforts, we should consider sharing the self-hosted compartment across processes. We already share within the process so this is a good target to look at.

We need to figure out what APIs Gecko has and which we should use.
Whiteboard: [overhead:>100k]
According to my analyzer, memory associated with self-hosted scripts are something like 150kb (including some scripts in the message manager), and the self-hosting-global is another about 300kb. I'm not sure how much of that might be sharable.
Whiteboard: [overhead:>100k] → [overhead:500k]

This has been stalled long enough. I'm taking this bug for ff69 train.

The most minimal approach here now would be to prime the SharedScriptData de-duplicated cache with shared data for the bytecode. This involves splitting SharedScriptData into SharedScriptData and RuntimeScriptData (and taking a word-per-script regression in the meantime).

The size of bytecode, etc for the self-hosted bytecode is 220kB. (A 1-word regression per script is usually 20kB in our overhead numbers).

I'll also dig into the memory used by the global itself that mcrr8 identifies and see if we making it lazier makes sense. I know we've hit a few regressions in the last few months due to adding language/library features via self-hosting.

Assignee: nobody → tcampbell
Priority: P2 → P1
Duplicate of this bug: 1357882
Duplicate of this bug: 1358606
Depends on: 1559275
Depends on: 1523749
Whiteboard: [overhead:500k] → [overhead:250k]
See Also: → 1618391
Blocks: 1662149
Assignee: tcampbell → nicolas.b.pierron
Depends on: 1690570

Early conservative estimate show that we can save ~16ms from the startup of each process on Android (Android 8.0 Pixel2 AArch64), by using a cached selfhosted.xdr instead of parsing it.

This estimation was made using Bug 1690570 patches, and comparing JS shell execution with an external file as a cache for selfhosted.xdr content, compared against a version which is parsing selfhosted code, and divided by the number of tests executed.

Blocks: 1618391

There's a v8 blog post from 2018 on their approach to reducing memory usage for builtins: https://v8.dev/blog/embedded-builtins

(In reply to Andrew McCreight [:mccr8] from comment #6)

There's a v8 blog post from 2018 on their approach to reducing memory usage for builtins: https://v8.dev/blog/embedded-builtins

So far our approach is to use their previous approach of sharing memory across processes, where each instance will instantiate the stencil.
This is not perfect, but easier than figuring out cross compilation issues in our build system, and would keep the current binary size.

Depends on: 1698045

Early result shows that during the start of Firefox, the Parent process and each content process are parsing 4 times the self-hosted code, each producing their own Stencil.

With patches that I am polishing now, the Parent process parses self-hosted code once, create a shared memory where the content is copied over. The shared memory is then used by the Worker thread within the Parent process. The shared memory is successfully transmitted to the Content process which then use the shared memory to decode the self-hosted content twice, one for the main thread and one for a worker thread.

Thus, with the work from Arai to borrow content while decoding, this new patch, with a single Content process should already be a memory saving despite the overhead added by shared memory. This should also be a time saving, but I have not measured it yet.

Patches are coming…

The JSRuntime already has an API to set the self-hosted content before the
initialization. This modification adds a proper API such that the rest of Gecko,
which does not have access to the JSRuntime implementation can use this
functions as well.

This modification relies on the shared memory implemented in Bug 1698045 and on
the ability to encode and decode self-hosted content from Bug 1668361 to
optimize the JS engine initialization by making the parent process encode the
self-hosted stencil, such that all other runtime initialization would only have
to decode it, including content processes.

Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d7bf37fd9355
part 0 - Simplify mozilla::GetBuildId to be safely called from any thread. r=tcampbell,mccr8
https://hg.mozilla.org/integration/autoland/rev/51ff7ca947e9
part 1 - Add an API to set self-hosted XDR content. r=tcampbell
https://hg.mozilla.org/integration/autoland/rev/20bc1a4de242
part 2 - Use shared memory to initialize the JS engine. r=smaug,tcampbell,necko-reviewers
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch
You need to log in before you can comment on or make changes to this bug.