Bug 1613705 Comment 2 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

The results are *very* preliminary. There's still a lot of potential improvements on all levels, but here are talos numbers from a fairly complete build that replaces L10nRegistry.jsm and Localization.jsm with Rust/C++: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=9bdda7f8c5ace35705e0961b55fba99f99fbe77c&newProject=try&newRevision=9b4e419e79518947b597079390e3cd3adfba760a&framework=1

sessionrestore, startup_about_home, ts_paint, twinopen wins!

This is just based on a very raw POC, with a number of limitations:
 - arguments are not yet passed
 - I/O is synchronous even for preferences (in m-c sync is only used for startup)
 - No fallback
 - L10nRegistry with its hot-plugging is yet to be added - https://github.com/zbraniecki/l10nregistry-rs

Despite that, I believe the results should remain similar when missing features get added for all startup path tests, while non-startup path is a matter of sync/async as the only likely significant factor. The rest of the missing bits are either affecting small number of strings (arguments) or fallback scenarios.

Further optimization avenues are:
 - fluent-syntax (Parser)
   - Use of unsafe slicing shows up to 20% win on parsing https://github.com/projectfluent/fluent-rs/issues/82
   - Lexer branch shows up to 50% win on parsing https://github.com/zbraniecki/fluent-rs/tree/lexer3
 - fluent-bundle (resolver)
   - Writer pattern can give up to 10% perf win on resolution https://github.com/projectfluent/fluent-rs/pull/127
 - Gecko bindings
   - A number of suboptimal allocations and UTF8<-->UTF16 conversions that we can still optimize
   - Some of the API calls are unnecessary slow because we needed it for the JSMs
   - IntlMemoizer is currently created per-bundle, but could be shared between all bundles in a single process/locale

It's hard to evaluate how those improvements would translate to actual talos numbers, but the numbers we get already are often fairly close to the upper bounds listed in the previous comment, so my hope is that they would mostly help us get wins in areas which the upper bound shown as potential wins but this talos compare doesn't show any significant wins for yet (quite often showing wins but below significance level).

What's nice is that basically all of the things I listed here are already reasonable to assume as achievable, not just a potential research, but as a clear forward path.
The results are *very* preliminary. There's still a lot of potential improvements on all levels, but here are talos numbers from a fairly complete build that replaces L10nRegistry.jsm and Localization.jsm with Rust/C++: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=9bdda7f8c5ace35705e0961b55fba99f99fbe77c&newProject=try&newRevision=9b4e419e79518947b597079390e3cd3adfba760a&framework=1

sessionrestore, startup_about_home, ts_paint, twinopen wins!

This is just based on a very raw POC, with a number of limitations:
 - arguments are not yet passed
 - I/O is synchronous even for preferences (in m-c sync is only used for startup)
 - No fallback
 - L10nRegistry with its hot-plugging is yet to be added - https://github.com/zbraniecki/l10nregistry-rs

Despite that, I believe the results should remain similar when missing features get added for all startup path tests, while non-startup path is a matter of sync/async as the only likely significant factor. The rest of the missing bits are either affecting small number of strings (arguments) or fallback scenarios.

Further optimization avenues are:
 - fluent-syntax (Parser)
   - Use of unsafe slicing shows up to 20% win on parsing https://github.com/projectfluent/fluent-rs/issues/82
   - Lexer branch shows up to 66% win on parsing https://github.com/projectfluent/fluent-rs/pull/161
   - Pushing parsing off the main thread may also help
 - fluent-bundle (resolver)
   - Writer pattern can give up to 10% perf win on resolution https://github.com/projectfluent/fluent-rs/pull/127
 - Gecko bindings
   - A number of suboptimal allocations and UTF8<-->UTF16 conversions that we can still optimize
   - Some of the API calls are unnecessary slow because we needed it for the JSMs
   - IntlMemoizer is currently created per-bundle, but could be shared between all bundles in a single process/locale

It's hard to evaluate how those improvements would translate to actual talos numbers, but the numbers we get already are often fairly close to the upper bounds listed in the previous comment, so my hope is that they would mostly help us get wins in areas which the upper bound shown as potential wins but this talos compare doesn't show any significant wins for yet (quite often showing wins but below significance level).

What's nice is that basically all of the things I listed here are already reasonable to assume as achievable, not just a potential research, but as a clear forward path.

Back to Bug 1613705 Comment 2