Improve Fluent performance in incomplete localization scenario

NEW
Unassigned

Status

()

defect
P3
normal
Last year
Last year

People

(Reporter: zbraniecki, Unassigned)

Tracking

(Blocks 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Fluent in Gecko handled incomplete localization by falling back on the next locale in the fallback chain.

I ran `./mach talos-test -a about_preferences_basic` for three locales:
 - en-US (complete)
 - it (complete)
 - pl (several missing strings)

Results (Linux, Nightly, opt build, 25 runs):

+--------+------+--------+---------+------+------+
| Locale | main | search | privacy | sync | home |
+--------+------+--------+---------+------+------+
| en-US  |  526 |    361 |     421 |  393 |  351 |
| it     |  539 |    367 |     431 |  394 |  348 |
| pl     |  605 |    431 |     488 |  459 |  416 |
+--------+------+--------+---------+------+------+

As you can see there's no difference between en-US and it, but there's a sizable regression in an incomplete locale of around 40-60ms.

Here are profiles between `it` and `pl`:

it: https://perfht.ml/2FbgD4v
pl: https://perfht.ml/2Ff09II

In both stacks you can see similarly small impact of Fluent on the loading stack, but there's one difference. If you filter by `l10n` keyword and look only at JS stack, you will notice that in the `it` case everything happens before DOMContentLoaded (right at the MozBeforeInitialXULLayout`). In the `pl` case there's additional blip of blue code running around 4.4s, much after DOMContentLoaded.
There are actually two issues disgused as one:

1) talos performance is slower for incomplete translations
2) there's a risk of FOUC for incomplete translations

Reasons for (1)
---------------

It seems that all of the performance regression disappears from talos if we remove a single line from Localization.jsm[0]:

```
console.warn(`Missing translations in ${locale}: ${ids}`);
```

This, means that the talos performance impact is Nightly only (because we currently display console.warn here only in Nightly pending bug 1453765).

There are two potential ways to fix it:

a) We should consider turning `console.warn` into a no-op if the Console is not open. Maybe it even is and I only see the regression because the `console.warn` is reported to the terminal from which I'm launching talos?

b) We can delay `console.warn` to post `document.l10n.ready` to make it not impact startup perf.

But unfortunately this does not fix the FOUC - here's a profile with console switched off - https://perfht.ml/2FbmMh5 - where you can see the same blip of l10n code in after `pageshow`.

Thus, we have to focus on (2).

[0] https://searchfox.org/mozilla-central/rev/8f06c1b9a080b84435a2906e420fe102e1ed780b/intl/l10n/Localization.jsm#158

Reasons for (2):
---------------

Due to limitations of JS DOM API which impacted performance of DOM operations we got a C++ API `Node.localize` (bug 1363862) which facilitates most of the heavy DOM operations like finding localizable nodes and applying translations.

The API takes a callback, collect localizable nodes and send their info to JS. JS then formats all the necessary localizations and returns them to the API which applies translations onto nodes.

So, in JS it looks somewhat like this:

```
function onMozBeforeInitialXULLayout() {
  document.localize(async function formatL10n(l10nAttrs) {
    return await Localization.formatTranslations(l10nAttrs);
  });
}
```

In complete translation scenario this operation is very fast and the async is optimized away because we loaded the first locale eagerly much earlier.

In the incomplete scenario, the `formatTranslations` code iterates over `l10nAttrs` and formats their translations for the first language.

For all strings it cannot format out of `pl` it then triggers loading of the fallback locale, and formats them from that locale (at the moment its always `en-US`).

This I/O is not very costly, but pushes us outside of the microtask pushing the final translation of the DOM post `pageshow`.
Fixing (2) will require changes to Fluent API.

As explained in Comment 1, currently the function that formats translations does more or less this:

```
async function formatTranslations(l10nAttrs) {
  let result = [];
  for async (l10nContext of generateContextsForLocales(locales)) {
    let missingTranslations = false;

    for (idx in L10nAttrs) {
      // Skip already translated entries
      if (result[idx]) {
        continue;
      }

      let translation = l10nContext.format(l10nAttrs[idx]);
      if (translation) {
        result[idx] = translation;
      } else {
        missingTranslations = true;
      }
    }

    if (!missingTranslations) {
      return result; 
    }
  }
}
```

This means that the function returns only ones all entries are ready including the fallback ones. And the `generateContextsForLocales` will return cached l10nContext for the primary locale, but for the fallback locale it'll trigger the I/O only when it encounters the first missing string.

In that sens the current model is optimistic - we pay for the fallback only when we need it.

One solution might be to turn `Node.localize` to act more like a iterator asking for more strings until all fallbacks are done.

A pseudo example might look like this:

```
function onMozBeforeInitialXULLayout() {
  let ctxFallback = generateContextsForLocales(locales);
  document.localize(function callback(l10nAttrs) {
    let [ctx, done] = await ctxFallack.next();
    if (done) {
      return [[], true];
    }
    let [translations, completed] = ctx.formatTranslations(l10nAttrs);  
    return [translations, completed];
  });
}
```

which would totally blur the separation between DOMLocalization and Localization.

There are some dirty hacks we could try - eagerly fetching fallback, or somehow marking incomplete locales at build time to fetch fallback only for them, but it all feels not clean.

Stas - can I get your take on this?
Flags: needinfo?(stas)
Blocks: 1441035
Priority: -- → P2
Depends on: 1456565
Fetching the second locale eagerly seems to remove the (2) so I filed bug 1456565 to fetch the fallback locale eagerly as well.

This is not an ultimate solution but an easy fix with no performance cost that will be easy to land in 61. For long term as we will get more flexible with our fallback chains (bug 1410923) we'll want to resolve it by improving our bindings.

I'll keep this bug open to investigate the `console.warn` impact.
Flags: needinfo?(stas)
See Also: → 1410923
Priority: P2 → P3
With bug 1456565 landing this becomes lower priority.

Pike has additional suggestion that I started toying with and I really like:

We could make L10nRegistry learn about the number of fallbacks used per context and fetch them eagerly on the next re-run.

That would require some simple storage, possibly using rkv[0] for L10nRegistry and some limited logic, but move the `touchNext` from l10n.js into L10nRegistry which would then return a CachedIterable with one or more items already cached.

I won't work on it yet, but it's nice to write it down for when we'll realize we need it.

[0] https://github.com/mozilla-prototypes/rkv
You need to log in before you can comment on or make changes to this bug.