Add a pref to control the use of lexical shortlists in Translations
Categories
(Firefox :: Translations, task)
Tracking
()
People
(Reporter: nordzilla, Assigned: nordzilla)
References
(Regressed 1 open bug)
Details
Attachments
(22 files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
Description
We recently discovered that the usage of a lexical shortlist is causing issues with models inventing new words for languages with higher morphological complexity (e.g. heavy usage of declensions).
For now, we are going to remove the usage of lexical shortlists from our translations until we find a way to improve the strategy around translating. However, we will put it behind a pref to ensure that the functionality can be controlled.
This change comes with tradeoffs (PerfCompare):
Pros:
- Faster engine initialization time.
- Fewer translation artifacts to download.
- Lower Memory usage.
- Higher-quality translations (especially due to fewer invented words).
Cons:
- Slower translations.
In the future, we may re-enable lexical shortlists, but we will have to change our strategy around how they are used, for example, providing a batch of the entire document, or perhaps dynamically deciding whether or not to use the shortlist based on the language pair and/or how much content there is to be translated. Bugs for these are yet to be filed.
Steps to implement
- Add a pref to control the usage of lexical shortlist in translations, default to off.
- Ensure that the lex files are not downloaded if this pref is off.
- Ensure that if the pref is flipped on, that the files are downloaded and used.
Tests to implement
- Add tests that ensure translations work flipping the pref value in both directions within the same translations session.
Assignee | ||
Updated•8 months ago
|
Updated•8 months ago
|
Assignee | ||
Comment 1•7 months ago
|
||
Assignee | ||
Comment 2•7 months ago
|
||
Depends on D234184
Assignee | ||
Comment 3•7 months ago
|
||
Depends on D234185
Assignee | ||
Comment 4•7 months ago
|
||
Depends on D234186
Assignee | ||
Comment 5•7 months ago
|
||
Depends on D234187
Assignee | ||
Comment 6•7 months ago
|
||
Depends on D234188
Assignee | ||
Comment 7•7 months ago
|
||
Depends on D234189
Assignee | ||
Comment 8•7 months ago
|
||
Depends on D234190
Assignee | ||
Comment 9•7 months ago
|
||
Depends on D234188
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Comment 10•7 months ago
|
||
This will also resolve Bug 1889753.
Assignee | ||
Comment 11•7 months ago
|
||
This patch adds fetch tasks to test-verify
for fetching the Translations
artifacts required for end-to-end testing.
Depends on D234191
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Assignee | ||
Comment 12•7 months ago
|
||
This patch ensures that the about:translations
page will retrigger
a translation if the browser.translations.useLexicalShortlist pref
changes.
Depends on D234412
Comment 13•7 months ago
|
||
Comment 14•7 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3c95589a3c9e
https://hg.mozilla.org/mozilla-central/rev/74f3ef5a6463
https://hg.mozilla.org/mozilla-central/rev/02caee1d8953
https://hg.mozilla.org/mozilla-central/rev/f56d4ec79677
https://hg.mozilla.org/mozilla-central/rev/6b4807dbd9f0
https://hg.mozilla.org/mozilla-central/rev/e31f741e108a
https://hg.mozilla.org/mozilla-central/rev/1479c82b454c
https://hg.mozilla.org/mozilla-central/rev/35cc64a343c1
https://hg.mozilla.org/mozilla-central/rev/d361bf6f08cf
https://hg.mozilla.org/mozilla-central/rev/28978600583c
https://hg.mozilla.org/mozilla-central/rev/1aa0f6d15f0f
Assignee | ||
Comment 15•7 months ago
|
||
This patch fixes several types through the Translations test suite that
included an extra "i" in Spanish.
Original Revision: https://phabricator.services.mozilla.com/D234184
Updated•7 months ago
|
Assignee | ||
Comment 16•7 months ago
|
||
This patch refactors several hard-coded lists of language model names in the Translations
test suite to now be generated by a function that takes language-tag pairs as an argument.
Original Revision: https://phabricator.services.mozilla.com/D234185
Updated•7 months ago
|
Assignee | ||
Comment 17•7 months ago
|
||
This patch refactors the FILES_PER_LANGUAGE_PAIR
constant to become RECORDS_PER_LANGUAGE_PAIR
to help prepare the platform for the fact that the lex
file will always be present
in Remote Settings, but will only be downloaded based on the useLexicalShortlist
pref
introduced later in this patch stack.
Original Revision: https://phabricator.services.mozilla.com/D234186
Updated•7 months ago
|
Assignee | ||
Comment 18•7 months ago
|
||
This patch contains various miscellaneous refactors within the TranslationsParent.
No logic is changed in this patch.
Original Revision: https://phabricator.services.mozilla.com/D234187
Updated•7 months ago
|
Assignee | ||
Comment 19•7 months ago
|
||
This patch removes optional withQualityEsitmation
arguments from various
functions throughout Translations, because we effectively never use these
files and do not ever plan to use them in the future. We will eventually
remove the files from Remote Settings entirely.
Original Revision: https://phabricator.services.mozilla.com/D234188
Updated•7 months ago
|
Assignee | ||
Comment 20•7 months ago
|
||
This patch reworkes how the TranslationsParent fires notifications to
observers, removing several one-off topics and replacing them with a
single translations:pref-changed
topic that includes the pref name
within the notification's data
parameter.
Original Revision: https://phabricator.services.mozilla.com/D234193
Updated•7 months ago
|
Assignee | ||
Comment 21•7 months ago
|
||
This patch unifies the observers that are used in the Translations
settings to listen to the same topic: translations:pref-changed
.
Original Revision: https://phabricator.services.mozilla.com/D234189
Updated•7 months ago
|
Assignee | ||
Comment 22•7 months ago
|
||
This patch introduces a pref that determines whether or not we will
utilize lexical shortlisting for Translations. Utilizing lexical
shortlisting has a positive impact on translation speed, but may
negatively affect the translation quality, particularly by causing
the models to invent new and incorrect words.
Original Revision: https://phabricator.services.mozilla.com/D234190
Updated•7 months ago
|
Assignee | ||
Comment 23•7 months ago
|
||
This patch ensures that we are testing our Translations features
with various combinations of the useLexicalShortlist pref, and
especially tests the behavior of flipping the pref within the
same browsing session, ensuring that all translation requests
still succeed.
Original Revision: https://phabricator.services.mozilla.com/D234191
Updated•7 months ago
|
Assignee | ||
Comment 24•7 months ago
|
||
This patch ensures that the about:translations
page will retrigger
a translation if the browser.translations.useLexicalShortlist pref
changes.
Original Revision: https://phabricator.services.mozilla.com/D234574
Updated•7 months ago
|
Assignee | ||
Comment 25•7 months ago
|
||
This patch adds fetch tasks to test-verify
for fetching the Translations
artifacts required for end-to-end testing.
Original Revision: https://phabricator.services.mozilla.com/D234412
Updated•7 months ago
|
Comment 26•7 months ago
|
||
beta Uplift Approval Request
- User impact if declined: Disabling the lexical shortlist pref has a significant improvement on the language translation models that are newly supported in Firefox 135. While the 135 Firefox Platform fully supports CJK translations as-is, these quality improvements are critical to providing the best-possible user experience. Not uplifting this patch stack may delay shipping CJK languages until Firefox 136.
- Code covered by automated testing: yes
- Fix verified in Nightly: no
- Needs manual QE test: yes
- Steps to reproduce for manual QE testing: Test Full Page Translations as well as Selected Text Translation with and without the browser.translations.useLexicalShortlist pref turned on.
- Risk associated with taking this patch: Low/Medium
- Explanation of risk level: I would classify this as low risk overall, since this is quite thoroughly tested in automation. However, it is a large stack and is close to the end of the beta uplift period.
- String changes made/needed: None
- Is Android affected?: yes
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Updated•7 months ago
|
Comment 27•7 months ago
|
||
uplift |
Updated•7 months ago
|
Comment 28•7 months ago
|
||
I have attempted to verify this report by taking loading times (in seconds) of the translation process on 3 operating systems, 4 languages, both situations of having respective language downloaded previously and not having it, both fullpage translation and select translations and I have compared them between having pref browser.translations.useLexicalShortlist true vs false. The results can be seen HERE. The document can be viewed using Mozilla accounts.
As a conclusion, I can confirm that:
- No translation has failed in any of the situations.
- Unfortunately, the loading times when the respective language is not downloaded before the translation process increases when pref is FALSE.
- Fortunately, the loading times when the language has been downloaded before the translation process decreases when the pref is TRUE.
- Results were more chaotic on a lower-end device.
The results were collected while taking into consideration as clear as possible phases of the translation process as possible; they were performed on a high-end Ubuntu 24, mid-end Windows 10 and a low-end MacOS 11; it was performed in the latest Nightly v136.0a1 from 2025-01-23.
Erik, can you answer some questions?
- Can you take a look at the results and share your opinions? Are they expected?
- How do you want to address this going forward?
- Is the verification process valid in this case? Is it more than we need or should I have only verified that the translation process does not fail?
- Should I repeat the same process for Fx135 when Beta build is available? In which Beta135 should it land in?
Thanks!
Assignee | ||
Comment 29•7 months ago
•
|
||
Daniel,
Thank you for your thoroughness in the verification process here. It's above and beyond what I expected, though much appreciated! I'm excited to have this kind of data on various platforms.
The net effect of this change is that it will improve the translated output of our models, however it does introduce a performance regression to all platforms (Bug 1942769) in terms of the speed of translations.
These impacts were known and tested before landing the patch into Nightly (PerfCompare).
Fortunately, it does also slightly reduce memory consumption in some cases, as well as reduces the download sizes, as you've noticed.
For me, the most important thing to verify here is, "No translation has failed in any of the situations."
For the purpose of verification, as long as you don't notice any newly introduced failures, then we are good to go!
In terms of testing beta, given that this was uplifted, it may be helpful to do a quick sanity check to make sure that no unexpected failures occur. The fact that it didn't happen in Nightly gives me high confidence.
Thank you so much, again!
Comment 30•7 months ago
|
||
Thank you for your quick reply! Considering the last two comments, I will set firefox136 tracking status as verified.
Moreover, I have also verified that the translation process does not fail in Beta v135.0b9. Unfortunately, I had to use treeherder builds from HERE since beta9 is not yet oficcially available. Some of the loading times were longer than expected, but no failures were observed. Tested in Windows 10, MacOS 11 and Ubuntu 22.
As a conclusion, uplift train firefox135 is also verified. Thank you.
Description
•