Closed Bug 1911890 Opened 3 months ago Closed 3 months ago

Translations Remote Settings sync does not properly handle publishing models with a higher minor verison

Categories

(Firefox :: Translations, defect)

defect

Tracking

()

RESOLVED FIXED
131 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- fixed
firefox129 --- fixed
firefox130 --- fixed
firefox131 --- fixed

People

(Reporter: nordzilla, Assigned: nordzilla)

References

(Regression)

Details

(Keywords: regression)

Attachments

(16 files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Description

The [current code] for handling the Remote Settings "sync" event directly adds newly created models into our list of cached models.

This logic was designed very early in the project, when our conceptual idea of updating a model would be to modify the Remote Settings record in place.

We later adopted a model in which we could have multiple versions of the same model within Remote Settings, and we would retrieve only the maximum compatible version for use in Firefox. For example, if two version of the same model exist in Remote Settings with versions 1.0 and 1.1, then only the 1.1 model will be used when Firefox's caches populated.

The issue is that the above logic that handles the "sync" event directly adds newly created models into our cache. This means that if we publish a 1.1 version of a model, the "sync" event will include it in the cache along with the already-present 1.0 version of the model.

The overall effect is that, for users who receive a sync event with an upgraded model version, they will get inconsistent translation results for that language pair until they restart their browser and the cache repopulates correctly filtering and using only the 1.1 version. Since there are three records per language pair, in the best case, they get all three 1.1 records or all three 1.0 records when requesting a translation, leading to a valid translation. Unfortunately these are only 2 cases out of 20 (6 nCr 3), meaning that there is a 90% chance for a user to encounter a garbage translation for that language pair until they restart their browser.

There are two valid ways to fix this:

  1. We could comb through our cache to search for records of the same type but a lower minor version, and remove them from the cache in favor of the new records.

  2. We could just invalidate the cache and rebuild our list of records, which already does the correct thing by choosing the maximum-compatible-version record.

My preference is to solve this using option 2) given that Remote Settings sync events occur at most once every 24 hours, unless manually triggered by publishing a model, which we do at most once a week. The cache currently takes between 0.1 and 0.2 seconds to fully populate.


Steps to reproduce

This is incredibly hard to reproduce because it involves publishing higher-minor-verison models to Remote Settings during the middle of an active session.

At the time of writing, I do have the Dev and Dev (Preview) Remote Settings channels set up to reproduce this case for Spanish to English, but that will not be true forever.

  • Download the Remote Settings Devtools extension.
  • Switch to the Dev channel, clear local data, and sync.
  • Restart the browser.
  • Ensure browser.translations.logLevel is set to All.
  • Trigger a translation from Spanish to English.
  • Verify in the console logs that 1.0 version models are being used.
  • Switch to the Dev channel, clear local data, and sync.
  • DO NOT restart the browser.
  • Trigger a translation from Spanish to English.

Expected Behavior
The console logs should report that version 1.1 models are used.
The translation is coherent.

Actual Behavior
The console logs report that version 1.0 and version 1.1 models are used.
90% of the time the translation is incoherent.


Steps to implement

  • Repopulate our language models cache on Remote Settings sync events.

Tests to implement

  • Ensure that our test cases properly cover creating, updating, and deleting models, then retrieving the expected versions.

Set release status flags based on info from the regressing bug 1837078

Severity: -- → S3

Ensures that Translations actor test properly use our
global major-version constants instead of assuming 1.*.

This ensures that if we ever bump the major version,
then these tests will adapt as necessary.

Refactors the Translations Remote Settings sync tests to
properly update the test Remote Settings databases before
emitting a "sync" event to the client, rather than only
emitting a "sync" event with no actual database changes.

Depends on D218676

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings Models collection to rebuild
the cache instead of attempting to modify the cached data in place.

Depends on D218677

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings WASM collection to work and
be tested the same way that the Models collection is handled.

Depends on D218678

Attachment #9418013 - Attachment description: WIP: Bug 1911890 - Future proof Translations actor test versioning r=#translations-reviewers! → Bug 1911890 - Future proof Translations actor test versioning r=#translations-reviewers!
Attachment #9418014 - Attachment description: WIP: Bug 1911890 - Refactor Translations Remote Settings sync tests r=#translations-reviewers! → Bug 1911890 - Refactor Translations Remote Settings sync tests r=#translations-reviewers!
Attachment #9418015 - Attachment description: WIP: Bug 1911890 - Rework Translations Models Remote Settings sync r=#translations-reviewers! → Bug 1911890 - Rework Translations Models Remote Settings sync r=#translations-reviewers!
Attachment #9418016 - Attachment description: WIP: Bug 1911890 - Rework Translations WASM Remote Settings sync r=#translations-reviewers! → Bug 1911890 - Rework Translations WASM Remote Settings sync r=#translations-reviewers!

Note: Release Managers

This patch stack consists of 3 core correctness fixes for Translations:

Once landed, I intend to request uplift for this stack into the following channels:

These changes are all fully tested in automation.

I believe these changes match the criteria for ESR uplift, and given that that is the version prior to the current release, it would be nice to have these fixes in all consecutive Firefox versions after that as well.

:nordzilla the planned 129 dot release builds on Monday 2024-08-19. The uplift deadline is the Friday before.
If you wanted to land this and add uplift requests before then? Otherwise it won't make the 129 dot release

Flags: needinfo?(enordin)

Hey Donal, I appreciate the ping.

I wrote the above comment while I was still implementing these fixes, but ran into some unexpected test failures in CI.

I'm happy to say I finally believe I've resolved everything, and I just queued this stack for landing about an hour ago.

I'll be sure to request uplifts before Friday.

Flags: needinfo?(enordin)
Pushed by enordin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d7dddb021fe2 Future proof Translations actor test versioning r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/710fde3f7237 Refactor Translations Remote Settings sync tests r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/fe0f5a630b51 Rework Translations Models Remote Settings sync r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/38b3113d9b4f Rework Translations WASM Remote Settings sync r=translations-reviewers,gregtatum

The patch landed in nightly and beta is affected.
:nordzilla, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox130 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(enordin)

Ensures that Translations actor test properly use our
global major-version constants instead of assuming 1.*.

This ensures that if we ever bump the major version,
then these tests will adapt as necessary.

Original Revision: https://phabricator.services.mozilla.com/D218676

Attachment #9419118 - Flags: approval-mozilla-beta?

Refactors the Translations Remote Settings sync tests to
properly update the test Remote Settings databases before
emitting a "sync" event to the client, rather than only
emitting a "sync" event with no actual database changes.

Original Revision: https://phabricator.services.mozilla.com/D218677

Attachment #9419119 - Flags: approval-mozilla-beta?

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings Models collection to rebuild
the cache instead of attempting to modify the cached data in place.

Original Revision: https://phabricator.services.mozilla.com/D218678

Attachment #9419120 - Flags: approval-mozilla-beta?

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings WASM collection to work and
be tested the same way that the Models collection is handled.

Original Revision: https://phabricator.services.mozilla.com/D218679

Attachment #9419121 - Flags: approval-mozilla-beta?

Ensures that Translations actor test properly use our
global major-version constants instead of assuming 1.*.

This ensures that if we ever bump the major version,
then these tests will adapt as necessary.

Original Revision: https://phabricator.services.mozilla.com/D218676

Attachment #9419129 - Flags: approval-mozilla-release?

Refactors the Translations Remote Settings sync tests to
properly update the test Remote Settings databases before
emitting a "sync" event to the client, rather than only
emitting a "sync" event with no actual database changes.

Original Revision: https://phabricator.services.mozilla.com/D218677

Attachment #9419130 - Flags: approval-mozilla-release?

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings Models collection to rebuild
the cache instead of attempting to modify the cached data in place.

Original Revision: https://phabricator.services.mozilla.com/D218678

Attachment #9419131 - Flags: approval-mozilla-release?

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings WASM collection to work and
be tested the same way that the Models collection is handled.

Original Revision: https://phabricator.services.mozilla.com/D218679

Attachment #9419132 - Flags: approval-mozilla-release?
Attachment #9419121 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9419120 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9419119 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9419118 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9419129 - Attachment description: Bug 1911890 - Future proof Translations actor test versioning → WIP: Bug 1911890 - Future proof Translations actor test versioning
Attachment #9419130 - Attachment description: Bug 1911890 - Refactor Translations Remote Settings sync tests → WIP: Bug 1911890 - Refactor Translations Remote Settings sync tests
Attachment #9419131 - Attachment description: Bug 1911890 - Rework Translations Models Remote Settings sync → WIP: Bug 1911890 - Rework Translations Models Remote Settings sync
Attachment #9419132 - Attachment description: Bug 1911890 - Rework Translations WASM Remote Settings sync → WIP: Bug 1911890 - Rework Translations WASM Remote Settings sync
Attachment #9419129 - Attachment description: WIP: Bug 1911890 - Future proof Translations actor test versioning → Bug 1911890 - Future proof Translations actor test versioning
Attachment #9419129 - Flags: approval-mozilla-esr128?
Attachment #9419130 - Attachment description: WIP: Bug 1911890 - Refactor Translations Remote Settings sync tests → Bug 1911890 - Refactor Translations Remote Settings sync tests
Attachment #9419130 - Flags: approval-mozilla-esr128?

Whoops, I accidentally overrode the release uplifts to be esr uplifts, instead of doing new esr uplifts. Fixing it now.

Flags: needinfo?(enordin)
Attachment #9419131 - Attachment description: WIP: Bug 1911890 - Rework Translations Models Remote Settings sync → Bug 1911890 - Rework Translations Models Remote Settings sync
Attachment #9419132 - Attachment description: WIP: Bug 1911890 - Rework Translations WASM Remote Settings sync → Bug 1911890 - Rework Translations WASM Remote Settings sync
Attachment #9419129 - Flags: approval-mozilla-esr128?
Attachment #9419130 - Flags: approval-mozilla-esr128?

Ensures that Translations actor test properly use our
global major-version constants instead of assuming 1.*.

This ensures that if we ever bump the major version,
then these tests will adapt as necessary.

Original Revision: https://phabricator.services.mozilla.com/D218676

Attachment #9419303 - Flags: approval-mozilla-esr128?

Refactors the Translations Remote Settings sync tests to
properly update the test Remote Settings databases before
emitting a "sync" event to the client, rather than only
emitting a "sync" event with no actual database changes.

Original Revision: https://phabricator.services.mozilla.com/D218677

Attachment #9419304 - Flags: approval-mozilla-esr128?

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings Models collection to rebuild
the cache instead of attempting to modify the cached data in place.

Original Revision: https://phabricator.services.mozilla.com/D218678

Attachment #9419305 - Flags: approval-mozilla-esr128?

Reworks the way that the TranslationsParent handles "sync"
events from the Remote Settings WASM collection to work and
be tested the same way that the Models collection is handled.

Original Revision: https://phabricator.services.mozilla.com/D218679

Attachment #9419306 - Flags: approval-mozilla-esr128?
Attachment #9419132 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9419131 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9419130 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9419129 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9419306 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Attachment #9419305 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Attachment #9419304 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Attachment #9419304 - Flags: approval-mozilla-esr128+ → approval-mozilla-esr128-
Attachment #9419303 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Attachment #9419304 - Flags: approval-mozilla-esr128- → approval-mozilla-esr128+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: