Closed Bug 1917851 Opened 10 months ago Closed 7 months ago

Utilize Intl.Segmenter to Segment Translations Source Text

Categories

(Firefox :: Translations, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
135 Branch
Tracking Status
firefox135 --- fixed

People

(Reporter: nordzilla, Assigned: nordzilla)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

Attachments

(5 files)

Description

Bug 1917849 will provide an API through which Bergamot can accept pre-segmented text as input.

At this point, we can utilize the Intl.Segmenter to segment the text in a CJK-compatible way before sending it to Bergamot.


Steps to implement

  • Modify the way that translation requests are sent to segment the text into sentences prior to sending them to Bergamot.
  • Ensure this code is tested.
Blocks: 1917853
Assignee: nobody → enordin

This patch updates the Translations typescript definintions
to conform to the updated WASM bindings in Bergamot version 2.0

This patch updates the TranslationsEngine code to utilize the
the updated WASM bindings in Bergamot version 2.0

Depends on D230443

This patch adds logic to our text-cleaning algorithm to improve
segmentation in a specific edge-case scenario where the Intl.Segmenter
algorithm produces a sentence break that is not ideal for translation.

Depends on D230444

This bug pulls in the latest generated JavaScript WASM
glue code for Bergamot version 2.0 and bumps the major-version
constant to ensure that only 2.x versions are used from here on.

Depends on D230445

This patch updates the URL of the Translations end-to-end test
artifact fetches to uniqlize the latest version of Bergamot,
compiled from https://github.com/mozilla/translations

Depends on D230446

Attachment #9440299 - Attachment description: WIP: Bug 1917851 - Update WASM Bindings for Bergamot 2.0 r=#translations-reviewers! → Bug 1917851 - Update WASM Bindings for Bergamot 2.0 r=#translations-reviewers!
Attachment #9440300 - Attachment description: WIP: Bug 1917851 - Update TranslationsEngine to utilize Bergamot 2.0 WASM bindings r=#translations-reviewers! → Bug 1917851 - Update TranslationsEngine to utilize Bergamot 2.0 WASM bindings r=#translations-reviewers!
Attachment #9440301 - Attachment description: WIP: Bug 1917851 - Improve Left Double Quote Segmentation for CJK Translations r=#translations-reviewers! → Bug 1917851 - Improve Left Double Quote Segmentation for CJK Translations r=#translations-reviewers!
Attachment #9440302 - Attachment description: WIP: Bug 1917851 - Bump BERGAMOT_MAJOR_VERSION from 1 to 2 r=#translations-reviewers! → Bug 1917851 - Bump BERGAMOT_MAJOR_VERSION from 1 to 2 r=#translations-reviewers!
Attachment #9440303 - Attachment description: WIP: Bug 1917851 - Update Translations E2E Tests to Use Bergamot 2.0 r=#translations-reviewers! → Bug 1917851 - Update Translations E2E Tests to Use Bergamot 2.0 r=#translations-reviewers!
Pushed by enordin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f5f727dd3676 Update WASM Bindings for Bergamot 2.0 r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/441819130698 Update TranslationsEngine to utilize Bergamot 2.0 WASM bindings r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/9fe8b6c56c4c Improve Left Double Quote Segmentation for CJK Translations r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/c1117994ec4b Bump BERGAMOT_MAJOR_VERSION from 1 to 2 r=translations-reviewers,gregtatum https://hg.mozilla.org/integration/autoland/rev/e4dbc6234960 Update Translations E2E Tests to Use Bergamot 2.0 r=translations-reviewers,gregtatum
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: