Closed Bug 1853300 Opened 1 year ago Closed 4 months ago

Tokens containing non-word characters are translated incorrectly

Categories

(Firefox :: Translations, defect, P3)

Firefox 118
defect

Tracking

()

RESOLVED MOVED

People

(Reporter: beavel, Unassigned)

References

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0

Steps to reproduce:

Click Firefox Translations add-on to bring up a a translating window.
Translate German to English
Type in something like:

info@gmail.com
Email:info@akkus-de.com
info@akkus-de.com
ff@mozmail.com

Actual results:

@ sign gets translated sometimes to dash "-"
sometimes to "(at)"
in my specific case inside an email to "Ã3"

"ff@mozmail.com" translates to "ffà mozmail.com"

Expected results:

Tokens containing special characters like @,:; maybe others should not be broken.

The "Ã3" probably indicates encoding problems in training dataset.
However tokens which contain more non-word characters (emails, URL's, ) probably shouldn't be translated

The Bugbug bot thinks this bug should belong to the 'Firefox::Translation' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Translation
Blocks: 1842762
Status: UNCONFIRMED → NEW
Ever confirmed: true

The severity field is not set for this bug.
:gregtatum, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(gtatum)

I can confirm this reproduces in about:translations. I believe the OpusTrainer work will address this: https://github.com/mozilla/firefox-translations-training/issues/161

Flags: needinfo?(gtatum)
Severity: -- → S3
Priority: -- → P3

We need to retrain models with bad robustness, moved to: https://github.com/mozilla/firefox-translations-training/issues/757

Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → MOVED
You need to log in before you can comment on or make changes to this bug.