Closed Bug 1584178 Opened 2 years ago Closed 1 month ago

Google Translate modifies whitespace within placeables, causing translation errors

Categories

(Webtools Graveyard :: Pontoon, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED MOVED

People

(Reporter: mathjazz, Assigned: jotes)

References

Details

Attachments

(1 file)

We should prevent certain pieces of text from being changed in the machine translation output in order to prevent translation errors.

For example: Google Translate translates
fileSizeProgress = ({ $partialSize } of { $totalSize })
as
fileSizeProgress = ({$ partialSize} od {$ totalSize})
(mind the spacing).

Such translations often result in errors, because they mess with the developer variables. Additionally, changes are hard to spot.

We should check if there's a way to annotate substrings that should not be modified in the MT output. Looking at the comments in https://issuetracker.google.com/issues/119256504, we might need to post-process the output instead.

Adam talked about that in our conversation around optimizing our MT use, too.

Whiteboard: pontoon-pretranslation
Whiteboard: pontoon-pretranslation
Summary: [pre-translation] Investigate annotating parts of strings for pretranslation → Google Translate modifies whitespace within placeables, causing translation errors
Duplicate of this bug: 1630407

(In reply to Axel Hecht from comment #3)

https://modelfront.com/machine-readable-text/ has some tips

After playing with the API a little, I can confirm this is indeed looking promising:

payload = {
    "q": 'Testing %S Placeholder',
    "source": "en",
    "target": locale_code,
    "format": "text",
    "key": api_key,
}

# Output: 'Testiranje% S rezerviranega mesta'

payload = {
    "q": 'Testing <span translate="no">%S</span> Placeholder',
    "source": "en",
    "target": locale_code,
    "format": "html",
    "key": api_key,
}

# Output: 'Testiranje <span translate="no">%S</span> rezerviranega mesta'

We have logic for marking up placeables on frontend, where we'll need to put them in wrappers. And unwrap the results.

Now that we know what to do, let's bump the priority of this.

(Note: The fact that the logic is in frontend will bite us when we'll try to apply it during pre-translation).

Priority: P3 → P2
Assignee: nobody → poke
Attached file GitHub Pull Request
Status: NEW → ASSIGNED
*This bug has been moved to GitHub.*

*Please check it out on https://github.com/mozilla/pontoon/issues.*
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → MOVED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.