Closed Bug 1500333 Opened 7 years ago Closed 4 years ago

Mixed RTL and LTR content is hard to read in text inputs

Categories

(Webtools Graveyard :: Pontoon, defect, P4)

defect

Tracking

(Not tracked)

RESOLVED MOVED

People

(Reporter: alamM, Unassigned)

References

Details

Attachments

(4 files)

Attached image Screenshot (61).png
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0 Steps to reproduce: For using XML Tag and External Argument as it is I clicked on it but it got mixed with one another. Actual results: Both got mixed with one another. Expected results: They should have remained separate.
Sorry but I don't understand what the bug is about. Is it an issue with Pontoon? Is it an error with translation? If it's Pontoon, we need to move this bug, and you should provide a bit more information on what you did, and what the expected behavior was.
(In reply to Francesco Lodolo [:flod] from comment #1) > Sorry but I don't understand what the bug is about. > > Is it an issue with Pontoon? Is it an error with translation? If it's > Pontoon, we need to move this bug, and you should provide a bit more > information on what you did, and what the expected behavior was. >>Yes! This is with Pontoon. It's not a Translation Error. As I have attached the Screenshot where you can see the XML Tag and External Argument are separate in the actual string but in the translated one & Translation Panel it got mixed.
I'm looking at the string in a text editor, and it seems correct to me? https://pontoon.mozilla.org/ur/firefox/browser/browser/preferences/preferences.ftl/?search=extension-controlled-privacy-containers&string=178407 ایک ایکسٹینشن , <img data-l10n-name="icon"/> {$name}, کو کنٹینر ٹیب کی ضرورت ہے۔
Component: ur / Urdu → Pontoon
Product: Mozilla Localizations → Webtools
Version: unspecified → other
(In reply to Francesco Lodolo [:flod] from comment #3) > I'm looking at the string in a text editor, and it seems correct to me? > https://pontoon.mozilla.org/ur/firefox/browser/browser/preferences/ > preferences.ftl/?search=extension-controlled-privacy-containers&string=178407 > > ایک ایکسٹینشن , <img data-l10n-name="icon"/> {$name}, کو کنٹینر ٹیب کی ضرورت > ہے۔ Putting it in text editor correct only XML Tag & External Argument but the Urdu Translation got mismatched.
Steps to reproduce: 1. Type "a" in the textarea. 2. Click on the XML placeable: "<img data-l10n-name="icon"/>". You get this in the textarea: </"a<img data-l10n-name="icon Mahtab, thanks for the report! What would be the expected value in the textarea after you insert the placeable?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: XML Tag and External Argument got mismatched → [RTL] XML placeables get modified when inserted into textarea
The expected value should be <img data-l10n-name="icon"/> a or a <img data-l10n-name="icon"/> depending upon the context.
Thanks! I'm looking at how this works for Hebrew (another RTL locale), which has an approved translation in Pontoon: https://pontoon.mozilla.org/he/firefox/browser/browser/preferences/preferences.ftl/?string=178407 Which means it's also in the file: https://hg.mozilla.org/l10n-central/he/file/2c05277ec42e/browser/browser/preferences/preferences.ftl#l91 According to Comment 6, the XML tag in the file output (LTR) seems correct. So I suspect the problem is that the string contains both, the RTL and LTR content and we force <textarea> to use RTL. I wonder what can we even do about this. Flagging Amir with a NI, who's been helping us with RTL issues in the past (see bug 1190566 for example).
Flags: needinfo?(amir.aharoni)
Thanks Mahtab for raising the issue. Yes Matjaz, your hunch is correct. The issue is becuase the statement contains both RTL and LTR characters and this issue was there in pootle as well and I suspect this is the case with other RTL as well.
Sorry, noticed it only now. Unfortunately, I cannot think of any way to fix this easily. It's a major inherent problem with how RTL languages work. Mixing RTL text with any kind of left-to-right code, including XML is always a disaster. This is why translating into RTL languages in text files is so awful: in translations files every single line has some LTR text in it, so *everything* is jumbled. Using any web-based translation solution such as Pontoon makes it much better, because it separates the translation from the source string and from the LTR string key. However, it doesn't fix this problem completely because some code or markup is quite often embedded in the string itself, as it is in this example. The ways to fix such things are: * Make Pontoon have super-smart input boxes that are not just plain text, but that are able to truly separate code from text. It would be super-cool, but probably very complicated to make. * Create aliases in RTL languages for XML element and attribute names. If it's done, then in Hebrew it would look like this:הרחבה" בשם <תמונה נתונים-תרגום-שם="סמל"/> {$שם} דורשת שימוש במגירת לשוניות." In theory, it would solve the problem, but it may introduce other problems, and it's a bit of a bottomless pit. * The most realistic solution is to have a policy that strongly suggests developers to avoid any kind of code or markup in translatable strings, unless it's really, really needed. It would be good for translators to all languages and not only to RTL ones, because it will make it easier for non-developers to translate. (For many people who grew up with the 1990s web HTML and similar things are natural, but it's not true for everyone. There are people who could be great translators, but who have a hard time with markup languages, and reducing this problem may increase volunteers' participation.)
Flags: needinfo?(amir.aharoni)
Thanks for a very valuable input, Amir! I'm lowering the priority until we find a meaningful way forward.
Priority: P3 → P4
Summary: [RTL] XML placeables get modified when inserted into textarea → Mixed RTL and LTR content is hard to read in text inputs
(In reply to Matjaz Horvat [:mathjazz] from comment #10) > Thanks for a very valuable input, Amir! Sure, happy to help any time. Sorry it took so long. > I'm lowering the priority until we find a meaningful way forward. The most realistic way, as I mention in the end of my comment is not so much in the area of feature development, but in the area of policies and practices for writing, reviewing, and maintaining code: strongly encourage developers to move as much code and markup out of translatable strings as possible.

It might help to add a "Raw mode" as what Pootle did. Here everything is breaked and forced LTR so you can check tags and other stuff easily.
Link: https://github.com/translate/pootle/issues/3941
There is no other way of fixing it as I see it.

See Also: → 1602426

This is how I see it. Yes it has a problem, but not a big one.

Tbh, eliminating markup from text strings is a bad idea, absoulutly bad. You create different parts, making translation much harder. Developers need to provide proper context, mistakes occur.
I belive that translators who translate applications must have at least a bit of knowledge in techincal aspects like variables and placeholders and plurals and and and

(In reply to Amir Aharoni from comment #11)

(In reply to Matjaz Horvat [:mathjazz] from comment #10)

Thanks for a very valuable input, Amir!

Sure, happy to help any time. Sorry it took so long.

I'm lowering the priority until we find a meaningful way forward.

The most realistic way, as I mention in the end of my comment is not so much
in the area of feature development, but in the area of policies and
practices for writing, reviewing, and maintaining code: strongly encourage
developers to move as much code and markup out of translatable strings as
possible.

Thanks Amir, but in our experience it's often more effort/cost to change developer behavior. We already ask developers to be aware of how much code they're including in strings, but resourcing any strict enforcement is not something we have resources to do.

Your first suggestion is consistent with how the majority of other computer-assisted translation tools handle code/tagged elements. They're condensed in the string automatically and the user has to expand them manually if they want to see or manipulate their content. Here's a good example: https://docs.sdl.com/LiveContent/content/en-US/SDL%20Trados%20Studio%20Help-v4/GUID-C6676C93-2EEF-4945-9438-905F05EF268E

Hello Everyone,

After much thought, we are thinking to use the following approach in Urdu.

Pontoon: https://pontoon.mozilla.org/ur

Writing Hamza(ء) at the end of an LTR to convert into RTL(for editor) as adding Hamza won't change the meaning of the sentence and it is a symbol that doesn't add meaning to a word.

What we expect is having a tool that looks for hamza(ء) and makes it a hidden element. So, in theory, they will be in DOM but not shown to the end-user.

Why not use a RLM? Assign it to a shortcut in the keyboard and you're good to go!
This is not the proper way to "solve" this issue. RLM/LRM and all the other marks are there for this exact thing.

Hello Safa,

Thanks for the reference.

I agree this is not a proper approach.

But using either
left-to-right mark: ‎ or ‎ (U+200E)
right-to-left mark: ‏ or ‏ (U+200F)

on Pontoon editor, it doesn't work unless it is designed to do it this way.

Hi!

I'm not sure what you mean. Pontoon doesn't need to do it itself, one can just insert it. We did that many times in Arabic translations. In windows there are shortcuts for ZWJ/ZWNJ/LRM/RLM in the Arabic layout (Ctrl+Shift+[1-4]). In Linux you can just customize the keyboard layout, or use helper tools (Character map, etc).

Ideal is that Pontoon implements this: https://bugzilla.mozilla.org/show_bug.cgi?id=1372861

*This bug has been moved to GitHub.* *Please check it out on https://github.com/mozilla/pontoon/issues.*
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → MOVED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: