Closed Bug 1500333 Opened 7 years ago Closed 4 years ago

Mixed RTL and LTR content is hard to read in text inputs

Tracking

(Not tracked)

Status:

RESOLVED MOVED

People

(Reporter: alamM, Unassigned)

References

Details

Attachments

(4 files)

Screenshot (61).png 7 years ago Mahtab Alam [:alamM] 50.55 KB, image/png		Details
Urdu (ur) · Firefox Updated bidi algorithm.png 6 years ago Safa Alfulaij 52.18 KB, image/png		Details
Without Hamza Approach, a sentence would look like this in textarea. 5 years ago Mohammad Shahbaz Alam [:shahbaz17] 43.47 KB, image/jpeg		Details
Hamza Approach, this will be displayed correctly in textarea. 5 years ago Mohammad Shahbaz Alam [:shahbaz17] 45.42 KB, image/jpeg		Details

Mahtab Alam [:alamM]

Reporter

Description

•

7 years ago

Attached image Screenshot (61).png — Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0 Steps to reproduce: For using XML Tag and External Argument as it is I clicked on it but it got mixed with one another. Actual results: Both got mixed with one another. Expected results: They should have remained separate.

Francesco Lodolo [:flod]

Comment 1

•

7 years ago

Sorry but I don't understand what the bug is about. Is it an issue with Pontoon? Is it an error with translation? If it's Pontoon, we need to move this bug, and you should provide a bit more information on what you did, and what the expected behavior was.

Mahtab Alam [:alamM]

Reporter

Comment 2

•

7 years ago

(In reply to Francesco Lodolo [:flod] from comment #1) > Sorry but I don't understand what the bug is about. > > Is it an issue with Pontoon? Is it an error with translation? If it's > Pontoon, we need to move this bug, and you should provide a bit more > information on what you did, and what the expected behavior was. >>Yes! This is with Pontoon. It's not a Translation Error. As I have attached the Screenshot where you can see the XML Tag and External Argument are separate in the actual string but in the translated one & Translation Panel it got mixed.

Francesco Lodolo [:flod]

Comment 3

•

7 years ago

I'm looking at the string in a text editor, and it seems correct to me? https://pontoon.mozilla.org/ur/firefox/browser/browser/preferences/preferences.ftl/?search=extension-controlled-privacy-containers&string=178407 ایک ایکسٹینشن , <img data-l10n-name="icon"/> {$name}, کو کنٹینر ٹیب کی ضرورت ہے۔

Component: ur / Urdu → Pontoon

Product: Mozilla Localizations → Webtools

Version: unspecified → other

Mahtab Alam [:alamM]

Reporter

Comment 4

•

7 years ago

(In reply to Francesco Lodolo [:flod] from comment #3) > I'm looking at the string in a text editor, and it seems correct to me? > https://pontoon.mozilla.org/ur/firefox/browser/browser/preferences/ > preferences.ftl/?search=extension-controlled-privacy-containers&string=178407 > > ایک ایکسٹینشن , <img data-l10n-name="icon"/> {$name}, کو کنٹینر ٹیب کی ضرورت > ہے۔ Putting it in text editor correct only XML Tag & External Argument but the Urdu Translation got mismatched.

Matjaz Horvat [:mathjazz]

Comment 5

•

7 years ago

Steps to reproduce: 1. Type "a" in the textarea. 2. Click on the XML placeable: "<img data-l10n-name="icon"/>". You get this in the textarea: </"a<img data-l10n-name="icon Mahtab, thanks for the report! What would be the expected value in the textarea after you insert the placeable?

Status: UNCONFIRMED → NEW

Ever confirmed: true

Priority: -- → P3

Summary: XML Tag and External Argument got mismatched → [RTL] XML placeables get modified when inserted into textarea

Mahtab Alam [:alamM]

Reporter

Comment 6

•

7 years ago

The expected value should be <img data-l10n-name="icon"/> a or a <img data-l10n-name="icon"/> depending upon the context.

Matjaz Horvat [:mathjazz]

Comment 7

•

7 years ago

Thanks! I'm looking at how this works for Hebrew (another RTL locale), which has an approved translation in Pontoon: https://pontoon.mozilla.org/he/firefox/browser/browser/preferences/preferences.ftl/?string=178407 Which means it's also in the file: https://hg.mozilla.org/l10n-central/he/file/2c05277ec42e/browser/browser/preferences/preferences.ftl#l91 According to Comment 6, the XML tag in the file output (LTR) seems correct. So I suspect the problem is that the string contains both, the RTL and LTR content and we force <textarea> to use RTL. I wonder what can we even do about this. Flagging Amir with a NI, who's been helping us with RTL issues in the past (see bug 1190566 for example).

Flags: needinfo?(amir.aharoni)

Mohammed Yaseen Khan [:foxt7ot]

Comment 8

•

7 years ago

Thanks Mahtab for raising the issue. Yes Matjaz, your hunch is correct. The issue is becuase the statement contains both RTL and LTR characters and this issue was there in pootle as well and I suspect this is the case with other RTL as well.

Amir Aharoni

Comment 9

•

7 years ago

Sorry, noticed it only now. Unfortunately, I cannot think of any way to fix this easily. It's a major inherent problem with how RTL languages work. Mixing RTL text with any kind of left-to-right code, including XML is always a disaster. This is why translating into RTL languages in text files is so awful: in translations files every single line has some LTR text in it, so *everything* is jumbled. Using any web-based translation solution such as Pontoon makes it much better, because it separates the translation from the source string and from the LTR string key. However, it doesn't fix this problem completely because some code or markup is quite often embedded in the string itself, as it is in this example. The ways to fix such things are: * Make Pontoon have super-smart input boxes that are not just plain text, but that are able to truly separate code from text. It would be super-cool, but probably very complicated to make. * Create aliases in RTL languages for XML element and attribute names. If it's done, then in Hebrew it would look like this:הרחבה" בשם <תמונה נתונים-תרגום-שם="סמל"/> {$שם} דורשת שימוש במגירת לשוניות." In theory, it would solve the problem, but it may introduce other problems, and it's a bit of a bottomless pit. * The most realistic solution is to have a policy that strongly suggests developers to avoid any kind of code or markup in translatable strings, unless it's really, really needed. It would be good for translators to all languages and not only to RTL ones, because it will make it easier for non-developers to translate. (For many people who grew up with the 1990s web HTML and similar things are natural, but it's not true for everyone. There are people who could be great translators, but who have a hard time with markup languages, and reducing this problem may increase volunteers' participation.)

Flags: needinfo?(amir.aharoni)

Matjaz Horvat [:mathjazz]

Comment 10

•

7 years ago

Thanks for a very valuable input, Amir! I'm lowering the priority until we find a meaningful way forward.

Priority: P3 → P4

Summary: [RTL] XML placeables get modified when inserted into textarea → Mixed RTL and LTR content is hard to read in text inputs

Amir Aharoni

Comment 11

•

7 years ago

(In reply to Matjaz Horvat [:mathjazz] from comment #10) > Thanks for a very valuable input, Amir! Sure, happy to help any time. Sorry it took so long. > I'm lowering the priority until we find a meaningful way forward. The most realistic way, as I mention in the end of my comment is not so much in the area of feature development, but in the area of policies and practices for writing, reviewing, and maintaining code: strongly encourage developers to move as much code and markup out of translatable strings as possible.

Safa Alfulaij

Comment 12

•

6 years ago

It might help to add a "Raw mode" as what Pootle did. Here everything is breaked and forced LTR so you can check tags and other stuff easily.
Link: https://github.com/translate/pootle/issues/3941
There is no other way of fixing it as I see it.

Axel Hecht [:Pike]

Updated

•

6 years ago

Comment 13

•

6 years ago

Attached image Urdu (ur) · Firefox Updated bidi algorithm.png — Details

This is how I see it. Yes it has a problem, but not a big one.

Tbh, eliminating markup from text strings is a bad idea, absoulutly bad. You create different parts, making translation much harder. Developers need to provide proper context, mistakes occur.
I belive that translators who translate applications must have at least a bit of knowledge in techincal aspects like variables and placeholders and plurals and and and

Jeff Beatty [:gueroJeff]

Comment 14

•

6 years ago

(In reply to Amir Aharoni from comment #11)

(In reply to Matjaz Horvat [:mathjazz] from comment #10)

Thanks for a very valuable input, Amir!

Sure, happy to help any time. Sorry it took so long.

I'm lowering the priority until we find a meaningful way forward.

The most realistic way, as I mention in the end of my comment is not so much
in the area of feature development, but in the area of policies and
practices for writing, reviewing, and maintaining code: strongly encourage
developers to move as much code and markup out of translatable strings as
possible.

Thanks Amir, but in our experience it's often more effort/cost to change developer behavior. We already ask developers to be aware of how much code they're including in strings, but resourcing any strict enforcement is not something we have resources to do.

Your first suggestion is consistent with how the majority of other computer-assisted translation tools handle code/tagged elements. They're condensed in the string automatically and the user has to expand them manually if they want to see or manipulate their content. Here's a good example: https://docs.sdl.com/LiveContent/content/en-US/SDL%20Trados%20Studio%20Help-v4/GUID-C6676C93-2EEF-4945-9438-905F05EF268E

Mohammad Shahbaz Alam [:shahbaz17]

Comment 15

•

5 years ago

Hello Everyone,

After much thought, we are thinking to use the following approach in Urdu.

Pontoon: https://pontoon.mozilla.org/ur

Writing Hamza(ء) at the end of an LTR to convert into RTL(for editor) as adding Hamza won't change the meaning of the sentence and it is a symbol that doesn't add meaning to a word.

What we expect is having a tool that looks for hamza(ء) and makes it a hidden element. So, in theory, they will be in DOM but not shown to the end-user.

Mohammad Shahbaz Alam [:shahbaz17]

Comment 16

•

5 years ago

Attached image Without Hamza Approach, a sentence would look like this in textarea. — Details

Mohammad Shahbaz Alam [:shahbaz17]

Comment 17

•

5 years ago

Attached image Hamza Approach, this will be displayed correctly in textarea. — Details

Safa Alfulaij

Comment 18

•

5 years ago

Why not use a RLM? Assign it to a shortcut in the keyboard and you're good to go!
This is not the proper way to "solve" this issue. RLM/LRM and all the other marks are there for this exact thing.

Mohammad Shahbaz Alam [:shahbaz17]

Comment 19

•

5 years ago

Hello Safa,

Thanks for the reference.

I agree this is not a proper approach.

But using either
left-to-right mark: ‎ or ‎ (U+200E)
right-to-left mark: ‏ or ‏ (U+200F)

on Pontoon editor, it doesn't work unless it is designed to do it this way.

Safa Alfulaij

Comment 20

•

5 years ago

Hi!

I'm not sure what you mean. Pontoon doesn't need to do it itself, one can just insert it. We did that many times in Arabic translations. In windows there are shortcuts for ZWJ/ZWNJ/LRM/RLM in the Arabic layout (Ctrl+Shift+[1-4]). In Linux you can just customize the keyboard layout, or use helper tools (Character map, etc).

Ideal is that Pontoon implements this: https://bugzilla.mozilla.org/show_bug.cgi?id=1372861

BMO Automation

Comment 21

•

4 years ago

*This bug has been moved to GitHub.* *Please check it out on https://github.com/mozilla/pontoon/issues.*

Status: NEW → RESOLVED

Closed: 4 years ago

Resolution: --- → MOVED

BMO Automation

Updated

•

4 years ago

Product: Webtools → Webtools Graveyard

You need to log in before you can comment on or make changes to this bug.