Closed Bug 1190566 Opened 9 years ago Closed 7 years ago

[translate] Set lang and dir attributes on translation textarea

Categories

(Webtools Graveyard :: Pontoon, defect, P2)

defect

Tracking

(firefox42 affected)

RESOLVED FIXED
Tracking Status
firefox42 --- affected

People

(Reporter: amir.aharoni, Assigned: mathjazz)

References

Details

The textarea element for writing the translation (id="translation") must have appropriate lang and dir attributes.

dir is vital because when translating to Hebrew I have to switch the direction manually. This must be automatic and it should be very easy.

The lang attribute is important as well, because for some languages it is used for setting the correct font.
Amir, I have a few questions WRT to this bug and also bugs 1190796 and 1190953.

1. When should we set dir="rtl"? I guess only when translating to RTL languages, not based on browser language? I think only Hebrew and Arabic from this list are RTL:
https://pontoon.mozilla.org/projects/pontoon-intro/

2. Where should we set lang and dir attributes? In all HTML elements that display translations - translation textarea, translation list, helpers (e.g. translation memory) and also contributors pages?
https://pontoon.mozilla.org/contributors/amir.aharoni@mail.huji.ac.il/

3. Is setting dir and lang enough, or do we also need to modify CSS (to set text alignment or something)?

Thanks for clarifying this and also for being early adopter. :-)
(In reply to Matjaz Horvat [:mathjazz] from comment #1)
> Amir, I have a few questions WRT to this bug and also bugs 1190796 and
> 1190953.
> 
> 1. When should we set dir="rtl"? I guess only when translating to RTL
> languages, not based on browser language?

Of course, according to the language into which the translation is done.

> I think only Hebrew and Arabic
> from this list are RTL:
> https://pontoon.mozilla.org/projects/pontoon-intro/

Definitely also Persian (fa) and Urdu (ur).

Possibly also Kurdish (ku), but please check what it actually is by looking at the current translations in the files. If it's the Arabic script, then it's RTL, if it's Latin, then LTR.

> 2. Where should we set lang and dir attributes? In all HTML elements that
> display translations - translation textarea, translation list, helpers (e.g.
> translation memory) and also contributors pages?
> https://pontoon.mozilla.org/contributors/amir.aharoni@mail.huji.ac.il/

Yes, on every element that displays a translated string.

Thanks for the link to https://pontoon.mozilla.org/contributors/amir.aharoni@mail.huji.ac.il/ - it's very nice and useful. The <p> with translation there does need lang="XX" dir="XXX".

> 3. Is setting dir and lang enough, or do we also need to modify CSS (to set
> text alignment or something)?

As far as I can see till now, HTML dir and lang are enough. If I'll see that more is needed, I'll open new bugs ;)
Commit pushed to master at https://github.com/mozilla/pontoon

https://github.com/mozilla/pontoon/commit/cf41ff52aaae151228ec9254de75db5ee551a57b
Bug 1190566: Set dir and lang attributes on #translation

- dir is need to set writing direction properly for RTL languages
- lang is important for some languages to set the correct font
While dir=rtl is will make the application more usable to RTL languages, having dir=auto has the benefits of automatically setting the direction based on the textarea content. That is, if a string is in LTR on an RTL locale, it should not be set as dir=rtl.
Tomer, does that mean we should basically just set dir=auto everywhere? Does that work reliably across browsers?
(In reply to Matjaz Horvat [:mathjazz] from comment #5)
> Tomer, does that mean we should basically just set dir=auto everywhere? Does
> that work reliably across browsers?

It would be supported on every modern browser, and older browser would ignore this keyword. 

https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/dir
OK, I'll make a patch to just use dir=auto everywhere, that way we don't even need the information which locales are RTL.

In the meantime, if you'll be able to log in, you can try out the current solution (with dir=rtl) on our staging server:
https://mozilla-pontoon-staging.herokuapp.com/he/pontoon-intro/
Deployed to production.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
dir=auto is incorrect. dir=auto is for cases when the language is *not* known. In this case the language is known, and the direction must be set according to it.

dir=auto will show strings that happen to begin with a Latin letter incorrectly.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
... Also, dir=auto is *totally* wrong for textarea. For an empty textarea the result will always be that the automatic direction is ltr. It must have the correct direction before the user starts typing.
All other web-based translation sites that I can see use explicit dir="rtl" lang="he" and that's the correct thing to do. This includes Pootle, Transifex, OneSky and Facebook's translation interface. I cannot find anybody who uses dir=auto for this. It must be an explicit direction.
Dir=auto is pretty new, and this is way most application doesn't use it. Having dir=rtl will force direction, making the Latin strings to appear wrong, thus, Dir=auto has better trade off.
dir=auto has been around 2011, so it's not very new. It was meant for elements the language of which *is not known*.

Strings that are supposed to remain untranslated, are not supposed appear in the translation interface at all.

Strings that are supposed to be translated must have _appropriate_ lang and dir. ltr for French and Slovenian, rtl for Arabic and Hebrew. Not "auto". "auto" gives the textarea the *wrong* direction.
As I told you, some strings are preferences and properties. For example ui.direction is a real string, and its "Hebrew translation" is rtl. Without dir=auto such strings will get the wrong direction, and it is worse than being semantic markup.
(In reply to Tomer Cohen :tomer from comment #17)
> For example ui.direction is a real string, and its "Hebrew translation" is rtl.

This is an excellent example of a string that shouldn't appear in the translation interface at all.
(In reply to Amir Aharoni from comment #18)
> This is an excellent example of a string that shouldn't appear in the
> translation interface at all.
Well, it is, and it is not the scope of Pontoon to make the world better. We need these strings, and if they'd be hidden, some locales may miss them.
(In reply to Tomer Cohen :tomer from comment #19)
> (In reply to Amir Aharoni from comment #18)
> > This is an excellent example of a string that shouldn't appear in the
> > translation interface at all.
> Well, it is, and it is not the scope of Pontoon to make the world better. We
> need these strings, and if they'd be hidden, some locales may miss them.

It is. Pontoon is a translation tool, and a translation tool is not supposed to show strings that are not supposed to be translated.

Defining a language as "rtl" is not a matter to be handled by translators. It's a matter to be handled by developers.
It is not just the direction strings. The locale files are used for setting default encoding, fonts, and other preferences as well. These strings are loaded into Pontoon the same way as other strings, and the translator attention is required for example for cases where it is necessary to change the window proportions etc. If you think this is wrong, you are welcome to suggest something better on another product/component. Pontoon isn't responsible to the way Mozilla translation works, it is just using the format we are using for about 15 years.
There is no automated way of detecting which strings should not be localizable. Doing it manually for tens of thousands of strings available for translation in Pontoon is also not an option. It has to be handled on a per-project basis by developers.

Until this is fixed, we have to live with what we have. And it sounds like auto is a better compromise than a hardcoded dir value.

Also, correct me if I'm wrong, but textarea is only set to LTR until you start typing, then it switches to RTL immediately after it detects the RTL language character. Isn't this good enough?
(In reply to Matjaz Horvat [:mathjazz] from comment #22)
> There is no automated way of detecting which strings should not be
> localizable. Doing it manually for tens of thousands of strings available
> for translation in Pontoon is also not an option. It has to be handled on a
> per-project basis by developers.

Well, it is done in the configuration of translatewiki.net , so it's definitely not impossible.

> Until this is fixed, we have to live with what we have. And it sounds like
> auto is a better compromise than a hardcoded dir value.

I have to disagree.

I don't mind seeing strings that aren't supposed to be translated as "broken", given that they aren't supposed to be there at all. Contrariwise, it will draw my attention to them.

> Also, correct me if I'm wrong, but textarea is only set to LTR until you
> start typing, then it switches to RTL immediately after it detects the RTL
> language character. Isn't this good enough?

It's true, but no, it's quite ugly. It should be rtl immediately.

That's the kind of thing that is done in Twitter, Google+ and YouTube comments, where the language in which the user is going to type is unknown, and it's OK there. But in Pontoon we know in which language the user is going to type.
(In reply to Amir Aharoni from comment #23)
> I don't mind seeing strings that aren't supposed to be translated as
> "broken", given that they aren't supposed to be there at all. Contrariwise,
> it will draw my attention to them.

That's a valid argument. That being said, these string will mostly stay in, because at least one locale has to translate them. If you have a look at Firefox Accounts, you will notice that even year numbers are localizable and they need to be (for e.g. Bengali): https://pontoon.mozilla.org/he/firefox-accounts/

The best thing we can do is detect them as placeables, which is why they are marked (with red color, which is probably not the best choice). So you can copy them to textarea by simply clicking on them, or you can Alt + C to copy the entire string.

> It's true, but no, it's quite ugly. It should be rtl immediately.
> 
> That's the kind of thing that is done in Twitter, Google+ and YouTube
> comments, where the language in which the user is going to type is unknown,
> and it's OK there. But in Pontoon we know in which language the user is
> going to type.

I think we can solve this by setting dir=rtl in empty textarea and switch it to dir=auto when it's not empty.
> > It's true, but no, it's quite ugly. It should be rtl immediately.
> > 
> > That's the kind of thing that is done in Twitter, Google+ and YouTube
> > comments, where the language in which the user is going to type is unknown,
> > and it's OK there. But in Pontoon we know in which language the user is
> > going to type.
> 
> I think we can solve this by setting dir=rtl in empty textarea and switch it
> to dir=auto when it's not empty.

Does that sound like a reasonable resolution?
Guys, let me know if the compromise suggested above (dir=rtl by default, switched to dir=auto if not empty) sounds good to you.
Flags: needinfo?(tomer.moz.bugs)
Flags: needinfo?(amir.aharoni)
No. This won't work correctly with textareas that happen to start with English strings, the most simple example being "%s". They will be shown incorrectly initially and the translator will have to switch the direction. The whole point of this bug is that I want the translator not to have to switch.

It's basically a choice of which strings will be shown incorrectly initially:
1. Strings that are supposed to be translated, but happen to start with English strings.
2. Strings that are not supposed to be translated.

Since strings from #2, ideally, are not supposed to exist at all, I clearly prefer to have them show up incorrectly, and to have strings from #1 show correctly.
Flags: needinfo?(amir.aharoni)
(In reply to Amir Aharoni from comment #27)
> It's basically a choice of which strings will be shown incorrectly initially:
> 1. Strings that are supposed to be translated, but happen to start with
> English strings.

Unless the strings files will have some context metadata, such as what it is expected to contain, this idea can't be made possible, and anyway Pontoon has nothing to do with requests for changing the way Mozilla translation currently works.
Flags: needinfo?(tomer.moz.bugs)
Summary: The textarea for writing the translation must have appropriate lang and dir attributes → [translate] Set lang and dir attributes on translation textarea
Priority: -- → P4
Coming back to this...

It's still dir=auto, and it's wrong.

It causes strings to be displayed incorrectly. Just as one example out of many, the Hebrew translation of the string "“%1$S” has arrived from %2$S." in Firefox Aurora account.properties begins with "%1$S", which forces it to be left-to-right.

Hebrew strings must be explicitly right-to-left and not auto.
Priority: P4 → P3
I completely agree with Amir here, strings starting with string LTR characters are currently displayed with the wrong direction, and it gets really hairy when the string has more than such placeable, e.g.:

    %1$S one %2$S two %3$S three

Will be shown (when translated) as:

    %1$S 1% ONE$S 1% TWO$S THREE

(like the string below which I didn’t mangle in any way):

    %1$S واحد %1$S اثنين %1$S ثلاثة

Which is completely confusing and almost all translators will get it wrong.

Cases like ui.direction or year numbers is a read herring IMO, as they will look just fine whether the base dir is LTR or RTL (the first is a single word, and the other has no characters with string direction). Even if it matters, they are so rare and exception that we should be siding for the common and normal cases not them.
Of course if we want to get really fancy, we can make placeables “unicode-bidi: isolate”, but then we won’t be able to use a textarea with all the complication of that.
Shifting priority to resolve this by the end of the quarter.
Priority: P3 → P2
(In reply to Khaled Hosny from comment #32)
> Of course if we want to get really fancy, we can make placeables
> “unicode-bidi: isolate”, but then we won’t be able to use a textarea with
> all the complication of that.

This is not very difficult since we can use the plain Unicode equivalents for isolation LRI/RLI/FSI..PDI for each placeholder. What really make it difficult is how to deal with words you wish to keep untranslated in strings such as "Firefox is your default browser" (The word Firefox is the brand name, and has to be kept untranslated according to the guidelines) - The solution here is to make sure such cases never occur ('Firefox' should be declared only once in the whole codebase), but sadly this is not the case.
(In reply to Tomer Cohen :tomer from comment #34)
> (In reply to Khaled Hosny from comment #32)
> > Of course if we want to get really fancy, we can make placeables
> > “unicode-bidi: isolate”, but then we won’t be able to use a textarea with
> > all the complication of that.
> 
> This is not very difficult since we can use the plain Unicode equivalents
> for isolation LRI/RLI/FSI..PDI for each placeholder.

You will need to filter them when saving the text into the database, but such control characters can also be inserted by the user (I use bidi control characters all the time) and you will need to tell which characters were inserted by the translator and which were inserted by Pontoon and it gets ugly very quickly. You also will need to make the inserted control characters uneditable, or else they can be accidentally removed by the user causing havoc in the string ordering.

> What really make it
> difficult is how to deal with words you wish to keep untranslated in strings
> such as "Firefox is your default browser" (The word Firefox is the brand
> name, and has to be kept untranslated according to the guidelines) - The
> solution here is to make sure such cases never occur ('Firefox' should be
> declared only once in the whole codebase), but sadly this is not the case.

That is not limited to display during translation, you will have the same problem when the string is actually used in the UI and translators need to deal with that (usually by manually inserting bidi control characters). Making it easy to insert and view control characters would be great (Pootle goes long way in nicely handling this, though not 100% perfect yet). I think this is really a separate issue.

Also in Arabic we transliterate Firefox so it is not an issue for us (I know Mozilla says not to, but I maintain that not transliterating it is unreasonable and anti localization).
Assignee: nobody → m
Commit pushed to master at https://github.com/mozilla/pontoon

https://github.com/mozilla/pontoon/commit/dad1a870b9abb4b01196e2955a34d2248c2dd3e9
RTL fixes (+ Locales tab sorting) (#585)

* Sort other locales alphabetically

* Fix bug 1190566: Use actual direction instead of auto

Wherever the `dir` attribute is set, we now set its value to the content
locale direction instead of relying on `auto`.

We also update `dir` and `lang` attributes of the translation textarea
every time translation interface is reloaded (which could include
replacing the target locale).

* Fix bug 1357945: Use italic only for Open Sans scripts

* Fix bug 1357919: Right-align RTL text

* Bug 1351813: Avoid using !important in CSS

Explicitly set Arabic and Persian-specific font-size.
Status: REOPENED → RESOLVED
Closed: 9 years ago7 years ago
Resolution: --- → FIXED
Commit pushed to master at https://github.com/mozilla/pontoon

https://github.com/mozilla/pontoon/commit/8a72fd18f8856077417d8ce176a974b7e01df71e
Fix bug 1190566: Use actual direction instead of auto

Wherever the `dir` attribute is set, we now set its value to the content
locale direction instead of relying on `auto`.

We also update `dir` and `lang` attributes of the translation textarea
every time translation interface is reloaded (which could include
replacing the target locale).
\o/ Thank you!

When is this supposed to be updated on pontoon.mozilla.org ?
Deployed!
My goodness, thank you so much! I owe you a 
(Bugzilla ate my beautiful emoji. It was a hug, a beer, and a falafel.)
You don't owe me anything (although a falafel sounds tempting!) - it took almost 2 years! :-)
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.