Closed Bug 136026 Opened 24 years ago Closed 24 years ago

Unicode Arabic digits rendered right-to-left, should be left-to-right

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Linux
defect
Not set
minor

Tracking

()

RESOLVED INVALID

People

(Reporter: jpatokal, Assigned: mkaply)

References

()

Details

Attachments

(2 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020310 BuildID: 20020311008 While Arabic is written right-to-left, Arabic numerals are written left-to-right. However, Mozilla incorrectly treats Unicode ARABIC-INDIC DIGIT U+0660 through 0669 character entities as "normal" Arabic and combines them right-to-left. See Unicode section 3.12 (bidirectionality) and 8.2 (Arabic) for a more extensive discussion. Arabic-indic digit type is defined as Weak/AN; while the standard does not explicitly spell out their default direction, I do not know of any language where they are *not* used left-to-right. Reproducible: Always Steps to Reproduce: View any page with Arabic numerals written as Unicode character entiries and observe. (One such page is included as the URL above, although its author mistakenly thinks that Mozilla's RTL order is correct!) Actual Results: The numbers are rendered from right to left. Expected Results: The numbers are rendered from left to right. For example, here are the Arabic numbers 0 to 3 as Unicode character entities: ٠١٢٣ They should be rendered as "0123" with zero (".") first, but Mozilla incorrectly reverses them into "3210".
As the HTML above was not executed, let's try a live demonstration:</pre><P> 0123<BR> &#x660;&#x661;&#x662;&#x663;<P> <pre>
And of course the HTML was escaped, grumble. Well, I'll shut up now, go ahead and try it...
Attached file Testcase with 0123
Uhm, I don't get it. It's rendered left to right (like numbers should be). Using linux 2002040510. I don't speak Arabic, so I probably don't get the point.
A-ha, it looks like the order is only reversed if there are spaces between the entities (!). Here's a new testcase that shows the results both with and without spaces.
...and, on further reflection, when the numbers are separated by whitespace they are treated as individual LTR elements in a row of RTL, and are thus rendered "backwards". Whether this behavior is correct or not depends on whether a row of pure ARABIC-INDIC should be considered a row of RTL.
The rendering is correct. Rule N1 of the Unicode Bidi Algorithm (http://www.unicode.org/unicode/reports/tr9/#N1) states "A sequence of neutrals takes the direction of the surrounding strong text if the text on both sides has the same direction. European and Arabic numbers are treated as though they were R." This doesn't happen in a line of only European numbers, (unless the base direction is right-to-left, of course) because they have already had their types changed to left-to-right by rule W7 (http://www.unicode.org/unicode/reports/tr9/#W7). Marking INVALID.
Status: UNCONFIRMED → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: