Unicode Arabic digits rendered right-to-left, should be left-to-right




17 years ago
11 years ago


(Reporter: jpatokal, Assigned: mkaply)



Firefox Tracking Flags

(Not tracked)




(2 attachments)



17 years ago
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020310
BuildID:    20020311008

While Arabic is written right-to-left, Arabic numerals are written
left-to-right.  However, Mozilla incorrectly treats Unicode ARABIC-INDIC DIGIT
U+0660 through 0669 character entities as "normal" Arabic and combines them

See Unicode section 3.12 (bidirectionality) and 8.2 (Arabic) for a more
extensive discussion.  Arabic-indic digit type is defined as Weak/AN; while the
standard does not explicitly spell out their default direction, I do not know of
any language where they are *not* used left-to-right.

Reproducible: Always
Steps to Reproduce:
View any page with Arabic numerals written as Unicode character entiries and
observe.  (One such page is included as the URL above, although its author
mistakenly thinks that Mozilla's RTL order is correct!)

Actual Results:  The numbers are rendered from right to left.

Expected Results:  The numbers are rendered from left to right.

For example, here are the Arabic numbers 0 to 3 as Unicode
character entities: ٠١٢٣

They should be rendered as "0123" with zero (".") first, but Mozilla incorrectly
reverses them into "3210".

Comment 1

17 years ago
As the HTML above was not executed, let's try a live demonstration:</pre><P>


Comment 2

17 years ago
And of course the HTML was escaped, grumble.  Well, I'll shut up now, go ahead
and try it...

Comment 3

17 years ago
Created attachment 78093 [details]
Testcase with 0123

Uhm, I don't get it. It's rendered left to right (like numbers should be).
Using linux 2002040510. I don't speak Arabic, so I probably don't get the

Comment 4

17 years ago
Created attachment 78097 [details]
Testcase for bug 136026, version 2

A-ha, it looks like the order is only reversed if there are spaces
between the entities (!).  Here's a new testcase that shows the results
both with and without spaces.

Comment 5

17 years ago
...and, on further reflection, when the numbers are separated by whitespace
they are treated as individual LTR elements in a row of RTL, and are
thus rendered "backwards".  Whether this behavior is correct or not
depends on whether a row of pure ARABIC-INDIC should be considered a row
of RTL.
The rendering is correct. Rule N1 of the Unicode Bidi Algorithm
(http://www.unicode.org/unicode/reports/tr9/#N1) states "A sequence of neutrals
takes the direction of the surrounding strong text if the text on both sides has
the same direction. European and Arabic numbers are treated as though they were R."

This doesn't happen in a line of only European numbers, (unless the base
direction is right-to-left, of course) because they have already had their types
changed to left-to-right by rule W7

Marking INVALID.
Last Resolved: 17 years ago
Resolution: --- → INVALID


11 years ago
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.