Closed Bug 136026 Opened 22 years ago Closed 22 years ago

Unicode Arabic digits rendered right-to-left, should be left-to-right

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Linux
defect
Not set
minor

Tracking

()

RESOLVED INVALID

People

(Reporter: jpatokal, Assigned: mkaply)

References

()

Details

Attachments

(2 files)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020310
BuildID:    20020311008

While Arabic is written right-to-left, Arabic numerals are written
left-to-right.  However, Mozilla incorrectly treats Unicode ARABIC-INDIC DIGIT
U+0660 through 0669 character entities as "normal" Arabic and combines them
right-to-left.

See Unicode section 3.12 (bidirectionality) and 8.2 (Arabic) for a more
extensive discussion.  Arabic-indic digit type is defined as Weak/AN; while the
standard does not explicitly spell out their default direction, I do not know of
any language where they are *not* used left-to-right.


Reproducible: Always
Steps to Reproduce:
View any page with Arabic numerals written as Unicode character entiries and
observe.  (One such page is included as the URL above, although its author
mistakenly thinks that Mozilla's RTL order is correct!)


Actual Results:  The numbers are rendered from right to left.

Expected Results:  The numbers are rendered from left to right.


For example, here are the Arabic numbers 0 to 3 as Unicode
character entities: ٠١٢٣

They should be rendered as "0123" with zero (".") first, but Mozilla incorrectly
reverses them into "3210".
As the HTML above was not executed, let's try a live demonstration:</pre><P>

0123<BR>
&#x660;&#x661;&#x662;&#x663;<P>
<pre>
And of course the HTML was escaped, grumble.  Well, I'll shut up now, go ahead
and try it...
Attached file Testcase with 0123
Uhm, I don't get it. It's rendered left to right (like numbers should be).
Using linux 2002040510. I don't speak Arabic, so I probably don't get the
point.
A-ha, it looks like the order is only reversed if there are spaces
between the entities (!).  Here's a new testcase that shows the results
both with and without spaces.
...and, on further reflection, when the numbers are separated by whitespace
they are treated as individual LTR elements in a row of RTL, and are
thus rendered "backwards".  Whether this behavior is correct or not
depends on whether a row of pure ARABIC-INDIC should be considered a row
of RTL.
The rendering is correct. Rule N1 of the Unicode Bidi Algorithm
(http://www.unicode.org/unicode/reports/tr9/#N1) states "A sequence of neutrals
takes the direction of the surrounding strong text if the text on both sides has
the same direction. European and Arabic numbers are treated as though they were R."

This doesn't happen in a line of only European numbers, (unless the base
direction is right-to-left, of course) because they have already had their types
changed to left-to-right by rule W7
(http://www.unicode.org/unicode/reports/tr9/#W7).

Marking INVALID.
Status: UNCONFIRMED → RESOLVED
Closed: 22 years ago
Resolution: --- → INVALID
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: