Closed Bug 204272 Opened 21 years ago Closed 19 years ago

Why do we classify shaped Arabic characters as left-to-right?

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: smontagu, Assigned: smontagu)

References

(Blocks 1 open bug, )

Details

Attachments

(5 files, 5 obsolete files)

I just noticed a problem in ordering in the testcase for bug 192008 (attachment
113746 [details]), which I think is caused by the following code in nsBidiPresUtils.cpp
which sets |charType = eCharType_LeftToRight| for shaped Arabic characters in
the U+FExx range.

As far as I remember we originally did that for compatibility with IE on
Windows, but I don't see the same problem in IE6 on Windows 2000. Is it an
outdated hack which we can now remove?
The recent testing of GetStringTypeW(CT_CTYPE2, ...) and GetCharacterPlacementW
(..., GCP_REORDER) on W2K and WinNT shows that shaped Arabic characters are 
properly classifies as RTL and get reordered by the mentioned platforms.

However, IIRC we assigned |eCharType_LeftToRight| to shaped characters for 
compatibility with the bidi engine of [some version of] Windows, which treated 
those as LTR.

So is it possible to ask the platform itself about character type?

*****
BTW, I don't see a problem in attachment 113746 [details]. What platform is experiencing 
it?
Does the platform (in which the problem is encountered) have native bidi 
support?
Attached file Testcase (obsolete) —
I see the bug on Windows 2000 with support installed for Arabic, Hebrew, Indic,
and Japanese (default language is Hebrew and user locale is English United
States).

I don't see it on current trunk, but I do see it on 1.3, and also with
attachment 122359 [details] [diff] [review] from bug 192088 applied.
Attached image Screenshot
Display of the testcase on the system described above.

Debugging, I see that both lines are passed to ExtTextOutW in the same (visual)
ordering, but as the screenshot shows, the first line apparently gets reordered
again by Windows (which is theoretically correct) while the second is displayed
as-is (which is the expected bug which we try to work around, as Lina has
said).
Attached file Extended testcase.
Curiouser and curiouser: in this version of the testcase, cases 2-4 are
displayed correctly in Mozilla 1.3, and cases 1-3 are displayed correctly in IE
6. My head is aching.
Attachment #122497 - Attachment is obsolete: true
Attached patch Patch (obsolete) — Splinter Review
I think this is the correct fix, subject to testing on various flavours of
Windows. (ResolveBackwards is only ever called when mRightToLeftText is TRUE,
i.e. only for right to left text and only for Windows with right-to-left
support, so this won't cause any slowdown for non-Bidi pages)
A big advantage of the patch is that it moves the workaround for a bug in
Windows to a Windows-only code path. The current code will cause problems on
non-Windows platforms with their own Bidi support.
Comment on attachment 122521 [details] [diff] [review]
Patch


Simon:

1. Why don't we provide a similar fix in nsFontMetricsWin.cpp when 
mRightToLeftText is FALSE?

2. Can it be acceptable to normalize (convert to 06xx range) Arabic text before 
rendering? (Though I think that once we rejected it, probably because of 
possible loss of composed characters.)

3. The removed code in CalculateCharType in nsBidiPresUtils.cpp seems to be 
affecting only visual text. It shouldn't impact the attached testcases. I'm 
saying that because it may help us in understanding the bizarre behaviour on 
windows (and it must be removed if buggy, of course :).)
What I said in comment 7.3, applies to bidi platform. I doesn't seem to affect 
text of any type on non-bidi one.
Oops, I'm sorry. Mozilla performs its own ordering when character type is 
inconsistent with the embedding level. So the code in nsBidiPresUtils involves 
both types of text.
>Why don't we provide a similar fix in nsFontMetricsWin.cpp when 
>mRightToLeftText is FALSE?

mRightToLeftText is set according to the value of isRightToLeftOnBidiPlatform in
nsTextFrame::PaintUnicodeText(), where
  isRightToLeftOnBidiPlatform = (isBidiSystem &&
                                 (eCharType_RightToLeft == charType ||
                                  eCharType_RightToLeftArabic == charType));

After the change to nsBidiPresUtils.cpp, charType will be
eCharType_RightToLeftArabic, and isBidiSystem will be TRUE on Bidi versions of
Windows (which is where the bug exists), so mRightToLeftText will be TRUE in all
cases where we need the workaround, no?
>Can it be acceptable to normalize (convert to 06xx range) Arabic text before 
>rendering? (Though I think that once we rejected it, probably because of 
>possible loss of composed characters.)

Not only the possible loss of composed characters. Windows has at least one bug
with rendering 06xx range characters (Farsi yeh), and if the page author is
deliberately using preshaped characters to avoid that bug, they will get a nasty
shock if we normalize :-)
Comment on attachment 122521 [details] [diff] [review]
Patch


Being agree with Simon's comment #10 and comment #11, I think that 
the suggested patch is the best workaround for now.
However, though rendering character by character will work 100%, ideally, in my 
opinion, we should try to investigate the problem more thoroughly and seek fine 
tuned solution.

Parentheses seem to be missing in
+        IS_FE_CHAR(*currChar+1)) {

It probably should be
+        IS_FE_CHAR(*(currChar+1))) {
Attached image Screenshot (obsolete) —
Illustrates how ExtTextOutW handles shaped characters on Windows 2000:
1. Homogeneous string of shaped characters, without output flags, is not
ordered.
2. Same string, with ETO_IGNORELANGUAGE flag (disabling international support),
looks identically.
3. Same string with attached 06xx range character, without output flags, is
reordered.
4. String as above, but with ETO_IGNORELANGUAGE flag, is not ordered.
Attached image Screenshot
Correct image (with *Arabic* Alef)
Attachment #123030 - Attachment is obsolete: true
Summarizing behavior of ExtTextOutW (and maybe other Windows GDI functions), it 
looks like ordering of shaped Arabic characters doesn't take place unless there 
is a strong RTL character or BIDI control (which supposes implicit or explicit 
text ordering).

This explains described in comment 3, comment 4, and comment 5 (in Mozilla, but 
maybe also in IE).

On one hand, such Windows behavior apparently contradicts the Unicode Bidi 
Algorithm, on the other hand, it's pretty reasonable that pre-shaped data would 
also be pre-ordered (visual); AFAIK shaped characters are commonly used to 
represent visually formatted data.

I wonder how other bidi platforms handle shaped Arabic characters?
As far as I can tell from looking at the testcases from here and bug 154399 in
Konqueror, qt treats shaped Arabic characters as right-to-left. It would be
interesting to know what OSX does.
At least in some circumstances this patch also prevents the regression of bug
192088 even with the original fix backed out. Time to head for reviews.
Windows seems to have a similar bug with Hebrew presentation forms as well
judging by this testcase, which is reduced from the "Exemplar characters" in
http://oss.software.ibm.com/cgi-bin/icu/lx/en/?_=he

IE exhibits the bug in the testcase, but not in the original page, which is
consistent with what Lina said in comment 15. Whether Mozilla exhibits it
depends on font prefs. If the presentation characters are rendered in a
different font, and so by a separate call to ExtTextOut, they are not ordered
correctly.
Attachment #122521 - Attachment is obsolete: true
Pango needs this patch too
Behdad, would you take a look at this please?
Blocks: Persian
Attached patch Patch synched to trunk (obsolete) — Splinter Review
Attachment #132169 - Attachment is obsolete: true
Attachment #178243 - Attachment is obsolete: true
Attachment #178363 - Flags: superreview?(rbs)
Attachment #178363 - Flags: review?(rbs)
Comment on attachment 178363 [details] [diff] [review]
Fixed a bug in the previous patch

r+sr=rbs
Attachment #178363 - Flags: superreview?(rbs)
Attachment #178363 - Flags: superreview+
Attachment #178363 - Flags: review?(rbs)
Attachment #178363 - Flags: review+
Any reason this was never checked in?
This is low risk, but I didn't think it was major enough to be worth asking for
drivers' approval during beta freeze. If you can find real-world sites that are
broken without the patch, feel free to request a1.8b3.
Oh, sorry. I didn't realize that the trunk was already frozen in April.
Attachment #178363 - Flags: approval1.8b3?
Comment on attachment 178363 [details] [diff] [review]
Fixed a bug in the previous patch

a=chofmann
Attachment #178363 - Flags: approval1.8b3? → approval1.8b3+
As you wish...

Checked in.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: