Closed Bug 81664 Opened 23 years ago Closed 23 years ago

need to clean up bidi reorder code

Categories

(Core :: Layout: Text and Fonts, defect, P1)

x86
Windows 98
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: ftang, Assigned: smontagu)

References

Details

Attachments

(3 files)

I move some email discussion to here
Subject: 
        Re: cleaning up the code
   Date: 
        Tue, 15 May 2001 22:05:18 +0300
   From: 
        lkemmel@il.ibm.com
     To: 
        Maha_Abou_El-Rous/Egypt/Contr/IBM@il.ibm.com
    CC: 
        smontagu@il.ibm.com, ftang@netscape.com







Maha,

With your new changes visual Hebrew looks good, but logical Hebrew - not
good at all. Please see attached (See attached file: testreorder.zip),
including test case and 2 screen snapshots  (mozilla and Bidi NS 4.61 --
please compare and see the difference).

Thanks,
Lina

Maha Abou El-Rous@IBMEG
14/05/2001 21:48

To:   Lina Kemmel/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB
cc:   Simon Montagu/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB,
      ftang@netscape.com@IBMIL@IBMDE@IBMGB
From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG
Subject:  Re: cleaning up the code  (Document link: Lina Kemmel)

Lina,
     Please find my comments below in your text prefixed mrous>>
Attached you'll find the changes I described in my previous mail,
aIsBidiSystem, trully indicates isBidiSystem. You'll also find that I have
moved the setting of RTL Reading into FormatUnicodeText(). I think it is
much "cleaner" that way.
Take a look at it, and if you're able to give a test that would be great.
Then, if you still feel that we should keep setting RTL Reading in
nsTextFrame.cpp, then we can move it back to it.

Best Regards,
Maha Abou El-Rous
IBM Egypt
56 Gameat El-Dowal Al Arabia st., 12311
Operator: (202) 7492 533  Ext.: (733)
Direct: (202) 3310 733
Fax : (202) 7601 227
e-mail : Mahar@eg.ibm.com


Lina Kemmel@IBMIL
14/05/2001 06,47 PM

To:   Maha Abou El-Rous/Egypt/Contr/IBM
cc:   Simon Montagu/Israel/Contr/IBM, ftang@netscape.com
From: Lina Kemmel/Israel/Contr/IBM@IBMIL
Subject:  Re: cleaning up the code  (Document link: Database 'Maha Abou
      El-Rous', View '($Inbox)')

Maha,

First of all, sorry about the typo. I meant we wouldn't reach doReverse =
PR_TRUE in the case of eCharType_LeftToRight char type.

With the old XOR condition and the old significance of aIsBidiSystem we
could reach doReverse when we had a latin character with odd EL (in visual
RTL or within RLO). Do you agree?
mrous>> I'm not really sure because its been like ages now with
aIsBidiSystem meaning isRightToLeftOnBidiPlatform

I agree with you that it would be good if aIsBidiSystem denotes only
whether the system is Bidi.
mrous>> I guess this is what erik wanted for some time now.

I'm currently unable to build mozilla code for technical reasons, but
what's your opinion, if we do the following:

a. In the case of Visual Arabic pass eCharType_LeftToRight to the
FormatUnicodeText (from nsTextFrame), instead of the real char type. (Then
the argument aCharType must be probably renamed into something like
aTreatCharAsRTL.)
mrous>> I'm not really sure it's a "clean" way to do it. I remember erik
commenting on such approach, when we tried to fix the visual problem by
passing isRightToLeftOnBidiPlatform false when Arabic visual document.
b. Leave FormatUnicodeText in its current (CVS) version (except for your
changes to numeric and literal shaping, of course).
mrous>> regarding the numeric shaping change, where you suggesting changing
the switch condition to if () statement. We had as if() before, and were
advised to change it to a case statement later on by frank.
c. The reason, why SetRightToLeftText was outside of FormatUnicodeText was
that it might do harm to literal RTL bullets (since there we've already
repositioned a period by hand). So, cannot we leave SetRightToLeftText in
nsTextFrame?
mrous>> Currently (without our suggested fix) we do have a problem with
bullets on RTL display, they loose indentation. There is a bug open for it.
so I believe we can test with the new fix and see what comes up.
Best regards,
Lina

Maha Abou El-Rous@IBMEG
14/05/2001 18:28

To:   Lina Kemmel/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB
cc:   Simon Montagu/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB,
      ftang@netscape.com@IBMIL@IBMDE@IBMGB
From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG
Subject:  Re: cleaning up the code  (Document link: Lina Kemmel)

Lina,
     Good catch.
3. However, since aIsBidiSystem changed semantically, and currently also
means that eCharType_RightToLeft == aCharType, we would never get to
doReverse = PR_TRUE, in the case eCharType_RightToLeft == aCharType, while
reverse may be needed, for example, for Latin chars in visual RTL.
(See:
if (isVisual){
  if (aIsBidiSystem){
    if ( (eCharType_RightToLeft == aCharType) ^ (aIsOddLevel) )
      doReverse = PR_TRUE;
.....
)
This means that even with the old XOR condition which was also conditioned
with aIsBidiSystem, we could never reach doReverse when we have a latin
character? Do you have a suggestion?

What do you think of the following:
we move the following lines of code from nsTextFrame.cpp to
FormatUnicodeText() (with the necessary changes)
   if (isBidiSystem && (eCharType_RightToLeft == charType) ) {
        isRightToLeftOnBidiPlatform = PR_TRUE;
      }
      else if (eCharType_RightToLeftArabic == charType) {
        isRightToLeftOnBidiPlatform =
          (hints & NS_RENDERING_HINT_ARABIC_SHAPING);
      }

and we then change the parameter list when calling FormatUnicodeText() from
nsTextFrame.cpp and replace "isRightToLeftOnBidiPlatform" with
"isBidiSystem", so it would become:
bidiUtils->FormatUnicodeText(aPresContext, text, textLength, charType,
level & 1, isBidiSystem,&aRenderingContext);

That way aIsBidiSystem denotes whether or not the system is Bidi, with out
the extra information if Hebrew or Arabic. What do you think?

For the numbers, I guess we can filter out characters using the condition
you suggested:
if ((eCharType_EuropeanNumber == aCharType) || (eCharType_ArabicNumber ==
aCharType))

Best Regards,
Maha Abou El-Rous
IBM Egypt
56 Gameat El-Dowal Al Arabia st., 12311
Operator: (202) 7492 533  Ext.: (733)
Direct: (202) 3310 733
Fax : (202) 7601 227
e-mail : Mahar@eg.ibm.com


Lina Kemmel@IBMIL
14/05/2001 01,45 PM

To:   Maha Abou El-Rous/Egypt/Contr/IBM
cc:   Simon Montagu/Israel/Contr/IBM, ftang@netscape.com
From: Lina Kemmel/Israel/Contr/IBM@IBMIL
Subject:  Re: cleaning up the code  (Document link: Database 'Maha Abou
      El-Rous', View '($Inbox)')

Hi Maha,

1. As for expanding old XOR condition, I have no questions, understanding
your need for proper handling of visual FE-ranged Arabic.

2. As for RTL reading, the name aIsBidiSystem is a little confusing, but
now I see it would work properly.

3. However, since aIsBidiSystem changed semantically, and currently also
means that eCharType_RightToLeft == aCharType, we would never get to
doReverse = PR_TRUE, in the case eCharType_RightToLeft == aCharType, while
reverse may be needed, for example, for Latin chars in visual RTL.
(See:
if (isVisual){
  if (aIsBidiSystem){
    if ( (eCharType_RightToLeft == aCharType) ^ (aIsOddLevel) )
      doReverse = PR_TRUE;
.....
)

Also (again, due to the new significance of aIsBidiSystem) for logical text
type the condition ( (aIsBidiSystem!=0) ^ (aIsOddLevel) ) looks good:
  else { // logical
    if ( (aIsBidiSystem!=0) ^ (aIsOddLevel) )
      doReverse = PR_TRUE;
...
So, no changes seem to be needed in this part (logical text type).

5. Yes, I meant numeric shaping (I thought "numeric swapping" was also in
use). The intent was to skip the numeric shaping for non-numeric char
types.


Thanks,
Lina

Maha Abou El-Rous@IBMEG
14/05/2001 13:09

To:   Lina Kemmel/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB
cc:   Simon Montagu/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB,
      ftang@netscape.com@IBMIL@IBMDE@IBMGB
From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG
Subject:  Re: cleaning up the code  (Document link: Lina Kemmel)

Lina,
     What we mainly did with FormatUnicodeText() was expanding the
previously existing XOR condition that sets
doReverse boolean
if (aIsBidiSystem)
  if (CHARTYPE_IS_RTL(aCharType) ^ aIsOddLevel)
    doReverse = PR_TRUE;
This is to differentiate between Arabic RTL and Hebrew RTL, as this was
causing a problem for Visual Arabic (FE characters).

As for setting RTL Reading, this is how it was set from nsTextFrame.cpp
right before the call to FormatUnicodeText()

if (isRightToLeftOnBidiPlatform)
   aRenderingContext.SetRightToLeftText(PR_TRUE);

isRightToLeftOnBidiPlatform is being passed as aIsBidiSystem to
FormatUnicodeText(), so what we did is only move the
condition inside FormatUnicodeText() as:

if ( aIsBidiSystem )
    aRenderingContext->SetRightToLeftText(PR_TRUE);

We also needed to set that to false when an Arabic visual document.

For the examples you listed, this is how they are handled with the new
code:
if (isVisual){
  if (aIsBidiSystem){
    if ( (eCharType_RightToLeft == aCharType) ^ (aIsOddLevel) )
      doReverse = PR_TRUE;

1. bidi system; odd EL; char type is LTR    ----> the condition is
evaluated to true(0^1), doReverse is True
2. bidi system; even EL; char type is LTR -----> the condition is evaluated
 to false(0^0), and doReverse remains False

However this is applicable only when a visual document, I think we'll need
to modify the logical condition to accomodate it.
how does this change look ?
if(isVisual){
}
else if (aIsBidiSystem)
  if (CHARTYPE_IS_RTL(aCharType) ^ aIsOddLevel)
    doReverse = PR_TRUE;

For the numeric swapping question, I didn't quite get it, do you mean
numeric shaping? could you please elaborate?

Best Regards,
Maha Abou El-Rous
IBM Egypt
56 Gameat El-Dowal Al Arabia st., 12311
Operator: (202) 7492 533  Ext.: (733)
Direct: (202) 3310 733
Fax : (202) 7601 227
e-mail : Mahar@eg.ibm.com


Lina Kemmel@IBMIL
13/05/2001 08,13 PM

To:   Maha Abou El-Rous/Egypt/Contr/IBM
cc:   Simon Montagu/Israel/Contr/IBM, ftang@netscape.com
From: Lina Kemmel/Israel/Contr/IBM@IBMIL
Subject:  Re: cleaning up the code  (Document link: Database 'Maha Abou
      El-Rous', View '($Inbox)')

Hi Maha,

Can you please clarify for me the following questions:

1. Would it make sense to condition numeric swapping by
  if ((eCharType_EuropeanNumber == aCharType) || (eCharType_ArabicNumber ==
 aCharType))
?

2. It seems that now you always set RTL reading
(aRenderingContext->SetRightToLeftText(PR_TRUE) ), when platform has bidi ,
except for the case of visual Arabic. I think this may cause some extra
reverses: for example, pure neutral strings would always get reversed by a
bidi platform.

Examples:
a. bidi system; odd EL; char type is LTR (result of RLO); -- must be
reversed, but isn't covered with the above condition.
b. bidi system; even EL; char type is LTR ("normal" LTR text) -- would get
reversed, while should remain as is.

Or have I missed anything?

Thanks,
Lina

Maha Abou El-Rous@IBMEG
13/05/2001 18:46

To:   Lina Kemmel/Israel/Contr/IBM@IBMIL
cc:   Simon Montagu/Israel/Contr/IBM@IBMIL, ftang@netscape.com
From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG
Subject:  cleaning up the code


Lina,
     Please find attached bidi1305.zip, in which we tried to solve the
Arabic Visual problem.
Please verify that our changes does not break the code for Hebrew, and
check if it is clean according to erik's directions.
Let me know what you think.



Best Regards,
Maha Abou El-Rous
IBM Egypt
56 Gameat El-Dowal Al Arabia st., 12311
Operator: (202) 7492 533  Ext.: (733)
Direct: (202) 3310 733
Fax : (202) 7601 227
e-mail : Mahar@eg.ibm.com
















   testreorder.zip

                    Name: 
                          testreorder.zip
                    Type: 
                          Zip Compressed Data (application/x-zip-compressed)
                 Encoding: 
                          base64
               Description: 
                          .ZIP File
Got updates from Maha, which allowed to see logical Hebrew properly.
(Tested logical and visual Hebrew, LTR and RTL, on bidi and non-bidi windows.)

Lina
Attached patch Fix for this bugSplinter Review
Changing QA contact to mahar@eg.ibm.com.
QA Contact: andreasb → mahar
my goodnees. It is very hard to review this patch. Is that possible you can 
break this patch into stages ? This kind of all-in-one patch will never pass 
review and suprereview process. File individual bugs for individual issues. 
Sorry, the diffs include some other things which happened to be in my local 
tree which are not related to this bug. Lina and I will get rid of the 
unnecessary stuff and post new diffs.
Attached patch new diffsSplinter Review
Bug 75005 depends on this fix
Blocks: 75005
Severity: normal → blocker
Blocks: 75034
Blocks: 82088
When is this code going to be submitted? we have 3 bugs dependent on this 
75005, 75034, 82088 two of which are related to Visual Arabic.
Priority: -- → P1
I Sent a note to Marc Attinasi to review this
sr=attinasi for the nsTextFrame changes. 

I noticed that there was a change unlike the rest of the simple cleanup:

-        bidiUtils->FormatUnicodeText(aPresContext, text, textLength, charType,
level & 1, isRightToLeftOnBidiPlatform);
+        bidiUtils->FormatUnicodeText(aPresContext, text, textLength,
+                                     charType, level & 1, isBidiSystem);

changed param from isRightToLeftOnBidi to isBidiSystem - just pointing it out in
case it was unintentional (I'm not sure exactly what the difference is, or if it
is important).
attinasi: The change from 'isRightToLeftOnBidi' to 'isBidiSystem' was 
intentional. Do you need more details?
Component: Internationalization → BiDi Hebrew & Arabic
I've tested the patch on the recent build. it is working fine. What are we 
waiting for to get it in?
attinasi, is the sr complete? who's to set the target milestone for this?
OS: Windows 95 → Windows 98
Mass-move all BiDi Hebrew and Arabic qa to me, zach@zachlipton.com. 
Thank you Gilad for your service to this component, and best of luck to you 
in the future.

Sholom. 
QA Contact: mahar → zach
Blocks: 75046
Blocks: 75009
Simon, could you push to get this fix in please.
Blocks: 94151
I will review this and look for a sr for the changes in nsBidiPresUtils.*

Maha and Lina, can you please state explicitly which of the following cases were
broken before and are working with this patch.

Logical Arabic (06 range) on bidi system
Visual Arabic (06 range) on bidi system
Logical Arabic (06 range) on non-bidi system
Visual Arabic (06 range) on non-bidi system
Logical Arabic (FE range) on bidi system
Visual Arabic (FE range) on bidi system
Logical Arabic (FE range) on non-bidi system
Visual Arabic (FE range) on non-bidi system
Logical Hebrew on bidi system
Visual Hebrew on bidi system
Logical Hebrew on non-bidi system
Visual Hebrew on non-bidi system
anything else
This bug was opened in the first place to clean up the reorder code. The patch attached 
fixes bug#75005: broken Visual Arabic (FE Range) on Bidi system, it is currently displayed 
reversed on NT giving a different behaviour than when running on Win2K.(inconsistent 
behaviour on different Windows platforms).
as a result, the following will be solved:
1.bug#75034: Selection highlight on the Arabic visual page is in the opposite direction.
2.bug#75046: 06 characters (eg. Kashida, Arabic colon) inside an FE stream (Arabic Visual) 
causes the Win2k (with Arabic locale as default) to reorder the stream wrongly.
Keywords: patch, review
I still see some issues with selection in Visual Hebrew, but I don't want to
hold up this patch because of them (and anyway I think they are my own messes
which I should clean up - I will work on them in bug 75034).

So r=simon@softel.co.il. We have sr=attinasi on the nsTextFrame changes and I
will ask for sr on the nsBidiPresUtils changes
Comment on attachment 40880 [details] [diff] [review]
Patch against the current code base

sr=blizzard
Attachment #40880 - Flags: superreview+
Fix FINALLY checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: