Closed Bug 81664 Opened 24 years ago Closed 23 years ago

need to clean up bidi reorder code

Categories

(Core :: Layout: Text and Fonts, defect, P1)

x86
Windows 98
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: ftang, Assigned: smontagu)

References

Details

Attachments

(3 files)

I move some email discussion to here Subject: Re: cleaning up the code Date: Tue, 15 May 2001 22:05:18 +0300 From: lkemmel@il.ibm.com To: Maha_Abou_El-Rous/Egypt/Contr/IBM@il.ibm.com CC: smontagu@il.ibm.com, ftang@netscape.com Maha, With your new changes visual Hebrew looks good, but logical Hebrew - not good at all. Please see attached (See attached file: testreorder.zip), including test case and 2 screen snapshots (mozilla and Bidi NS 4.61 -- please compare and see the difference). Thanks, Lina Maha Abou El-Rous@IBMEG 14/05/2001 21:48 To: Lina Kemmel/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB cc: Simon Montagu/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB, ftang@netscape.com@IBMIL@IBMDE@IBMGB From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG Subject: Re: cleaning up the code (Document link: Lina Kemmel) Lina, Please find my comments below in your text prefixed mrous>> Attached you'll find the changes I described in my previous mail, aIsBidiSystem, trully indicates isBidiSystem. You'll also find that I have moved the setting of RTL Reading into FormatUnicodeText(). I think it is much "cleaner" that way. Take a look at it, and if you're able to give a test that would be great. Then, if you still feel that we should keep setting RTL Reading in nsTextFrame.cpp, then we can move it back to it. Best Regards, Maha Abou El-Rous IBM Egypt 56 Gameat El-Dowal Al Arabia st., 12311 Operator: (202) 7492 533 Ext.: (733) Direct: (202) 3310 733 Fax : (202) 7601 227 e-mail : Mahar@eg.ibm.com Lina Kemmel@IBMIL 14/05/2001 06,47 PM To: Maha Abou El-Rous/Egypt/Contr/IBM cc: Simon Montagu/Israel/Contr/IBM, ftang@netscape.com From: Lina Kemmel/Israel/Contr/IBM@IBMIL Subject: Re: cleaning up the code (Document link: Database 'Maha Abou El-Rous', View '($Inbox)') Maha, First of all, sorry about the typo. I meant we wouldn't reach doReverse = PR_TRUE in the case of eCharType_LeftToRight char type. With the old XOR condition and the old significance of aIsBidiSystem we could reach doReverse when we had a latin character with odd EL (in visual RTL or within RLO). Do you agree? mrous>> I'm not really sure because its been like ages now with aIsBidiSystem meaning isRightToLeftOnBidiPlatform I agree with you that it would be good if aIsBidiSystem denotes only whether the system is Bidi. mrous>> I guess this is what erik wanted for some time now. I'm currently unable to build mozilla code for technical reasons, but what's your opinion, if we do the following: a. In the case of Visual Arabic pass eCharType_LeftToRight to the FormatUnicodeText (from nsTextFrame), instead of the real char type. (Then the argument aCharType must be probably renamed into something like aTreatCharAsRTL.) mrous>> I'm not really sure it's a "clean" way to do it. I remember erik commenting on such approach, when we tried to fix the visual problem by passing isRightToLeftOnBidiPlatform false when Arabic visual document. b. Leave FormatUnicodeText in its current (CVS) version (except for your changes to numeric and literal shaping, of course). mrous>> regarding the numeric shaping change, where you suggesting changing the switch condition to if () statement. We had as if() before, and were advised to change it to a case statement later on by frank. c. The reason, why SetRightToLeftText was outside of FormatUnicodeText was that it might do harm to literal RTL bullets (since there we've already repositioned a period by hand). So, cannot we leave SetRightToLeftText in nsTextFrame? mrous>> Currently (without our suggested fix) we do have a problem with bullets on RTL display, they loose indentation. There is a bug open for it. so I believe we can test with the new fix and see what comes up. Best regards, Lina Maha Abou El-Rous@IBMEG 14/05/2001 18:28 To: Lina Kemmel/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB cc: Simon Montagu/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB, ftang@netscape.com@IBMIL@IBMDE@IBMGB From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG Subject: Re: cleaning up the code (Document link: Lina Kemmel) Lina, Good catch. 3. However, since aIsBidiSystem changed semantically, and currently also means that eCharType_RightToLeft == aCharType, we would never get to doReverse = PR_TRUE, in the case eCharType_RightToLeft == aCharType, while reverse may be needed, for example, for Latin chars in visual RTL. (See: if (isVisual){ if (aIsBidiSystem){ if ( (eCharType_RightToLeft == aCharType) ^ (aIsOddLevel) ) doReverse = PR_TRUE; ..... ) This means that even with the old XOR condition which was also conditioned with aIsBidiSystem, we could never reach doReverse when we have a latin character? Do you have a suggestion? What do you think of the following: we move the following lines of code from nsTextFrame.cpp to FormatUnicodeText() (with the necessary changes) if (isBidiSystem && (eCharType_RightToLeft == charType) ) { isRightToLeftOnBidiPlatform = PR_TRUE; } else if (eCharType_RightToLeftArabic == charType) { isRightToLeftOnBidiPlatform = (hints & NS_RENDERING_HINT_ARABIC_SHAPING); } and we then change the parameter list when calling FormatUnicodeText() from nsTextFrame.cpp and replace "isRightToLeftOnBidiPlatform" with "isBidiSystem", so it would become: bidiUtils->FormatUnicodeText(aPresContext, text, textLength, charType, level & 1, isBidiSystem,&aRenderingContext); That way aIsBidiSystem denotes whether or not the system is Bidi, with out the extra information if Hebrew or Arabic. What do you think? For the numbers, I guess we can filter out characters using the condition you suggested: if ((eCharType_EuropeanNumber == aCharType) || (eCharType_ArabicNumber == aCharType)) Best Regards, Maha Abou El-Rous IBM Egypt 56 Gameat El-Dowal Al Arabia st., 12311 Operator: (202) 7492 533 Ext.: (733) Direct: (202) 3310 733 Fax : (202) 7601 227 e-mail : Mahar@eg.ibm.com Lina Kemmel@IBMIL 14/05/2001 01,45 PM To: Maha Abou El-Rous/Egypt/Contr/IBM cc: Simon Montagu/Israel/Contr/IBM, ftang@netscape.com From: Lina Kemmel/Israel/Contr/IBM@IBMIL Subject: Re: cleaning up the code (Document link: Database 'Maha Abou El-Rous', View '($Inbox)') Hi Maha, 1. As for expanding old XOR condition, I have no questions, understanding your need for proper handling of visual FE-ranged Arabic. 2. As for RTL reading, the name aIsBidiSystem is a little confusing, but now I see it would work properly. 3. However, since aIsBidiSystem changed semantically, and currently also means that eCharType_RightToLeft == aCharType, we would never get to doReverse = PR_TRUE, in the case eCharType_RightToLeft == aCharType, while reverse may be needed, for example, for Latin chars in visual RTL. (See: if (isVisual){ if (aIsBidiSystem){ if ( (eCharType_RightToLeft == aCharType) ^ (aIsOddLevel) ) doReverse = PR_TRUE; ..... ) Also (again, due to the new significance of aIsBidiSystem) for logical text type the condition ( (aIsBidiSystem!=0) ^ (aIsOddLevel) ) looks good: else { // logical if ( (aIsBidiSystem!=0) ^ (aIsOddLevel) ) doReverse = PR_TRUE; ... So, no changes seem to be needed in this part (logical text type). 5. Yes, I meant numeric shaping (I thought "numeric swapping" was also in use). The intent was to skip the numeric shaping for non-numeric char types. Thanks, Lina Maha Abou El-Rous@IBMEG 14/05/2001 13:09 To: Lina Kemmel/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB cc: Simon Montagu/Israel/Contr/IBM@IBMIL@IBMDE@IBMGB, ftang@netscape.com@IBMIL@IBMDE@IBMGB From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG Subject: Re: cleaning up the code (Document link: Lina Kemmel) Lina, What we mainly did with FormatUnicodeText() was expanding the previously existing XOR condition that sets doReverse boolean if (aIsBidiSystem) if (CHARTYPE_IS_RTL(aCharType) ^ aIsOddLevel) doReverse = PR_TRUE; This is to differentiate between Arabic RTL and Hebrew RTL, as this was causing a problem for Visual Arabic (FE characters). As for setting RTL Reading, this is how it was set from nsTextFrame.cpp right before the call to FormatUnicodeText() if (isRightToLeftOnBidiPlatform) aRenderingContext.SetRightToLeftText(PR_TRUE); isRightToLeftOnBidiPlatform is being passed as aIsBidiSystem to FormatUnicodeText(), so what we did is only move the condition inside FormatUnicodeText() as: if ( aIsBidiSystem ) aRenderingContext->SetRightToLeftText(PR_TRUE); We also needed to set that to false when an Arabic visual document. For the examples you listed, this is how they are handled with the new code: if (isVisual){ if (aIsBidiSystem){ if ( (eCharType_RightToLeft == aCharType) ^ (aIsOddLevel) ) doReverse = PR_TRUE; 1. bidi system; odd EL; char type is LTR ----> the condition is evaluated to true(0^1), doReverse is True 2. bidi system; even EL; char type is LTR -----> the condition is evaluated to false(0^0), and doReverse remains False However this is applicable only when a visual document, I think we'll need to modify the logical condition to accomodate it. how does this change look ? if(isVisual){ } else if (aIsBidiSystem) if (CHARTYPE_IS_RTL(aCharType) ^ aIsOddLevel) doReverse = PR_TRUE; For the numeric swapping question, I didn't quite get it, do you mean numeric shaping? could you please elaborate? Best Regards, Maha Abou El-Rous IBM Egypt 56 Gameat El-Dowal Al Arabia st., 12311 Operator: (202) 7492 533 Ext.: (733) Direct: (202) 3310 733 Fax : (202) 7601 227 e-mail : Mahar@eg.ibm.com Lina Kemmel@IBMIL 13/05/2001 08,13 PM To: Maha Abou El-Rous/Egypt/Contr/IBM cc: Simon Montagu/Israel/Contr/IBM, ftang@netscape.com From: Lina Kemmel/Israel/Contr/IBM@IBMIL Subject: Re: cleaning up the code (Document link: Database 'Maha Abou El-Rous', View '($Inbox)') Hi Maha, Can you please clarify for me the following questions: 1. Would it make sense to condition numeric swapping by if ((eCharType_EuropeanNumber == aCharType) || (eCharType_ArabicNumber == aCharType)) ? 2. It seems that now you always set RTL reading (aRenderingContext->SetRightToLeftText(PR_TRUE) ), when platform has bidi , except for the case of visual Arabic. I think this may cause some extra reverses: for example, pure neutral strings would always get reversed by a bidi platform. Examples: a. bidi system; odd EL; char type is LTR (result of RLO); -- must be reversed, but isn't covered with the above condition. b. bidi system; even EL; char type is LTR ("normal" LTR text) -- would get reversed, while should remain as is. Or have I missed anything? Thanks, Lina Maha Abou El-Rous@IBMEG 13/05/2001 18:46 To: Lina Kemmel/Israel/Contr/IBM@IBMIL cc: Simon Montagu/Israel/Contr/IBM@IBMIL, ftang@netscape.com From: Maha Abou El-Rous/Egypt/Contr/IBM@IBMEG Subject: cleaning up the code Lina, Please find attached bidi1305.zip, in which we tried to solve the Arabic Visual problem. Please verify that our changes does not break the code for Hebrew, and check if it is clean according to erik's directions. Let me know what you think. Best Regards, Maha Abou El-Rous IBM Egypt 56 Gameat El-Dowal Al Arabia st., 12311 Operator: (202) 7492 533 Ext.: (733) Direct: (202) 3310 733 Fax : (202) 7601 227 e-mail : Mahar@eg.ibm.com testreorder.zip Name: testreorder.zip Type: Zip Compressed Data (application/x-zip-compressed) Encoding: base64 Description: .ZIP File
Got updates from Maha, which allowed to see logical Hebrew properly. (Tested logical and visual Hebrew, LTR and RTL, on bidi and non-bidi windows.) Lina
Attached patch Fix for this bugSplinter Review
Changing QA contact to mahar@eg.ibm.com.
QA Contact: andreasb → mahar
my goodnees. It is very hard to review this patch. Is that possible you can break this patch into stages ? This kind of all-in-one patch will never pass review and suprereview process. File individual bugs for individual issues.
Sorry, the diffs include some other things which happened to be in my local tree which are not related to this bug. Lina and I will get rid of the unnecessary stuff and post new diffs.
Attached patch new diffsSplinter Review
Bug 75005 depends on this fix
Blocks: 75005
Severity: normal → blocker
Blocks: 75034
Blocks: 82088
When is this code going to be submitted? we have 3 bugs dependent on this 75005, 75034, 82088 two of which are related to Visual Arabic.
Priority: -- → P1
I Sent a note to Marc Attinasi to review this
sr=attinasi for the nsTextFrame changes. I noticed that there was a change unlike the rest of the simple cleanup: - bidiUtils->FormatUnicodeText(aPresContext, text, textLength, charType, level & 1, isRightToLeftOnBidiPlatform); + bidiUtils->FormatUnicodeText(aPresContext, text, textLength, + charType, level & 1, isBidiSystem); changed param from isRightToLeftOnBidi to isBidiSystem - just pointing it out in case it was unintentional (I'm not sure exactly what the difference is, or if it is important).
attinasi: The change from 'isRightToLeftOnBidi' to 'isBidiSystem' was intentional. Do you need more details?
Component: Internationalization → BiDi Hebrew & Arabic
I've tested the patch on the recent build. it is working fine. What are we waiting for to get it in?
attinasi, is the sr complete? who's to set the target milestone for this?
OS: Windows 95 → Windows 98
Mass-move all BiDi Hebrew and Arabic qa to me, zach@zachlipton.com. Thank you Gilad for your service to this component, and best of luck to you in the future. Sholom.
QA Contact: mahar → zach
Blocks: 75046
Blocks: 75009
Simon, could you push to get this fix in please.
Blocks: 94151
I will review this and look for a sr for the changes in nsBidiPresUtils.* Maha and Lina, can you please state explicitly which of the following cases were broken before and are working with this patch. Logical Arabic (06 range) on bidi system Visual Arabic (06 range) on bidi system Logical Arabic (06 range) on non-bidi system Visual Arabic (06 range) on non-bidi system Logical Arabic (FE range) on bidi system Visual Arabic (FE range) on bidi system Logical Arabic (FE range) on non-bidi system Visual Arabic (FE range) on non-bidi system Logical Hebrew on bidi system Visual Hebrew on bidi system Logical Hebrew on non-bidi system Visual Hebrew on non-bidi system anything else
This bug was opened in the first place to clean up the reorder code. The patch attached fixes bug#75005: broken Visual Arabic (FE Range) on Bidi system, it is currently displayed reversed on NT giving a different behaviour than when running on Win2K.(inconsistent behaviour on different Windows platforms). as a result, the following will be solved: 1.bug#75034: Selection highlight on the Arabic visual page is in the opposite direction. 2.bug#75046: 06 characters (eg. Kashida, Arabic colon) inside an FE stream (Arabic Visual) causes the Win2k (with Arabic locale as default) to reorder the stream wrongly.
Keywords: patch, review
I still see some issues with selection in Visual Hebrew, but I don't want to hold up this patch because of them (and anyway I think they are my own messes which I should clean up - I will work on them in bug 75034). So r=simon@softel.co.il. We have sr=attinasi on the nsTextFrame changes and I will ask for sr on the nsBidiPresUtils changes
Comment on attachment 40880 [details] [diff] [review] Patch against the current code base sr=blizzard
Attachment #40880 - Flags: superreview+
Fix FINALLY checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: