Closed Bug 197375 Opened 19 years ago Closed 18 years ago

Arabic string being copy/pasted, reverses

Categories

(Core :: Layout: Text and Fonts, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: tonylaszlo, Assigned: smontagu)

References

()

Details

(Keywords: fixed1.7, Whiteboard: See comment #7 for a workaround, fixed-aviary1.0)

Attachments

(5 files, 3 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312

In the case of many sites with Arabic or Hebrew, a string of data being 
copied and then pasted, from one form to another, gets pasted in ltr 
order. 


Reproducible: Always

Steps to Reproduce:
1.go to a site with Arabic or Hebrew examples: http://www.haaretz.co.il and
http://aljazeera.net 
2.copy a string
3.paste into a form on that site, or in another tab/window of mozilla (example:
the search form at http://www.issho.org or http://www.google.com )

Actual Results:  
rtl string gets pasted in reverse order, i.e., ltr. 



Expected Results:  
This doesn't happen on other sites, probably due to the way the html is 
written/generated. This is an example of where the phenomenon does not 
occur: 
http://www.issho.org/modules.php?op=modload&name=News&file=article&sid=510

you might try copy/pasting a string of Arabic into the search form at the top
right, for example. it will go in rtl. 

This is the expected behavior.
I see it to copied content from http://www.haaretz.co.il to www.google.com. 
Linux 2003031308 and a Linux two day old CVS with GTK2/xft build
Confirming for http://www.haaretz.co.il, but I can't reproduce on
http://www.aljazeera.net/
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking1.4?
I'm quite sure that this is a dup of Bug 122183, which was wrongly marked as a
dup of Bug 120247, which was correctly marked as a dup of Bug 95400. This is
where the chain ends :-)

Prog.
Thank you for this information. So, what does one do to keep the pasted text
from reversing, in actual terms? On the bug 95400 page I see (paraphrasing)
"typing in a character afterwards to make the string reverse (to the appropriate
direction)" and "it seems that attachment 81953 [details] [diff] [review] in bug 95228 fixes this bug as
well." This one is FIXED, too?



NO, the bug is still here. You can't copy text from a Visual Hebrew page and
paste it in a Logical Hebrew page without seeing this bug. The only workaround
is to use a third party utility that reverses the charcter order.

Prog.
Thank you for the confirmation. 
Two suggestions, then. 

1) The addition of either this bug or the duplicate you mentioned (Bug 95400) to
a list of bugs to fix, _or_ the release of a notice to the effect that Mozilla
will not be able to/does not plan to achieve this, noting that workarounds via
third party utilities are necessary for Mozilla users. 

2) Creation of a document with a list of such third party utilities, for each OS
platform. Alternatively, this could be done outside the framework of Mozilla; I
would be happy to prepare such a document, if that would be preferred. In either
case, if you would initially mention a few of those third party utilities, that
would be very helpful (personally, I am especially interested in those for
Linux). Thanks. 
For an effective, cross-platform solution for reversing the order of characters,
use the following on-line tool: http://www.guides.co.il/php/hebrew.php

Simply paste the text in the textarea and press the "Send" button (the one on
the right). If you wish to translate this tool to Arabic, perhaps you could
contact the authors. Chances are they'll be happy to help.

Prog.
Many thanks. I take the liberty of including two other links here which may be
useful to the developer or user who is not so acquainted with the subtleties of
this issue:  
http://www.tau.ac.il/~danon/Hebrew/HTML_and_Hebrew.html
http://www.langbox.com/bidimozilla/
I'd like to confirm that this problem with Mozilla exist for Arabic as well, it
had been bugging me for a long time .. I can paste Arabic in Mozilla and it will
be very fine .. that is if it is copied from a program other than mozilla (ie.
gedit, kate, konq, etc.. ). but when you copy from Mozilla, you can't paste what
you copied either in mozilla, or any other place. So the reversing happen at the
copy not paste.
mozilla1.4 shipped. unsetting blocking1.4 request.
Flags: blocking1.4?
I got this bug when I try to copy and paste any Arabic text in Linux (not a bug
in windows)

Note: 
1) This bug happen if the website is not UTF-8
2) The only workaround I found it is :
	A) Select the Text and click the Right button.
	B) Select "View Selection Source" from the menu.
	C) Copy the text from the source code.
I am using Mozilla 1.5 / Gecko/20031210 .. And the problem is still there.. I
want to add that it is major as Mozilla can't be used for copying text from when
it is in Arabic.
This bug was opened before 9 months and the status is still NEW

this is a majr problem in Mozilla  (just think you can't copy or paste any thin
in english :)  )
please can anything be done ? I'm trying to deploy mozilla in our companey and build services arround it, 
unfortunately, we can't do this because the arabic support is not perfect yet.
Developers, you are really doing a great job. Please be more kind ;-)
Changing Hardware and OS to All.

Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.6) Gecko/20040113

Prog.
OS: Linux → All
Hardware: PC → All
This bug is still there and it was reported almost a year ago.  Can we
get a status on it and see if anyone has looked seriously into a
potential solution ?
WFM for haaretz.co.il and aljazeera.net. Both sites now use Logical encodings
(Windows-1255 and Windows-1256 respectively) which are not effected by this bug.
The same applies to pages that use UTF-8 and ISO-8859-8-i

The bug is visible in pages with Visual encodings such as
http://forums.ort.org.il/scripts/addform.asp?which_forum=30

Tim Freedom, it doesn't seem that anyone is currently working on this bug. If
you want to find the best ways to advance this issue, I suggest that you first
read http://bugzilla.mozilla.org/page.cgi?id=etiquette.html

Prog.
Summary: hebrew or arabic string being copy/pasted, reverses → Hebrew or Aabic string being copy/pasted, reverses (Visual encodings only)
Whiteboard: See comment #7 for a workaround
The bug still exist in Mozilla 1.6, FireFox and Thunderbird. I face it daily.
The good thing is that the Bug do not exist when dealing with UTF-8.. I only
face it when having to use the Windows-1256 encoding (CP-1256). It does not
happen only with copying from the Browser, but also with copying text from the
Text Areas in Thunderbird (while writing an email in Windows-1256), and same in
Mozilla-Mail in 1.6 .
Summary: Hebrew or Aabic string being copy/pasted, reverses (Visual encodings only) → Hebrew or Arabic string being copy/pasted, reverses (Visual encodings only)
If you go to the (previously mentioned)
http://forums.ort.org.il/scripts/addform.asp?which_forum=30 ,   select a couple
of words on either side of "Ctrl+shift", then view selected source (my default
format is english, by the way -- no actual hebrew knowledge),  You'll see how
this is actually encoded... very snarky.
Taking this bug.
Assignee: mkaply → smontagu
Has anyone seeing this bug made any changes to the pref bidi.clipboardtextmode?
Assuming the default value of the pref, there is a very simple fix to the Arabic
non-UTF-8 case on Linux.

That is actually a different bug from the originally reported case of visual
Hebrew, which is much harder to fix, in fact probably impossible to produce a
perfect fix for. 
Actually, we can remove IsArabicEncoding() altogether since this is the only
caller.
Attachment #148584 - Attachment is obsolete: true
I can test the patch on Windows XP and linux. If someone can help with testing
on Windows 98 (Arabic version and non-Arabic version) and other OSs it would
help a lot.
I can help testing the patch, is it against CVS ? which release ??
and if i'm able to compile mozilla ;)
Debian testing here.
The patch is against current trunk CVS.
Status: NEW → ASSIGNED
Great, works fine!
however I noticed that opening www.amrkhaled.net, edit>select all, edit>copy
pasting in openoffice reverse the whole page though arabic is pasted fine. Do u 
think it's a mozilla bug or openoffice bug ?
(In reply to comment #28)
> however I noticed that opening www.amrkhaled.net, edit>select all, edit>copy
> pasting in openoffice reverse the whole page though arabic is pasted fine. Do u 
> think it's a mozilla bug or openoffice bug ?

As far as I can tell what happens in that scenario is that if there is a table
with specified direction, e.g. <table dir="rtl>, it has right-to-left
directionality in OpenOffice, but if it inherits its direction from the <body>
element it will have default directionality in OpenOffice. I would expect that
the same thing would happen with any other attribute on the <body> element, but
I need to test more. In any case it's not related to this bug, and you can work
around the problem by setting right-to-left direction in the OpenOffice document
(from Format | Page) before pasting.
Version of attachment 148587 [details] [diff] [review], removing more dead code.

I believe that this is the way to go, if only to clear the ground for a
non-buggy implementation of converting visual source to logical clipboard. 

I don't myself intend to fix the bug as originally reported, since it would
require a lot of code to work properly (something on the lines of
nsBidiPresUtils::Resolve) and there will still be edge cases of visual text
with no logical equivalent which could only be fixed by programmatically
inserting control codes.
Attachment #148587 - Attachment is obsolete: true
Attachment #148881 - Flags: superreview?(bzbarsky)
Attachment #148881 - Flags: review?(bzbarsky)
Requesting blocking1.7. The problem with copying Arabic text which my patch
fixes was described in a recent Slashdot interview as "a show stopper and
slowing the adoption of GNU/Linux [in Egypt]"
http://interviews.slashdot.org/interviews/04/05/13/1346237.shtml
Flags: blocking1.7?
Comment on attachment 148881 [details] [diff] [review]
The radical patch with more cleanup

Index: layout/base/src/nsPresContext.cpp
===================================================================

@@ -1025,35 +1020,17 @@ nsPresContext::GetContainer()

 #ifdef IBMBIDI
-//ahmed
 NS_IMETHODIMP

that should read
-NS_IMETHODIMP
Simon, how about providing a binary so that more people can test the patch?
positive experiences will help back up the forthcoming approval1.7 request, and
if these tests turn out to be less successful, well, then this will only help
improve the patch even further.

I can provide some webspace for the binary, if you need it.

Prog.
I'll be testing the new patch, I'll also post a compiled binary :Compiled under 
Debian sarge" as soon as I finish compiling.
Doesn't compile:
c++ -o nsPresContext.o -c -DOSTYPE=\"Linux2.6\" -DOSARCH=\"Linux\" 
-D_IMPL_NS_LAYOUT -I../../../content/events/src -I./../../html/base/src -I./../.
./html/style/src -I./../../xul/base/src -I./../../xul/content/src  -I../../..
/dist/include/xpcom -I../../../dist/include/string -I../../../dist/include/dom 
-I../../../dist/include/content -I../../../dist/include/gfx -I../../..
/dist/include/widget -I../../../dist/include/view -I../../../dist/include/locale 
-I../../../dist/include/webshell -I../../../dist/include/necko -I../../..
/dist/include/uconv -I../../../dist/include/pref -I../../..
/dist/include/uriloader -I../../../dist/include/docshell -I../../..
/dist/include/imglib2 -I../../../dist/include/js -I../../..
/dist/include/xpconnect -I../../../dist/include/layout -I../../../dist/include 
-I/usr/src/mozilla/dist/include/nspr     -I/usr/X11R6/include   -fPIC  
-I/usr/X11R6/include -fno-rtti -fno-exceptions -Wall -Wconversion 
-Wpointer-arith -Wcast-align -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy 
-Wno-non-virtual-dtor -Wno-long-long -fshort-wchar -pthread -pipe  -DNDEBUG 
-DTRIMMED -O2  -I/usr/X11R6/include -DMOZILLA_CLIENT -include ../../..
/mozilla-config.h -Wp,-MD,.deps/nsPresContext.pp nsPresContext.cpp
nsPresContext.cpp: In member function `void nsPresContext::GetFontPreferences()
   ':
nsPresContext.cpp:328: warning: `nsFont*font' might be used uninitialized in 
   this function
nsPresContext.cpp: At global scope:
nsPresContext.cpp:1030: error: syntax error before `::' token
nsPresContext.cpp:1036: error: syntax error before `->' token
../../../dist/include/content/nsContentPolicyUtils.h:198: warning: `
   nsIDocShell* NS_CP_GetDocShellFromDOMNode(nsIDOMNode*)' defined but not used
make[4]: *** [nsPresContext.o] Error 1
make[4]: Leaving directory `/usr/src/mozilla/layout/base/src'
make[3]: *** [libs] Error 2
make[3]: Leaving directory `/usr/src/mozilla/layout/base'
make[2]: *** [libs] Error 2
make[2]: Leaving directory `/usr/src/mozilla/layout'
make[1]: *** [tier_9] Error 2
make[1]: Leaving directory `/usr/src/mozilla'
make: *** [default] Error 2

c++ 3.3.3
> Doesn't compile:

See comment 32, please.
Comment on attachment 148881 [details] [diff] [review]
The radical patch with more cleanup

r+sr=bzbarsky if you make this compile... ;)
Attachment #148881 - Flags: superreview?(bzbarsky)
Attachment #148881 - Flags: superreview+
Attachment #148881 - Flags: review?(bzbarsky)
Attachment #148881 - Flags: review+
This version should compile out of the box. Mohammed, please let me know if
there are problems.
Attachment #148881 - Attachment is obsolete: true
Thanks.
Whatever I did, the previous patch always complains about missing symbols when I 
try to run mozilla.
I'll try this one and keep you updated.
I've uploaded the tarball I've built, Debian testing.
http://www.foolab.org/files/mozilla-i686-pc-linux-gnu.tar.gz
tested the patch here for arabic text.
works perfectly as far as I can tell.

any hints about these edge cases so I can do more tests?

and yes this should realy block 1.7
tested it here also on Linux using Mohamad Sameer's binary and it is working
fine as far as i can tell

will it be included in teh next release of firefox also ?
Comment on attachment 148925 [details] [diff] [review]
last patch with corrections

Transferring review per comment 37 and comment 40
Attachment #148925 - Flags: superreview+
Attachment #148925 - Flags: review+
(In reply to comment #40)
> I've uploaded the tarball I've built, Debian testing.
> http://www.foolab.org/files/mozilla-i686-pc-linux-gnu.tar.gz

I tested this build under SuSE 8.2 with a Visual Hebrew page and the text is
still pasted in reverse order. As paste-targets I used Mozilla's address bar, a
textarea, and KWrite.

All the Arabic pages linked in this bug now use Logical encodings. Are there any
alternative pages that I can test?

Prog.
If people think this should block the release then why hasn't someone requested
approval to land the reviewed patch?
Asa, I'm not sure the reviewed patch fully solves this problem. See comment #45

Prog.
(In reply to comment #45)
> If people think this should block the release then why hasn't someone requested
> approval to land the reviewed patch?

Two reasons: I was waiting for some more testing feedback, and I don't usually
have time to get to the computer on Fridays.

I see from comment 44 that people are still confused about what this bug is
about and which cases the patch is intended to fix, and this is partly because
the summary is just wrong. Hebrew only reverses in visual encodings, but on
Linux (and probably on any OS without native Bidi support, e.g US versions of
Windows 9x) Arabic always reverses unless it is in UTF-8.

The latter is the problem which I consider a blocker, and which the patch fixes.
The patch does not fix visual encodings, but I don't think that issue is a
blocker for anything.
Summary: Hebrew or Arabic string being copy/pasted, reverses (Visual encodings only) → Hebrew or Arabic string being copy/pasted, reverses
(In reply to comment #41)
> tested the patch here for arabic text.
> works perfectly as far as I can tell.
> 
> any hints about these edge cases so I can do more tests?

The edge cases are in visual encodings, and not relevant to the current patch.
Comment on attachment 148925 [details] [diff] [review]
last patch with corrections

Requesting approval 1.7. As already mentioned, this is a very serious bug for
Arabic users.

For non-Arabic documents, risk is zero, because the patch just removes a code
path which documents not in Arabic encodings never use. (This is why UTF-8
documents didn't exhibit the bug: they never went through the broken code)
Attachment #148925 - Flags: approval1.7?
Is there any reason why Hebrew is included in the summary at all? do you plan to
fix copy/paste of Visual Hebrew?

Prog.
(In reply to comment #51)
> do you plan to fix copy/paste of Visual Hebrew?

Do you think we need to?

This is not a rhetorical question, but maybe we should have the discussion in a
newsgroup or forum.
Patch checked in to trunk.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Comment on attachment 148925 [details] [diff] [review]
last patch with corrections

a=asa (on behalf of drivers) for checkin to 1.7
Attachment #148925 - Flags: approval1.7? → approval1.7+
Patch checked into 1.7 branch.
Keywords: fixed1.7
Flags: blocking1.7?
Hello, sorry for not being able to submit my message earlier, but this patch is
a disaster!
We need clipboard BIDI modes (Visual, Logical, Source) in so many cases. For
example on Visual systems when copy from a logical page the text will get
reversed when pasting. and vise versa, suppose we're going to copy from a visual
page and paste on a logical system. 
This patch doesn't provide a solution for the problem, it turns around it. 
(In reply to comment #56)
> We need clipboard BIDI modes (Visual, Logical, Source) in so many cases. For
> example on Visual systems when copy from a logical page the text will get
> reversed when pasting. and vise versa, suppose we're going to copy from a visual
> page and paste on a logical system. 
> This patch doesn't provide a solution for the problem, it turns around it. 

I think the idea behind this patch is to always support logical to logical
copy/paste. It doesn't fix visual to logical or logical to visual cases.

Has this patch caused a regression that you're seeing? If so, you need to state
so very soon-ish, as Mozilla 1.7 is very close to release.
(In reply to comment #56)
> Hello, sorry for not being able to submit my message earlier, but this patch is
> a disaster!
> We need clipboard BIDI modes (Visual, Logical, Source) in so many cases. For
> example on Visual systems when copy from a logical page the text will get
> reversed when pasting. and vise versa, suppose we're going to copy from a visual
> page and paste on a logical system. 
> This patch doesn't provide a solution for the problem, it turns around it. 

Was the existing code working for you? It seemed to me just too broken to be
useful, especially since it depended on a hidden pref. Could a third-party tool
like the one mentioned in comment 7 solve the issue?
There is nothing called visual or logical for Arabic, It's for hebrew only and 
we know that this patch doesn't fix it (correct me please)
I'm not against hebrew, But we've got a thing done, Thanks Mozilla 
developers/contributers/testers for this.
We are very glad and I wish I can help more but I can't.
Now IBM ppl, If you have a fix, please present it, but please don't try to b0rk 
the good things happened.
Whiteboard: See comment #7 for a workaround → See comment #7 for a workaround, needed-aviary1.0
We are working on a new patch to solve both Arabic and Hebrew problems. We're expecting to 
submit the new patch soon. 
As for Arabic, yes we have visual for Arabic as well as Hebrew, so we need the visual copy 
support for pages in visual format such as pages encoded in IBM-864. 
If we can't catch 1.7 release, perhaps we could port it to 1.8a or something. 
i don't think that no one is using the visual for Arabic, 

and if u IBM see that ur patch will be more helpful so go on and compelete ur
work but till u finish i  see that the preset patch which fix the bug with copy
and paste for Arabic should be included in 1.7

cause this bug was not active for a long time and it is a must to be fixed in
next release atleast with the current patch 
I have created a patch that solves the Arabic copy/paste bug. It also supports
visual Arabic copy/paste in addition to visual Hebrew. However I'm experiencing
some problems submitting the patch, I select to create new attachement and
choose the patch file and fill in the relevant data, then click on submit, but a
red box appears saying that I haven't chosen a file!!!.
I can submit the patch as an additional comment if this suits you.
I sent you an email, you can attach the patch to a reply and I will submit it.

Thanks,

Prog.
Here is the patch, I failed to attach it so I'm submitting it in a comment:

Index: layout/base/public/nsIPresContext.h
===================================================================
RCS file: /cvsroot/mozilla/layout/base/public/nsIPresContext.h,v
retrieving revision 3.127
diff -u -6 -r3.127 nsIPresContext.h
--- layout/base/public/nsIPresContext.h	29 Apr 2004 23:34:09 -0000	3.127
+++ layout/base/public/nsIPresContext.h	6 Jun 2004 13:47:29 -0000
@@ -467,12 +467,18 @@
   /**
    * Check for Arabic encoding
    * @return aResult == TRUE if the document encoding is an Arabic codepage
    */
   NS_IMETHOD IsArabicEncoding(PRBool &aResult) const = 0;
 
+  /** Y2004 Bug #197375
+   * Check for Hebrew encoding
+   * @return aResult == TRUE if the document encoding is an Hebrew codepage
+   */
+  NS_IMETHOD IsHebrewEncoding(PRBool &aResult) const = 0;
+
   /**
    * Set the Bidi capabilities of the system
    * @param aIsBidi == TRUE if the system has the capability of reordering Bidi
text
    */
   void SetIsBidiSystem(PRBool aIsBidi)
   {
Index: layout/base/src/nsPresContext.h
===================================================================
RCS file: /cvsroot/mozilla/layout/base/src/nsPresContext.h,v
retrieving revision 3.129
diff -u -6 -r3.129 nsPresContext.h
--- layout/base/src/nsPresContext.h	30 Apr 2004 04:06:30 -0000	3.129
+++ layout/base/src/nsPresContext.h	6 Jun 2004 13:47:44 -0000
@@ -103,12 +103,14 @@
   NS_IMETHOD GetBidiUtils(nsBidiPresUtils** aBidiUtils);
   NS_IMETHOD SetBidi(PRUint32 aSource, PRBool aForceReflow = PR_FALSE);
   NS_IMETHOD GetBidi(PRUint32* aDest) const;
  //ahmed
   NS_IMETHOD IsVisRTL(PRBool &aResult) const;
   NS_IMETHOD IsArabicEncoding(PRBool &aResult) const;
+  //Y2004 Bug #197375
+  NS_IMETHOD IsHebrewEncoding(PRBool &aResult) const; //Y2004
 
 //Mohamed  17-1-01
   NS_IMETHOD GetBidiCharset(nsACString &aCharSet) const;
 //Mohamed End
 #endif // IBMBIDI
 
Index: layout/base/src/nsPresContext.cpp
===================================================================
RCS file: /cvsroot/mozilla/layout/base/src/nsPresContext.cpp,v
retrieving revision 3.261
diff -u -6 -r3.261 nsPresContext.cpp
--- layout/base/src/nsPresContext.cpp	30 Apr 2004 04:11:02 -0000	3.261
+++ layout/base/src/nsPresContext.cpp	6 Jun 2004 13:47:59 -0000
@@ -1037,12 +1037,22 @@
   aResult=PR_FALSE;
   if ( (mCharset.EqualsIgnoreCase("ibm864")
)||(mCharset.EqualsIgnoreCase("ibm864i")
)||(mCharset.EqualsIgnoreCase("windows-1256")
)||(mCharset.EqualsIgnoreCase("iso-8859-6") ))
     aResult=PR_TRUE;
   return NS_OK;
 }
 
+NS_IMETHODIMP //Y2004 Bug #197375
+nsPresContext::IsHebrewEncoding(PRBool& aResult) const
+{
+  aResult=PR_FALSE;
+  if ( (mCharset.EqualsIgnoreCase("ibm862")
)||(mCharset.EqualsIgnoreCase("windows-1255") ) ||
+       (mCharset.EqualsIgnoreCase("iso-8859-8") ) ||
(mCharset.EqualsIgnoreCase("iso-8859-8i") ) )
+    aResult=PR_TRUE;
+  return NS_OK;
+}
+
 NS_IMETHODIMP
 nsPresContext::IsVisRTL(PRBool& aResult) const
 {
   aResult=PR_FALSE;
   if ( (mIsVisual)&&(GET_BIDI_OPTION_DIRECTION(mBidi) ==
IBMBIDI_TEXTDIRECTION_RTL) )
     aResult=PR_TRUE;
Index: content/shared/public/nsBidiUtils.h
===================================================================
RCS file: /cvsroot/mozilla/content/shared/public/nsBidiUtils.h,v
retrieving revision 1.6
diff -u -6 -r1.6 nsBidiUtils.h
--- content/shared/public/nsBidiUtils.h	17 Apr 2004 21:52:19 -0000	1.6
+++ content/shared/public/nsBidiUtils.h	6 Jun 2004 13:48:35 -0000
@@ -99,12 +99,13 @@
   /**
    * Scan an nsString, converting numerals to Arabic or Hindi forms
    * @param aSrc is the input string
    * @param aDst is the output string
    */
   nsresult HandleNumbers(const nsString& aSrc, nsString& aDst);
+  nsresult Reverse_Hebrew(const nsString& aSrc, nsString& aDst); // Y2004 Bug
#197375
 
 // --------------------------------------------------
 // IBMBIDI 
 // --------------------------------------------------
 //
 // These values are shared with Preferences dialog
Index: content/shared/src/nsBidiUtils.cpp
===================================================================
RCS file: /cvsroot/mozilla/content/shared/src/nsBidiUtils.cpp,v
retrieving revision 3.12
diff -u -6 -r3.12 nsBidiUtils.cpp
--- content/shared/src/nsBidiUtils.cpp	17 Apr 2004 21:52:20 -0000	3.12
+++ content/shared/src/nsBidiUtils.cpp	6 Jun 2004 13:48:46 -0000
@@ -390,13 +390,13 @@
 }
 
 nsresult Conv_FE_06_WithReverse(const nsString& aSrc, nsString& aDst)
 {
   PRUnichar *aSrcUnichars = (PRUnichar *)aSrc.get();
   PRBool foundArabic = PR_FALSE;
-  PRUint32 i, endArabic, beginArabic, size;
+  PRInt64 i, endArabic, beginArabic, size;
   beginArabic = 0;
   size = aSrc.Length();
   aDst.Truncate();
   for (endArabic=0;endArabic<size;endArabic++) {
     if (aSrcUnichars[endArabic] == 0x0000) 
       break; // no need to convert char after the NULL
@@ -550,6 +550,42 @@
 nsresult HandleNumbers(const nsString& aSrc, nsString& aDst)
 {
   aDst = aSrc;
   return HandleNumbers((PRUnichar *)aDst.get(),aDst.Length(),
IBMBIDI_NUMERAL_REGULAR);
 }
 
+// Y2004 Bug #197375
+nsresult Reverse_Hebrew(const nsString& aSrc, nsString& aDst)
+{
+  PRUnichar *aSrcUnichars = (PRUnichar *)aSrc.get();
+  PRBool foundHebrew = PR_FALSE;
+  PRInt64 i, endHebrew, beginHebrew, size;
+  beginHebrew = 0;
+  size = aSrc.Length();
+  aDst.Truncate();
+  for (endHebrew=0;endHebrew<size;endHebrew++) {
+      if (aSrcUnichars[endHebrew] == 0x0000) 
+          break; // no need to convert char after the NULL
+
+      while( (IS_HEBREW_CHAR(aSrcUnichars[endHebrew])) ||
(aSrcUnichars[endHebrew]==0x0020) )
+      {
+          if(! foundHebrew ) {
+              beginHebrew=endHebrew;
+              foundHebrew= PR_TRUE;
+          }
+          endHebrew++;
+      }
+      if(foundHebrew) {
+          endHebrew--;
+          for (i=endHebrew; i>=beginHebrew; i--) {
+              /*if((IS_ARABIC_CHAR(aSrcUnichars[i]))||
+                   (IS_ARABIC_DIGIT(aSrcUnichars[i]))||
+                   (aSrcUnichars[i]==0x0020))*/
+              aDst += aSrcUnichars[i];
+          }     
+      } else {
+          aDst += aSrcUnichars[endHebrew]; 
+      }
+      foundHebrew=PR_FALSE;
+  }// for : loop the buffer
+  return NS_OK;
+}
Index: layout/base/src/nsCopySupport.cpp
===================================================================
RCS file: /cvsroot/mozilla/layout/base/src/Attic/nsCopySupport.cpp,v
retrieving revision 1.33
diff -u -6 -r1.33 nsCopySupport.cpp
--- layout/base/src/nsCopySupport.cpp	18 Apr 2004 14:30:22 -0000	1.33
+++ layout/base/src/nsCopySupport.cpp	10 Jun 2004 13:54:03 -0000
@@ -145,12 +145,98 @@
   nsCOMPtr<nsIDocument> doc = do_QueryInterface(aDoc);
   if (doc) {
     nsIPresShell *shell = doc->GetShellAt(0);
     if (shell) {
       nsCOMPtr<nsIPresContext> context;
       shell->GetPresContext(getter_AddRefs(context) );
+      /************************************************************************** 
+        * Y2004 (Ahmed -- Bug #197375)
+        *
+        * The following simplified chart describes the code conditions
+        *
+        *                    __ RTL text --> convert the FE characters to 06
characters
+        *                   |
+        *             __ Bidi System
+        *            |   (i.e. Bidi Locale) 
+        *            |      |__ LTR text --> Convert the FE characters to 06
characters
+        *            |                       and reverse the Arabic text in the
buffer.
+        *            |
+        *      _(Visual Page)
+        *     |      |__ Not Bidi System --> only handle numerics to change the
Hindi numbers to Arabic Numbers
+        *     |          (i.e. Not Bidi Locale)
+        *     |
+        * Clipboard 
+        * (Logical)
+        *     |
+        *     |__ (Logical Page) --> No extra processing is needed
+        *
+        *
+        *
+        *
+        *      __ (Visual Page) --> Convert the 06 characters to FE characters
according
+        *     |                     to the text direction. (Which is broken also)
+        *     |                                
+        *     |
+        * Clipboard
+        * (Visual)
+        *     |
+        *     |__ (Logical Page) --> Convert the 06 to FE and reverse the
buffer only
+        *                            if the cliboard mode is visual.
+        *
+        * Clipboard (Source) --> Accroding to the source of the page the copy
mode will be set,
+        *                        if the page th visual then the copy mode is
visual, if the
+        *                        page the logical the copy mode will logical.
Then the code
+        *                        will pass through one of the mentioned passes
above.
+        *       *******
+        * The following algorithm describe the new solution in code-like flow.
+        * 
+        * .
+        * .
+        * .
+        * if ( the code page is Arabic code page )
+        * {
+        *     In this case we have two major options, the first one if the
+        *     clipboard is logical with visual pages, the second option is
+        *     everything else. So the flow of the code would be as following:
+        *     if ( The clipboard mode is Logical && The page is in visual format )
+        *     {
+        *         if ( The opearting system is running bidi locale (Arabic Locale)
+        *         {
+        *             if ( The text direction is RTL )
+        *                 convert the Arabic string from FE range to 06 range
because
+        *                 the copy mode is logical, that's why we need to
convert the FE to 06. 
+        *             else // ( The text direction is LTR )
+        *                 We need to convert the FE character encoding to 06
character encoding,
+        *                 also we have to reverse the buffer to properly paste
Arabic.
+        *         }
+        *         else // ( The system is not running an Arabic locale )
+        *             All what we have to do is to handle the numerics if they
will be Hindi or Arabic.
+        *     }
+        *     else
+        *         if ( The code page is UTF-8 )
+        *             Special handling for UTF-8 case
+        *         else 
+        *         {
+        *             if ( The code page is visual OR
+        *                  the code page is logical while the clipboard mode is
visual)
+        *                 Convert the 06 characters to FE characters and
reverse the Arabic
+        *                 text according to the direction.
+        *         }
+        * }
+        * else // ( New code to handle visual Hebrew )
+        * {
+        *     if ( the code page is Hebrew code page in visual format )
+        *         Change the Hebrew presentation from visual to logical by
reversing Hebrew
+        *         string only, so the text appears readable after pasting it
+        * }
+        * .
+        * .
+        * .
+        *
+        */
+
       if (context) {
         context->IsArabicEncoding(arabicCharset);
         if (arabicCharset) {
           PRUint32 bidiOptions;
           PRBool isBidiSystem = context->IsBidiSystem();
           PRBool isVisual = context->IsVisualMode();
@@ -173,21 +259,44 @@
             buffer = newBuffer;
           }
           //Mohamed
           else {
             nsCAutoString bidiCharset;
             context->GetBidiCharset(bidiCharset);
-            if (bidiCharset.EqualsIgnoreCase("UTF-8") || (!isVisual)) {
+
+            if (bidiCharset.EqualsIgnoreCase("UTF-8")) {
               if ( (GET_BIDI_OPTION_CLIPBOARDTEXTMODE(bidiOptions) ==
IBMBIDI_CLIPBOARDTEXTMODE_VISUAL) || (!isBidiSystem) ) {
                 nsAutoString newBuffer;
                 Conv_06_FE_WithReverse(buffer, newBuffer,
GET_BIDI_OPTION_DIRECTION(bidiOptions));
                 HandleNumbers(newBuffer, buffer);
               }
+            } //Y2004 Bug #197375
+            else if(isVisual || (!isVisual &&
(GET_BIDI_OPTION_CLIPBOARDTEXTMODE(bidiOptions) ==
IBMBIDI_CLIPBOARDTEXTMODE_VISUAL)) )
+            {
+                nsAutoString newBuffer;
+                Conv_06_FE_WithReverse(buffer, newBuffer,
GET_BIDI_OPTION_DIRECTION(bidiOptions));
+                HandleNumbers(newBuffer, buffer);
             }
           }
         }
+        else { // Y2004 -- Ahmad Hebrew visual Support Bug #197375
+            PRBool hebrewCharset; //Y2004
+          
+            context->IsHebrewEncoding(hebrewCharset);
+            if(hebrewCharset) {
+                PRBool isVisual = context->IsVisualMode();
+
+                if (isVisual) {
+                    nsAutoString newBuffer;
+
+                    Reverse_Hebrew(buffer, newBuffer);
+                    buffer = newBuffer;
+                }
+
+            } 
+        }
       }
     }
   }
 #endif // IBMBIDI
 
   // Get the Clipboard
This the same patch which is submitted by prognathous@hotmail.com on behalf of
me, However I'm submitting it again in order to have the authority to edit it.
Attachment #150790 - Flags: review?(smontagu)
Whiteboard: See comment #7 for a workaround, needed-aviary1.0 → See comment #7 for a workaround, fixed-aviary1.0
Comment on attachment 150790 [details] [diff] [review]
Patch to fix both Arabic & Hebrew problems

This patch is made against the trunk. More contribution to test and verify this
patch will be more than welcome
Attachment #150790 - Flags: superreview?(dbaron)
Comment on attachment 150790 [details] [diff] [review]
Patch to fix both Arabic & Hebrew problems

Review is required in order to approve this patch
Attachment #150790 - Flags: superreview?(dbaron)
Attachment #150790 - Flags: review?(smontagu)
Attachment #150790 - Flags: review?(dbaron)
Attachment #150790 - Flags: review?(dbaron) → review?(smontagu)
Hi All, I have submitted the new patch to solve both Arabic and Hebrew 
problems long time ago, and issued review Flag.
Still the patch is not reviewed. Could you review it, Simon? For others, 
specially Hebrew speakers could you test the patch against visual Hebrew?
Comment on attachment 150790 [details] [diff] [review]
Patch to fix both Arabic & Hebrew problems

Apologies for the delay in reviewing this.

I'm not happy with the way the original code uses the page encoding to
determine the code path. What will happen in a UTF-8 page containing both
Hebrew and Arabic, or a page in some other encoding with Hebrew and/or Arabic
represented by numeric character references?

Adding a new IsHebrewEncoding() doesn't solve this fundamental issue. It would
be better to use nsPresContext::BidiEnabled(), which will return PR_TRUE if
there are either Hebrew or Arabic (or other RTL) characters anywhere in the
page.

I need more thought about ReverseHebrew() and Conv...WithReverse(). I think
they are oversimplified and will not handle all cases correctly.
Attachment #150790 - Flags: review?(smontagu) → review-
> if (bidiCharset.EqualsIgnoreCase("UTF-8")) {
>                if ( (GET_BIDI_OPTION_CLIPBOARDTEXTMODE(bidiOptions) ==
> IBMBIDI_CLIPBOARDTEXTMODE_VISUAL) || (!isBidiSystem) ) {
>                  nsAutoString newBuffer;
>                  Conv_06_FE_WithReverse(buffer, newBuffer,
> GET_BIDI_OPTION_DIRECTION(bidiOptions));
>                  HandleNumbers(newBuffer, buffer);
>                }

Why do we need reversing on a non-bidi system? It seems to cause Arabic text be 
pasted in reversed order, if the destination application supports bidi. To 
control compatibility with external applications, we have an appropriate UI 
option (clipboard mode).

So, can we just call |Conv_06_FE| on a |!isBidiSystem|? 
> So, can we just call |Conv_06_FE| on a |!isBidiSystem|?

This applies to the logic, so if the patch is rewritten - to a possible 
|Conv_06_FE| substitution.
I guess this This part was a special case for UTF-8 handling. I think we will
need to redisgn the code path not to rely on the code set to determine it as
objected by Semion.
I think we will have to redesign the old code as whole.
Summary: Hebrew or Arabic string being copy/pasted, reverses → Arabic string being copy/pasted, reverses
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.