Closed
Bug 236707
Opened 21 years ago
Closed 21 years ago
ARABIC COMMA's Joining Class is wrong
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: zwnj, Assigned: zwnj)
References
()
Details
Attachments
(1 file, 2 obsolete files)
1.33 KB,
patch
|
smontagu
:
review+
rbs
:
superreview+
mkaply
:
approval1.7+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114
ARABIC COMMA <U+060C> is a Non-Joining character, but behaves like
Joining-Transparent. In Arabic/Perisna texts, shaping of characters
after/before of ARABIC COMMA will be wrong.
Reproducible: Always
Steps to Reproduce:
1. Open <http://bamdad.org/~behnam/test/comma.fa.html>.
Actual Results:
The 3rd ARABIC LETTER HEH is _middle_ and the 1st ARABIC LETTER WAW is a _final_.
Expected Results:
In this text, the 3rd ARABIC LETTER HEH must be a _final_ and the 1st ARABIC
LETTER WAW must be a _isolated_.
As I looked at mozilla/source/content/shared/src/nsBidiUtils.cpp>, non of U+060x
and U+061x characters are in the shaping algorithm. Also many characters added
to these codes in Unicode 4.
Assignee | ||
Comment 1•21 years ago
|
||
U+0600..U+0620 Added to Arabic Joining Class Table (gJoiningClass);
#define GetJoiningClass(c) Fixed (Optimized);
Assignee | ||
Comment 2•21 years ago
|
||
It works; I've tested.
I know that this joining implementation is not good enough and is going to
replaced with some other libs, but please check in this patch till replacing the
code.
This bug is very annoying in Persian (and perhaps Arabic) texts.
Behdad: Could you test it please?
Summary: ARABIC COMMA Joining Class in wronge → ARABIC COMMA's Joining Class is wrong
Comment 3•21 years ago
|
||
Thanks Behnam for the patch. Unfortunately my laptop is not in a situation to
test anything. Roozbeh, would you please?
Behnam, please attach a patch with diff -u format.
Assignee | ||
Comment 4•21 years ago
|
||
Attachment #143178 -
Attachment is obsolete: true
Updated•21 years ago
|
Assignee: roland.mainz → mkaply
Status: UNCONFIRMED → NEW
Component: GFX: Xlib → Layout: BiDi Hebrew & Arabic
Ever confirmed: true
QA Contact: timeless → zach
Comment 5•21 years ago
|
||
I'd like to understand better the logic behind the bug, but the only computers I
have access to at the moment are on XP and don't go through this code path. What
happens if you use a regular ASCII comma (or any other codepoint outside the
U+06xx range) instead of an Arabic comma?
Assignee | ||
Comment 6•21 years ago
|
||
Everything is correct for ASCII characters between of Arabic letters. So the
main problem was not here.
I think the problem was that IS_ARABIC_CHAR was defined as:
#define IS_ARABIC_CHAR(c) ((0x0600 <= (c)) && ((c)<= 0x06FF))
but GetJoiningClass was defined as:
#define GetJoiningClass(c) \
(((0x0621 <= (c)) && ((c) <= 0x06FF)) ? \
(gJoiningClass[(c) - 0x0621]) : \
((0x200D == (c)) ? eJC : eTr))
and as the comment said:
// All letters without an equivalen in the FB50 block are 'eNJ' here. This
// should be fixed after finding some better mechanism for handling Arabic.
it was for 0621~06FF, and works right for outside of 06xx, but works wrong for
0600~0620.
Comment 7•21 years ago
|
||
(In reply to comment #6)
> I think the problem was that IS_ARABIC_CHAR was defined as:
>
> #define IS_ARABIC_CHAR(c) ((0x0600 <= (c)) && ((c)<= 0x06FF))
>
> but GetJoiningClass was defined as:
>
> #define GetJoiningClass(c) \
> (((0x0621 <= (c)) && ((c) <= 0x06FF)) ? \
> (gJoiningClass[(c) - 0x0621]) : \
> ((0x200D == (c)) ? eJC : eTr))
Yes, that was my theory too. :-)
I think the patch would be cleaner if you change
(((0x0621 <= (c)) && ((c) <= 0x06FF)) ? \
to
(IS_ARABIC_CHAR(c)) \
in GetJoiningClass and move the |((c) >> 8) == 0x06)| optimization to the
definition of |IS_ARABIC_CHAR|.
Assignee | ||
Comment 8•21 years ago
|
||
Table "gJoiningClass" is fixed like the previous one.
Using "IS_ARABIC_CHAR" for arabic character detection.
Remove the optimization, cause of human readability of header file.
Attachment #143181 -
Attachment is obsolete: true
Comment 9•21 years ago
|
||
Comment on attachment 144290 [details] [diff] [review]
use IS_ARABIC_CHAR, fixed gJoiningClass
r=smontagu.
Attachment #144290 -
Flags: review+
Comment 10•21 years ago
|
||
Sorry for the noise, but can this patch be applied please?
Comment 11•21 years ago
|
||
Comment on attachment 144290 [details] [diff] [review]
use IS_ARABIC_CHAR, fixed gJoiningClass
The patch still needs superreview before it can be applied: I'm requesting from
rbs.
Attachment #144290 -
Flags: superreview?(rbs)
Comment 12•21 years ago
|
||
Comment on attachment 144290 [details] [diff] [review]
use IS_ARABIC_CHAR, fixed gJoiningClass
sr=rbs
Attachment #144290 -
Flags: superreview?(rbs) → superreview+
Comment 13•21 years ago
|
||
Comment on attachment 144290 [details] [diff] [review]
use IS_ARABIC_CHAR, fixed gJoiningClass
a=mkaply
Attachment #144290 -
Flags: approval1.7+
Updated•21 years ago
|
Assignee: mkaply → behnam
Comment 14•21 years ago
|
||
checked in for 1.7final
Checking in nsBidiUtils.cpp;
/cvsroot/mozilla/content/shared/src/nsBidiUtils.cpp,v <-- nsBidiUtils.cpp
new revision: 3.11; previous revision: 3.10
done
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in
before you can comment on or make changes to this bug.
Description
•