Closed
Bug 41425
Opened 25 years ago
Closed 22 years ago
UTF-8 converter gets surrogates wrong
Categories
(Core :: Internationalization, defect, P3)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
People
(Reporter: erik, Assigned: shanjian)
References
Details
(Keywords: intl, topembed)
Attachments
(1 file)
708 bytes,
patch
|
ftang
:
review+
blizzard
:
superreview+
|
Details | Diff | Splinter Review |
UTF-8 codes for UCS-4 characters outside UTF-16 are mistakenly converted to
surrogates in UTF-16. The test is currently:
if(mUcs4 >= 0x001F0000) {
It should be:
if(mUcs4 >= 0x00110000) {
Also, the surrogates are computed wrongly. Instead of ORing (|), it should be
ADDing (+). The current code is:
*out++ = 0xD800 | (0x000003FF & (mUcs4 >> 10));
*out++ = 0xDC00 | (0x000003FF & mUcs4);
It should be:
*out++ = 0xD800 + (mUcs4 >> 10);
*out++ = 0xDC00 + (0x000003FF & mUcs4);
The "out" pointer must not be incremented twice (as above) without checking
"outend". It should return NS_OK_UDEC_MOREOUTPUT and save state if we don't have
enough space in the output buffer.
For 4-byte UTF-8 chars, the line should be:
mUcs4 = (mUcs4 << 18) & 0x001C0000L;
Near the end of the method, res is set to NS_OK_UDEC_MOREINPUT even if it had
been set to NS_ERROR_UNEXPECTED earlier in the code. It should not overwrite
res in this way.
"res" is initialized to "nsnull". This should be NS_OK.
Reporter | ||
Comment 1•25 years ago
|
||
Please ignore the part about ORing and ADDing. ORing is fine (and faster).
Reporter | ||
Comment 2•25 years ago
|
||
The "if(mUcs4 >= 0x001F0000)" bug occurs in several files in Mozilla. Maybe all
of these should be fixed as part of this bug.
Comment 3•25 years ago
|
||
This probably mean 3 places
/intl/uconv/ucvlatin/nsUTF8ToUnicode.cpp
/xpcom/ds/nsString2.cpp
/xpcom/ds/nsTextFormatter.cpp
The comment of changing 0x001F0000 to 0x00110000 is correct.
The comment of out++ twice should only apply to nsUTF8ToUnicode.cpp since we
guarantee the buffer size in nsString2.cpp and nsTextFormatter.cpp
Comment 4•24 years ago
|
||
erik- can you help to fix it? Thanks Mark it as M23
Assignee: cata → erik
Status: ASSIGNED → NEW
Target Milestone: M17 → M23
Comment 5•24 years ago
|
||
Take this bug back. Mark it as future
Assignee: erik → ftang
Target Milestone: --- → Future
Comment 7•23 years ago
|
||
shanjian- can you take a look at this old bug. Is this still valid? if not ,
please close it. if yes, please file individual bugs for each cases. Thanks
Assignee: ftang → shanjian
Status: ASSIGNED → NEW
Target Milestone: Future → ---
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•23 years ago
|
||
/intl/uconv/ucvlatin/nsUTF8ToUnicode.cpp has been taken care of.
/xpcom/ds/nsString2.cpp (It should be nsString2.h), will be taken care of in bug
134035.
/xpcom/ds/nsTextFormatter.cpp will be taken care of by this bug.
Assignee | ||
Comment 9•23 years ago
|
||
Comment 10•23 years ago
|
||
Comment on attachment 82258 [details] [diff] [review]
patch
r=ftang
Attachment #82258 -
Flags: review+
Assignee | ||
Comment 11•23 years ago
|
||
blizzard, could you sr?
Comment 12•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review]
patch
sr=blizzard
Attachment #82258 -
Flags: superreview+
Assignee | ||
Comment 13•22 years ago
|
||
fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 14•22 years ago
|
||
I suspect the checkin for this bug to have caused the regression in bug 164771.
Could you verify this?
Comment 15•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review]
patch
I just verified that his fix is causing bug 164771, as arthur suggested, which
is a serious copy/paste regression.
Attachment #82258 -
Flags: needs-work+
Comment 16•22 years ago
|
||
Reopening this bug. The patch needs to be backed out, or augmented to fix the
copy/paste problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 17•22 years ago
|
||
This is hard for me to understand why this simple change cause the problem. I
could not reproduce it on my box. When kind of text did you try to copy?
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 18•22 years ago
|
||
Just verified with kin that the bug 164771 is not caused by this patch. Mark as
fixed.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Comment 19•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review]
patch
Ok, I screwed up on my backout of shanjian's changes. I tested the backout with
the wrong build. Backing out shanjian's patch did *not* fix the problem.
Appologies to shanjian.
Attachment #82258 -
Flags: needs-work+
Comment 20•22 years ago
|
||
batch: adding topembed per Gecko2 document
http://rocknroll.mcom.com/users/marek/publish/Gecko/Gecko2Tasks.html
Keywords: topembed
Comment 21•22 years ago
|
||
Shanjian, how can I verify this?
Assignee | ||
Comment 22•22 years ago
|
||
Profile name resolution use this code. Try it and see if it works.
You need to log in
before you can comment on or make changes to this bug.
Description
•