Closed
Bug 41425
Opened 24 years ago
Closed 22 years ago
UTF-8 converter gets surrogates wrong
Categories
(Core :: Internationalization, defect, P3)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
People
(Reporter: erik, Assigned: shanjian)
References
Details
(Keywords: intl, topembed)
Attachments
(1 file)
708 bytes,
patch
|
ftang
:
review+
blizzard
:
superreview+
|
Details | Diff | Splinter Review |
UTF-8 codes for UCS-4 characters outside UTF-16 are mistakenly converted to surrogates in UTF-16. The test is currently: if(mUcs4 >= 0x001F0000) { It should be: if(mUcs4 >= 0x00110000) { Also, the surrogates are computed wrongly. Instead of ORing (|), it should be ADDing (+). The current code is: *out++ = 0xD800 | (0x000003FF & (mUcs4 >> 10)); *out++ = 0xDC00 | (0x000003FF & mUcs4); It should be: *out++ = 0xD800 + (mUcs4 >> 10); *out++ = 0xDC00 + (0x000003FF & mUcs4); The "out" pointer must not be incremented twice (as above) without checking "outend". It should return NS_OK_UDEC_MOREOUTPUT and save state if we don't have enough space in the output buffer. For 4-byte UTF-8 chars, the line should be: mUcs4 = (mUcs4 << 18) & 0x001C0000L; Near the end of the method, res is set to NS_OK_UDEC_MOREINPUT even if it had been set to NS_ERROR_UNEXPECTED earlier in the code. It should not overwrite res in this way. "res" is initialized to "nsnull". This should be NS_OK.
Reporter | ||
Comment 1•24 years ago
|
||
Please ignore the part about ORing and ADDing. ORing is fine (and faster).
Reporter | ||
Comment 2•24 years ago
|
||
The "if(mUcs4 >= 0x001F0000)" bug occurs in several files in Mozilla. Maybe all of these should be fixed as part of this bug.
Comment 3•24 years ago
|
||
This probably mean 3 places /intl/uconv/ucvlatin/nsUTF8ToUnicode.cpp /xpcom/ds/nsString2.cpp /xpcom/ds/nsTextFormatter.cpp The comment of changing 0x001F0000 to 0x00110000 is correct. The comment of out++ twice should only apply to nsUTF8ToUnicode.cpp since we guarantee the buffer size in nsString2.cpp and nsTextFormatter.cpp
Comment 4•24 years ago
|
||
erik- can you help to fix it? Thanks Mark it as M23
Assignee: cata → erik
Status: ASSIGNED → NEW
Target Milestone: M17 → M23
Comment 5•24 years ago
|
||
Take this bug back. Mark it as future
Assignee: erik → ftang
Target Milestone: --- → Future
Comment 7•23 years ago
|
||
shanjian- can you take a look at this old bug. Is this still valid? if not , please close it. if yes, please file individual bugs for each cases. Thanks
Assignee: ftang → shanjian
Status: ASSIGNED → NEW
Target Milestone: Future → ---
Assignee | ||
Updated•22 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•22 years ago
|
||
/intl/uconv/ucvlatin/nsUTF8ToUnicode.cpp has been taken care of. /xpcom/ds/nsString2.cpp (It should be nsString2.h), will be taken care of in bug 134035. /xpcom/ds/nsTextFormatter.cpp will be taken care of by this bug.
Assignee | ||
Comment 9•22 years ago
|
||
Comment 10•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review] patch r=ftang
Attachment #82258 -
Flags: review+
Assignee | ||
Comment 11•22 years ago
|
||
blizzard, could you sr?
Comment 12•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review] patch sr=blizzard
Attachment #82258 -
Flags: superreview+
Assignee | ||
Comment 13•22 years ago
|
||
fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 14•22 years ago
|
||
I suspect the checkin for this bug to have caused the regression in bug 164771. Could you verify this?
Comment 15•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review] patch I just verified that his fix is causing bug 164771, as arthur suggested, which is a serious copy/paste regression.
Attachment #82258 -
Flags: needs-work+
Comment 16•22 years ago
|
||
Reopening this bug. The patch needs to be backed out, or augmented to fix the copy/paste problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 17•22 years ago
|
||
This is hard for me to understand why this simple change cause the problem. I could not reproduce it on my box. When kind of text did you try to copy?
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 18•22 years ago
|
||
Just verified with kin that the bug 164771 is not caused by this patch. Mark as fixed.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Comment 19•22 years ago
|
||
Comment on attachment 82258 [details] [diff] [review] patch Ok, I screwed up on my backout of shanjian's changes. I tested the backout with the wrong build. Backing out shanjian's patch did *not* fix the problem. Appologies to shanjian.
Attachment #82258 -
Flags: needs-work+
Comment 20•22 years ago
|
||
batch: adding topembed per Gecko2 document http://rocknroll.mcom.com/users/marek/publish/Gecko/Gecko2Tasks.html
Keywords: topembed
Comment 21•22 years ago
|
||
Shanjian, how can I verify this?
Assignee | ||
Comment 22•22 years ago
|
||
Profile name resolution use this code. Try it and see if it works.
You need to log in
before you can comment on or make changes to this bug.
Description
•