Closed
Bug 137657
Opened 22 years ago
Closed 22 years ago
html content serializer and nsISaveAsCharset does not handle surrogate correctly
Categories
(Core :: Internationalization, defect, P3)
Tracking
()
VERIFIED
FIXED
mozilla1.2alpha
People
(Reporter: ftang, Assigned: shanjian)
References
()
Details
(Keywords: intl, topembed, Whiteboard: [eta:8/5/2002])
Attachments
(1 file)
5.56 KB,
patch
|
ftang
:
review+
blizzard
:
superreview+
|
Details | Diff | Splinter Review |
I think our nsISaveAsCharset and/or our html content sink code does not handle surrogate pair correctly reproduce procedure: 1. open http://people.netscape.com/ftang/testscript/gb18030/gbtext.cgi?page=596 this page have some surrogate characters. (you probably will see all question mark display there. it is ok) 2. select the fourth line, and copy 3. open "File:New:Blank page to edit" 4. paste the surrogate characters in 5. "File:Save" and save into one file 6. browser that file, 7. view source of that file, you will notice that it use two ncr to encode one unicode characters, for example you will got <pre>��������������������</pre> if you copy the 4th line expect behavior, I should see <pre>𠀄𠀅𠀆𠀇𠀈𠀉𠀊𠀋𠀌𠀍</pre>
Reporter | ||
Comment 1•22 years ago
|
||
I think this will cause some problme about copy and paste through html. Could be the cause of the bug ji filed about copy and paste surrogate characters from mozilla to microsoft WordXP
Reporter | ||
Comment 2•22 years ago
|
||
probably caused by intl/unicharutil/src/nsSaveAsCharset.cpp 270 NS_IMETHODIMP 271 nsSaveAsCharset::DoConversionFallBack(PRUnichar inCharacter, char *outString, PRInt32 bufferLength) 272 { .. 310 case attr_FallbackDecimalNCR: 311 rv = ( PR_snprintf(outString, bufferLength, "&#%u;", inCharacter) > 0) ? NS_OK : NS_ERROR_FAILURE; 312 break; 313 case attr_FallbackHexNCR: 314 rv = (PR_snprintf(outString, bufferLength, "&#x%x;", inCharacter) > 0) ? NS_OK : NS_ERROR_FAILURE; 315 break;
Reporter | ||
Comment 3•22 years ago
|
||
we probably need to change the implementation of 178 nsSaveAsCharset::DoCharsetConversion(const PRUnichar *inString, char **outString) , and change the interface of 148 nsSaveAsCharset::HandleFallBack(PRUnichar character, char **outString, PRInt32 *bufferLength, 149 PRInt32 *currentPos, PRInt32 estimatedLength) to 148 nsSaveAsCharset::HandleFallBack(PRUint32 inUCS4, char **outString, PRInt32 *bufferLength, 149 PRInt32 *currentPos, PRInt32 estimatedLength) and 271 nsSaveAsCharset::DoConversionFallBack(PRUnichar inCharacter, char *outString, PRInt32 bufferLength) to 271 nsSaveAsCharset::DoConversionFallBack(PRUint32 inUCS4, char *outString, PRInt32 bufferLength)
Reporter | ||
Comment 5•22 years ago
|
||
I think this problem cause copy and paste problem with OfficeXP when we paste it in as html. Also, we will cause problem when we save the page. shanjian- please look at this when you have time. We may need this for rtm. (30% of chance)
Target Milestone: --- → mozilla1.1alpha
Comment 6•22 years ago
|
||
nsISaveAsCharset is also used for message send, gfx seems to be using it too (for transliteration?)
Assignee | ||
Comment 7•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Status: NEW → ASSIGNED
Reporter | ||
Comment 9•22 years ago
|
||
Comment on attachment 90894 [details] [diff] [review] patch r=ftang the buffer size is guaranteed to be 256 by the caller and the code use the right function to ensure no buffer overrun also.
Attachment #90894 -
Flags: review+
Reporter | ||
Comment 10•22 years ago
|
||
let's get this into trunk asap nsbeta1+ for m1.2final
Assignee | ||
Comment 11•22 years ago
|
||
blizzard, could you sr?
Updated•22 years ago
|
Attachment #90894 -
Flags: superreview+
Comment 12•22 years ago
|
||
Comment on attachment 90894 [details] [diff] [review] patch sr=blizzard on these bits. Any callers that need to be fixed from the API changes?
Assignee | ||
Comment 13•22 years ago
|
||
fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 14•22 years ago
|
||
This commit have added a "may be used uninitialized" warning to brad TBox: +intl/unicharutil/src/nsSaveAsCharset.cpp:254 + `PRUint32 unMappedChar' might be used uninitialized in this function See also bug 59652 for more on these warnings.
Comment 15•22 years ago
|
||
P.S. The warning was fixed in bug 165908.
Comment 16•22 years ago
|
||
batch: adding topembed per Gecko2 document http://rocknroll.mcom.com/users/marek/publish/Gecko/Gecko2Tasks.html
Keywords: topembed
Comment 17•22 years ago
|
||
Verified fixed in 09-12 trunk build / WinXP-SC.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•