Closed
Bug 396436
Opened 17 years ago
Closed 3 years ago
HTTP parser fails to recognize an utf-8 broken at the edge of given conversion buffer
Categories
(Core :: DOM: HTML Parser, defect, P5)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: buniofh, Unassigned)
Details
Attachments
(2 files)
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty) Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/2.0.0.6 (Custom build) While converting an UTF-8 encoded HTML page with embedded javascript to UTF-16 (see attachment) when nsScanner encounters a multibyte character broken at the edge of the given and next conversion buffer (may be seen in nsNativeUConvService) an error is returned. Few characters may get misinterpreted at that point which causes errors in script parsing. Bug was seen on PC/Linux although code inspection proves that it may be affecting all platforms and operating systems. Build config attached. Reproducible: Always Steps to Reproduce: 1. Simply run attached test on mozilla compiled with attached flags. Actual Results: script parsing crashed. hitting the 'zonk' key will produce a small report which will state exactly which script commands were ignored or ill parsed. Expected Results: will be seen in browser window as broken tex. good result should produce: a button and a frame filled with only one character repeated 1000 times. bad will show parts of code within the frame as well as break script execution.
two files: test.html - the test test2.html - the contents of the frame.
please file bugs like this against the appropriate Core component. people on irc.mozilla.org can help, but so can bonsai.mozilla.org/cvsblame.cgi?file=... (just look through bugs against the relevant file to see where they live). and please file bugs based on trunk code, not based on a branch.
Component: General → HTML: Parser
Product: Firefox → Core
QA Contact: general → parser
Version: unspecified → 1.8 Branch
point taken. even though mozconfig was used in fact to build minimo the code has been checked the against mozilla trunk and the bug still shows (flag --enable-necko-small-buffers is to be suspected). that's why i've reported it as general. fix that takes care of the problem (credit to E. Mironov): --- mozilla/intl/uconv/native/nsNativeUConvService.cpp 14 Mar 2006 08:35:00 -0000 1.1.1.1 +++ mozilla/intl/uconv/native/nsNativeUConvService.cpp 18 Sep 2007 14:02:17 -0000 1.2 @@ -307,7 +307,13 @@ res = 0; break; } - + + if(errno == EINVAL) + { + res = 0; + break; + } + if (errno == EILSEQ) { if (mReplaceOnError) { --- mozilla/parser/htmlparser/src/nsScanner.cpp 14 Mar 2006 08:38:29 -0000 1.1.1.1 +++ mozilla/parser/htmlparser/src/nsScanner.cpp 18 Sep 2007 14:03:08 -0000 1.2 @@ -345,6 +345,14 @@ nsresult res=NS_OK; PRUnichar *unichars, *start; if(mUnicodeDecoder) { + int spareBufferLen = spareBuffer.Length(); + if(spareBufferLen > 0) + { + spareBuffer.Append(aBuffer, aLen); + aLen += spareBufferLen; + aBuffer = spareBuffer.get(); + } + PRInt32 unicharBufLen = 0; mUnicodeDecoder->GetMaxLength(aBuffer, aLen, &unicharBufLen); nsScannerString::Buffer* buffer = nsScannerString::AllocBuffer(unicharBufLen + 1); @@ -358,6 +366,20 @@ res = mUnicodeDecoder->Convert(aBuffer, &srcLength, unichars, &unicharLength); totalChars += unicharLength; + + if((NS_OK == res) && (srcLength < aLen)) + { + nsCString tmp; + tmp.Assign(aBuffer + srcLength, aLen - srcLength); + spareBuffer.Assign(tmp); + break; + } + else + if((srcLength == aLen) && (spareBuffer.Length() > 0)) + { + spareBuffer.Cut(0, spareBuffer.Length()); + } + // Continuation of failure case if(NS_FAILED(res)) { // if we failed, we consume one byte, replace it with U+FFFD @@ -370,7 +392,7 @@ NS_ERROR("Unexpected end of destination buffer"); break; } - + unichars[unicharLength++] = (PRUnichar)0xFFFD; unichars = unichars + unicharLength; unicharLength = unicharBufLen - (++totalChars); --- mozilla/parser/htmlparser/src/nsScanner.h 14 Mar 2006 08:38:29 -0000 1.1.1.1 +++ mozilla/parser/htmlparser/src/nsScanner.h 18 Sep 2007 14:03:08 -0000 1.2 @@ -403,6 +403,8 @@ nsCString mCharset; nsIUnicodeDecoder *mUnicodeDecoder; nsParser *mParser; + private: + nsCString spareBuffer; }; #endif
Comment 5•3 years ago
|
||
Bulk-downgrade of unassigned, >=3 years untouched DOM/Storage bug's priority.
If you have reason to believe this is wrong, please write a comment and ni :jstutte.
Severity: major → S4
Priority: -- → P5
Comment 6•3 years ago
|
||
This bug has been traded for a bug that fails to produce an error if an XML file ends with an incomplete UTF-8 byte sequence.
Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•