Closed Bug 142625 Opened 24 years ago Closed 24 years ago

Changing character coding menu in XML file causes NS crash

Categories

(Core :: XML, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: teruko, Assigned: harishd)

References

()

Details

(Keywords: crash, intl, Whiteboard: [adt2 RTM] [Fixed on the trunk 05/21 and branch 05/22],custrtm-)

Attachments

(1 file, 3 obsolete files)

After I changed the character coding menu in XML file, Netscape will crash. Steps of reproduce 1. Go to above URL 2. Select menu View|Character coding Japanese (ISO-2022-JP is marked 3. Select menu View|Character coding -> Auto-Detect (Universal) 4. Select menu View|Character coding -> Auto-Detect (off) Netscape crashed. Tested 5-06-2002 1.00 Win32 build. Talkback incident number 5999133 MSVCRT.DLL + 0x1240c (0x7801240c) nsWritingIterator::write [..\..\dist\include\string\nsStringIterator.h, line 360] copy_string [..\..\dist\include\string\nsAlgorithm.h, line 92] nsAString::do_AppendFromReadable [d:\builds\seamonkey\mozilla\string\src\nsAString.cpp, line 363] nsAString::do_AppendFromElementPtrLength [d:\builds\seamonkey\mozilla\string\src\nsAString.cpp, line 375] nsAutoString::nsAutoString [d:\builds\seamonkey\mozilla\string\obsolete\nsString2.cpp, line 1199] nsXMLContentSink::HandleDoctypeDecl [d:\builds\seamonkey\mozilla\content\xml\document\src\nsXMLContentSink.cpp, line 1874] nsExpatDriver::HandleEndDoctypeDecl [d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatDriver.cpp, line 462] doProlog [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 2387] prologProcessor [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 2249] XML_ParseBuffer [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 965] XML_Parse [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 955] nsExpatDriver::ParseBuffer [d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatDriver.cpp, line 666] nsExpatDriver::ConsumeToken [d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatDriver.cpp, line 776] nsParser::Tokenize [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2507] nsParser::ResumeParse [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 1731] nsParser::OnDataAvailable [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2369] nsDocumentOpenInfo::OnDataAvailable [d:\builds\seamonkey\mozilla\uriloader\base\nsURILoader.cpp, line 244] nsStreamListenerTee::OnDataAvailable [d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerTee.cpp, line 57] nsHttpChannel::OnDataAvailable [d:\builds\seamonkey\mozilla\netwerk\protocol\http\src\nsHttpChannel.cpp, line 2959] nsOnDataAvailableEvent::HandleEvent [d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerProxy.cpp, line 203] PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 597] PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 530] _md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 1078] nsAppShellService::Run [d:\builds\seamonkey\mozilla\xpfe\appshell\src\nsAppShellService.cpp, line 451] main1 [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1473] main [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1808] WinMain [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1826] WinMainCRTStartup() KERNEL32.DLL + 0x7903 (0x77e57903)
Keywords: crash
QA Contact: ruixu → teruko
Keywords: nsbeta1+
Whiteboard: [adt2]
Keywords: intl
I haven't got this crash on 05-08 1.0.0 build on WinME-JA and WinXP-SC yet, the crash didn't happen to me when tried some other XML pages neither.
It doesn't crash on my EN W2k with default locale set to Ja. And it also doesn't crash on my Simplified Chinese XP.
If you go to above url or other XML files under http://babel/tests/browser/xml/xml-charset/ Then, after you change the Character coding menu several times, NS will crash. The original reproduciable steps is not 100% reproduciable. I tested this with 5-09-1.00 Win32 build on Win98J. After I went to http://babel/tests/browser/xml/xml-charset/pr-xml-big-endian.xml 1. Select menu View|Character coding->Auto-Detect (Universal) 2. Select menu View|Character coding->Auto-Detect (off) 3. Select menu View|Character coding->UTF-8 NS crashed. This is not 100% reproduciable steps. However, I got the same talkback report. Talkback incident #6111155 MSVCRT.DLL + 0x1020b (0x7801020b) nsWritingIterator<unsigned short>::write [..\..\dist\include\string\nsStringIterator.h, line 360] copy_string [..\..\dist\include\string\nsAlgorithm.h, line 92] nsAString::do_AppendFromReadable [d:\builds\seamonkey\mozilla\string\src\nsAString.cpp, line 363] nsAString::do_AppendFromElementPtrLength [d:\builds\seamonkey\mozilla\string\src\nsAString.cpp, line 375] nsAutoString::nsAutoString [d:\builds\seamonkey\mozilla\string\obsolete\nsString2.cpp, line 1199] nsXMLContentSink::HandleDoctypeDecl [d:\builds\seamonkey\mozilla\content\xml\document\src\nsXMLContentSink.cpp, line 1874] nsExpatDriver::HandleEndDoctypeDecl [d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatDriver.cpp, line 462] doProlog [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 2387] prologProcessor [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 2249] XML_ParseBuffer [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 965] XML_Parse [d:\builds\seamonkey\mozilla\expat\xmlparse\xmlparse.c, line 955] nsExpatDriver::ParseBuffer [d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatDriver.cpp, line 666] nsExpatDriver::ConsumeToken [d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatDriver.cpp, line 776] nsParser::Tokenize [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2507] nsParser::ResumeParse [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 1731] nsParser::OnDataAvailable [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2369] nsDocumentOpenInfo::OnDataAvailable [d:\builds\seamonkey\mozilla\uriloader\base\nsURILoader.cpp, line 244] nsHttpChannel::OnDataAvailable [d:\builds\seamonkey\mozilla\netwerk\protocol\http\src\nsHttpChannel.cpp, line 2965] nsOnDataAvailableEvent::HandleEvent [d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerProxy.cpp, line 203] PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 597] PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 530] _md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 1078] KERNEL32.DLL + 0x24497 (0xbff94497) 0x00688bfa
setting milestone to moz1.0 to keep this under my radar
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.0
I GUESS the problem is here 446 nsExpatDriver::HandleEndDoctypeDecl() 447 { 448 NS_ASSERTION(mSink, "content sink not found!"); 449 450 harishd 3.10 const PRUnichar* doctypeStart = mBuffer + ( mDoctypePos - mBytesParsed ) / 2; 451 const PRUnichar* doctypeEnd = mBuffer + ( XML_GetCurrentByteIndex(mExpatParser) - mBytesParsed ) / 2; 452 453 harishd 3.1 if(mSink) { 454 rbs 3.18 // let the sink know any additional knowledge that we have about the document 455 // (currently, from bug 124570, we only expect to pass additional agent sheets 456 // needed to layout the XML vocabulary of the document) 457 nsIURI* data = nsnull; 458 if (mCatalogData && mCatalogData->mAgentSheet) { 459 NS_NewURI(&data, mCatalogData->mAgentSheet); 460 } 461 mInternalState = mSink->HandleDoctypeDecl(doctypeStart, doctypeEnd - doctypeStart, data); 462 NS_IF_RELEASE(data); 463 harishd 3.1 } 46 not 100% sure. but does "doctypeEnd - doctypeStart" return the number of PRUnichar or the size of that ? (which is twice the number of PRUnichar. I know the mSink->HandleDoctypeDecl is expecting the number of PRUnichar, but I am not sure "doctypeEnd - doctypeStart" did that work. should that be "(doctypeEnd - doctypeStart)/sizof(PRUnichar)" instead ? over harishd.
Assignee: yokoyama → harishd
Status: ASSIGNED → NEW
Component: Internationalization → XML
Debugging revealed that the problem is due to a large DOCTYPE, handed over to expat in chunks, that causes buffer offset to go haywire. This problem may be solved with a rewind mechanism in nsExpatDriver. Will have to investigate this option further. FYI: This is a very bad bug and should be fixed for rtm.
Status: NEW → ASSIGNED
Whiteboard: [adt2] → [adt2 RTM]
Blocks: 143200
Attached patch patch v1.0 (obsolete) — Splinter Review
This patch would ensure the complete consumption of a DOCTYPE ( even if the DOCTYPE provided to the expat driver in chunks ). The fix: mDoctypeStart holds on to the DOCTYPE start position in the scanner, thereby ensuring the exisitance of the underlying sliding string, and when HandleEndDoctype()gets called everthing between mDoctypeStart and the current position, i.e., the doctype content, is copied into an autostring. This copy is |not| an additional overhead because the same copy used to happen in the content sink and now it is moved down to nsExpatDriver. FYI: I tried replacing nsString with nsAString, however, there are a few helper methods in the xml content sink that still rely on nsString methods. Will open up a new bug on that. Btw, I'm not sure how/when the iterator ( mDoctypeStart ) would give up the ownership.
Comment on attachment 84368 [details] [diff] [review] patch v1.0 timeless suggested adding a level of indirection: a pointer member would hold the iterator (allocated on the heap). Then, after copying the doctype string we just delete the pointer, which will also free the iterator. Seems like a hack but if we can't figure out anything else that should do it. I also don't like the fact that the sink's HandleDoctypeDecl() parameter is not |const|, although I see you use that to avoid an extra copy. I guess I could live with that until we change the APIs, but even then there should be a gigantic comment describing what is going to happen to the parameter. It looks like nsHTMLContentSink.cpp changes belong to another bug.
Attached patch patch v1.0 (obsolete) — Splinter Review
A slight modification ( replaced nsString parameter with nsAString iterators ) to my previous patch.
Attachment #84368 - Attachment is obsolete: true
Keywords: review
Whiteboard: [adt2 RTM] → [adt2 RTM] [Needs r/sr/a=]
Attached patch patch v1.2 (obsolete) — Splinter Review
Have added a memeber, mDoctypeText ( nsString ), in nsExpatDriver to collect the doctype content. Even though it is not very efficient it is safe ( unlike iterators relying on sliding strings ). Also, in an effort to keep the API's, in nsIExpatSink, alike I've moved some code from the content sink into nsExpatDriver.
Attachment #84400 - Attachment is obsolete: true
Comment on attachment 84522 [details] [diff] [review] patch v1.2 >Index: htmlparser/public/nsIExpatSink.idl >=================================================================== >- void HandleDoctypeDecl(in wstring aDoctype, >- in unsigned long aLength, >+ void HandleDoctypeDecl(in wstring aSubset, >+ in wstring aName, >+ in wstring aSystemId, >+ in wstring aPublicId, Shouldn't these be ns*String parameters? We have the strings in expat driver, and getting the underlying buffer just to do a dependent string from it again seems a waste of effort... >Index: htmlparser/src/nsExpatDriver.cpp >=================================================================== >+static const char* kWhitespace = " \r\n\t\b"; // Optimized for typical cases Actually you should remove '\b' character, it is not legal whitespace in XML 1.0. >+ static const char* kDelimiter = " >[\r\n\t\b"; // Optimized for typical cases '\b' here can be removed. >+ mInternalState = mSink->HandleDoctypeDecl(mDoctypeText.get(), >+ name.get(), >+ systemId.get(), >+ publicId.get(), >+ data); >- mDoctypePos = kNotInDoctype; >+ mDoctypeText.Truncate(); I don't remember if Truncate() actually releases the underlying buffer, maybe it was SetLength(0). But I guess it does not matter, this will be released once the driver goes away (after document loaded). So I don't think you even need to Truncate(), it is just wasted effort. (Expat will make sure we'll get an error if we see another DOCTYPE, for example.) >Index: htmlparser/src/nsExpatDriver.h >=================================================================== >Index: content/xml/document/src/nsXMLContentSink.cpp >=================================================================== >Index: content/xul/document/src/nsXULContentSink.cpp >=================================================================== >Index: rdf/base/src/nsRDFContentSink.cpp >===================================================================
Attachment #84522 - Flags: needs-work+
Attachment #84522 - Attachment is obsolete: true
I literally JUST tweaked Left/Right/Mid on the branch about an hour ago - just a heads up to make sure you update mozilla/string before landing this (the changes were already on the trunk, so if this patch is against the trunk you're fine)
Comment on attachment 84550 [details] [diff] [review] patch v1.3 [ addresses heikki's comments ] r=heikki
Attachment #84550 - Flags: review+
alecf: The patch is based on the trunk. Thanks for the heads up anyway.
Comment on attachment 84550 [details] [diff] [review] patch v1.3 [ addresses heikki's comments ] Call SetCapacity(0) on the doctype string once you're done with it. sr=jst
Attachment #84550 - Flags: superreview+
Keywords: adt1.0.0, approval
Whiteboard: [adt2 RTM] [Needs r/sr/a=] → [adt2 RTM] [Needs a=]
Fix landed ( 05/21 ) on the trunk.
Whiteboard: [adt2 RTM] [Needs a=] → [adt2 RTM Fixed on the trunk 05/21] [Needs a=]
Teruko - Can you verify this on the trunk?
Jaime, I will test this on trunk build today.
I tested this on 5-22 Win32 trunk build. This workes fine on trunk.
I tested this on 5-22 Linux trunk build. This workes fine on trunk also.
I tested this on 5-22-09 MacOS 8.x trunk build. This works fine.
Comment on attachment 84550 [details] [diff] [review] patch v1.3 [ addresses heikki's comments ] a=chofmann,brendan,shaver
Attachment #84550 - Flags: approval+
Fix landed ( 05/22 ) on the branch.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Whiteboard: [adt2 RTM Fixed on the trunk 05/21] [Needs a=] → [adt2 RTM Fixed on the trunk 05/21 and branch 05/22] [Needs a=]
Teruko verified on trunk. Also adding fixed1.0.0 keyword.
Status: RESOLVED → VERIFIED
Keywords: fixed1.0.0
Whiteboard: [adt2 RTM Fixed on the trunk 05/21 and branch 05/22] [Needs a=] → [adt2 RTM] [Fixed on the trunk 05/21 and branch 05/22]
Keywords: adt1.0.0adt1.0.0+
Whiteboard: [adt2 RTM] [Fixed on the trunk 05/21 and branch 05/22] → [adt2 RTM] [Fixed on the trunk 05/21 and branch 05/22],custrtm-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: