Closed Bug 86665 Opened 23 years ago Closed 23 years ago

N620 crash [@ nsExpatTokenizer::GetLine]

Categories

(Core :: XML, defect, P1)

x86
Windows NT
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: greer, Assigned: hjtoi-bugzilla)

Details

(Keywords: crash, topcrash, Whiteboard: [X-Files][can't reproduce])

Crash Data

This one just showed up in the last day or so in the talkback reports. It shows 
a lot of users crashing at startup. We don't have a lot of info yet. (cf. bug 
82332.)

nsExpatTokenizer::GetLine   21 
     First BBID :31823111
     Last BBID  :31900266
     Min Runtime :0
     Max Runtime :6731
     First Appearance Date : 2001-06-17
     Last Appearance Date : 2001-06-19
     First BuildID : 2001060713
     Last BuildID : 2001060713

Stack Trace: 

         nsExpatTokenizer::GetLine
[d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatTokenizer.cpp  line 258]
         nsExpatTokenizer::PushXMLErrorTokens
[d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatTokenizer.cpp  line 413]
         nsExpatTokenizer::ParseXMLBuffer
[d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatTokenizer.cpp  line 441]
         nsExpatTokenizer::ConsumeToken
[d:\builds\seamonkey\mozilla\htmlparser\src\nsExpatTokenizer.cpp  line 491]
         nsParser::Tokenize     
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp  line
2435]
         nsParser::ResumeParse  
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp  line
1871]
         nsParser::OnDataAvailable      
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp
line 2329]
         nsJARChannel::OnDataAvailable
[d:\builds\seamonkey\mozilla\netwerk\protocol\jar\src\nsJARChannel.cpp  line 
608]
         nsOnDataAvailableEvent::HandleEvent
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerProxy.cpp  line 
183]
         PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c  
line 591]
         PL_ProcessPendingEvents        
[d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c  line
524]
         _md_EventReceiverProc  
[d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c  line 1072]
 
        Source File : 
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/htmlparser/src/nsExpatTokeni
zer.cpp line : 258
     (31877784) Comments: first start of the browser after 3th installation 
(full)
     (31875159) URL: www.itv-f1.com
     (31875159) Comments: Pictures do not appear. Text is huge.
     (31868695) Comments: It did not even launch!!!!!!!!!!!!!!1
     (31860586) Comments: Failed during the program startup.
     (31859769) Comments: The first start of Netscape6.10B1
     (31826015) Comments: When starting program on Windows2000
adding crash, topcrash keywords. Also qawanted for help in repro.
Keywords: crash, qawanted, topcrash
Looking at the stack it looks like the users might have a corrupted/old/wrong
chrome package (bad XML file), which takes us into the XML error processing
code. Looking at the code I would guess we are reading beyond a buffer
start/end, and crash. This will probably not crash for everyone, even with the
same corrupted chrome (if that is the reason). If my analysis is correct, I
still don't know why we would end up reading beyond buffer boundaries...
Status: NEW → ASSIGNED
Priority: -- → P1
I have one idea that might hide the real problem, and prevent the crash. It is
not the correct fix, since I do not know what is causing the condition or if the
condition is null buffer or out of bounds read.

Anyway, if the buffer we get from Expat happens to be empty, we do not bail out.
We have assertions that you would hit in debug builds, but that is all. So, if
the buffer happened to be null it would be easy to bail out. The buffer
shouldn't be null, though.

The change is in nsExpatTokenizer.cpp, line 435:

-  if (mExpatParser) {
+  if (mExpatParser &&
+      ( (aBuffer && aLength) || (aBuffer == nsnull && aLength == 0) )) {
Well, the previous comment's patch would not fix it even if we had null
string... simply bail out if !aSourceBuffer in GetLine() would do it.

After digging through the talkback reports I found a little more clues:

1) Creating a new profile 2) delete this profile 3) Selecting an old one 4)
Launching N6 with this old profile. That's all ! => standard mail not working
wirh N6 !! Message sent : XML Parsing Error : undefined entity Location:
chrome://messenger/content/messenger.xul line number 128,column 24: <menuitem
label="&copyEmailAddress.label;" 

Notice the lower case first letter in copyEmailAddress.label. In my installation
it is uppercase, and in the source tree it is uppercase. Still, it should not
crash, of course.

When I made the typo in my source tree and try to open the mailnews window, I
get an error window with the same text as above, but no crash. I'll check with
Purify if it is reporting something weird going on.
Target Milestone: --- → mozilla0.9.3
Nope, Purify thinks everything is OK even with that typo.
I'll still try for 0.9.2...
Target Milestone: mozilla0.9.3 → mozilla0.9.2
Whiteboard: can't reproduce
Looking at the Talkback data, it seems only Netscape 6.1b1 has crashed with this
signature. There are actually two different stacks in the crash reports, the
other one shows bigger stack and the crash is reported on line 265 or so.

Also, some talkback comments suggest they saw an XML parser error in
messenger.xul, namely with entity "copyEmailAddress.label". There is no such
entity in the current version (the current version starts with upper case C). It
is of course possible that the user made a typo when s/he wrote the comment. The
last change in that file was May 19 (and that specific line was changed in
April, though the entity name did not change). There have been no changes in
nsExpatTokenizer.cpp that seem relevant (May 19 and June 2 are the latest).

I have made the typo in messenger.xul both in NS6.1b1 and current Mozilla and I
still can't reproduce, and Purify still thinks everything is ok (no out of
bounds reads or writes or stuff like that).

I have a few theories: the users are upgrading from NS 6 or some Mozilla
version, and the chrome has gotten whacked; the users have installed different
language packs that messed up their chrome; the users have installed some third
party packages that messed up their chrome or the users have downloaded
corrupted bits from us. It would still seem unlikely we would crash...
Filing in the X-Files folder. Send Mulder to investigate, and Scully to kill
this Bug.
Whiteboard: can't reproduce → [X-Files][can't reproduce]
And still a bit more info dug out from Talkback, mainly confirming that the
talkback reports have only come from Windows (NT 4 and Win2k mostly, one Win98).
About 60 reports so far.
No progress, moving to next milestone.
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Markus, since this seems to be important to you, can you reproduce this bug? I
am really stumped here...
Well, what I can tell so far:
I have been using mozilla build on a regular basis ... everything was fine 
since I installed the Netscape 6.1PR1
I always delete the whole directory and install the mozilla (talkback) version 
into it ... you must have received several talkback reports in the last minute.

What is strange that I did the same on another workstation, there mozilla is 
starting up without any problems. 8-)

Both workstations are running on win2k.
Sorry, that I cannot provide you with more details ... but if you need any info 
(registry, a file somewhere on the disk, etc.) just let me know.

This might be somehow related to bug 63472 or vice versa?!
any relation to bug 63472 or bug 59116 ?
See this too.
I previously had mozilla installed ... build from around end of June.
Then installed Netscape PR1. 
I could use both parallel, everything was fine.

Today I just downloaded build 2001070904 and on start-up it crashes every time!
I'm on Win2k Pro SP2.
think the summary should be changed into 
M092 startup crash [@ nsExpatTokenizer::GetLine]

Any comments?
Just started build 2001071408 and it worked!!! :)

Does anybody know why?
Alexander, did you install a Talkback enabled build when you installed build
2001070904?  Looking at the Talkback data, I only see this crash occurring with
Mozilla 0.9.1/Netscape6.10B1 build 2001060713.  There aren't any crashes with
recent builds in the Talkback database (not Mozilla 0.9.2, MozillaTrunk or
Netscape6.10 branch).

I'm guessing this was a M091 specific crash that is not longer an issue with
newer builds..so unless someone can reproduce this with either a build from the
MozillaTrunk or the Netscape6.10 branch, perhaps we should mark this one worksforme?
Ok, I am gonna mark this worksforme. If it raises its head in newer builds
(>NS6.1b1), feel free to reopen.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
reopening bug...since I see quite a few of these crashes with Netscape 6.20. 
Changing summary as well.  Here are a few recent crashes with N620:

  38170875     2001102218   Netscape6.20   Windows 98 4.10 build 67766222  
2001-11-17 18:36:26   nsExpatTokenizer::GetLine f009dba2   506   15019   
URL: http://www.liveupdate.com/   
Comments: Went to site to check if Crescendo would work with Netscape 6.2 I
always get an invalid character in configuration when 6.2 starts. This isn't
looking good.

  38170555     2001102218   Netscape6.20   Windows 98 4.10 build 67766222  
2001-11-17 18:26:59   nsExpatTokenizer::GetLine f009dba2   20   14513     
Comments: Browser was launching.

  38057773     2001102218   Netscape6.20   Windows 95 4.0 build 67109814  
2001-11-15 10:31:48   nsExpatTokenizer::GetLine 1523b19e   45   12374   
URL: canadiangeographic.com   
Comments: waiting for page to load

  37693021     2001102218   Netscape6.20   Windows NT 4.0 build 1381  
2001-11-07 07:43:25   nsExpatTokenizer::GetLine c4e7cd8d   39   39     
Comments: netscape crashed when opening the first time. I had a message with an
error with memory.

  37666825     2001102218   Netscape6.20   Windows 98 4.90 build 73010104  
2001-11-06 17:38:57   nsExpatTokenizer::GetLine 140c9bdb   100   2216     
Comments: Netscp6 has caused an error in GKPARSER.DLL

Those are just a few incidents with helpful comments...there are many others.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Summary: M091 startup crash [@ nsExpatTokenizer::GetLine] → N620 crash [@ nsExpatTokenizer::GetLine]
Target Milestone: mozilla0.9.3 → ---
If those users are seeing this when NS6.2 launches, they have broken
installations or something. Obviously we shouldn't crash, but the crash location
is in a code path that only gets called when there is an XML parsing error, i.e.
non-wellformed XML. Since it happens when they launch the error is in their XUL
- we are certainly not shipping broken XUL. They could have played with Mozilla,
or installed some nonstandard packages or something weird like that.

Anyway, a totally new XML parsing architecture has landed on the trunk last
week. Please see if this comes up again after 01/12 Mozilla trunk builds and
reopen if it does. Closing as worksforme.
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → WORKSFORME
QA Contact: petersen → rakeshmishra
Crash Signature: [@ nsExpatTokenizer::GetLine]
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.