Closed Bug 52397 Opened 24 years ago Closed 24 years ago

N601 Crash #8 [@ nsHTTPServerListener::OnDataAvailable]

Categories

(Core :: Networking: HTTP, defect, P1)

x86
All
defect

Tracking

()

VERIFIED FIXED
mozilla0.8

People

(Reporter: jwbaker, Assigned: darin.moz)

References

()

Details

(Keywords: crash, top100, topcrash, Whiteboard: [dogfood+])

Crash Data

Attachments

(1 file)

Linux debug build pulled 2000-09-12-19.  After clicking on a link, Moz crashes. 
To reproduce:

1) Start Moz
2) http://www.mozilla.org/quality/help/bugzilla-helper.html
3) click "Create a Bugzilla account"
4) click "Yes" for the cookie prompts, if any

Mozilla either crashes with this stack:

#0  0x409b671f in nsHTTPServerListener::OnDataAvailable (this=0x8578ac0, 
    channel=0x85c5464, context=0x869fc90, i_pStream=0x8578b78, 
    i_SourceOffset=0, i_Length=58) at nsHTTPResponseListener.cpp:467
#1  0x4094d574 in nsOnDataAvailableEvent::HandleEvent (this=0x86be8f8)
    at nsAsyncStreamListener.cpp:400
#2  0x4094c7e7 in nsStreamListenerEvent::HandlePLEvent (aEvent=0x86be920)
    at nsAsyncStreamListener.cpp:97
#3  0x401228bf in PL_HandleEvent (self=0x86be920) at plevent.c:575
#4  0x401226d1 in PL_ProcessPendingEvents (self=0x809dad0) at plevent.c:508
#5  0x40124511 in nsEventQueueImpl::ProcessPendingEvents (this=0x809daa8)
    at nsEventQueue.cpp:356
#6  0x40c1b290 in event_processor_callback (data=0x809daa8, source=6, 
    condition=GDK_INPUT_READ) at nsAppShell.cpp:158
#7  0x40c1aecf in our_gdk_io_invoke (source=0x8180390, condition=G_IO_IN, 
    data=0x8164bd8) at nsAppShell.cpp:58
#8  0x40dd720e in g_io_unix_dispatch (source_data=0x81803a8, 
    current_time=0xbffff6a0, user_data=0x8164bd8) at giounix.c:135
#9  0x40dd8717 in g_main_dispatch (dispatch_time=0xbffff6a0) at gmain.c:656
#10 0x40dd8cdb in g_main_iterate (block=1, dispatch=1) at gmain.c:877
#11 0x40dd8e59 in g_main_run (loop=0x8167438) at gmain.c:935
#12 0x40d07069 in gtk_main () at gtkmain.c:476
#13 0x40c1b979 in nsAppShell::Run (this=0x809ba30) at nsAppShell.cpp:335
#14 0x4069a50c in nsAppShellService::Run (this=0x809e6c8)
    at nsAppShellService.cpp:378
#15 0x80553e0 in main1 (argc=2, argv=0xbffff984, nativeApp=0x0)
    at nsAppRunner.cpp:958
#16 0x8055ab4 in main (argc=2, argv=0xbffff984) at nsAppRunner.cpp:1139
#17 0x403712e7 in __libc_start_main () from /lib/libc.so.6


Or asserts this assertion:


###!!! ASSERTION: NS_ENSURE_TRUE(aListener) failed: 'aListener', file
nsHTTPChannel.cpp, line 1231
###!!! Break: at file nsHTTPChannel.cpp, line 1231
Making visible.
Severity: normal → critical
Keywords: crash, nsbeta3
->ruslan
Assignee: gagan → ruslan
Whiteboard: [nsbeta3+]
Gagan? What's this new ApplyConversion stuff in the listener (it crashes there)? 
Did somebody checked in smth. big recetly?
I have a fix. 
Assignee: ruslan → gagan
*** Bug 52502 has been marked as a duplicate of this bug. ***
gagan, what is your fix?  I would like to apply your fix to bug 52667 which I 
suspect might be a dup of this bug.

Also, what is the hold-up with checking in the fix?
This might be a dup of 52667.  Would like to try out gagan's fix for that bug to 
see if it fixes this one.
Oops, ignore the second of my two comments above.  That one was meant to be put 
in the other bug report.
Another place that seems to duplicate this bug is http://www.maximumlinux.com;
just go to the main page, then go to any other page.  The "onUnload" event
handler opens a new window, and this causes the crash; loading the page
by entering it into the URL bar works just fine.  Oddly, if Mozilla can't
load the new page, it just freezes, instead of crashing.
I saved the MaximumLinux testcase to my local filesystem, and tested it off
of there.  With this setup, Mozilla crashes regardless of whether or not
the page for the popup-window can be loaded.  It gives the following
assesrtions and stack trace:

###!!! ASSERTION: don't do that: 'Not Reached', file
nsChromeProtocolHandler.cpp, line 434
###!!! Break: at file nsChromeProtocolHandler.cpp, line 434
we don't handle eBorderStyle_close yet... please fix me
WEBSHELL+ = 5
###!!! ASSERTION: no script global object: 'mScriptGlobalObject != nsnull', file
nsXULDocument.cpp, line 5635
###!!! Break: at file nsXULDocument.cpp, line 5635
###!!! ASSERTION: don't do that: 'Not Reached', file
nsChromeProtocolHandler.cpp, line 434
###!!! Break: at file nsChromeProtocolHandler.cpp, line 434
Document file:///usr/home/matt/develop/mozilla/notes/ loaded successfully
WEBSHELL+ = 6
Enabling Quirk StyleSheet
Setting content window
*** Pulling out the charset
JavaScript strict warning:
chrome://navigator/content/navigator.js line 433: reference to undefined
property window.arguments

JavaScript strict warning:
chrome://navigator/content/navigator.js line 456: reference to undefined
property window.arguments

in SetSecurityButton

Program received signal SIGSEGV, Segmentation fault.
0x2b6e2b4b in nsCOMPtr<nsIStreamListener>::assign_assuming_AddRef (
    this=0x8cee370, newPtr=0x8ba7e00) at ../../dist/include/nsCOMPtr.h:471
471                 NSCAP_RELEASE(this, oldPtr);
(gdb) where
#0  0x2b6e2b4b in nsCOMPtr<nsIStreamListener>::assign_assuming_AddRef (
    this=0x8cee370, newPtr=0x8ba7e00) at ../../dist/include/nsCOMPtr.h:471
#1  0x2b6e2bd5 in nsCOMPtr<nsIStreamListener>::assign_with_AddRef (
    this=0x8cee370, rawPtr=0x8ba7e00) at ../../dist/include/nsCOMPtr.h:848
#2  0x2b6e4709 in nsCOMPtr<nsIStreamListener>::operator= (this=0x8cee370,
    rhs=0x8ba7e00) at ../../dist/include/nsCOMPtr.h:583
#3  0x2b6d5170 in nsDocumentOpenInfo::RetargetOutput (this=0x8cee360,
    aChannel=0x8cac710,
    aSrcContentType=0x8927420 "application/http-index-format",
    aOutContentType=0x0, aStreamListener=0x8ba7e00) at nsURILoader.cpp:453
#4  0x2b6d4bd6 in nsDocumentOpenInfo::DispatchContent (this=0x8cee360,
    aChannel=0x8cac710, aCtxt=0x0) at nsURILoader.cpp:391
#5  0x2b6d3e45 in nsDocumentOpenInfo::OnStartRequest (this=0x8cee360,
    aChannel=0x8cac710, aCtxt=0x0) at nsURILoader.cpp:233
#6  0x2b4f90a0 in nsFileChannel::OnStartRequest (this=0x8cac710,
    transportChannel=0x8804050, context=0x0) at nsFileChannel.cpp:633
#7  0x2b47d964 in nsOnStartRequestEvent::HandleEvent (this=0x892b220)
    at nsAsyncStreamListener.cpp:210
#8  0x2b47d1c5 in nsStreamListenerEvent::HandlePLEvent (aEvent=0x8b8de90)
    at nsAsyncStreamListener.cpp:97
#9  0x2abd7b11 in PL_HandleEvent (self=0x8b8de90) at plevent.c:575
#10 0x2abd7900 in PL_ProcessPendingEvents (self=0x80ab730) at plevent.c:508
#11 0x2abd997e in nsEventQueueImpl::ProcessPendingEvents (this=0x80ab708)
    at nsEventQueue.cpp:356
#12 0x2b761a7b in event_processor_callback (data=0x80ab708, source=8,
    condition=GDK_INPUT_READ) at nsAppShell.cpp:158
#13 0x2b761665 in our_gdk_io_invoke (source=0x821f988, condition=G_IO_IN,
    data=0x83824d0) at nsAppShell.cpp:58
#14 0x2b93f8da in g_io_unix_dispatch ()
    at ../../../dist/include/nsIPageSequenceFrame.h:112
#15 0x2b940f96 in g_main_dispatch ()
    at ../../../dist/include/nsIPageSequenceFrame.h:112
#16 0x2b941561 in g_main_iterate ()
    at ../../../dist/include/nsIPageSequenceFrame.h:112
#17 0x2b941701 in g_main_run ()
    at ../../../dist/include/nsIPageSequenceFrame.h:112
#18 0x2b862fdc in gtk_main ()
    at ../../../dist/include/nsIPageSequenceFrame.h:112
#19 0x2b7621fa in nsAppShell::Run (this=0x810fa30) at nsAppShell.cpp:335
#20 0x2b2141be in nsAppShellService::Run (this=0x8108eb0)
    at nsAppShellService.cpp:407
#21 0x8055b1e in main1 (argc=1, argv=0x7ffff494, nativeApp=0x0)
    at nsAppRunner.cpp:958
#22 0x8056291 in main (argc=1, argv=0x7ffff494) at nsAppRunner.cpp:1139
This crash made it to the talkback topcrash list today, adding topcrash and [@ 
nsHTTPServerListener::OnDataAvailable] for topcrash tracking.
Keywords: topcrash
Summary: Crashing in nsHTTPServerListener::OnDataAvailable → Crashing in [@ nsHTTPServerListener::OnDataAvailable]
Raising to P2 
Priority: P3 → P2
Not holding PR3 for this change. If there's a super-safe patch you have to fix 
the top crasher for PR3, please bring to PDT, but otherwise, we can wait for 
RTM. Marking nsbeta3- and adding rtm nomination.
Keywords: rtm
Whiteboard: [nsbeta3+] → [nsbeta3-]
Not holding for topcrash?  This crash makes mozilla essentially unusable for my
day-to-day needs.  I have stopped using Mozilla because of this crash.
It looks like this might have been fixed with the fix that was checked
in for bug 52818.  The bug persisted for a while after I downloaded and
compiled the fix, so you might have to clear your disk cache after getting
the fix (the bug is a cache related problem).  Bug 52818 is also a topcrash.
Ooops, the bug is still there; sorry.  I'd found this new testcase:

1) Go to http://www.druglibrary.org/frames2/drugsense.htm
2) Click on the link there

And it was consistently crashing with the stacktrace for this bug.  The
crash went away for a while, then came back.

Interestingly, if you you go directly to the page linked to in the testcase,
http://www.mapinc.org/, Mozilla is just fine.  But clicking on the testcase
link opens up the URL in a new window, and the crash happens.

Also, I've found that while the crash is caused because mChannel is NULL
at this point:

        (void) (mChannel->GetApplyConversion(&bApplyConversion));

it isn't NULL when OnDataAvailable() is entered, and the places where
mChannel can be set never assign a NULL value to it.  This looks lik it might
be a case of memory trampling, except that it the crash is so conistent.
I've found that for my latest testcase, the crash is probably
happening because nsHTTPServerListener::OnDataAvailable() is being
called *recursively*; if you'll note the stack trace I'm going to
attach, you'll see that each time that nsHTTPServerListener::OnDataAvailable()

is invoked, it's invoked with the exact same value for "this".

As far as I can tell, what's happening is:

1) nsHTTPServerListener::OnDataAvailable() is invoked, which then
   invokes FinishedResponseHeaders().
2) Once the thread gets into FinishedResponseHeaders(), it tries to
   open up a new window, since the target for the link isn't the
   current window.
3) This causes any current native events to be dispatched.
4) This leads to nsHTTPServerListener::OnDataAvailable() being called
   again, for the exact same listener.
5) All the data from that connection is slurped up.
6) OnStopRequest() request is called, setting the mChannel member of
   the nsHTTPServerListener instance to NULL via NS_IF_RELEASE().
7) Things finish up, and FinishedResponseHeaders() returns control to
   nsHTTPServerListener::OnDataAvailable().
8) nsHTTPServerListener::OnDataAvailable() merrily proceeds along it's
   way, but mChannel is now NULL, so it's screwed.

NOTE: The line numbers for the stack trace aren't entirely correct,
      because of all the debugging statements I've added.
OnDataAvailable is expected to be called multiple times and the calls aren't
really identical-- the offsets/lengths are changing. So the behaviour you are
seeing is ok. Vidur recently checked in some changes today (evening) that relate
to the async reflow for layout and have a possible effect on this bug. I'd
suggest we recheck this bug with those changes in.
I know OnDataAvailable() is supposed to be called multiple times, but
it still seems to me that it should never be able to indirectly invoke itself
on the same instance.  As far as I can tell, it's this indirect recursion
that's causing the crash.
I crash here often when opening a new window.  For example, from mail/news
clicking on a link or other times when using "Open link in new window..."
Here's the patch that I've been using without any problems that I know of.  I
have a sneaking suspicion it's not quite right:


Index: nsHTTPResponseListener.cpp
===================================================================
RCS file: /cvsroot/mozilla/netwerk/protocol/http/src/nsHTTPResponseListener.cpp,v
retrieving revision 1.129
diff -u -r1.129 nsHTTPResponseListener.cpp
--- nsHTTPResponseListener.cpp  2000/09/05 21:22:33     1.129
+++ nsHTTPResponseListener.cpp  2000/10/01 18:12:25
@@ -462,7 +462,7 @@
         rv = NS_BINDING_ABORTED;
     }

-    if (NS_SUCCEEDED(rv) && i_Length) {
+    if (NS_SUCCEEDED(rv) && i_Length && mChannel) {
         PRBool bApplyConversion = PR_TRUE;
         (void) (mChannel->GetApplyConversion(&bApplyConversion));

*** Bug 54859 has been marked as a duplicate of this bug. ***
adding myself to the cc list.
this is beta3-? it's a topcrash! cc'ing others from a dupe bug
amen to that, pink.

I'm on the mtbf (mean time between failure) team.  this bug qualifies.

gagan, I'd be happy to start debugging and working on a fix for rtm.
*** Bug 55102 has been marked as a duplicate of this bug. ***
dogfood+ from dup bug 55102
Keywords: dogfood
Whiteboard: [nsbeta3-] → [nsbeta3-][dogfood+]
Our bug results from nested event queues being broken in XPCOM.
Depends on: 54371
No longer depends on: 54371
*** Bug 54597 has been marked as a duplicate of this bug. ***
*** Bug 52734 has been marked as a duplicate of this bug. ***
*** Bug 52061 has been marked as a duplicate of this bug. ***
*** Bug 52949 has been marked as a duplicate of this bug. ***
*** Bug 42624 has been marked as a duplicate of this bug. ***
I hit this bug withing 1/2 of browsing cnet and abcnews. Mostly links that cause
a window to pop up and content gets targetted to that cause this for me.

This has got to be rtm+ P1
*** Bug 52913 has been marked as a duplicate of this bug. ***
Depends on: 54371
*** Bug 52628 has been marked as a duplicate of this bug. ***
*** Bug 53163 has been marked as a duplicate of this bug. ***
PDT: The dup bug 52628 was [PDTP1][rtm need info].
Keywords: top100
Alright I am setting this P1 and need info. After we verify that the changes for 
the dependent bug indeed fix this, then we can close this. For most part it 
appears that we wouldn't have to change much in HTTP so there may not be a need 
for a rtm+/rtm++ to mark this fixed.  
Priority: P2 → P1
Whiteboard: [nsbeta3-][dogfood+] → [nsbeta3-][dogfood+][rtm need info]
*** Bug 53555 has been marked as a duplicate of this bug. ***
*** Bug 53355 has been marked as a duplicate of this bug. ***
*** Bug 55023 has been marked as a duplicate of this bug. ***
*** Bug 55045 has been marked as a duplicate of this bug. ***
Keywords: mostfreq
If you need more test cases, this was my test page for finding spoting the bug. 
http://www.zoned.net/~xkahn/crash.html
*** Bug 54987 has been marked as a duplicate of this bug. ***
Note additional test case in duplicate Bug #52949.

Note the testcase in bug 52628 (which has been marked as a duplicate of this
bug). It is seven lines of HTML and crashes mozilla upon load each time. There
is also a description (by me) in the bug describing why this happens.

*** Bug 55029 has been marked as a duplicate of this bug. ***
*** Bug 52277 has been marked as a duplicate of this bug. ***
I have applied the recent checkin to plevent.c, but I do NOT have the other 
three patches. With just the plevent fix, I am now locking up where I never 
locked up before. When I go to a secure site I get a window appear about the 
certificate that's being presented by the remote server. When I click on the 
NEXT button at the bottom of this window, the buttons all disappear but the 
window remains there and the browser is locked. This is reproducible. If I 
revert back to the previous libxpcom then this window is dismissed just fine and 
there is no lockup.
Just so you know, I don't completely have security working in the OpenVMS 
version, so this potentially could be some other problem, but I really don't 
think so, since I am getting a lot further than the first dialog (with the 
previous xpcom).

Also I only have a debug version at this point, so I haven't/can't test with 
non-debug.
*** Bug 54783 has been marked as a duplicate of this bug. ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
It seems to me that the part fixes of bug 54371 were the only reason this was
crashing. And since dougt landed those (yesterday) all these crasher have
disappeared. I am hence marking this fixed. Thanks to dougt, darin, danm -- bye
bye crasher-bug!
*** Bug 52314 has been marked as a duplicate of this bug. ***
in what builds should we expect to see this fix?
sorry forgot to add that- this should be fixed on both the trunk and the branch
(per dougt's comments on his checkin for bug 54371)
*** Bug 55177 has been marked as a duplicate of this bug. ***
*** Bug 52314 has been marked as a duplicate of this bug. ***
verified:
Linux rh6 2000100509
Status: RESOLVED → VERIFIED
*** Bug 56005 has been marked as a duplicate of this bug. ***
Reopening bug, this crash is still happening with the official RTM build on
Linux and is the #8 topcrasher with RTM on Windows...so changing OS to All.  The
stack trace is about the same as before:

Incident ID 21984337
nsHTTPServerListener::OnDataAvailable
[d:\builds\seamonkey\mozilla\netwerk\protocol\http\src\nsHTTPResponseListener.cpp,
line 469]
nsOnDataAvailableEvent::HandleEvent
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsAsyncStreamListener.cpp, line 406]
nsStreamListenerEvent::HandlePLEvent
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsAsyncStreamListener.cpp, line 106]
PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 581]
PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c,
line 517]
_md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line
1051]
KERNEL32.DLL + 0x2222f (0xbff9222f)
0x00658b54

Although the original steps to reproduce no longer seem to cause a crash, a lot
of talkback reports have been submitted for this crash.  Most of the entries
show the following two locations in the code at the top of the stack trace:

Source File :
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/netwerk/protocol/http/src/nsHTTPResponseListener.cpp
line : 454
and
Source File :
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/netwerk/protocol/http/src/nsHTTPResponseListener.cpp
line : 469

I have not been able to reproduce, but here are a few URLs and comments from the
talkback entries:

URL: About.com
	 URL: I
	 URL: jackpot.com
	 URL: real.com
	 URL: www.apple.com URL: www.bellsouth.net URL: www.email.com URL: www.juston.com
URL: www.yahoo.com Comment:  i was browsing the adult website
	Comment:  entered a nokia site to develope my own cell phone cover design.   This
thing is really unstable.  when may I expect a reply?
	Comment:  why are you having so much trouble with this new netscape i like it but
it keeps locking up / and i keep getting an error message. i dont like that ..
	Comment:  Closing out from the web.
	Comment:  Closed a pop up window
	Comment:  i tried to change the skin


Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Summary: Crashing in [@ nsHTTPServerListener::OnDataAvailable] → RTM Crash #9 [@ nsHTTPServerListener::OnDataAvailable]
http bugs to "Networking::HTTP"
Assignee: gagan → darin
Status: REOPENED → NEW
Component: Networking → Networking: HTTP
Target Milestone: --- → M19
Target Milestone: M19 → mozilla0.8
This is also the #8 topcrash for RTM release on Linux.  Changing OS to All.
Here are a couple of URLs and Comments from Linux users:

URL:(23133463) www.sfgate.com
Comment: (23133463) seems to crash the browser every single time I open it!

URL:(23155254) register.com
	
Comment: (23140452) location bar won't accept <enter> so tried CTL+L / open location
OS: Linux → All
Keywords: nsbeta3, rtmnsbeta1
Whiteboard: [nsbeta3-][dogfood+][rtm need info] → [dogfood+]
*** Bug 63585 has been marked as a duplicate of this bug. ***
Looks to me like 54371 fixed this... marking as dupe.

*** This bug has been marked as a duplicate of 54371 ***
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → DUPLICATE
missed the later comments on this bug... reopening.  sounds like there is
something extremely subtle going on here... investigating possible causes.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
The bug 62643 seems to be a dupe of this one.
Jpatel: this seems like a different bug. I recently added additional check for a 
null pointer which might have fixed that. Could you verify with a more recent 
build. It definitely is not this bug since this bug deals with the problems 
relating to events going out of sync. I am closing this as fixed again- pls. 
open a new one if you see problems with the sfgate.com site(s) which work fine 
for me. 
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
changing summary to N601 for tracking. gagan, this crash is still happening with
N601...but doesn't show up for any recent trunk builds.  so i'm assuming the
problem has been fixed since the N601 release...if it pops up again in any form,
i will open a new bug.  sorry about the confusion.
Summary: RTM Crash #9 [@ nsHTTPServerListener::OnDataAvailable] → N601 Crash #8 [@ nsHTTPServerListener::OnDataAvailable]
Reopening to track landing this patch on the N6 branch.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Marking FIXED (again)... seems the 6.0 branch is dead.
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
Verifying to get off the most frequent bug radar. After an HTTP rearch, this bug 
is well and truly dead.

Gerv
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsHTTPServerListener::OnDataAvailable]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: