Closed
Bug 52628
Opened 24 years ago
Closed 24 years ago
Crash in necko while loading home.netscape.com. [@ nsHTTPResponse::GetContentLength]
Categories
(Core :: Networking, defect, P1)
Tracking
()
People
(Reporter: alla, Assigned: gagan)
References
()
Details
(Keywords: crash, top100, topcrash, Whiteboard: [PDTP1][rtm need info])
Crash Data
Attachments
(1 file)
124 bytes,
text/html
|
Details |
Using a debug build build from cvs a few hours ago i got a crash while loading
home.netscape.com. This was on an SMP machine which was quite heavily loaded
(running 10 concurrent mozilla instances).
Here is the backtrace:
#0 nsHTTPResponse::GetContentLength (this=0x0, o_ContentLength=0xbffff678) at
/export/alex/mozilla/netwerk/protocol/http/src/nsHTTPResponse.cpp:115
#1 0x408ac92e in nsHTTPServerListener::OnDataAvailable (this=0x868d1f0,
channel=0x87a789c, context=0x877fff8, i_pStream=0x877e7d8,
i_SourceOffset=40117, i_Length=19) at
/export/alex/mozilla/netwerk/protocol/http/src/nsHTTPResponseListener.cpp:557
#2 0x40837e16 in nsOnDataAvailableEvent::HandleEvent (this=0x41b25b48) at
/export/alex/mozilla/netwerk/base/src/nsAsyncStreamListener.cpp:400
#3 0x40836ff6 in nsStreamListenerEvent::HandlePLEvent (aEvent=0x41bebed0) at
/export/alex/mozilla/netwerk/base/src/nsAsyncStreamListener.cpp:97
#4 0x400eb06e in PL_HandleEvent (self=0x41bebed0) at
/export/alex/mozilla/xpcom/threads/plevent.c:575
#5 0x400eaf09 in PL_ProcessPendingEvents (self=0x80a2308) at
/export/alex/mozilla/xpcom/threads/plevent.c:508
#6 0x400ecb10 in nsEventQueueImpl::ProcessPendingEvents (this=0x80a22e0) at
/export/alex/mozilla/xpcom/threads/nsEventQueue.cpp:356
#7 0x40a8d79f in event_processor_callback (data=0x80a22e0, source=7,
condition=GDK_INPUT_READ)
at /export/alex/mozilla/widget/src/gtk/nsAppShell.cpp:158
#8 0x40a8d45d in our_gdk_io_invoke (source=0x8212de0, condition=G_IO_IN,
data=0x8212dd0) at /export/alex/mozilla/widget/src/gtk/nsAppShell.cpp:58
#9 0x40c4aaca in g_io_unix_dispatch () from /usr/lib/libglib-1.2.so.0
#10 0x40c4c186 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0
#11 0x40c4c751 in g_main_iterate () from /usr/lib/libglib-1.2.so.0
#12 0x40c4c8f1 in g_main_run () from /usr/lib/libglib-1.2.so.0
#13 0x40b745b9 in gtk_main () from /usr/lib/libgtk-1.2.so.0
#14 0x40a8e374 in nsAppShell::Run (this=0x80a6918) at
/export/alex/mozilla/widget/src/gtk/nsAppShell.cpp:335
#15 0x405c5b35 in ?? () from
/export/alex/mozilla/obj-debug/dist/bin/components/libnsappshell.so
#16 0x8051b9b in main1 (argc=1, argv=0xbffffa74, nativeApp=0x0) at
/export/alex/mozilla/xpfe/bootstrap/nsAppRunner.cpp:958
#17 0x805215b in main (argc=1, argv=0xbffffa74) at
/export/alex/mozilla/xpfe/bootstrap/nsAppRunner.cpp:1139
#18 0x402fa9cb in __libc_start_main (main=0x8051ff8 <main>, argc=1,
argv=0xbffffa74, init=0x804bcf0 <_init>, fini=0x805b634 <_fini>,
rtld_fini=0x4000ae60 <_dl_fini>, stack_end=0xbffffa6c) at
../sysdeps/generic/libc-start.c:92
The this=0x0 looks bad, and is due to the nsHTTPServerListener::mChannel being
NULL in the call in frame 1. Looking at the place of the call this looks a bit
strange:
rv = mResponseDataListener->OnDataAvailable(mChannel,
mChannel->mResponseContext, i_pStream, 0, i_Length) ;
PRInt32 cl = -1;
mResponse->GetContentLength(&cl) ; <-- Crash here.
The OnDataAvailable call returns NS_OK (rv==0), and the
mChannel->mResponseContext doesn't crash. This means that this looks like some
kind of SMP race where some other thread sets mChannel to NULL.
I'm able to reproduce this by continuosly reloading home.netscape.com until it
crashes. The machine doesn't have to be heavily loaded either it seems.
I'm not sure if this is a race, the mResponseDataListener->OnDataAvailable call
goes via a InterceptStreamListener to a nsPipeInputStream which calls
nsHTTPFinalListener::OnDataAvailible(). Somewhere in these calls it is possible
that the channel is somehow closed.
I've been unable to find someone with a non-SMP machine to test this yet.
Summary: Crash in necko while loading home.netscape.com. SMP race? → Crash in necko while loading home.netscape.com.
Ok. This doesn't look like a SMP race any more. I've attached a minimal testcase
that kill my mozilla dead on loading.
It doesn't give exactly the same backtrace, but it looks like it is the same
cause.
Updated•24 years ago
|
Severity: normal → critical
*** This bug has been marked as a duplicate of 52257 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
oops wrong one.
Status: RESOLVED → REOPENED
Priority: P3 → P4
Resolution: DUPLICATE → ---
Comment 8•24 years ago
|
||
I can pretty easily reproduce this bug by turning on the image and cookie
prompting in the prefs, then browsing around to sites that have lots of images
and cookies. Porn for example. Anyway this bug is still present in a build
from 2000-09-21-10 on Linux.
Nominating for beta3
approVing for beta3+ and raising priority to P2. (that V is intentional- see bug
53750)
Priority: P4 → P2
Whiteboard: [nsbeta3+]
Comment 10•24 years ago
|
||
Netscape.com is a Critical site to us at Netscape, so crashing on it is not
acceptable.
Upping priority to P1, and adding PDTP1 marking.
Adding nomination for RTM
Reporter | ||
Comment 11•24 years ago
|
||
I've been looking into this. Here is a short description of what happens when
loading the attached example.
While parsing the page we get to a SCRIPT tag, which is placed in the
contentsink (HTMLContentSink::ProcessSCRIPTTag()). The content sink does some
stuff, and then evaluates the script, after the evaluation returns it calls
PostEvaluateScript(), but at this time the HTMLContentsink has already been
freed and destructed, leading to a crash.
The reason the HTMLContent sink i destroyed is because the script evaluates a
call to window.open() which after some time enters
nsXULWindow::CreateNewContentWindow(). CreateNewContentWindow() creates a top
level window and starts loading navigator.xul into it. It then does loop waiting
for the xul to finish loading.
The problem is that this loop also turns on loading of the original window,
which happens to load faster than the xul, leading to the original HTML content
sink being destroyed before CreateContentWindow() returns.
I'll try to figure out if there is a way to get only the xul loading events, but
I'm in deep water here, and might need help.
Reporter | ||
Comment 12•24 years ago
|
||
The appshell loop gets all events, this includes io events from the original
html page. It seems this is correct AppShell behaviour though, so how was this
supposed to work?
Maybe one could add some kind of reentrancy protection to nsHTMLContentSink?
I need some help here.
Comment 13•24 years ago
|
||
Marking nsbeta3- as it appears that this happens very infrequently. bug already
nominated for rtm.
Whiteboard: [nsbeta3+][PDTP1] → [nsbeta3-][PDTP1]
Reporter | ||
Comment 14•24 years ago
|
||
Infrequently? The attached page crashes every time it's loaded. The only reason
home.netscape.com doesn't is because it uses the time to randomly select if and
what page it should pop up. On average it should crash on home.netscape.com with
1/3 chance.
Comment 15•24 years ago
|
||
To make it clear: You don't need an SMP machine to crash.
Reproducible: 100%
Reproduce:
1. Click on the first attachment to this bug.
Actual result:
Crash
Assignee | ||
Comment 16•24 years ago
|
||
I agree this is a serious bug. Removing - to reassess.
Whiteboard: [nsbeta3-][PDTP1] → [PDTP1]
Comment 17•24 years ago
|
||
This crash has made it to the the talkback topcrash list. Adding topcrash
keyword and [@ nsHTTPResponse::GetContentLength] for tracking.
Keywords: topcrash
Summary: Crash in necko while loading home.netscape.com. → Crash in necko while loading home.netscape.com. [@ nsHTTPResponse::GetContentLength]
Comment 18•24 years ago
|
||
*** Bug 54848 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 20•24 years ago
|
||
*** Bug 53102 has been marked as a duplicate of this bug. ***
Comment 21•24 years ago
|
||
PDT marking [rtm need info] until patch and code reviews are available.
Whiteboard: [PDTP1][rtm+] → [PDTP1][rtm need info]
Comment 22•24 years ago
|
||
This looks like a dup of bug 52397 to me.
Assignee | ||
Comment 23•24 years ago
|
||
Yup this is a dup now.
*** This bug has been marked as a duplicate of 52397 ***
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → DUPLICATE
Updated•13 years ago
|
Crash Signature: [@ nsHTTPResponse::GetContentLength]
You need to log in
before you can comment on or make changes to this bug.
Description
•