Closed Bug 52628 Opened 24 years ago Closed 24 years ago

Crash in necko while loading home.netscape.com. [@ nsHTTPResponse::GetContentLength]

Categories

(Core :: Networking, defect, P1)

x86
Linux
defect

Tracking

()

VERIFIED DUPLICATE of bug 52397

People

(Reporter: alla, Assigned: gagan)

References

()

Details

(Keywords: crash, top100, topcrash, Whiteboard: [PDTP1][rtm need info])

Crash Data

Attachments

(1 file)

Using a debug build build from cvs a few hours ago i got a crash while loading home.netscape.com. This was on an SMP machine which was quite heavily loaded (running 10 concurrent mozilla instances). Here is the backtrace: #0 nsHTTPResponse::GetContentLength (this=0x0, o_ContentLength=0xbffff678) at /export/alex/mozilla/netwerk/protocol/http/src/nsHTTPResponse.cpp:115 #1 0x408ac92e in nsHTTPServerListener::OnDataAvailable (this=0x868d1f0, channel=0x87a789c, context=0x877fff8, i_pStream=0x877e7d8, i_SourceOffset=40117, i_Length=19) at /export/alex/mozilla/netwerk/protocol/http/src/nsHTTPResponseListener.cpp:557 #2 0x40837e16 in nsOnDataAvailableEvent::HandleEvent (this=0x41b25b48) at /export/alex/mozilla/netwerk/base/src/nsAsyncStreamListener.cpp:400 #3 0x40836ff6 in nsStreamListenerEvent::HandlePLEvent (aEvent=0x41bebed0) at /export/alex/mozilla/netwerk/base/src/nsAsyncStreamListener.cpp:97 #4 0x400eb06e in PL_HandleEvent (self=0x41bebed0) at /export/alex/mozilla/xpcom/threads/plevent.c:575 #5 0x400eaf09 in PL_ProcessPendingEvents (self=0x80a2308) at /export/alex/mozilla/xpcom/threads/plevent.c:508 #6 0x400ecb10 in nsEventQueueImpl::ProcessPendingEvents (this=0x80a22e0) at /export/alex/mozilla/xpcom/threads/nsEventQueue.cpp:356 #7 0x40a8d79f in event_processor_callback (data=0x80a22e0, source=7, condition=GDK_INPUT_READ) at /export/alex/mozilla/widget/src/gtk/nsAppShell.cpp:158 #8 0x40a8d45d in our_gdk_io_invoke (source=0x8212de0, condition=G_IO_IN, data=0x8212dd0) at /export/alex/mozilla/widget/src/gtk/nsAppShell.cpp:58 #9 0x40c4aaca in g_io_unix_dispatch () from /usr/lib/libglib-1.2.so.0 #10 0x40c4c186 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0 #11 0x40c4c751 in g_main_iterate () from /usr/lib/libglib-1.2.so.0 #12 0x40c4c8f1 in g_main_run () from /usr/lib/libglib-1.2.so.0 #13 0x40b745b9 in gtk_main () from /usr/lib/libgtk-1.2.so.0 #14 0x40a8e374 in nsAppShell::Run (this=0x80a6918) at /export/alex/mozilla/widget/src/gtk/nsAppShell.cpp:335 #15 0x405c5b35 in ?? () from /export/alex/mozilla/obj-debug/dist/bin/components/libnsappshell.so #16 0x8051b9b in main1 (argc=1, argv=0xbffffa74, nativeApp=0x0) at /export/alex/mozilla/xpfe/bootstrap/nsAppRunner.cpp:958 #17 0x805215b in main (argc=1, argv=0xbffffa74) at /export/alex/mozilla/xpfe/bootstrap/nsAppRunner.cpp:1139 #18 0x402fa9cb in __libc_start_main (main=0x8051ff8 <main>, argc=1, argv=0xbffffa74, init=0x804bcf0 <_init>, fini=0x805b634 <_fini>, rtld_fini=0x4000ae60 <_dl_fini>, stack_end=0xbffffa6c) at ../sysdeps/generic/libc-start.c:92 The this=0x0 looks bad, and is due to the nsHTTPServerListener::mChannel being NULL in the call in frame 1. Looking at the place of the call this looks a bit strange: rv = mResponseDataListener->OnDataAvailable(mChannel, mChannel->mResponseContext, i_pStream, 0, i_Length) ; PRInt32 cl = -1; mResponse->GetContentLength(&cl) ; <-- Crash here. The OnDataAvailable call returns NS_OK (rv==0), and the mChannel->mResponseContext doesn't crash. This means that this looks like some kind of SMP race where some other thread sets mChannel to NULL.
I'm able to reproduce this by continuosly reloading home.netscape.com until it crashes. The machine doesn't have to be heavily loaded either it seems.
I'm not sure if this is a race, the mResponseDataListener->OnDataAvailable call goes via a InterceptStreamListener to a nsPipeInputStream which calls nsHTTPFinalListener::OnDataAvailible(). Somewhere in these calls it is possible that the channel is somehow closed. I've been unable to find someone with a non-SMP machine to test this yet.
Summary: Crash in necko while loading home.netscape.com. SMP race? → Crash in necko while loading home.netscape.com.
Ok. This doesn't look like a SMP race any more. I've attached a minimal testcase that kill my mozilla dead on loading. It doesn't give exactly the same backtrace, but it looks like it is the same cause.
Keywords: crash
Severity: normal → critical
*** Bug 52776 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of 52257 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
oops wrong one.
Status: RESOLVED → REOPENED
Priority: P3 → P4
Resolution: DUPLICATE → ---
I can pretty easily reproduce this bug by turning on the image and cookie prompting in the prefs, then browsing around to sites that have lots of images and cookies. Porn for example. Anyway this bug is still present in a build from 2000-09-21-10 on Linux. Nominating for beta3
Keywords: nsbeta3, top100
approVing for beta3+ and raising priority to P2. (that V is intentional- see bug 53750)
Priority: P4 → P2
Whiteboard: [nsbeta3+]
Netscape.com is a Critical site to us at Netscape, so crashing on it is not acceptable. Upping priority to P1, and adding PDTP1 marking. Adding nomination for RTM
Keywords: rtm
Priority: P2 → P1
Whiteboard: [nsbeta3+] → [nsbeta3+][PDTP1]
I've been looking into this. Here is a short description of what happens when loading the attached example. While parsing the page we get to a SCRIPT tag, which is placed in the contentsink (HTMLContentSink::ProcessSCRIPTTag()). The content sink does some stuff, and then evaluates the script, after the evaluation returns it calls PostEvaluateScript(), but at this time the HTMLContentsink has already been freed and destructed, leading to a crash. The reason the HTMLContent sink i destroyed is because the script evaluates a call to window.open() which after some time enters nsXULWindow::CreateNewContentWindow(). CreateNewContentWindow() creates a top level window and starts loading navigator.xul into it. It then does loop waiting for the xul to finish loading. The problem is that this loop also turns on loading of the original window, which happens to load faster than the xul, leading to the original HTML content sink being destroyed before CreateContentWindow() returns. I'll try to figure out if there is a way to get only the xul loading events, but I'm in deep water here, and might need help.
The appshell loop gets all events, this includes io events from the original html page. It seems this is correct AppShell behaviour though, so how was this supposed to work? Maybe one could add some kind of reentrancy protection to nsHTMLContentSink? I need some help here.
Marking nsbeta3- as it appears that this happens very infrequently. bug already nominated for rtm.
Whiteboard: [nsbeta3+][PDTP1] → [nsbeta3-][PDTP1]
Infrequently? The attached page crashes every time it's loaded. The only reason home.netscape.com doesn't is because it uses the time to randomly select if and what page it should pop up. On average it should crash on home.netscape.com with 1/3 chance.
To make it clear: You don't need an SMP machine to crash. Reproducible: 100% Reproduce: 1. Click on the first attachment to this bug. Actual result: Crash
I agree this is a serious bug. Removing - to reassess.
Whiteboard: [nsbeta3-][PDTP1] → [PDTP1]
This crash has made it to the the talkback topcrash list. Adding topcrash keyword and [@ nsHTTPResponse::GetContentLength] for tracking.
Keywords: topcrash
Summary: Crash in necko while loading home.netscape.com. → Crash in necko while loading home.netscape.com. [@ nsHTTPResponse::GetContentLength]
*** Bug 54848 has been marked as a duplicate of this bug. ***
plus for rtm
Whiteboard: [PDTP1] → [PDTP1][rtm+]
*** Bug 53102 has been marked as a duplicate of this bug. ***
PDT marking [rtm need info] until patch and code reviews are available.
Whiteboard: [PDTP1][rtm+] → [PDTP1][rtm need info]
This looks like a dup of bug 52397 to me.
Yup this is a dup now. *** This bug has been marked as a duplicate of 52397 ***
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → DUPLICATE
verif. dup
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsHTTPResponse::GetContentLength]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: