Closed Bug 24008 Opened 25 years ago Closed 25 years ago

[DOGFOOD] Crash loading any web pages on particular configurations

Categories

(Core :: Networking, defect, P3)

x86
Windows 95
defect

Tracking

()

VERIFIED DUPLICATE of bug 23709

People

(Reporter: lchiang, Assigned: gagan)

References

Details

(Whiteboard: [PDT+])

Attachments

(1 file)

[DOGFOOD] Crash loading any web pages on particular configurations

Win32, 2000-01-13-15-m13 build.
133mhz system with 64MB RAM

1) Install seamonkey.exe
2) Start the browser
3) Crash loading the page home.netscape.com.  Same problem starting mail when
the mail start page loads.

Note:  We got around this crash for mail by turning the mail start page off.
Also, local pages loaded via File | Open work fine.

We tried this on two different systems of similar configuration and had the
problem.  Also, asj@ipa.net have reported a similar problem on his configuration
over the last few days.

Not sure how to narrow this down further.  Talkback doesn't work so we can't
send in a report with a stack trace.

Per Jan, assigning to networking component.
If you need to see this, contact suresh@netscape.com
Whiteboard: [PDT+]
PDT+
It sounds like this bug and bug 23709 are dupes or at least very close.  I have
a Windows 95 P120 with 64MB RAM.

I have some other details in my post on 23709.
Target Milestone: M14
Not enough information to tell anything about this one.

Lisa: Can you try this with a debug build?
I'll ask Suresh to try.  He has VC++ installed on that machine (which is our
performance testing machine), but needs to get CVS installed to get a debug
build.  He'll try sharing to a machine with a debug build first.
I tried to load some web pages with a Shared debug build. It loads some pages
like http://www.mozilla.org, http://www.yahoo.com,  fine.

Some pages like http://home.netscape.com, http://www.cnn.com results in a crash
with the following stack trace.

nsSocketTransport::OnFound(nsSocketTransport * const 0x02043178, nsISupports *
0x00000000, const char * 0x02042e90, nsHostEnt * 0x016afda8) line 1451 + 5 bytes
nsDNSLookup::CallOnFound(nsDNSLookup * const 0x020437b0) line 293
nsDNSEventProc(HWND__ * 0x01ae96d0, unsigned int 57272, unsigned int 1, long 3)
line 391
KERNEL32! bff73663()
KERNEL32! bff928e0()
Lisa, are you seeing this on the build 2000011908?  I can reproduce this
on the 01-13-15 build except I see the crash after hitting home.netscape.com
then www.cnn.com and back.  It takes typing in a few urls.  Today's build
didn't crash for me with that test and on the systems I have.
If this is related to bug 23709
I am still seeing the crash in build 2000011908.  It does however take longer to
crash.  Just before it crashes my console screen fills with
nsLayoutHistoryState::GetState, ERROR getting History state for the key
that line over and over
then
nsLayoutHistoryState::GetState, ERROR getting History state for the key
FindShortcut: in='www.news.com'  out='null'
Error loading URL http://www.netscape.com/
Document: Done (102.55 secs)
I'm still seeing the crash while loading home.netscape.com and cnn.com using
2000-01-19-13-M13 windows commercial build.
try moving your history.dat aside (don't remove it, it may be worth saving for
debugging purposes.)

it will be in Users50/<profile name>/history.dat

try existing 5.0
move that file to the side (rename it to history.dat-old or something)
restart

do you still crash?
Seth, I still see the crash after renaming history.dat file.
I am running Windows 95 4.00.950 B on a Pentium with 48 MB RAM and I also see
this crash using build 2000011908 (from the zip not the installer) with the
following:
MOZILLA caused an invalid page fault in
module MSVCRT.DLL at 014f:78001648.
Registers:
EAX=0127f950 CS=014f EIP=78001648 EFLGS=00010212
EBX=00002000 SS=0157 ESP=0063f934 EBP=0063f93c
ECX=00000800 DS=0157 ESI=0127d950 FS=4967
EDX=00000000 ES=0157 EDI=00000000 GS=0000
Bytes at CS:EIP:
f3 a5 ff 24 95 28 17 00 78 8b c7 ba 03 00 00 00
Stack dump:
0155b1b4 00002000 0063f960 60c26fed 00000000 0127d950 00002000 015ebd60 0063f9d0
0063f9d0 00000000
0063f9c4 605778b0 0155b1b4 0127d950 00002000

I can reproduce this crash very simply.  Go to bugzilla.  Search existing bugs.
Submit the default query (No milestones selected).  It starts to render the
query results, the scrollbar appears, and 3 seconds later it crashes.  The
console screen fills with the error getting history state when the query page is
reached, not when the query results page is being rendered and crashing.

Is this bug really M14 and not M13?  A serious crasher like this, PDT+, and it
is not M13 - so much for me using Mozilla as Dogfood (which I have been doing
with the January 9 nightly build).
I just tried it and got this crash:

memcpy(unsigned char * 0x00000000, unsigned char * 0x03923f80, unsigned long
0x00002000) line 171
nsStorageStream::Write(nsStorageStream * const 0x031f7204, const char *
0x03923f80, unsigned int 0x00002000, unsigned int * 0x0012faa8) line 167 + 20
bytes
MemCacheWriteStreamWrapper::Write(MemCacheWriteStreamWrapper * const 0x032a0a60,
const char * 0x03923f80, unsigned int 0x00002000, unsigned int * 0x0012faa8)
line 285 + 38 bytes
CacheOutputStream::Write(CacheOutputStream * const 0x032a1b00, const char *
0x03923f80, unsigned int 0x00002000, unsigned int * 0x0012faa8) line 75 + 38
bytes
InterceptStreamListener::write(char * 0x03923f80, unsigned int 0x00002000) line
1141
InterceptStreamListener::Read(InterceptStreamListener * const 0x03300694, char *
0x03923f80, unsigned int 0x00002000, unsigned int * 0x0012fcac) line 1128 + 21
bytes
nsMultiMixedConv::OnDataAvailable(nsMultiMixedConv * const 0x033104d0,
nsIChannel * 0x031f12e0, nsISupports * 0x00000000, nsIInputStream * 0x03300694,
unsigned int 0x00000000, unsigned int 0x00002000) line 125 + 24 bytes
nsDocumentOpenInfo::OnDataAvailable(nsDocumentOpenInfo * const 0x031f6c20,
nsIChannel * 0x031f12e0, nsISupports * 0x00000000, nsIInputStream * 0x03300694,
unsigned int 0x00000000, unsigned int 0x00002000) line 192 + 46 bytes
InterceptStreamListener::OnDataAvailable(InterceptStreamListener * const
0x03300690, nsIChannel * 0x031f12e0, nsISupports * 0x00000000, nsIInputStream *
0x032be758, unsigned int 0x00000000, unsigned int 0x00002000) line 1113
nsHTTPResponseListener::OnDataAvailable(nsHTTPResponseListener * const
0x032a3880, nsIChannel * 0x031f64a4, nsISupports * 0x031f12e0, nsIInputStream *
0x032be758, unsigned int 0x00102000, unsigned int 0x00002000) line 195 + 58
bytes
nsOnDataAvailableEvent::HandleEvent(nsOnDataAvailableEvent * const 0x045c21c0)
line 370
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x045c3fd0) line 93 + 12 bytes
PL_HandleEvent(PLEvent * 0x045c3fd0) line 522 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00c2dac0) line 483 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x01f60ca8, unsigned int 0x0000c080, unsigned int
0x00000000, long 0x00c2dac0) line 951 + 9 bytes
USER32! 77e71820()

Still investigating...
I don't have a current build, so I'm just looking at the nsStorageStream.cpp
source code...

The target address of the memcpy() is the value of instance variable
mWriteCursor, which is apparently NULL at that time.  mWriteCursor is
initialized to NULL in the nsStorageStream constructor and there are five
assignment statements that write to mWriteCursor.

There's a check for a NULL mWriteCursor within the conditional block that begins
at line 156, so we know that that block (which contains one of the assignments
to mWriteCursor) was not executed.  There are two explicit assignments of NULL
to mWriteCursor, which would seem make them very suspicious.  However, these
assignments also cause mSegmentEnd to be set to NULL which, in turn, would cause
the conditional block at line 156 to be executed, but we know that was not
executed.

By process of elimination, the two assignments to mWriteCursor within
nsStorageStream::Seek() must be the bad ones.  Beyond that, I'm at a loss.
I have a fix. Basically in the out-of-memory case in Write, we need to reset 
mSegmentEnd, or else the next Write gets out of sync and dies.

After applying this fix, I was able to successfully bring up 5183 bugs in one 
page, although it takes several minutes to do this and the cpu is pegged the 
entire time. I suspect that there's some out-of-memory thrashing condition that 
we need to investigate. But at least this fix fixes the crash. 
Warren, can you post the patch?  Get a reviewer, and if it still looks like 
a fix (if not the last fix), I'll a=brendan@mozilla.org it for M13.

/be
I've submitted the patch. It was reviewed by Gordon -- he favored keeping the 
logging code since it might come in useful again someday. I could take it out 
if people want something more minimal. It would be great if Fur could review 
this too.
Nice job, warren.  r=fur
*** Bug 23849 has been marked as a duplicate of this bug. ***
Fix checked into tip... awaiting branch tag for M13.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Checked into the M13 branch too.
Status: RESOLVED → REOPENED
re-opening to change status to dupe

*** This bug has been marked as a duplicate of 23709 ***
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: FIXED → DUPLICATE
Discussed with engineering - this is actually a duplicate of 23709
The bug repaired was actually another problem.   
Status: RESOLVED → VERIFIED
I thought we'd keep this bug to represent the problem that was fixed, and the 
other one for the nsSocketTransport::OnFound problem, but whatever.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: