Closed Bug 76125 Opened 25 years ago Closed 24 years ago

browser busting for 4 hours hangs system and then crashes browser - Trunk [@ nsCacheService::DeactivateAndClearEntry] [@ nsHTMLInputElement::SetFocus]

Categories

(Core :: Networking: Cache, defect)

x86
Windows 98
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: chofmann, Assigned: chofmann)

References

()

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(3 files)

After about 4 hours of running browser buster I get a blue screen with this message from windows. "Terminating the tread due to a stack overflow problem. A Vxb, possibily recently installed, has consumed to much stack space. Increase the settings of Min SPs in System.ini or remove the recently installed Vxb's. There are currently 7 SP's allocated. Press any key to continue." If I hit the "any" key ;-) my machine comes back to life, but then if I try to kill of any browser windows I immediately crash with this stack trace. I've been able to repeat this twice.. Incident ID 29148088 Trigger Time 2001-04-15 19:36:34 Email Address chofmann User Comments browser buster after 4 hours. Build ID 2001041309 Product ID Netscape6.50 Platform ID Win32 Stack Trace nsCacheService::DeactivateAndClearEntry [d:\builds\seamonkey\mozilla\netwerk\cache\src\nsCacheService.cpp, line 1063] PL_DHashTableEnumerate [d:\builds\seamonkey\mozilla\xpcom\ds\pldhash.c, line 460] nsCacheService::ClearActiveEntries [d:\builds\seamonkey\mozilla\netwerk\cache\src\nsCacheService.cpp, line 1047] nsCacheService::Shutdown [d:\builds\seamonkey\mozilla\netwerk\cache\src\nsCacheService.cpp, line 288] nsCacheService::Observe [d:\builds\seamonkey\mozilla\netwerk\cache\src\nsCacheService.cpp, line 1071] nsObserverService::Notify [d:\builds\seamonkey\mozilla\xpcom\ds\nsObserverService.cpp, line 238] NS_ShutdownXPCOM [d:\builds\seamonkey\mozilla\xpcom\build\nsXPComInit.cpp, line 449] NETSCP6.EXE + 0x11d1 (0x004011d1) NETSCP6.EXE + 0x2c07 (0x00402c07) KERNEL32.DLL + 0x1b560 (0xbff8b560) KERNEL32.DLL + 0x1b412 (0xbff8b412) KERNEL32.DLL + 0x19dd5 (0xbff89dd5)
here are some addtional stats from my system at the time of the crash.. Operating System: Windows 98 4.10 build 67766446 Service Pack: A Physical Memory: 128.0 MB Memory Status: Available Total Physical Memory: 3.4 MB 128.0 MB Page File: 1737.9 MB 1920.5 MB Virtual Memory: 1734.3 MB 2044.0 MB we could think about bumping up the SPs in the System.ini in the installer, but I'm thinking we we should figure out a way to get by with system default settings. 4.x runs for many more hours than the 4 I'm seeing in the trunk current builds..
Any idea what thread got terminated? What else got clobbered when the thread was terminated? Could you continue browsing, or did you only try to shutdown? I can add some simple checks (which "should" be unnecessary) in nsCacheService::DeactivateAndClearEntry, but when a thread gets killed it could leave quite a bit of the browser in a bad state that could be difficult to protect against. The real question is what caused the stack overflow. This stack trace may be the twisted pile of hot metal from a train wreck, but we really want to know is: who put the penny on the tracks.
I went looking for other instances of crashes in nsCacheService::DeactivateAndClearEntry on the Trunk. here is what I came up with. Looks like stephend@netscape.com might be able to reproduce a similar kind of crash.. stephend, is that news/foo.com (crash on startup) reproducable? 2001041015 2001-04-11 Windows 98 4.90 build 73010104 5321 nsCacheService::DeactivateAndClearEntry 5f451f1a 2001041109 2001-04-12 Windows 98 4.10 build 67766446 34639 nsCacheService::DeactivateAndClearEntry 4ee1a9bd 2001041306 2001-04-13 Windows NT 5.0 build 2195 57 stephend@netscape.com nsCacheService::DeactivateAndClearEntry 3b4098d1 news/foo.com (crash on startup) 2001041306 2001-04-14 Windows NT 5.0 build 2195 1596 stephend@netscape.com nsCacheService::DeactivateAndClearEntry 3b4098d1 news/foo.com (crash on startup) 2001041322 2001-04-14 Windows 98 4.10 build 67766446 2747 nsCacheService::DeactivateAndClearEntry e1ea8076 2001041322 2001-04-14 Windows 98 4.10 build 67766446 32143 nsCacheService::DeactivateAndClearEntry 46645697 2001041306 2001-04-14 Windows NT 5.0 build 2195 241 stephend@netscape.com nsCacheService::DeactivateAndClearEntry 3b4098d1 news/foo.com (crash on startup)
Yes, 100%, here's how to reproduce it. Enter a news server like news/foo.com into the account wizard for an NNTP account. You will crash when you hit "Finish" on the account. Re-launch Mozilla, and you will crash everytime, until you delete that profile...
my first attempt at doing this resulted in the exact same stack track... Incident ID: 29127931... we will need some folks to run browser buster and try this again to figure out if they could continue browsing after exhausting the default number of SP's. my guess is that they won't be able to do to much... the system seemed very unstable at that point, and even after I shutdown mozilla I had to hit the power button to get the system back on its feet to do other work.
adding some nspr and xpcom folks to the cc list.
The interesting thing about the crash on startup is that nsCacheService::DeactivateAndClearEntry() is only called when nsCacheService shuts down. Why is it asked to shutdown at startup?
Talkback was down for a good portion of the day Friday, so I think I might have accidentally left that comment "news/foo.com" in the Agent's field. I've already filed the crasher that I mention, it's bug 75976. Sorry for the noise.
After I reproduce the crash stephend mentioned, I was able to relaunch the browser, but it wouldn't browse. After deleting the *old* Cache directory, I was able to relaunch and successfully browse. This is similar to bug 75259, in which sspitzer reports loss of browsing, regained after deleting both Cache and NewCache directories. I already have a bug for removing the old cache from the build (bug 72528). I have not been able to reproduce the crash in nsCacheService::DeactivateAndClearEntry(). Should this bug be about fixing DeactivateAndClearEntry() or about whatever is causing mozilla to crash after 4 hours?
lets keep this as the master tracking bug for all problems related to breaking through the 4 hour barrier... set up other dependency bugs off this one. I'm hosed now too after that last attempt at an endurance session. When I try to start up my disk just grinds away. If I try to see what is going on with task manager it show Netscp6.exe not responding. If I kill it off the disk ginds for about 10 seconds longer then it goes away. I haven't been able to start up in several atttempts. Is there some place I can read about that tells me how to remove the 'old' cache? Should this bump up the priority and the need to get the old cache removed before we go out with a milestone release or any big betas? maybe my plea to get more folks running browser buster on idle machines over night will help us to gather some data that might tell us if any one that trys to run a single session for several hours will encounter this problem.
ok, I'm back on my feet and can start up the friday build again... turns out removing the old cache seemingly had no effect. .\Users50\chofmann>mv cache oldcache didn't seem to have any effect. then I ran .\Users50\chofmann>mv newcache oldnewcache and the browser startup fine... I notice that there were a huge number of files in the newcache directory.
Depends on: 76293
started the browser buster at 4pm on 4/16- it was still running at 7am 4/17 many alert messages on desktop -repeats of 3- www.filez.com, ww.stocksite.com could not be found and connection refused at www.the-park.com many popup ads browser was still moving through sites until I closed- crash on close relaunch was ok
i started the browser buster last night around 9pm. it's now nearly noon and it's still going. it's on page 611. Nothing really to note except a couple sites no longer exists - lots of dialogs saying "Couldn't find www.filez.com" or something like that. I used win2000 build 2001041604.
Update: I have the same findings as jelwell. Win32 (NT) build 2001-04-16-12-trunk commercial build. 400mhz system with 256K RAM. Ran overnight and cycled through 504 pages. Still going, but I ended it since I had to get some work done :-) Very good.
Target Milestone: --- → mozilla0.9
Could this be a Win9x thing? Everyone who says they are ok seems to be on Windows NT/2000. It could be some sort of resource leak which will affect win9x more since they are quite limited on some resources.
Using builds 2001-04-18 I ran browser buster overnight on my WinMe system, a Win98 lab system (64mb ram 133mhz) my Mac 8.6 system and my linux system, when I came in this is what I found: WinME and Win98 crashed- last page for Winme was url93 and last page for Win98 was url9. Talkback stack for both was the same, but not like what was reported in this bug my talkback report listed below. Linux was not running the app anymore, not sure if it crashed but the successfully loade page was www.barnesandnoble.com/promo/coupon/popups/ftcfreeship_popup.asp and the last line in console was GGdk - ERROR ** : BadDrawable (invalid Pixmap or Window parameter) serial 15 error_code 9 request_code 55 minor_code 0 Mac was stuck on www.stocksite.com and frozen. Had to Force Quit Netscape 6to gain use. On all systems I was able to relauch and browse. Stack for windows crash: Call Stack: (Signature = nsHTMLInputElement::SetFocus b65f9f42) nsHTMLInputElement::SetFocus [d:\builds\seamonkey\mozilla\content\html\content\src\nsHTMLInputElement.cpp, line 689] nsEventStateManager::PreHandleEvent [d:\builds\seamonkey\mozilla\content\events\src\nsEventStateManager.cpp, line 487] PresShell::HandleEventInternal [d:\builds\seamonkey\mozilla\layout\html\base\src\nsPresShell.cpp, line 5402] PresShell::HandleEvent [d:\builds\seamonkey\mozilla\layout\html\base\src\nsPresShell.cpp, line 5334] nsView::HandleEvent [d:\builds\seamonkey\mozilla\view\src\nsView.cpp, line 377] nsViewManager::DispatchEvent [d:\builds\seamonkey\mozilla\view\src\nsViewManager.cpp, line 2039] GlobalWindowImpl::Activate [d:\builds\seamonkey\mozilla\dom\src\base\nsGlobalWindow.cpp, line 2790] nsWebShellWindow::HandleEvent [d:\builds\seamonkey\mozilla\xpfe\appshell\src\nsWebShellWindow.cpp, line 461] nsWindow::DispatchEvent [d:\builds\seamonkey\mozilla\widget\src\windows\nsWindow.cpp, line 708] nsWindow::DispatchWindowEvent [d:\builds\seamonkey\mozilla\widget\src\windows\nsWindow.cpp, line 725] nsWindow::DispatchFocus [d:\builds\seamonkey\mozilla\widget\src\windows\nsWindow.cpp, line 4213] nsWindow::ProcessMessage [d:\builds\seamonkey\mozilla\widget\src\windows\nsWindow.cpp, line 3146] nsWindow::WindowProc [d:\builds\seamonkey\mozilla\widget\src\windows\nsWindow.cpp, line 960] KERNEL32.DLL + 0x363b (0xbff7363b) KERNEL32.DLL + 0x242e7 (0xbff942e7)
CC'ing Hyatt, the Windows crash is in the middle of the most recent patch to nsHTMLInputElement.cpp.
Waterson checked in an equivalent patch as bug 76715. We should not be crashing in nsHTMLInputElement::SetFocus anymore.
Adding topcrash keyword and Trunk [@ nsCacheService::DeactivateAndClearEntry] [@ nsHTMLInputElement::SetFocus] to summary for tracking. The nsCacheService::DeactivateAndClearEntry crash last occured with build 2001041709, so it is no longer happening, but the latest stack trace posted by Esther shows a new stack signature which is currently a topcrash. According to Talkback, the last crash in nsHTMLInputElement::SetFocus happened with build 2001041908, so if it's fixed, someone needs to verify this crash with a newer build. That crash is near the top of the current Talkback topcrash list.
Keywords: topcrash
Summary: browser busting for 4 hours hangs system and then crashes browser → browser busting for 4 hours hangs system and then crashes browser - Trunk [@ nsCacheService::DeactivateAndClearEntry] [@ nsHTMLInputElement::SetFocus]
*** Bug 76554 has been marked as a duplicate of this bug. ***
*** Bug 76577 has been marked as a duplicate of this bug. ***
sr=brendan@mozilla.org, get an r= and drivers will a=. /be
Whiteboard: important to mozilla0.9
r=gagan
Whiteboard: important to mozilla0.9 → important to mozilla0.9 - have patch
a= asa@mozilla.org for checkin to 0.9
Whiteboard: important to mozilla0.9 - have patch → important to mozilla0.9
patch to nsCacheService::DeactivateAndClear() checked in.
I left a win2000 comm build (2001041604) I got to page 1534 before it apparently just went blank. Although the entire app is still usable, (i'm posting with the same process now, and checking email.).
ressigning to chofmann. gordon has checked in a fix and is tracking his remaining work in bug 76293 --new cache needs to be removed if browser busting or mozilla is killed... This bug should become a tracking bug for problems we encounter with browser buster. Unsetting Target Milestone since there are no further fixes to be checked in under this particular bug.
Assignee: gordon → chofmann
Target Milestone: mozilla0.9 → ---
Whiteboard: important to mozilla0.9
Blocks: 82033
haven't seen this for a while. I'll reopen if AI do.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Crash Signature: [@ nsCacheService::DeactivateAndClearEntry] [@ nsHTMLInputElement::SetFocus]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: