Closed Bug 56086 Opened 24 years ago Closed 24 years ago

Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns. M09 topcrash [@ 0x00000000 - nsAppShell::Run()]

Categories

(SeaMonkey :: General, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED DUPLICATE of bug 80345
mozilla0.9.1

People

(Reporter: brodms, Assigned: dr)

References

Details

(Keywords: crash, helpwanted, topcrash)

Crash Data

Attachments

(1 file)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux 2.2.14-SMP i686; en-US; m18) Gecko/20001007 BuildID: 2000100721 Rapid clicking of the Forward and/or Back buttons can cause a crash. It appears that hitting one of these buttons while the last page requested is still being pulled out of cache is the direct cause of the crash. I also tried using the keyboard controls(Alt+Left and Alt+Right) and was also able to cause this crash. Reproducible: Sometimes Steps to Reproduce: 1.Start the browser and load a series of large pages. 2.Now rapidly navigate through them using the Forward and Back buttons, or using the keyboard controls. Actual Results: The browser crashed. Expected Results: Merrily loaded the pages requested. Here are several Talkbacks: TB18903607H: This crash using keyboard navigation. TB18867835K, TB18717846H: This crash using the Forward and Back buttons. I tried and failed to reproduce this on Win98, but I only have 32MB of RAM on that machine, so I really can't do anything "rapidly".
Shortening lengthy summary.
Summary: Rapidly, repeatedly hitting Forward and Back buttons crashes browser. → Rapidly hitting Forward & Back buttons crashes browser.
Adding crash keyword.
Keywords: crash
May be a Linux-specific bug. WFM Win2k.
I'm confirming, but there must be a duplicate somewhere, because I've seen this behavior consistently for over a year with moz/linux. I never bothered to file it because I've seen it so consistently and for so long that I figured it must be in the system, somewhere. I'll attempt to do another search for it, but in the meantime this is important.If I can't find a dup, I'll add more observations here later.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I've been looking for something that this duplicates, but I can't find anything, and all the IRC channels are useless at this hour of the morning. Anyway, I'll add my observations: *This has occured for at least 14 months in the Linux builds. *The crash occurs after you stop hitting the buttons, not while you are still hitting the buttons (somehow, this keeps moz alive.) *I believe that this is SMP only- I don't have an SMP machine to test it on to be sure, but at least one friend (via IRC) reports stability under the same conditions. Anyway, those are my two cents.
I can reproduce it on an non SMP machine (linux 2000100908)
I'm on a non-SMP Linux machine, and I've been observing this behavior for at least a month (probably longer). I held off on reporting it for the same reason: I figured it was already in Bugzilla somewhere. But i can't imagine it being in any component other than History or possibly Keyboard Nav, and I've rifled through all of those.
History and Keybd Nav are relatively new components. A truly old bug of tis nature would be in XP Apps or Browser-General. What's SMP?
SMP == Symmetric Multi-Processing. Two chips in one machine :) Tends to magnify the effect of any threading bugs (which made moz unusable on SMP for quite some time.) However, if this is verified from elsewhere it seems that isn't the problem.
nav triage teaM: beta stopper. we have to fix crashers.
Keywords: nsbeta1
Priority: P3 → P1
Bug 59831 might be a duplicate of this.
I've just had a very similar-seeming crash for build 2000123121 on Solaris (on a single-processor machine). I pressed back once, got the start of rendering of the page it was backing to, pressed back again immediately and got and immediate SEGV. Time between the two presses of back was ~1s. Sorry, no stack trace available, because Sun Workshop debugger crashes when loading the core-file.
*** Bug 59831 has been marked as a duplicate of this bug. ***
nav triage team: Marking nbseta1+
Whiteboard: nsbeta1+
Target Milestone: --- → mozilla0.9
I couldn't reproduce this on my RH 6.2 (no SMP) 266 MHZ 96MB machine. It doesn't crash on NT either. Has anyone encountered this recently? A stack trace would be more useful.
I can still reproduce this without much difficulty on my SuSE 6.4 machine, a P3-733 w/ 256MB of RAM. I'll supply a Talkback ID once I can get Talkback to start working again.
Ditto on the reproduceability of this with the latest builds. Is there a talkback build for Linux right now? I haven't seen one marked like that in the ftp for a while- am I missing something? I've been out of the QA loop here for quite some time :| If someone will point me to it, I'll take a stab at it...
I would love to look in to this, if I had a reproducible case or a stack trace. Claudius can you try to reproduce this? The crash may be occuring upon hitting back and forward. But unless I see a stack trace, I wouldn't know exactly whare it is crashing.
Status: NEW → ASSIGNED
OK, I got a Talkback build and reproduced the crash on Linux. Talkback ID is: TB26074452Z
cc'ing pavlov, akkana and mcafee, if they are aware of something similar. The crash seems to happen as below on linux/solaris when rapidly clicking back forward buttons or using keyboard shortcuts. From the talkback report ID: TB26074452Z, here's how the stack trace looks like. I couldn't reproduce it. But will definitely try again. 0x08dc44d1 libglib-1.2.so.0 + 0x11cbf (0x409cecbf) libglib-1.2.so.0 + 0x10bd6 (0x409cdbd6) libglib-1.2.so.0 + 0x11203 (0x409ce203) libglib-1.2.so.0 + 0x113cc (0x409ce3cc) libgtk-1.2.so.0 + 0x9300c (0x408ed00c) nsAppShell::Run() nsAppShellService::Run() main1() main() libc.so.6 + 0x18a5e (0x40247a5e)
What version of gtk is on the machine that crashes?
The machine that crashes is running GTK 1.2.
my stock rh62 machine has gtk 1.2.6, presumeably that is the version radha has.
To be more precise, this machine is running GTK 1.2.7.
Giving to the toolkit team. I could not reproduce it in my machine. Stack has no SH code in it.
Assignee: radha → trudelle
Status: ASSIGNED → NEW
Component: History: Session → XP Toolkit/Widgets
this bug needs to go see the Dr!
Assignee: trudelle → dr
I can't reproduce this on my machine (running RH7, gtk+-1.2.8). I have a feeling this may have been a bug in 1.2.7, or may have been fixed by some other checkin (or I just can't click fast enough). Can anybody still repro this with the latest mozilla builds? If so, what versions of gtk+, etc. do you have?
Keywords: qawanted
updating qa to john the toolkit master morrison, accepting...
Status: NEW → ASSIGNED
QA Contact: claudius → jrgm
Yeah, I can crash rh6.1 with > glib-config --version 1.2.5 > gtk-config --version 1.2.5 > I clicked about 10 links in a row, then proceeded to cycle up and down the list getting mozilla churning pretty hard, and eventually it crashed with a stack similar to the one above.
I can't, for the life of me, reproduce this bug, either on my machine (Redhat 7) or kandrot's (SuSE). We both use gtk+ 1.2.8, so I'm tempted to say it's a gtk bug, but there's still the possibility that it's a timing bug. Both my machine and kandrot's are fast (700MHz)... jrgm, or brodms, or anybody who sees this bug: Would it be possible for you to temporarily upgrade your gtk+ version to 1.2.8, since you're experiencing the bug currently, and see if upgrading fixes it?
I will upgrade to GTK 1.2.8 this weekend...
I can't duplicate the bug with gtk 1.2.9 + 2001022808, and I've seen it (with prior versions of gtk) since literally the dawn of time. Probably is a gtk bug, then.
Okay, sounds good to me. Thanks for all your help! I'll resolve this as INVALID, since it's not our fault, and see if I can get our "official gtk version," if there is such a thing, bumped up to 1.2.8.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
*** Bug 71754 has been marked as a duplicate of this bug. ***
guys, guys,... what are we saying here? are we saying the equivelent of, "oh, we crash only on win95 so lets not support that platform????" we need a bit of digging deeper. this is a listing of all the 'new' user comments of m0.8 talkback data center around hitting the back button... its about the 4th ranked crasher on the 0.8 milestone. looks like its happening on NT 4 and 5 as well but at a lower frequency and there could be multiple things going to cause the crashes.. Crash Analysis M08 Build - Related to back back related user comments ====== 180 - 0x00000000 40413ba6 BBID: 27437231 sec. http://abc.net.au/news/ Hit back button running on Windows NT 5.0 build 2195 src_file line 0x00000000 b582554e BBID: 27100526 sec. http://pages.ebay.com/search/items/search.html started browser. went to www.ebay.com. clicked search link. it took me to http://pages.ebay.com/search/items/search.html. page loaded. text disappeared. hit back. access violation. running on Windows NT 5.0 build 2195 src_file line - 0x000a000e bbf969aa BBID: 27399893 sec. http://www.dealtime.de/ bookmark window: dragged last (or 2nd from end) bookmark at window top (window scrolled authomatically backwards) and dropped bookmark: boom running on Windows NT 5.0 build 2195 src_file line - 0x285ef9b1 f24deba2 BBID: 27457418 sec. http://www.linuxtoday.com Looking at linuxtoday.com.Clicked a new.Then back and once the page started to show drag & drop the URL to the bookmarks sidebar into the personal toolbar folder.and ... that's all. running on Windows 95 4.0 build 67306684 src_file line - 0x85fc488d 68911668 BBID: 26800842 sec. www.jabber.com Opened up the page. The page has a flash animation on it. Went back one to the About screen. Then tried to go forward one. Boom. running on Windows 95 4.0 build 67109975 src_file line - 0xffffffff e20458a5 BBID: 26990693 sec. www.email.cz PI pressed back I was asked to confirm form data resubmission but before I preswsed back for three items (using an arrow) and than I canaceled the resubmission. It crashed Mozilla. running on Windows NT 4.0 build 1381 src_file line - JS_GetPrivate 3d26a039 BBID: 27095897 1859 sec. hit the back button running on Windows NT 5.0 build 2195 src_file d:/builds/0.8/mozilla/js/src/jsapi.c line 1859 - KERNEL32.DLL + 0xb9a6 (0xbff7b9a6) 41ee399f BBID: 27485256 sec. I pushed "back" button when there was no connection to Internet running on Windows 98 4.10 build 67766446 src_file line - MSVCRT.dll + 0xd157 (0x7800d157) 438e00dc BBID: 27350397 sec. www.slashdot.org Pressing the back button running on Windows NT 4.0 build 1381 src_file line - gklayout.dll + 0x1d311 (0x602fd311) cdf45b3c BBID: 27475804 sec. crash hitting back button. running on Windows NT 4.0 build 1381 src_file line - gklayout.dll + 0x1d311 (0x602fd311) dc56f1c9 BBID: 27471174 sec. Clicking the "back" button. running on Windows NT 5.0 build 2195 src_file line - gklayout.dll + 0x1d311 (0x602fd311) fd79750e BBID: 27451767 sec. http://freshmeat.net going back on real networks website... running on Windows NT 4.0 build 1381 src_file line - js_MarkGCThing 9303e52f BBID: 27570241 788 sec. www.kmeleon.org Went back from the results of the poll running on Windows 98 4.10 build 67766222 src_file d:/builds/seamonkey/mozilla/js/src/jsgc.c line 788 - js_MarkGCThing 9d684a24 BBID: 27087093 787 sec. http://www.tivo.com Hit the back arrow to return to another page from the TiVo page with flash animation. running on Windows NT 5.0 build 2195 src_file d:/builds/0.8/mozilla/js/src/jsgc.c line 787 - 0x00000000 106cc52d BBID: 26854241 sec. www.soureforge.net pressing the back button from quanta.sourceforge.net running on Linux 2.4.2 src_file line - 0x00000000 1acbbeaa BBID: 27046596 sec. back button from a .txt page to an html one in Navigator running on Linux 2.4.1 src_file line - 0x00000000 20e4ef56 BBID: 27194517 sec. google.com Hitting the back button of a search. running on Linux 2.2.5-15 src_file line - 0x00000000 23760159 BBID: 27478561 sec. www.uswestdex.com hit the back browser button several times in rapid succession the browser just disappeared. running on Linux 2.2.14 src_file line - 0x00000000 37a7322b BBID: 26868721 sec. Crashed going backwards in session history. running on Linux 2.2.17-14 src_file line - 0x00000000 450e352b BBID: 26927337 sec. http://www.videolan.org clicking the back button running on Linux 2.2.19pre14ext3 src_file line - 0x00000000 450e352b BBID: 26932965 sec. http://www.debian.org clicking the back button running on Linux 2.2.19pre14ext3 src_file line - 0x00000000 450e352b BBID: 27271231 sec. http://www.theregister.co.uk hitting the back button running on Linux 2.2.19pre14ext3 src_file line - 0x00000000 46e7139b BBID: 27479277 sec. internal bugzilla URL Looking at our internal bugzilla site... hit the back button after an updating bug. running on Linux 2.2.14-5.0 src_file line - 0x00000000 57131f8d BBID: 26961269 sec. using the back button running on Linux 2.2.14-5.0 src_file line - 0x00000000 5842f7b4 BBID: 27125813 sec. http://sourceforge.net/ backing out of project documentation page running on Linux 2.4.2 src_file line - 0x00000000 6e0e5358 BBID: 26905967 sec. http://www.autoweb.com/buy.htm Pressed the back button while the page was loading and it puked. running on Linux 2.2.17-21mdk src_file line - 0x00000000 7376c723 BBID: 27067826 sec. rapidly hitting the back button running on Linux 2.2.17 src_file line - 0x00000000 83b498b0 BBID: 27286119 sec. www.heise.de clicking back-Button twice in a short period of time running on Linux 2.2.14 src_file line - 0x00000000 9748fd05 BBID: 26925015 sec. http://www.ashford.com Hit the back button. running on Linux 2.2.17-21mdk src_file line - 0x00000000 aa6dd355 BBID: 26798488 sec. http://www.suzukicycles.com looking at the GZ250 and clicking the back button running on Linux 2.2.19pre9ext3 src_file line - - 0x00000000 ba4f8b23 BBID: 27643556 sec. back on suse portal running on Linux 2.2.16 src_file line - 0x00000000 c7254653 BBID: 27517524 sec. Mozilla sometimes(NOT ALWAYS) crashing when pressing back button.... This bug has been here very long infact this bug exist both in netscape 4 and mozilla....If you going back in the history cache by clicking the Back button iconthe browser will sometimes crash :-/ running on Linux 2.4.2 src_file line - 0x00000000 d7d8a14a BBID: 27605557 sec. Just using the back button... running on Linux 2.2.16-22 src_file line - 0x00000000 e703a581 BBID: 26985481 sec. Pressing the back button multiple times very quickly. running on Linux 2.2.18 src_file line - 0x00000010 d56be25e BBID: 27538994 sec. www.twoview.co.uk Hit the back buton running on Linux 2.2.18 src_file line - 0x00000011 83df530c BBID: 26991582 sec. www.ebay.com I had hit the back button several times and then it crashed. running on Linux 2.2.16-22 src_file line - 0x00000020 5a1823e5 BBID: 27468991 sec. 7 I pressed the back button while it was rendering a page after I pressed the back button again. OK that's a bad explanation... I pressed back twice in a row and it crashed after the second one. running on Linux 2.2.13 src_file line - 0x000000e8 9c53460a BBID: 26879294 sec. I closed mozilla 0.8 (installed from binaries for linux i386) when the talkback window appeared. running on Linux 2.2.18 src_file line - 0x00420005 71b022e2 BBID: 26825828 sec. Running for a long time (3 days?). Played with some forms onhttp://www.dabs.com/ a bit and hit "back" several times quickly. Boom :) running on Linux 2.2.17-21mdk src_file line - 0x0066006f dbe5c47b BBID: 26914748 sec. http://www.avsforum.com (unknown subpage URL) Hit back button. running on Linux 2.4.2 src_file line - 0x006d006f ceeac9d4 BBID: 26789210 sec. netscape.com multiple clicking on back button without waiting for the page to load and display. running on Linux 2.4.1 src_file line - 0x087d5ad1 8de7e14d BBID: 27045613 sec. I was starting a download and at the same time I pressed the back button. running on Linux 2.4.1 src_file line - 0x08820556 9e8a28d2 BBID: 27290290 sec. http://www.unibank.dk I don't know what was the cause of the crash. I was pressingback when it happened. running on Linux 2.2.16-3 src_file line - 0x088f6d5c 957aba2e BBID: 27091229 sec. www.openmotif.org Pressing the back button while two downloads were active running on Linux 2.2.17-8 src_file line - 0x0918526e d28d54ae BBID: 27060275 sec. unknown Going back to a previously viewed page. running on Linux 2.4.2 src_file line - 0x0921d09a 29034a10 BBID: 27016232 sec. http://www.slashdot.org/ Setting the background colour to white. running on Linux 2.2.5-15 src_file line - 0x11d1ad6a 56e7d1b0 BBID: 27135286 sec. www.mozillazine.org I was pressing the back button (going back to www.mozilla.org) running on Linux 2.2.16-storm src_file line - 0x40367470 eec7e192 BBID: 27000381 sec. http://bugzilla.mozilla.org Pressed the 'back' button twice... running on Linux 2.2.17-0.24mdk src_file line - 0x40b00061 cffe47e4 BBID: 26813289 sec. clicking on the "back" button several times in a row running on Linux 2.2.18 src_file line - 0x40ea48c5 caea6f0a BBID: 27069090 sec. monster.com hit back button expected it to ask if I wanted to repost form data. running on Linux 2.4.1 src_file line - 0x40ed72c2 8f47de5b BBID: 26775771 sec. (internal web site) hit back button going through some pages generated from cgi-binscripts running on Linux 2.2.16-3 src_file line - 0x74163a83 fc0bc409 BBID: 26944028 sec. changing the default background colors in Preferences. running on Linux 2.2.18 src_file line - libc.so.6 + 0x52502 (0x4028d502) 6ad9d819 BBID: 27297719 sec. http://www.compaq.it/presario/prodotti/portatili/14_xl453/caratteristiche.html I pressed the "back" button running on Linux 2.2.15-4mdk src_file line - libc.so.6 + 0x76917 (0x402aa917) 5ef14161 BBID: 26743599 sec. www.zive.cz I pressed the 'back' button to get out of the list of comments under the article. running on Linux 2.4.1-ac12 src_file line - libc.so.6 + 0x79618 (0x402ad618) 1590429a BBID: 27413715 sec. http://www.vivo.com/ I clicked on the back button 4 or 5 times rapidly. running on Linux 2.2.17 src_file line - libc.so.6 + 0xa6af4 (0x402d2af4) 414b13fe BBID: 26998647 sec. seems like a feedback agent failure - the browser seems to have started up okbut i'm getting the feedback agent anyway. running on Linux 2.2.16-3 src_file line - libc.so.6 + 0xb2c82 (0x402dcc82) d5f1d816 BBID: 27124458 sec. http://www.cam.ac.uk/societies/boatclub/ I think I clicked "back".(from inside some frames thing for which the URL seems to be hidden) running on Linux 2.2.16-4.cl.ext3 src_file line - libc.so.6 + 0xb2c82 (0x402dfc82) eb44f290 BBID: 26832204 sec. http://www.gmu.edu/departments/economics/bcaplan/anarfaq.htm Hit the 'back' button running on Linux 2.2.14-5.0smp src_file line - libc.so.6 + 0xdea32 (0x40312a32) 3e1af77d BBID: 26936545 sec. www.worldonline.dk preesed the back button running on Linux 2.2.16-22 src_file line - libjavaplugin_oji.so + 0x1b416 (0x40f3d416) fed3aa4a BBID: 27701830 sec. http://www.echl.com this time I was going _back_ to that page... this is really annoying. running on Linux 2.2.17-21mdk src_file line - nsCachedNetData::Release 3a830548 BBID: 27660458 280 sec. hit back button. running on Windows NT 4.0 build 1381 src_file d:/builds/seamonkey/mozilla/netwerk/cache/mgr/nsCachedNetData.cpp line 280 - nsCachedNetData::Release 72ce5225 BBID: 27046673 279 sec. www.dagbladet.no pressed `back' stack trace:NECKO! 6065ce8a()XPCOM! 60ca7630()NECKO! 606723e2()NECKO! 6064cc30()NECKO! 6064cb76()778b0c24() running on Windows NT 5.0 build 2195 src_file d:/builds/0.8/mozilla/netwerk/cache/mgr/nsCachedNetData.cpp line 279 - nsCachedNetData::Release 942e6766 BBID: 27156536 280 sec. http://sinfest.net/d/20010228.html used back context menu to go from here to /20010301.html and kaboom! this was first session with todays trunk build don't know if this is common..... running on Windows NT 5.0 build 2195 src_file d:/builds/seamonkey/mozilla/netwerk/cache/mgr/nsCachedNetData.cpp line 280 - nsCachedNetData::Release() 65c9d6c3 BBID: 27614351 sec. Hit the back button running on Linux 2.2.17-21mdk src_file line - nsCachedNetData::Release() 72cb148b BBID: 27144294 sec. clicked the "back" button running on Linux 2.2.17-21mdk src_file line - nsCachedNetData::Release() 9a4d1ef3 BBID: 27325423 sec. hitting back button running on Linux 2.2.16-22 src_file line - nsEventStateManager::UpdateCursor f07682c2 BBID: 27293569 1427 sec. Slashdot going back several pages in history running on Windows 98 4.10 build 67766446 src_file d:/builds/0.8/mozilla/layout/events/src/nsEventStateManager.cpp line 1427 - nsGenericDOMDataNode::IsOnlyWhitespace() 7f799368 BBID: 27478253 sec. http://www.ibm.com/ using the back button to get back to google. A bunch of javascriptwindows had just popped up from ibm's site. running on Linux 2.4.1 src_file line - nsHTTPPipelinedRequest::OnStopRequest() 719aafca BBID: 27660320 sec. http://www.avweb.com back one page running on Linux 2.4.1 src_file line - nsHTTPPipelinedRequest::WriteRequest() 48ae0d42 BBID: 27587200 sec. http://freshmeat.net/projects/lyx i go back from the lyx homepage to the freshmeat lyx project page and crash ! running on Linux 2.4.2 src_file line - nsSupportsArray::Clear 26cd3ae5 BBID: 26803835 320 sec. Hit back button. (can't remember URL) running on Windows NT 4.0 build 1381 src_file d:/builds/0.8/mozilla/xpcom/ds/nsSupportsArray.cpp line 320 - nsWindow::UpdateIdle() 593da47a BBID: 27310397 sec. Dilbert cartoon page (I don't know the URL; my browser has just crashed :) I was hitting the back-button repeatedly. I've noticed that the Dilbertcartoon page doesn't want to go to the page that I was viewing beforeit (don't know why yet) and I was just being obnoxious about it...Despite this little thing running on Linux 2.2.12-32 src_file line - nsWindow::UpdateIdle() d367fe61 BBID: 26830914 sec. hitting the back button in my browser running on Linux 2.2.17 src_file line
Status: RESOLVED → REOPENED
Keywords: topcrash
Resolution: INVALID → ---
jpatel, can you dig out some of the NT stack traces when the talkback server comes back up?
Summary: Rapidly hitting Forward & Back buttons crashes browser. → M08 Rapidly Forward&Back crashes browser.[@ 0x00000000 - nsAppShell::Run]
I figured out what was going on with Linux, but I haven't got any clue for the Windows problems. I think they're probably unrelated, given that upgrading GTK solved the problem on Linux. You might try hunting around for talkback crashes relating to forward or back buttons located in /widget/src/windows... That'd be my best guess. Reassigning to choffman.
Assignee: dr → chofmann
Status: REOPENED → NEW
OS: Linux → Windows NT
I agree with Dan, from the talkback data, it looks like this particular crash (0x00000000) might be a little different than the windows crashes. Talkback is having problems, so I haven't had a chance to look for a windows stack trace, but I will post it here as soon as i can so that we can look at it before deciding whether to log a new bug for the windows crash or not.
Okay, I just reproduced this three times on a 800MHz Redhat7.0 glib/gtk 1.2.8 This is a linux-only bug (it's deep into gtk/glib), and a separate bug should be filed for windows crashing on this user action. There may be nothing that can be done, or perhaps there is a way to defend against it, or gee, find the bug in glib and fix it :-] Dan, if you need help to reproduce the crash, I'll come show you how to get medieval on the browser buttons.
Assignee: chofmann → dr
Keywords: mozilla1.0
OS: Windows NT → Linux
Whiteboard: nsbeta1+ → linux crash in glib/gtk
Status: NEW → ASSIGNED
Keywords: helpwanted
Summary: M08 Rapidly Forward&Back crashes browser.[@ 0x00000000 - nsAppShell::Run] → Rapidly hitting forward & back crashes in gtk/glib code
Whiteboard: linux crash in glib/gtk
Man! Ok, accepting... If we're crashing in gtk/glib code and it's not their fault, then there are a few possibilities. We're very likely passing null to something, but what could be uninitialized? Could be the content area maybe? Hmm... Reopening 71754 for chofmann's windows bug also...
Keywords: topcrash
*** Bug 71754 has been marked as a duplicate of this bug. ***
If you don't want to own this bug, then find the right owner.
Keywords: topcrash
Summary: Rapidly hitting forward & back crashes in gtk/glib code → Rapidly hitting forward & back crashes in gtk/glib code [@ 0x00000000 - nsAppShell::Run]
Crashes in gtk code will happen. Yeah it's not our code, but the reality of software is we need to do our best to avoid crashing in other libraries as well as our own.
Yeah, apologies for all the confusion. If I had taken a second to read the other bug, I would have noticed that it was also linux-only all along. Sorry about that. This isn't a toolkit bug, so setting to Browser-General. John Morrison gave some good advice, and I'm still on this.
Component: XP Toolkit/Widgets → Browser-General
Keywords: qawanted
I know little to nothing about the priorities of the folks who are responsible for this bug, but given that much of the linux world will be (most likely) upgrading to newer versions of GTK in the reasonably near future (debian unstable already has 'safe' gtk packages, and RH 7.1 will cover most corporate installations) this probably shouldn't be "critical" anymore. It would still be ideal to fix the problem eventually but I'm sure there must be other problems that moz's time can be spent on.
liv: we realized the problem unfortunately still exists with the newer versions of gtk. it seems like fixing this bug will be a matter of compiling moz against a debug build of gtk/glib :(
I'm going to guess that this is probably not a gtk bug but is actually something that we are doing that's bad. I need a better stack trace here since I've never seen this. Brendan, weren't you saying that you saw something similar to this in a recent build?
Adding M08 to summary for tracking, since this crash was reported after looking at M08 talkback data.
Summary: Rapidly hitting forward & back crashes in gtk/glib code [@ 0x00000000 - nsAppShell::Run] → Rapidly hitting forward & back crashes in gtk/glib code M08 crash [@ 0x00000000 - nsAppShell::Run]
Yes, I recently saw a crash jumping through 0 from nsWindow::UpdateIdle. Of course, due to crappy gnome-terminal vs. gdb vs. ??? clipboard behavior, I couldn't copy the stack. It was shallow. /be
Adding nscatfood nomination from dup bug 71754 and moving over the crash car.
Keywords: nsCatFood
Dan: how recent a version of gtk was this seen on? I've been able to reliably reproduce this bug for nearly 22 months now (since June-ish 1999) and am now completely unable to reproduce the behavior with a Gnome pre-1.4 gtk (1.2.9). Are you sure the new manifestation has the same roots as the old?
This was reproduced on glib/gtk 1.2.8, Redhat7.0 on a 800MHz/128MB pc, as noted earlier. I apologize for not having upgraded. ;-]
Adding qawanted.
Keywords: qawanted
Keywords: qawanted
->moz0.9.1, per triage
Target Milestone: mozilla0.9 → mozilla0.9.1
re: email from choffman about topcrashers I'm slightly doomed with embedding 0.9 work right now, plus I've been wrestling with gdb in a feeble attempt to get it to tell me anything. If you want me to bother pavlov to look at this in purify on solaris, I have to bounce that off of trudelle -- please ask him. Otherwise, this is P1 for 0.9.1.
cc'ing arik, who may be able to get to this sooner.
Blocks: 77421
Finally got a workable gdb arrangement, no thanks to 57051. Anyway, with a debug gtk and glib build, I get the following stack trace: #0 0x00000000 in ?? () #1 0x4086eb25 in nsWindow::UpdateIdle (data=0x0) at nsWindow.cpp:617 #2 0x40a25214 in g_idle_dispatch (source_data=0x4086eabc, dispatch_time=0xbffff510, user_data=0x0) at gmain.c:1367 #3 0x40a24271 in g_main_dispatch (dispatch_time=0xbffff510) at gmain.c:656 #4 0x40a2487d in g_main_iterate (block=1, dispatch=1) at gmain.c:877 #5 0x40a24a0c in g_main_run (loop=0x81f8298) at gmain.c:935 #6 0x4093e83f in gtk_main () at gtkmain.c:524 #7 0x40854af5 in nsAppShell::Run (this=0x80a9818) at nsAppShell.cpp:360 #8 0x407eba09 in nsAppShellService::Run (this=0x80b3520) at nsAppShellService.cpp:407 #9 0x08054007 in main1 (argc=1, argv=0xbffff784, nativeApp=0x0) at nsAppRunner.cpp:1005 #10 0x08054ccf in main (argc=1, argv=0xbffff784) at nsAppRunner.cpp:1306 #11 0x402f4f31 in __libc_start_main (main=0x8054acc <main>, argc=1, ubp_av=0xbffff784, init=0x804f060 <_init>, fini=0x805e444 <_fini>, rtld_fini=0x4000e274 <_dl_fini>, stack_end=0xbffff77c) at ../sysdeps/generic/libc-start.c:129 The source at nsWindow.cpp#617 is window->Update(): gboolean nsWindow::UpdateIdle (gpointer data) { GSList *old_queue = update_queue; GSList *tmp_list = old_queue; update_idle = 0; update_queue = nsnull; while (tmp_list) { nsWindow *window = (nsWindow *)tmp_list->data; window->mIsUpdating = PR_FALSE; window->Update(); tmp_list = tmp_list->next; } g_slist_free (old_queue); return PR_FALSE; } The fact that we set mIsUpdating to false immediately before we go and update seems strange to me. The data argument passed here is null, but I think that's a red herring (iirc, that's just the way all gtk signals/slots with function pointers work). Hmm...
brendan: So I can confirm your experience jumping through 0 in UpdateIdle. |this| is non-null, |window| is non-null and |data| is null but uninteresting, but gdb gives me a whole bunch of member variables of |window| that are null and interesting. The ones that look awfully suspicious are: (nsBaseWidget) - mClientData - mContext - mAppShell - mToolkit - mChildren (nsWidget) - mWidget (!!!) - mParent (nsWindow) - mShell - mSuperWin There are others, but these strike more fear into my heart, especially mWidget. Thing is, the obvious assumption would be "gee, I ought to be croaking on a line where I'm trying to dereference a null pointer," but the line in question, |window->Update()|, and the entire function body around it, is clean of such problems as far as I can tell. The |Update| function's first lines are |if (!mSuperWin); return NS_OK;|, which seem a little odd, but should return happily. The only weirdness, as far as I can see, is that we set our updating flag to false right as we go update. blizzard, brendan, any idea?
Fixing summary, keywords (this isn't a topcrash, btw, the talkback info chofmann posted is mostly win32).
Severity: critical → major
Keywords: topcrash
Summary: Rapidly hitting forward & back crashes in gtk/glib code M08 crash [@ 0x00000000 - nsAppShell::Run] → Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns.
It's probably entirely uninteresting, but with this patch, my stack trace decides to leave off UpdateIdle and instead claims to die in g_idle_dispatch. This all makes very minimal sense to me. cvs server: Diffing . Index: nsWindow.cpp =================================================================== RCS file: /cvsroot/mozilla/widget/src/gtk/nsWindow.cpp,v retrieving revision 1.328 diff -u -r1.328 nsWindow.cpp --- nsWindow.cpp 2001/04/17 23:41:32 1.328 +++ nsWindow.cpp 2001/05/01 01:44:35 @@ -613,8 +613,9 @@ { nsWindow *window = (nsWindow *)tmp_list->data; - window->mIsUpdating = PR_FALSE; + window->mIsUpdating = PR_TRUE; window->Update(); + window->mIsUpdating = PR_FALSE; tmp_list = tmp_list->next; }
*** Bug 73418 has been marked as a duplicate of this bug. ***
Here's another patch (suggested by pavlov) which also *doesn't* work: Index: nsWindow.cpp =================================================================== RCS file: /cvsroot/mozilla/widget/src/gtk/nsWindow.cpp,v retrieving revision 1.328 diff -u -r1.328 nsWindow.cpp --- nsWindow.cpp 2001/04/17 23:41:32 1.328 +++ nsWindow.cpp 2001/05/03 01:18:15 @@ -216,6 +216,9 @@ UnqueueDraw(); } +static GSList *update_queue = NULL; +static guint update_idle = 0; + NS_IMETHODIMP nsWindow::Destroy(void) { // remove our pointer from the object so that we event handlers don't send us events @@ -228,6 +231,11 @@ if (mMozArea) gtk_object_remove_data(GTK_OBJECT(mMozArea), "nsWindow"); + // remove our idle function + gboolean rv(g_idle_remove_by_data(NULL)); + if (TRUE == rv) + update_idle = 0; + return nsWidget::Destroy(); } @@ -596,9 +604,6 @@ // all the updates in one shot. Actually, this queue should // be at most per-toplevel. FIXME. // - -static GSList *update_queue = NULL; -static guint update_idle = 0; gboolean nsWindow::UpdateIdle (gpointer data)
arik: my eyes, they are bleary. can you see anything funky in this code, besides that the code is funky? :)
Blocks: 75642
This is a topcrash for M09, added topcrash keyword. Added M09 topcrash [@ 0x00000000 - nsAppShell::Run()] to summary for tracking Here are some URLs & Comments that might help repro this crash: (30146226) URL: http://freshmeat.net/daily/2001/05/08/ (30146226) Comments: Just hit back button. (30141334) URL: http://shrimpwars.be (30141334) Comments: Quickly clicking the "Back" button. (30128515) URL: thespark.com (30128515) Comments: I hit the "back" button several times while taking the IQ test; it went back a few times and then it crashed. Here is a recent stack trace: Incident ID 30146226 0x00000000 libglib-1.2.so.0 + 0x10f49 (0x407e6f49) libglib-1.2.so.0 + 0xff96 (0x407e5f96) libglib-1.2.so.0 + 0x10561 (0x407e6561) libglib-1.2.so.0 + 0x10701 (0x407e6701) libgtk-1.2.so.0 + 0x8c569 (0x4070b569) nsAppShell::Run() nsAppShellService::Run() main1() main() libc.so.6 + 0x189cb (0x401ea9cb)
Keywords: topcrash
Summary: Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns. → Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns. M09 topcrash [@ 0x00000000 - nsAppShell::Run()]
Jan: Yes, nothing's different about the stack trace. We die in nsWindow::UpdateIdle for no reason I can make out... I'm starting to wonder if it isn't a glib threadsafety issue... The only thing I can think to do, my previous patches having failed, is to bandaid the |window->mIsUpdating = PR_FALSE; window->Update();| with an |if (window)|, but my debugger tells me that |window| is non-null. Something else I might try out of desperation is replacing the (nsWindow *) cast with an NS cast macro, or to see if some non-nsWindow is being inserted somewhere as data to that queue, but I'm sure we'd have already run into that if it were the problem.
i just filed bug 79937 which is a crash in ::UpdateIdle do to a pure virtual method invocation. Although i considered it an xlib bug.
Severity: major → critical
i think this is a dupe of bug 80345 which contains a simple fix. *** This bug has been marked as a duplicate of 80345 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → DUPLICATE
Sweet jesus, thank you so much... timeless: you should apply the same technique to xlib to fix your bug.
Status: RESOLVED → VERIFIED
sorry i didn't end up being helpfull ;-) will try better next time...
Mass removing self from CC list.
Now I feel sumb because I have to add back. Sorry for the spam.
Product: Browser → Seamonkey
it sounds like this one has been fixed, or the code has morfed to the point that this bug is no longer valuable. I might be worthwhile to set up reporting, analysis, and tracking of bug related to forward and back button activity since that has been a source of exposure to crash regressions in the past. closing this one and opening https://bugzilla.mozilla.org/show_bug.cgi?id=330085
Crash Signature: [@ 0x00000000 - nsAppShell::Run()]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: