Closed Bug 56086 Opened 24 years ago Closed 23 years ago

Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns. M09 topcrash [@ 0x00000000 - nsAppShell::Run()]

Categories

(SeaMonkey :: General, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED DUPLICATE of bug 80345
mozilla0.9.1

People

(Reporter: brodms, Assigned: dr)

References

Details

(Keywords: crash, helpwanted, topcrash)

Crash Data

Attachments

(1 file)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux 2.2.14-SMP i686; en-US; m18) Gecko/20001007
BuildID:    2000100721

Rapid clicking of the Forward and/or Back buttons can cause a crash. It appears
that hitting one of these buttons while the last page requested is still being
pulled out of cache is the direct cause of the crash.
I also tried using the keyboard controls(Alt+Left and Alt+Right) and was also
able to cause this crash.

Reproducible: Sometimes
Steps to Reproduce:
1.Start the browser and load a series of large pages.
2.Now rapidly navigate through them using the Forward and Back buttons, or using
the keyboard controls.

Actual Results:  The browser crashed.

Expected Results:  Merrily loaded the pages requested.

Here are several Talkbacks:

TB18903607H: This crash using keyboard navigation.

TB18867835K, TB18717846H: This crash using the Forward and Back buttons.

I tried and failed to reproduce this on Win98, but I only have 32MB of RAM on
that machine, so I really can't do anything "rapidly".
Shortening lengthy summary.
Summary: Rapidly, repeatedly hitting Forward and Back buttons crashes browser. → Rapidly hitting Forward & Back buttons crashes browser.
Adding crash keyword.
Keywords: crash
May be a Linux-specific bug. WFM Win2k.
I'm confirming, but there must be a duplicate somewhere, because I've seen this
behavior consistently for over a year with moz/linux. I never bothered to file
it because I've seen it so consistently and for so long that I figured it must
be in the system, somewhere. I'll attempt to do another search for it, but in
the meantime this is important.If I can't find a dup, I'll add more observations
here later.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I've been looking for something that this duplicates, but I can't find anything, and all the IRC channels are useless at this hour of the morning. Anyway, I'll add my observations:
*This has occured for at least 14 months in the Linux builds.
*The crash occurs after you stop hitting the buttons, not while you are still hitting the buttons (somehow, this keeps moz alive.)
*I believe that this is SMP only- I don't have an SMP machine to test it on to be sure, but at least one friend (via IRC) reports stability under the same conditions.
Anyway, those are my two cents.
I can reproduce it on an non SMP machine (linux 2000100908)
I'm on a non-SMP Linux machine, and I've been observing this behavior for at
least a month (probably longer).

I held off on reporting it for the same reason: I figured it was already in
Bugzilla somewhere. But i can't imagine it being in any component other than
History or possibly Keyboard Nav, and I've rifled through all of those.
History and Keybd Nav are relatively new components. A truly old bug of tis nature would
be in XP Apps or Browser-General. What's SMP?
SMP == Symmetric Multi-Processing. Two chips in one machine :) Tends to magnify
the effect of any threading bugs (which made moz unusable on SMP for quite some
time.) However, if this is verified from elsewhere it seems that isn't the problem.
nav triage teaM: beta stopper. we have to fix crashers. 
Keywords: nsbeta1
Priority: P3 → P1
Bug 59831 might be a duplicate of this.
I've just had a very similar-seeming crash for build 2000123121 on Solaris (on a
single-processor machine). I pressed back once, got the start of rendering of
the page it was backing to, pressed back again immediately and got and immediate
SEGV. Time between the two presses of back was ~1s.

Sorry, no stack trace available, because Sun Workshop debugger crashes when
loading the core-file.
*** Bug 59831 has been marked as a duplicate of this bug. ***
nav triage team:

Marking nbseta1+
Whiteboard: nsbeta1+
Target Milestone: --- → mozilla0.9
I couldn't reproduce this on my RH 6.2 (no SMP) 266 MHZ 96MB machine. It doesn't
crash on NT either. Has anyone encountered this recently? A stack trace would be
more useful.
I can still reproduce this without much difficulty on my SuSE 6.4 machine,
a P3-733 w/ 256MB of RAM. I'll supply a Talkback ID once I can get
Talkback to start working again.
Ditto on the reproduceability of this with the latest builds. Is there a
talkback build for Linux right now? I haven't seen one marked like that in the
ftp for a while- am I missing something? I've been out of the QA loop here for
quite some time :| If someone will point me to it, I'll take a stab at it...
I would love to look in to this, if I had a reproducible case or a stack trace.
Claudius can you try to  reproduce this? The crash may be occuring upon hitting
back and forward. But unless I see a stack trace, I wouldn't know exactly whare
it is crashing. 
Status: NEW → ASSIGNED
OK, I got a Talkback build and reproduced the crash on Linux. 
Talkback ID is: TB26074452Z
cc'ing pavlov, akkana and mcafee, if they are aware of something similar. The 
crash seems to happen as below on linux/solaris when rapidly clicking back 
forward buttons or using keyboard shortcuts. From the talkback report ID: 
TB26074452Z, here's how the stack trace looks like. I couldn't reproduce it. But 
will definitely try again.

0x08dc44d1 
libglib-1.2.so.0 + 0x11cbf (0x409cecbf) 
libglib-1.2.so.0 + 0x10bd6 (0x409cdbd6) 
libglib-1.2.so.0 + 0x11203 (0x409ce203) 
libglib-1.2.so.0 + 0x113cc (0x409ce3cc) 
libgtk-1.2.so.0 + 0x9300c (0x408ed00c) 
nsAppShell::Run() 
nsAppShellService::Run() 
main1() 
main() 
libc.so.6 + 0x18a5e (0x40247a5e) 
                                            

What version of gtk is on the machine that crashes?
The machine that crashes is running GTK 1.2.
my stock rh62 machine has gtk 1.2.6, presumeably that is the version
radha has.
To be more precise, this machine is running GTK 1.2.7.
Giving to the toolkit team. I could not reproduce it in my machine. Stack has no
SH code in it.
Assignee: radha → trudelle
Status: ASSIGNED → NEW
Component: History: Session → XP Toolkit/Widgets
this bug needs to go see the Dr!
Assignee: trudelle → dr
I can't reproduce this on my machine (running RH7, gtk+-1.2.8). I have a feeling
this may have been a bug in 1.2.7, or may have been fixed by some other checkin
(or I just can't click fast enough). Can anybody still repro this with the
latest mozilla builds? If so, what versions of gtk+, etc. do you have?
Keywords: qawanted
updating qa to john the toolkit master morrison, accepting...
Status: NEW → ASSIGNED
QA Contact: claudius → jrgm
Yeah, I can crash rh6.1 with 

> glib-config --version
1.2.5
> gtk-config --version
1.2.5
>

I clicked about 10 links in a row, then proceeded to cycle up and down the list
getting mozilla churning pretty hard, and eventually it crashed with a stack 
similar to the one above.
I can't, for the life of me, reproduce this bug, either on my machine (Redhat 7)
or kandrot's (SuSE). We both use gtk+ 1.2.8, so I'm tempted to say it's a gtk
bug, but there's still the possibility that it's a timing bug. Both my machine
and kandrot's are fast (700MHz)...

jrgm, or brodms, or anybody who sees this bug: Would it be possible for you to
temporarily upgrade your gtk+ version to 1.2.8, since you're experiencing the
bug currently, and see if upgrading fixes it?
I will upgrade to GTK 1.2.8 this weekend...
I can't duplicate the bug with gtk 1.2.9 + 2001022808, and I've seen it (with
prior versions of gtk) since literally the dawn of time. Probably is a gtk bug,
then. 
Okay, sounds good to me. Thanks for all your help! I'll resolve this as INVALID,
since it's not our fault, and see if I can get our "official gtk version," if
there is such a thing, bumped up to 1.2.8.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
*** Bug 71754 has been marked as a duplicate of this bug. ***
guys, guys,...  what are we saying here?  
are we saying the equivelent of, "oh, we crash only
on win95 so lets not support that platform????"

we need a bit of digging deeper.  this is a listing of all the
'new' user comments of m0.8 talkback data center around
hitting the back button...  its about the 4th ranked crasher
on the 0.8 milestone.

looks like its happening on NT 4 and 5 as well but at a lower frequency
and there could be multiple things going to cause the crashes..


Crash Analysis M08 Build - Related to back

back related user comments ====== 180 

- 0x00000000 40413ba6 BBID: 27437231 
sec. http://abc.net.au/news/ Hit back button running on Windows NT 5.0 build 
2195 src_file line 

 0x00000000 b582554e BBID: 27100526 
sec. http://pages.ebay.com/search/items/search.html started browser. went to 
www.ebay.com. clicked search link. it took me to
http://pages.ebay.com/search/items/search.html. page loaded. text disappeared. 
hit back. access violation. running on Windows NT 5.0 build
2195 src_file line 

- 0x000a000e bbf969aa BBID: 27399893 
sec. http://www.dealtime.de/ bookmark window: dragged last (or 2nd from end) 
bookmark at window top (window scrolled authomatically
backwards) and dropped bookmark: boom running on Windows NT 5.0 build 2195 
src_file line 


- 0x285ef9b1 f24deba2 BBID: 27457418 
sec. http://www.linuxtoday.com Looking at linuxtoday.com.Clicked a new.Then back 
and once the page started to show drag & drop the
URL to the bookmarks sidebar into the personal toolbar folder.and ... that's 
all. running on Windows 95 4.0 build 67306684 src_file line 

- 0x85fc488d 68911668 BBID: 26800842 
sec. www.jabber.com Opened up the page. The page has a flash animation on it. 
Went back one to the About screen. Then tried to go
forward one. Boom. running on Windows 95 4.0 build 67109975 src_file line 


- 0xffffffff e20458a5 BBID: 26990693 
sec. www.email.cz PI pressed back I was asked to confirm form data resubmission 
but before I preswsed back for three items (using an
arrow) and than I canaceled the resubmission. It crashed Mozilla. running on 
Windows NT 4.0 build 1381 src_file line 


- JS_GetPrivate 3d26a039 BBID: 27095897 
1859 sec. hit the back button running on Windows NT 5.0 build 2195 src_file 
d:/builds/0.8/mozilla/js/src/jsapi.c line 1859 

- KERNEL32.DLL + 0xb9a6 (0xbff7b9a6) 41ee399f BBID: 27485256 
sec. I pushed "back" button when there was no connection to Internet running on 
Windows 98 4.10 build 67766446 src_file line 


- MSVCRT.dll + 0xd157 (0x7800d157) 438e00dc BBID: 27350397 
sec. www.slashdot.org Pressing the back button running on Windows NT 4.0 build 
1381 src_file line 

- gklayout.dll + 0x1d311 (0x602fd311) cdf45b3c BBID: 27475804 
sec. crash hitting back button. running on Windows NT 4.0 build 1381 src_file 
line 

- gklayout.dll + 0x1d311 (0x602fd311) dc56f1c9 BBID: 27471174 
sec. Clicking the "back" button. running on Windows NT 5.0 build 2195 src_file 
line 

- gklayout.dll + 0x1d311 (0x602fd311) fd79750e BBID: 27451767 
sec. http://freshmeat.net going back on real networks website... running on 
Windows NT 4.0 build 1381 src_file line 


- js_MarkGCThing 9303e52f BBID: 27570241 
788 sec. www.kmeleon.org Went back from the results of the poll running on 
Windows 98 4.10 build 67766222 src_file
d:/builds/seamonkey/mozilla/js/src/jsgc.c line 788 

- js_MarkGCThing 9d684a24 BBID: 27087093 
787 sec. http://www.tivo.com Hit the back arrow to return to another page from 
the TiVo page with flash animation. running on Windows NT
5.0 build 2195 src_file d:/builds/0.8/mozilla/js/src/jsgc.c line 787 


- 0x00000000 106cc52d BBID: 26854241 
sec. www.soureforge.net pressing the back button from quanta.sourceforge.net 
running on Linux 2.4.2 src_file line 

- 0x00000000 1acbbeaa BBID: 27046596 
sec. back button from a .txt page to an html one in Navigator running on Linux 
2.4.1 src_file line 

- 0x00000000 20e4ef56 BBID: 27194517 
sec. google.com Hitting the back button of a search. running on Linux 2.2.5-15 
src_file line 

- 0x00000000 23760159 BBID: 27478561 
sec. www.uswestdex.com hit the back browser button several times in rapid 
succession the browser just disappeared. running on Linux 2.2.14
src_file line 

- 0x00000000 37a7322b BBID: 26868721 
sec. Crashed going backwards in session history. running on Linux 2.2.17-14 
src_file line 



- 0x00000000 450e352b BBID: 26927337 
sec. http://www.videolan.org clicking the back button running on Linux 
2.2.19pre14ext3 src_file line 

- 0x00000000 450e352b BBID: 26932965 
sec. http://www.debian.org clicking the back button running on Linux 
2.2.19pre14ext3 src_file line 

- 0x00000000 450e352b BBID: 27271231 
sec. http://www.theregister.co.uk hitting the back button running on Linux 
2.2.19pre14ext3 src_file line 

- 0x00000000 46e7139b BBID: 27479277 
sec. internal bugzilla URL Looking at our internal bugzilla site... hit the back 
button after an updating bug. running on Linux 2.2.14-5.0 src_file
line 

- 0x00000000 57131f8d BBID: 26961269 
sec. using the back button running on Linux 2.2.14-5.0 src_file line 

- 0x00000000 5842f7b4 BBID: 27125813 
sec. http://sourceforge.net/ backing out of project documentation page running 
on Linux 2.4.2 src_file line 

- 0x00000000 6e0e5358 BBID: 26905967 
sec. http://www.autoweb.com/buy.htm Pressed the back button while the page was 
loading and it puked. running on Linux 2.2.17-21mdk
src_file line 

- 0x00000000 7376c723 BBID: 27067826 
sec. rapidly hitting the back button running on Linux 2.2.17 src_file line 

- 0x00000000 83b498b0 BBID: 27286119 
sec. www.heise.de clicking back-Button twice in a short period of time running 
on Linux 2.2.14 src_file line 

- 0x00000000 9748fd05 BBID: 26925015 
sec. http://www.ashford.com Hit the back button. running on Linux 2.2.17-21mdk 
src_file line 

- 0x00000000 aa6dd355 BBID: 26798488 
sec. http://www.suzukicycles.com looking at the GZ250 and clicking the back 
button running on Linux 2.2.19pre9ext3 src_file line 

-

- 0x00000000 ba4f8b23 BBID: 27643556 
sec. back on suse portal running on Linux 2.2.16 src_file line 

- 0x00000000 c7254653 BBID: 27517524 
sec. Mozilla sometimes(NOT ALWAYS) crashing when pressing back button.... This 
bug has been here very long infact this bug exist both in
netscape 4 and mozilla....If you going back in the history cache by clicking the 
Back button iconthe browser will sometimes crash :-/ running on
Linux 2.4.2 src_file line 

- 0x00000000 d7d8a14a BBID: 27605557 
sec. Just using the back button... running on Linux 2.2.16-22 src_file line 

- 0x00000000 e703a581 BBID: 26985481 
sec. Pressing the back button multiple times very quickly. running on Linux 
2.2.18 src_file line 

- 0x00000010 d56be25e BBID: 27538994 
sec. www.twoview.co.uk Hit the back buton running on Linux 2.2.18 src_file line 

- 0x00000011 83df530c BBID: 26991582 
sec. www.ebay.com I had hit the back button several times and then it crashed. 
running on Linux 2.2.16-22 src_file line 

- 0x00000020 5a1823e5 BBID: 27468991 
sec. 7 I pressed the back button while it was rendering a page after I pressed 
the back button again. OK that's a bad explanation... I pressed
back twice in a row and it crashed after the second one. running on Linux 2.2.13 
src_file line 

- 0x000000e8 9c53460a BBID: 26879294 
sec. I closed mozilla 0.8 (installed from binaries for linux i386) when the 
talkback window appeared. running on Linux 2.2.18 src_file line 



- 0x00420005 71b022e2 BBID: 26825828 
sec. Running for a long time (3 days?). Played with some forms 
onhttp://www.dabs.com/ a bit and hit "back" several times quickly. Boom :)
running on Linux 2.2.17-21mdk src_file line 

- 0x0066006f dbe5c47b BBID: 26914748 
sec. http://www.avsforum.com (unknown subpage URL) Hit back button. running on 
Linux 2.4.2 src_file line 

- 0x006d006f ceeac9d4 BBID: 26789210 
sec. netscape.com multiple clicking on back button without waiting for the page 
to load and display. running on Linux 2.4.1 src_file line 

- 0x087d5ad1 8de7e14d BBID: 27045613 
sec. I was starting a download and at the same time I pressed the back button. 
running on Linux 2.4.1 src_file line 

- 0x08820556 9e8a28d2 BBID: 27290290 
sec. http://www.unibank.dk I don't know what was the cause of the crash. I was 
pressingback when it happened. running on Linux 2.2.16-3
src_file line 

- 0x088f6d5c 957aba2e BBID: 27091229 
sec. www.openmotif.org Pressing the back button while two downloads were active 
running on Linux 2.2.17-8 src_file line 

- 0x0918526e d28d54ae BBID: 27060275 
sec. unknown Going back to a previously viewed page. running on Linux 2.4.2 
src_file line 

- 0x0921d09a 29034a10 BBID: 27016232 
sec. http://www.slashdot.org/ Setting the background colour to white. running on 
Linux 2.2.5-15 src_file line 

- 0x11d1ad6a 56e7d1b0 BBID: 27135286 
sec. www.mozillazine.org I was pressing the back button (going back to 
www.mozilla.org) running on Linux 2.2.16-storm src_file line 



- 0x40367470 eec7e192 BBID: 27000381 
sec. http://bugzilla.mozilla.org Pressed the 'back' button twice... running on 
Linux 2.2.17-0.24mdk src_file line 

- 0x40b00061 cffe47e4 BBID: 26813289 
sec. clicking on the "back" button several times in a row running on Linux 
2.2.18 src_file line 

- 0x40ea48c5 caea6f0a BBID: 27069090 
sec. monster.com hit back button expected it to ask if I wanted to repost form 
data. running on Linux 2.4.1 src_file line 

- 0x40ed72c2 8f47de5b BBID: 26775771 
sec. (internal web site) hit back button going through some pages generated from 
cgi-binscripts running on Linux 2.2.16-3 src_file line 

- 0x74163a83 fc0bc409 BBID: 26944028 
sec. changing the default background colors in Preferences. running on Linux 
2.2.18 src_file line 



- libc.so.6 + 0x52502 (0x4028d502) 6ad9d819 BBID: 27297719 
sec. 
http://www.compaq.it/presario/prodotti/portatili/14_xl453/caratteristiche.html I 
pressed the "back" button running on Linux 2.2.15-4mdk
src_file line 

- libc.so.6 + 0x76917 (0x402aa917) 5ef14161 BBID: 26743599 
sec. www.zive.cz I pressed the 'back' button to get out of the list of comments 
under the article. running on Linux 2.4.1-ac12 src_file line 

- libc.so.6 + 0x79618 (0x402ad618) 1590429a BBID: 27413715 
sec. http://www.vivo.com/ I clicked on the back button 4 or 5 times rapidly. 
running on Linux 2.2.17 src_file line 

- libc.so.6 + 0xa6af4 (0x402d2af4) 414b13fe BBID: 26998647 
sec. seems like a feedback agent failure - the browser seems to have started up 
okbut i'm getting the feedback agent anyway. running on Linux
2.2.16-3 src_file line 

- libc.so.6 + 0xb2c82 (0x402dcc82) d5f1d816 BBID: 27124458 
sec. http://www.cam.ac.uk/societies/boatclub/ I think I clicked "back".(from 
inside some frames thing for which the URL seems to be hidden)
running on Linux 2.2.16-4.cl.ext3 src_file line 

- libc.so.6 + 0xb2c82 (0x402dfc82) eb44f290 BBID: 26832204 
sec. http://www.gmu.edu/departments/economics/bcaplan/anarfaq.htm Hit the 'back' 
button running on Linux 2.2.14-5.0smp src_file line 

- libc.so.6 + 0xdea32 (0x40312a32) 3e1af77d BBID: 26936545 
sec. www.worldonline.dk preesed the back button running on Linux 2.2.16-22 
src_file line 

- libjavaplugin_oji.so + 0x1b416 (0x40f3d416) fed3aa4a BBID: 27701830 
sec. http://www.echl.com this time I was going _back_ to that page... this is 
really annoying. running on Linux 2.2.17-21mdk src_file line 

- nsCachedNetData::Release 3a830548 BBID: 27660458 
280 sec. hit back button. running on Windows NT 4.0 build 1381 src_file
d:/builds/seamonkey/mozilla/netwerk/cache/mgr/nsCachedNetData.cpp line 280 

- nsCachedNetData::Release 72ce5225 BBID: 27046673 
279 sec. www.dagbladet.no pressed `back' stack trace:NECKO! 6065ce8a()XPCOM! 
60ca7630()NECKO! 606723e2()NECKO!
6064cc30()NECKO! 6064cb76()778b0c24() running on Windows NT 5.0 build 2195 
src_file
d:/builds/0.8/mozilla/netwerk/cache/mgr/nsCachedNetData.cpp line 279 

- nsCachedNetData::Release 942e6766 BBID: 27156536 
280 sec. http://sinfest.net/d/20010228.html used back context menu to go from 
here to /20010301.html and kaboom! this was first session
with todays trunk build don't know if this is common..... running on Windows NT 
5.0 build 2195 src_file
d:/builds/seamonkey/mozilla/netwerk/cache/mgr/nsCachedNetData.cpp line 280 

- nsCachedNetData::Release() 65c9d6c3 BBID: 27614351 
sec. Hit the back button running on Linux 2.2.17-21mdk src_file line 

- nsCachedNetData::Release() 72cb148b BBID: 27144294 
sec. clicked the "back" button running on Linux 2.2.17-21mdk src_file line 

- nsCachedNetData::Release() 9a4d1ef3 BBID: 27325423 
sec. hitting back button running on Linux 2.2.16-22 src_file line 

- nsEventStateManager::UpdateCursor f07682c2 BBID: 27293569 
1427 sec. Slashdot going back several pages in history running on Windows 98 
4.10 build 67766446 src_file
d:/builds/0.8/mozilla/layout/events/src/nsEventStateManager.cpp line 1427 

- nsGenericDOMDataNode::IsOnlyWhitespace() 7f799368 BBID: 27478253 
sec. http://www.ibm.com/ using the back button to get back to google. A bunch of 
javascriptwindows had just popped up from ibm's site.
running on Linux 2.4.1 src_file line 

- nsHTTPPipelinedRequest::OnStopRequest() 719aafca BBID: 27660320 
sec. http://www.avweb.com back one page running on Linux 2.4.1 src_file line 

- nsHTTPPipelinedRequest::WriteRequest() 48ae0d42 BBID: 27587200 
sec. http://freshmeat.net/projects/lyx i go back from the lyx homepage to the 
freshmeat lyx project page and crash ! running on Linux 2.4.2
src_file line 


- nsSupportsArray::Clear 26cd3ae5 BBID: 26803835 
320 sec. Hit back button. (can't remember URL) running on Windows NT 4.0 build 
1381 src_file
d:/builds/0.8/mozilla/xpcom/ds/nsSupportsArray.cpp line 320 


- nsWindow::UpdateIdle() 593da47a BBID: 27310397 
sec. Dilbert cartoon page (I don't know the URL; my browser has just crashed :) 
I was hitting the back-button repeatedly. I've noticed that the
Dilbertcartoon page doesn't want to go to the page that I was viewing beforeit 
(don't know why yet) and I was just being obnoxious about
it...Despite this little thing running on Linux 2.2.12-32 src_file line 

- nsWindow::UpdateIdle() d367fe61 BBID: 26830914 
sec. hitting the back button in my browser running on Linux 2.2.17 src_file line 
Status: RESOLVED → REOPENED
Keywords: topcrash
Resolution: INVALID → ---
jpatel, can you dig out some of the NT stack traces when
the talkback server comes back up?
Summary: Rapidly hitting Forward & Back buttons crashes browser. → M08 Rapidly Forward&Back crashes browser.[@ 0x00000000 - nsAppShell::Run]
I figured out what was going on with Linux, but I haven't got any clue for the
Windows problems. I think they're probably unrelated, given that upgrading GTK
solved the problem on Linux. You might try hunting around for talkback crashes
relating to forward or back buttons located in /widget/src/windows... That'd be
my best guess. Reassigning to choffman.
Assignee: dr → chofmann
Status: REOPENED → NEW
OS: Linux → Windows NT
I agree with Dan, from the talkback data, it looks like this particular crash 
(0x00000000) might be a little different than the windows crashes.  Talkback is 
having problems, so I haven't had a chance to look for a windows stack trace, 
but I will post it here as soon as i can so that we can look at it before 
deciding whether to log a new bug for the windows crash or not.
Okay, I just reproduced this three times on a 800MHz Redhat7.0 glib/gtk 1.2.8

This is a linux-only bug (it's deep into gtk/glib), and a separate bug should 
be filed for windows crashing on this user action. 

There may be nothing that can be done, or perhaps there is a way to defend
against it, or gee, find the bug in glib and fix it :-]

Dan, if you need help to reproduce the crash, I'll come show you how to get
medieval on the browser buttons.
Assignee: chofmann → dr
Keywords: mozilla1.0
OS: Windows NT → Linux
Whiteboard: nsbeta1+ → linux crash in glib/gtk
Status: NEW → ASSIGNED
Keywords: helpwanted
Summary: M08 Rapidly Forward&Back crashes browser.[@ 0x00000000 - nsAppShell::Run] → Rapidly hitting forward & back crashes in gtk/glib code
Whiteboard: linux crash in glib/gtk
Man! Ok, accepting... If we're crashing in gtk/glib code and it's not their
fault, then there are a few possibilities. We're very likely passing null to
something, but what could be uninitialized? Could be the content area maybe? Hmm...

Reopening 71754 for chofmann's windows bug also...
Keywords: topcrash
*** Bug 71754 has been marked as a duplicate of this bug. ***
If you don't want to own this bug, then find the right owner.
Keywords: topcrash
Summary: Rapidly hitting forward & back crashes in gtk/glib code → Rapidly hitting forward & back crashes in gtk/glib code [@ 0x00000000 - nsAppShell::Run]
Crashes in gtk code will happen.  Yeah it's not our code,
but the reality of software is we need to do our best to
avoid crashing in other libraries as well as our own.
Yeah, apologies for all the confusion. If I had taken a second to read the other
bug, I would have noticed that it was also linux-only all along. Sorry about that.

This isn't a toolkit bug, so setting to Browser-General. John Morrison gave some
good advice, and I'm still on this.
Component: XP Toolkit/Widgets → Browser-General
Keywords: qawanted
I know little to nothing about the priorities of the folks who are responsible
for this bug, but given that much of the linux world will be (most likely)
upgrading to newer versions of GTK in the reasonably near future (debian
unstable already has 'safe' gtk packages, and RH 7.1 will cover most corporate
installations) this probably shouldn't be "critical" anymore. It would still be
ideal to fix the problem eventually but I'm sure there must be other problems
that moz's time can be spent on. 
liv: we realized the problem unfortunately still exists with the newer versions
of gtk.

it seems like fixing this bug will be a matter of compiling moz against a debug
build of gtk/glib :(
I'm going to guess that this is probably not a gtk bug but is actually something
that we are doing that's bad.  I need a better stack trace here since I've never
seen this.

Brendan, weren't you saying that you saw something similar to this in a recent
build?
Adding M08 to summary for tracking, since this crash was reported after looking 
at M08 talkback data.
Summary: Rapidly hitting forward & back crashes in gtk/glib code [@ 0x00000000 - nsAppShell::Run] → Rapidly hitting forward & back crashes in gtk/glib code M08 crash [@ 0x00000000 - nsAppShell::Run]
Yes, I recently saw a crash jumping through 0 from nsWindow::UpdateIdle.  Of
course, due to crappy gnome-terminal vs. gdb vs. ??? clipboard behavior, I
couldn't copy the stack.  It was shallow.

/be
Adding nscatfood nomination from dup bug 71754 and moving over the crash car.
Keywords: nsCatFood
Dan: how recent a version of gtk was this seen on? I've been able to reliably
reproduce this bug for nearly 22 months now (since June-ish 1999) and am now
completely unable to reproduce the behavior with a Gnome pre-1.4 gtk (1.2.9).
Are you sure the new manifestation has the same roots as the old?
This was reproduced on glib/gtk 1.2.8, Redhat7.0 on a 800MHz/128MB pc, as 
noted earlier. I apologize for not having upgraded. ;-]
Adding qawanted. 
Keywords: qawanted
Keywords: qawanted
->moz0.9.1, per triage
Target Milestone: mozilla0.9 → mozilla0.9.1
re: email from choffman about topcrashers

I'm slightly doomed with embedding 0.9 work right now, plus I've been wrestling
with gdb in a feeble attempt to get it to tell me anything. If you want me to
bother pavlov to look at this in purify on solaris, I have to bounce that off of
trudelle -- please ask him. Otherwise, this is P1 for 0.9.1.
cc'ing arik, who may be able to get to this sooner.
Blocks: 77421
Finally got a workable gdb arrangement, no thanks to 57051. Anyway, with a debug
gtk and glib build, I get the following stack trace:

#0  0x00000000 in ?? ()
#1  0x4086eb25 in nsWindow::UpdateIdle (data=0x0) at nsWindow.cpp:617
#2  0x40a25214 in g_idle_dispatch (source_data=0x4086eabc,
dispatch_time=0xbffff510, user_data=0x0) at gmain.c:1367
#3  0x40a24271 in g_main_dispatch (dispatch_time=0xbffff510) at gmain.c:656
#4  0x40a2487d in g_main_iterate (block=1, dispatch=1) at gmain.c:877
#5  0x40a24a0c in g_main_run (loop=0x81f8298) at gmain.c:935
#6  0x4093e83f in gtk_main () at gtkmain.c:524
#7  0x40854af5 in nsAppShell::Run (this=0x80a9818) at nsAppShell.cpp:360
#8  0x407eba09 in nsAppShellService::Run (this=0x80b3520) at
nsAppShellService.cpp:407
#9  0x08054007 in main1 (argc=1, argv=0xbffff784, nativeApp=0x0) at
nsAppRunner.cpp:1005
#10 0x08054ccf in main (argc=1, argv=0xbffff784) at nsAppRunner.cpp:1306
#11 0x402f4f31 in __libc_start_main (main=0x8054acc <main>, argc=1,
ubp_av=0xbffff784, init=0x804f060 <_init>, fini=0x805e444 <_fini>,
rtld_fini=0x4000e274 <_dl_fini>, stack_end=0xbffff77c) at
../sysdeps/generic/libc-start.c:129

The source at nsWindow.cpp#617 is window->Update():

gboolean
nsWindow::UpdateIdle (gpointer data)
{
  GSList *old_queue = update_queue;
  GSList *tmp_list = old_queue;

  update_idle = 0;
  update_queue = nsnull;

  while (tmp_list)
  {
    nsWindow *window = (nsWindow *)tmp_list->data;

    window->mIsUpdating = PR_FALSE;
    window->Update();

    tmp_list = tmp_list->next;
  }

  g_slist_free (old_queue);

  return PR_FALSE;
}

The fact that we set mIsUpdating to false immediately before we go and update
seems strange to me. The data argument passed here is null, but I think that's a
red herring (iirc, that's just the way all gtk signals/slots with function
pointers work). Hmm...
brendan: So I can confirm your experience jumping through 0 in UpdateIdle.
|this| is non-null, |window| is non-null and |data| is null but uninteresting,
but gdb gives me a whole bunch of member variables of |window| that are null and
interesting. The ones that look awfully suspicious are:

(nsBaseWidget)
  - mClientData
  - mContext
  - mAppShell
  - mToolkit
  - mChildren
(nsWidget)
  - mWidget (!!!)
  - mParent
(nsWindow)
  - mShell
  - mSuperWin

There are others, but these strike more fear into my heart, especially mWidget.
Thing is, the obvious assumption would be "gee, I ought to be croaking on a line
where I'm trying to dereference a null pointer," but the line in question,
|window->Update()|, and the entire function body around it, is clean of such
problems as far as I can tell. The |Update| function's first lines are |if
(!mSuperWin); return NS_OK;|, which seem a little odd, but should return happily.

The only weirdness, as far as I can see, is that we set our updating flag to
false right as we go update.

blizzard, brendan, any idea?
Fixing summary, keywords (this isn't a topcrash, btw, the talkback info chofmann
posted is mostly win32).
Severity: critical → major
Keywords: topcrash
Summary: Rapidly hitting forward & back crashes in gtk/glib code M08 crash [@ 0x00000000 - nsAppShell::Run] → Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns.
It's probably entirely uninteresting, but with this patch, my stack trace
decides to leave off UpdateIdle and instead claims to die in g_idle_dispatch.
This all makes very minimal sense to me.

cvs server: Diffing .
Index: nsWindow.cpp
===================================================================
RCS file: /cvsroot/mozilla/widget/src/gtk/nsWindow.cpp,v
retrieving revision 1.328
diff -u -r1.328 nsWindow.cpp
--- nsWindow.cpp	2001/04/17 23:41:32	1.328
+++ nsWindow.cpp	2001/05/01 01:44:35
@@ -613,8 +613,9 @@
   {
     nsWindow *window = (nsWindow *)tmp_list->data;
     
-    window->mIsUpdating = PR_FALSE;
+    window->mIsUpdating = PR_TRUE;
     window->Update();
+    window->mIsUpdating = PR_FALSE;
     
     tmp_list = tmp_list->next;
   }
*** Bug 73418 has been marked as a duplicate of this bug. ***
Here's another patch (suggested by pavlov) which also *doesn't* work:

Index: nsWindow.cpp
===================================================================
RCS file: /cvsroot/mozilla/widget/src/gtk/nsWindow.cpp,v
retrieving revision 1.328
diff -u -r1.328 nsWindow.cpp
--- nsWindow.cpp	2001/04/17 23:41:32	1.328
+++ nsWindow.cpp	2001/05/03 01:18:15
@@ -216,6 +216,9 @@
     UnqueueDraw();
 }
 
+static GSList *update_queue = NULL;
+static guint update_idle = 0;
+
 NS_IMETHODIMP nsWindow::Destroy(void)
 {
   // remove our pointer from the object so that we event handlers don't send us
events
@@ -228,6 +231,11 @@
   if (mMozArea)
     gtk_object_remove_data(GTK_OBJECT(mMozArea), "nsWindow");
 
+  // remove our idle function
+  gboolean rv(g_idle_remove_by_data(NULL));
+  if (TRUE == rv)
+    update_idle = 0;
+
   return nsWidget::Destroy();
 }
 
@@ -596,9 +604,6 @@
 // all the updates in one shot. Actually, this queue should
 // be at most per-toplevel. FIXME.
 //
-
-static GSList *update_queue = NULL;
-static guint update_idle = 0;
 
 gboolean 
 nsWindow::UpdateIdle (gpointer data)
arik: my eyes, they are bleary. can you see anything funky in this code, besides
that the code is funky? :)
Blocks: 75642
This is a topcrash for M09, added topcrash keyword.
Added M09 topcrash [@ 0x00000000 - nsAppShell::Run()] to summary for tracking

Here are some URLs & Comments that might help repro this crash:

     (30146226) URL: http://freshmeat.net/daily/2001/05/08/
     (30146226) Comments: Just hit back button.
     (30141334) URL: http://shrimpwars.be
     (30141334) Comments: Quickly clicking the "Back" button.
     (30128515) URL: thespark.com
     (30128515) Comments: I hit the "back" button several times while taking the 
IQ test; it went back a few times and then it crashed.


Here is a recent stack trace:

Incident ID 30146226 
0x00000000 
libglib-1.2.so.0 + 0x10f49 (0x407e6f49) 
libglib-1.2.so.0 + 0xff96 (0x407e5f96) 
libglib-1.2.so.0 + 0x10561 (0x407e6561) 
libglib-1.2.so.0 + 0x10701 (0x407e6701) 
libgtk-1.2.so.0 + 0x8c569 (0x4070b569) 
nsAppShell::Run() 
nsAppShellService::Run() 
main1() 
main() 
libc.so.6 + 0x189cb (0x401ea9cb) 
Keywords: topcrash
Summary: Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns. → Crash in gtk/nsWindow::UpdateIdle, mashing back & fwd btns. M09 topcrash [@ 0x00000000 - nsAppShell::Run()]
Jan: Yes, nothing's different about the stack trace. We die in
nsWindow::UpdateIdle for no reason I can make out... I'm starting to wonder if
it isn't a glib threadsafety issue...

The only thing I can think to do, my previous patches having failed, is to
bandaid the |window->mIsUpdating = PR_FALSE; window->Update();| with an |if
(window)|, but my debugger tells me that |window| is non-null. Something else I
might try out of desperation is replacing the (nsWindow *) cast with an NS cast
macro, or to see if some non-nsWindow is being inserted somewhere as data to
that queue, but I'm sure we'd have already run into that if it were the problem.
i just filed bug 79937 which is a crash in ::UpdateIdle do to a pure virtual 
method invocation.  Although i considered it an xlib bug.
Severity: major → critical
i think this is a dupe of bug 80345 which contains a simple fix.

*** This bug has been marked as a duplicate of 80345 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago23 years ago
Resolution: --- → DUPLICATE
Sweet jesus, thank you so much...

timeless: you should apply the same technique to xlib to fix your bug.
Status: RESOLVED → VERIFIED
sorry i didn't end up being helpfull ;-)

will try better next time...
Mass removing self from CC list.
Now I feel sumb because I have to add back. Sorry for the spam.
Product: Browser → Seamonkey
it sounds like this one has been fixed, or the code has morfed to the point that this bug is no longer valuable.   I might be worthwhile to set up reporting, analysis, and tracking of bug related to forward and back button activity since that has been a source of exposure to crash regressions in the past.  closing this one and opening https://bugzilla.mozilla.org/show_bug.cgi?id=330085
Crash Signature: [@ 0x00000000 - nsAppShell::Run()]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: