(timers) Crash killing a popup window (in pt_PostNotifies?) [nsWebShellWindow::FirePersistenceTimer, PR_Unlock, pthread_cond_signal, __pthread_lock]

RESOLVED WORKSFORME

Status

()

P2
critical
RESOLVED WORKSFORME
17 years ago
16 years ago

People

(Reporter: mozilla-bugs, Assigned: danm.moz)

Tracking

({crash})

Trunk
mozilla1.5alpha
x86
Linux
crash
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

(Reporter)

Description

17 years ago
Reproducible: not sure

I've had Mozilla (BuildID 2002050205 trunk) crash on RedHat Linux 7.2 when I
went to http://weather.yahoo.com/forecast/USNY0717_c.html (using a Personal
Toolbar bookmark) and then killed (via a WM's "X") the pop-up ad window when it
have just started loading.

gdb says:

#0  __pthread_lock (lock=0x5, self=0x0) at spinlock.c:84
#1  0x40221060 in pthread_cond_signal (cond=0x5) at condvar.c:241
#2  0x40209e8c in pt_PostNotifies () from /usr/lib/libnspr4.so
#3  0x4020a04b in PR_Unlock () from /usr/lib/libnspr4.so
#4  0x407e23df in nsWebShellWindow::FirePersistenceTimer () from
/usr/lib/mozilla/components/libnsappshell.so
#5  0x4018ef3b in nsTimerImpl::Fire () from /usr/lib/libxpcom.so
#6  0x4018f06f in handleTimerEvent () from /usr/lib/libxpcom.so
#7  0x4018aa82 in PL_HandleEvent () from /usr/lib/libxpcom.so
#8  0x4018af26 in PL_ProcessEventsBeforeID () from /usr/lib/libxpcom.so
#9  0x407fee1f in processQueue () from /usr/lib/mozilla/components/libwidget_gtk.so
#10 0x40156111 in nsVoidArray::EnumerateForwards () from /usr/lib/libxpcom.so
#11 0x407fee66 in nsAppShell::ProcessBeforeID () from
/usr/lib/mozilla/components/libwidget_gtk.so
#12 0x408062fa in handle_gdk_event () from
/usr/lib/mozilla/components/libwidget_gtk.so
#13 0x40379d7f in gdk_event_dispatch () from /usr/lib/libgdk-1.2.so.0
#14 0x403ad773 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0
#15 0x403add39 in g_main_iterate () from /usr/lib/libglib-1.2.so.0
#16 0x403adeec in g_main_run () from /usr/lib/libglib-1.2.so.0
#17 0x402c8333 in gtk_main () from /usr/lib/libgtk-1.2.so.0
#18 0x407feb69 in nsAppShell::Run () from
/usr/lib/mozilla/components/libwidget_gtk.so
#19 0x407ddba2 in nsAppShellService::Run () from
/usr/lib/mozilla/components/libnsappshell.so
#20 0x08053475 in getCountry ()
#21 0x08053dbb in main ()
#22 0x4050b647 in __libc_start_main (main=0x8053c70 <main>, argc=1,
ubp_av=0xbffff7f4, init=0x804cf8c <_init>, fini=0x8055660 <_fini>,
    rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff7ec) at
../sysdeps/generic/libc-start.c:129
(Reporter)

Updated

17 years ago
Keywords: crash

Comment 1

17 years ago
Changed the product to Browser for investigation.
Assignee: wtc → Matti
Component: NSPR → Browser-General
Product: NSPR → Browser
QA Contact: wtc → imajes-qa
Version: 4.2 → other
-> timer (?) = pavlov
Assignee: Matti → pavlov
Component: Browser-General → ImageLib
QA Contact: imajes-qa → tpreston

Updated

17 years ago
Component: ImageLib → XPCOM

Updated

17 years ago
Priority: -- → P2
Target Milestone: --- → mozilla1.1alpha
(Reporter)

Comment 3

17 years ago
Had a similar crash again. Again, killed a pop-up wildow that have just popped
up. This time it was on a different web site, with BuildID 2002051312 on RedHat
Linux 7.2

gdb said:

(gdb) bt
#0  0x006f006f in ?? ()
#1  0x4018f95b in nsTimerImpl::Fire () from /usr/lib/libxpcom.so
#2  0x4018fa8f in handleTimerEvent () from /usr/lib/libxpcom.so
#3  0x4018b462 in PL_HandleEvent () from /usr/lib/libxpcom.so
#4  0x4018b906 in PL_ProcessEventsBeforeID () from /usr/lib/libxpcom.so
#5  0x40841e2f in processQueue () from /usr/lib/mozilla/components/libwidget_gtk.so
#6  0x40156d61 in nsVoidArray::EnumerateForwards () from /usr/lib/libxpcom.so
#7  0x40841e76 in nsAppShell::ProcessBeforeID () from
/usr/lib/mozilla/components/libwidget_gtk.so
#8  0x4084931a in handle_gdk_event () from
/usr/lib/mozilla/components/libwidget_gtk.so
#9  0x4037ad7f in gdk_event_dispatch () from /usr/lib/libgdk-1.2.so.0
#10 0x403ae773 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0
#11 0x403aed39 in g_main_iterate () from /usr/lib/libglib-1.2.so.0
#12 0x403aeeec in g_main_run () from /usr/lib/libglib-1.2.so.0
#13 0x402c9333 in gtk_main () from /usr/lib/libgtk-1.2.so.0
#14 0x40841b79 in nsAppShell::Run () from
/usr/lib/mozilla/components/libwidget_gtk.so
#15 0x40820a02 in nsAppShellService::Run () from
/usr/lib/mozilla/components/libnsappshell.so
#16 0x08053225 in getCountry ()
#17 0x08053b6b in main ()
#18 0x4050c647 in __libc_start_main (main=0x8053a20 <main>, argc=1,
ubp_av=0xbffed9c4,
    init=0x804cdcc <_init>, fini=0x8055430 <_fini>, rtld_fini=0x4000dcd4
<_dl_fini>, stack_end=0xbffed9bc)
    at ../sysdeps/generic/libc-start.c:129
(Reporter)

Comment 4

17 years ago
*** Bug 151633 has been marked as a duplicate of this bug. ***
(Reporter)

Updated

17 years ago
Summary: Crash in pt_PostNotifies [nsWebShellWindow::FirePersistenceTimer, PR_Unlock, pthread_cond_signal, __pthread_lock] → Crash killing a popup window (in pt_PostNotifies?) [nsWebShellWindow::FirePersistenceTimer, PR_Unlock, pthread_cond_signal, __pthread_lock]

Updated

17 years ago
Summary: Crash killing a popup window (in pt_PostNotifies?) [nsWebShellWindow::FirePersistenceTimer, PR_Unlock, pthread_cond_signal, __pthread_lock] → (timers) Crash killing a popup window (in pt_PostNotifies?) [nsWebShellWindow::FirePersistenceTimer, PR_Unlock, pthread_cond_signal, __pthread_lock]

Comment 5

16 years ago
pavlov -> dougt
Assignee: pavlov → dougt

Comment 6

16 years ago
i can't reproduce this.  I have tried it on a few different sites (espn, money,
aol).  I couldn't get a popup from yahoo (yeah yahoo!).  

However, there is clearly a life time problem in the nsWebShellWindow.  The code
inializes a timer with a pointer to |this|.  However, there is no reference
incremented for this -- so the timer does not own the nsWebShellWindow.  During
the ~() of nsWebShellWindow, it does call timer cancel, however this may be too
late.

Of course, it begs the question, why isn't cancel() cancelling the timer.  
Could it be that the timer is being fired at the same time the nsWebShellWindow
is being destroyed?  

The simple fix here is to simply addref |this| before init'ing the timer and
release the nsWebShellWindow in the callback.

Dan, this blames to you.  thoughts on it?

Comment 7

16 years ago
*** Bug 155846 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 8

16 years ago
  I agree; since the timer object isn't solely owned by the window, it should
(effectively) hold a reference to the window. It's probably the Right Thing. You
should also release not only in the callback but also in the window's Destroy
method, where it cancels the timer. It might be cleaner to use InitWithCallback
instead of InitWithFuncCallback so the reference is handled automatically.
  Also unable to reproduce this bug, I've been eyeballing timer code trying to
figure out how this could result in a callback to a dead window object. Haven't
figured that out yet. Seems like the timer object may be outliving its
usefulness somehow. I wonder if this new reference will cause a circular
reference and leak. That's better than a crash of course but it'd still be nice
to know what's happening.
  I plan to poke around a little more. But lacking any breakthroughs, yes, a
nice addref seems appropriate.

Comment 9

16 years ago
over to danm for the nsWebShellWindow bandaide and more poking.
Assignee: dougt → danm

Comment 10

16 years ago
Re-setting target milestone for triage.
Target Milestone: mozilla1.1alpha → ---
(Assignee)

Comment 11

16 years ago
Created attachment 122800 [details] [diff] [review]
timer bumps window refcount

I've never been able to reproduce this bug but as we've already discussed, the
timer object, which refers to its owner window and isn't solely owned by the
window, should hold a reference to the window. Maybe that's the cause of the
crash.

This is the cleanest implementation (see comment 8), it doesn't cause
nsWebShellWindow to leak, and it seems like a good thing to do.
(Assignee)

Updated

16 years ago
Status: NEW → ASSIGNED
Flags: blocking1.4?
Target Milestone: --- → mozilla1.4final
(Assignee)

Updated

16 years ago
Attachment #122800 - Flags: superreview?(jaggernaut)
Attachment #122800 - Flags: review?(dougt)

Comment 12

16 years ago
Comment on attachment 122800 [details] [diff] [review]
timer bumps window refcount

x
Attachment #122800 - Flags: review?(dougt) → review+

Comment 13

16 years ago
that comment should have been: It can't hurt.

Comment 14

16 years ago
Comment on attachment 122800 [details] [diff] [review]
timer bumps window refcount

sr=jag if you remove the call to cancel from the destructor.
Attachment #122800 - Flags: superreview?(jaggernaut) → superreview+
(Assignee)

Updated

16 years ago
Attachment #122800 - Flags: approval1.4?
(Assignee)

Updated

16 years ago
Attachment #122800 - Flags: approval1.4?
(Assignee)

Comment 15

16 years ago
darin says the new(ish) timer architecture, which proxies back to the
originating thread to execute the callback, will prevent a race condition from
firing the timer callback after it's been cancelled in
nsWebShellWindow::Destroy. The additional reference in this patch shouldn't
really help anything. And indeed talkback reports, while they do show a
smattering of this crash in the 1.0 branch, show nothing on the trunk.

So this crash seems already fixed on the trunk, probably by the new timer code.
Still it seems the COM Thing to Do, letting the timer (effectively) hold a
reference to its callback object. We should probably do this if only for
appearance' sake. I'm retargetting past the current stability (1.4) milestone.
When the time comes we should probably also revisit whether the timerlock is
necessary.
Flags: blocking1.4?
Target Milestone: mozilla1.4final → mozilla1.5alpha
(Assignee)

Comment 16

16 years ago
addref/release part of patch is checked in. Again, the crash has already been
fixed so this is no longer especially important, but it is more proper COM,
bumping the window's refcount while another object holds a reference to it.
Closing the bug WFM.
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.