Closed Bug 154730 Opened 22 years ago Closed 19 years ago

mozilla1.1a core dump at PR_AtomicDecrement()

Categories

(MailNews Core :: Composition, defect, P2)

x86
All
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: antonio.xu, Assigned: bugzilla)

Details

(Keywords: fixedOEM)

Attachments

(2 files, 1 obsolete file)

Mozilla1.1a build on Solaris 8 and window2000 core dump at PR_AtomicDecrement()
by the following operatioin,

1. Start Mozilla1.1a
2. Start Mail & News
3. Click Compose button multiple times
4. Close the invoked compose window immidiately

Try to 3-4 steps for several times

I'm seeing this problem even in C locale.

current thread: t@1
=>[1] PR_AtomicDecrement(0x18, 0x0, 0xffbedfc4, 0xfd8fcd41, 0x6, 0xffbee22c),
at 0xfea202fc
  [2] pt_PostNotifies(0xa9cce8, 0x1, 0xffbedfc4, 0xa9cce8, 0xfd746850, 0x1), at 
0xff052a9c
  [3] PR_Unlock(0xa9cce8, 0x0, 0x14, 0x2710, 0x128d2b8, 0x0), at 0xff052c30
  [4] nsWebShellWindow::FirePersistenceTimer(0x126dfb0, 0x11aaa80, 0xd2ce0ba9, 
0xd2cebd0b, 0x21, 0x13fd7c0), at 0xfdfc9138
  [5] nsTimerImpl::Fire(0x126dfb0, 0x1, 0x136f1f0, 0xfeb3801c, 0x62696e, 
0x66696c), at 0xff146c90
  [6] handleTimerEvent(0x133ea38, 0x0, 0x136f220, 0xfebd4840, 0xff140f78, 0x0), 
at 0xff146e08
  [7] PL_HandleEvent(0x133ea38, 0x208264, 0x0, 0xac970, 0xfee4cb30, 0x0), at 
0xff140f10
  [8] PL_ProcessEventsBeforeID(0x97d78, 0xb720, 0x13fcca8, 0xfeffcbdc, 
0xfee4cb30, 0x0), at 0xff1413f0
  [9] processQueue(0x97d78, 0xb720, 0xf, 0x203760, 0xfee4cb30, 0x80), at 
0xfdb3f8a8
  [10] nsVoidArray::EnumerateForwards(0xd3580, 0xfdb3f8a0, 0xb720, 0xd3580, 
0xfdb48b20, 0xfdb48688), at 0xff0fec98
  [11] handle_gdk_event(0x1329b80, 0x0, 0xfdb485d0, 0xfee4cb30, 0xfee4ef94, 
0x0), at 0xfdb48b20
  [12] gdk_event_dispatch(0x0, 0xffbeeb48, 0x0, 0x0, 0x179d40, 0x0), at 
0xfefd6bc4
  [13] g_main_dispatch(0x0, 0xfee4efc0, 0xfee4efbc, 0xfee4efc4, 0xff3e2660, 
0xfee93b57), at 0xfee26e20
  [14] g_main_iterate(0x1, 0x1, 0x1, 0xfee4ef30, 0x1, 0xfee28138), at 0xfee27634
  [15] g_main_run(0x1df798, 0x0, 0xff14312c, 0xfdb3f3d0, 0x0, 0x0), at 
0xfee27870
  [16] gtk_main(0x0, 0x6a480, 0xffbeec6c, 0x80000000, 0xff1cce44, 0xff1cd3e4), 
at 0xfef0ff30
  [17] nsAppShell::Run(0x1810f0, 0x1810f0, 0xfdb3f584, 0x1a3f8, 0xfdfe4373, 
0x0), at 0xfdb3f5c4
  [18] main1(0x80000000, 0x21f65, 0xff1cda1c, 0xff1cd85c, 0x0, 0x800), at 
0x1a43c
  [19] main(0x6, 0xffbeeef4, 0xffbeef10, 0x6a000, 0x0, 0x0), at 0x1afc0
(dbx) quit
Priority: -- → P2
slightly different stacktrace from linux debug build

(gdb) frame 1
#1  0x4026eca7 in PR_Lock (lock=0xdadadada) at ptsynch.c:190
190	    rv = pthread_mutex_lock(&lock->mutex);
Current language:  auto; currently c
(gdb) p lock
$1 = (PRLock *) 0xdadadada
(gdb) p lock->mutex
Cannot access memory at address 0xdadadada


another debug run (not under gdb) gave the following assertion before crashing:

Assertion failure: 0 == rv, at ptsynch.c:191
OS=>All
OS: Windows 2000 → All
My patch can fix this bug.  I found it problem was due to when compose windows
has been destroied, the Timer still be fired for run function
"nsWebShellWindow::FirePersistenceTimer", this funtion will try to use the
member of the destroied win object, so it will make mozilla crash. I found when
the ~nsWebShellWindow will be executed, it will judge whether the mSPTimer was
equal to null, if mSPTimer is unequal to null, it will cancel all timer. But
some time win object will set timer a lot times, so when the first fire been
executed, it will set mSPTimer equal to null, then if we close the compose
windows and detroy win object, it won't cancel timer due to mSPTimer equal to
null, so when another fire been executed,it will let mozilla crash.  I think if
we set mSPTimer equal to null everytime when
nsWebShellWindow::FirePersistenceTimer has been runned. it will let win object
create new timer object for himself when "void
nsWebShellWindow::SetPersistenceTimer" been runned again. So I think my fix is
good for this problem. 
Please r=? & sr=?
I have researched timer,I think timer maybe have problem. I found when we init 
a timer, timer will be added in TimerThread::mTimers. Then timer will be 
removed from TimerThread::mTimers and released in TimerThread::Run(), then 
timer will use timer::PostTimerEvent() for pass himself to nsTimerManager, it 
will try to fire it. So I think the problem is how to judge a timer has been 
runned.  When timer was removed from TimerThread::mTimers in TimerThread::Run
(),it is means timer has been runned. So if we think that is right,Timer 
shouldn't have problem.  But if we think timer's fire function has been runned 
is means timer has been runned. We should suppress timer fired, if timer has 
been removed from TimerThread::mTimers in TimerThread::Run for running, before 
running nsTimerImpl::SetDelay. But I think the most important question is how 
to judge timer has been runned. The anwser are "when timer was removed from 
TimerThread::mTimers in TimerThread::Run()" or "when timer's fire function has 
been runned".  I think the firse anwser is good, if we choice the second answer 
we will miss some event.
if we think the second anwser is good, we should add some code like this

void
nsWebShellWindow::SetPersistenceTimer(PRBool aSize, PRBool aPosition, PRBool 
aMode)
{
  PR_Lock(mSPTimerLock);
  if (mSPTimer) {
+   mSPTimer->Cancel();    
    mSPTimer->SetDelay(SIZE_PERSISTENCE_TIMEOUT);
please r=? my patch
I'd rather that Pavlov r= this
Change some code according to bryner's advice.please r=? & sr=?
Thank you
Attachment #90176 - Attachment is obsolete: true
Comment on attachment 90470 [details] [diff] [review]
patch version 1.01,please r=? & sr=?

Ok, looks good to me, but I think the underlying timer issue should be
investigated as well (that being that calling SetDelay on a timer that has not
yet fired can make it fire twice).

r=bryner
Attachment #90470 - Flags: review+
Comment on attachment 90470 [details] [diff] [review]
patch version 1.01,please r=? & sr=?

sr=jst
Attachment #90470 - Flags: superreview+
Comment on attachment 90470 [details] [diff] [review]
patch version 1.01,please r=? & sr=?

a=asa (on behalf of drivers) for checkin to the 1.1 trunk.
Attachment #90470 - Flags: approval+
checked in
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Whiteboard: branchOEM
Whiteboard: branchOEM → branchOEM+
checked in NETSCAPE_7_0_OEM_BRANCH (a=jdunn)
Whiteboard: branchOEM+ → fixedOEM
Whiteboard: fixedOEM → branchOEM+ fixedOEM
Keywords: fixedOEM
Whiteboard: branchOEM+ fixedOEM
Using trunk builds 200211-25 on winxp and linux and macosx, I crash trying this
scenario.  Not sure if it's the same crash but here is the winxp talkback.
nsComposerCommandsUpdater::SelectionIsCollapsed
[c:/builds/seamonkey/mozilla/editor/composer/src/nsComposerCommandsUpdater.cpp,
line 367]
nsComposerCommandsUpdater::TimerCallback
[c:/builds/seamonkey/mozilla/editor/composer/src/nsComposerCommandsUpdater.cpp,
line 265]
nsComposerCommandsUpdater::Notify
[c:/builds/seamonkey/mozilla/editor/composer/src/nsComposerCommandsUpdater.cpp,
line 383]
nsTimerImpl::Fire [c:/builds/seamonkey/mozilla/xpcom/threads/nsTimerImpl.cpp,
line 380]
nsAppShell::Run [c:/builds/seamonkey/mozilla/widget/src/windows/nsAppShell.cpp,
line 177]
nsAppShellService::Run
[c:/builds/seamonkey/mozilla/xpfe/appshell/src/nsAppShellService.cpp, line 472]
main1 [c:/builds/seamonkey/mozilla/xpfe/bootstrap/nsAppRunner.cpp, line 1557]
main [c:/builds/seamonkey/mozilla/xpfe/bootstrap/nsAppRunner.cpp, line 1905]
WinMain [c:/builds/seamonkey/mozilla/xpfe/bootstrap/nsAppRunner.cpp, line 1925]
WinMainCRTStartup()
kernel32.dll + 0x217c7 (0x77e817c7) 

and here is all the linux talkback has in the report:
SIGSEGV: Segmentation Fault: (signal 11)

Is this a different bug or the same as this one.

to reproduce, follow the steps in the original scenario, reopening until someone
can tell me if this is the same bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Product: MailNews → Core
I can't reproduce this and lack of comments suggests nobody else can either.

reresolving FIXED since a patch went in.
Status: REOPENED → RESOLVED
Closed: 22 years ago19 years ago
Resolution: --- → FIXED
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: