Closed Bug 392873 Opened 14 years ago Closed 13 years ago

crash in nsExpirationTracker<nsSHEntry,3>::RemoveObject

Categories

(Core :: General, defect)

x86
Windows 2000
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: techtonik, Assigned: roc)

Details

(Keywords: crash)

Attachments

(4 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a8pre) Gecko/2007081905 Minefield/3.0a8pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a8pre) Gecko/2007081905 Minefield/3.0a8pre

Crash occurred while working with GMail, not sure if it was a reason. Stacktrace attached. No talkback interface in this build.

Reproducible: Sometimes
Attached file c0000005 crash report
Flags: blocking-firefox3?
Version: unspecified → Trunk
Flags: blocking-firefox3?
Product: Firefox → Core
QA Contact: general → general
I experienced a similar crash today (I think I was waiting for a new page to load) with a SeaMonkey trunk build. It might have the same cause, but hard to tell since the original bug report does not contain a good stack or a Breakpad id. Anyway, here is the stack:
0:000> kp
ChildEBP RetAddr  
0012fc1c 02467698 docshell!nsExpirationTracker<nsSHEntry,3>::RemoveObject(class nsSHEntry * aObj = 0x02467521)+0x24 [f:\mozilla\tree-cvsmo\mozilla\objsuite\dist\include\xpcom\nsexpirationtracker.h @ 157]
0012fc24 0246748d docshell!HistoryTracker::NotifyExpired(class nsSHEntry * aObj = 0x02467521)+0x9 [f:\mozilla\tree-cvsmo\mozilla\docshell\shistory\src\nsshentry.cpp @ 70]
0012fc40 02467521 docshell!nsExpirationTracker<nsSHEntry,3>::AgeOneGeneration(void)+0x4b [f:\mozilla\tree-cvsmo\mozilla\objsuite\dist\include\xpcom\nsexpirationtracker.h @ 211]
0012fc48 002b88df docshell!nsExpirationTracker<nsSHEntry,3>::TimerCallback(class nsITimer * aTimer = 0x0028b3f7, void * aThis = 0x00000001)+0xc [f:\mozilla\tree-cvsmo\mozilla\objsuite\dist\include\xpcom\nsexpirationtracker.h @ 278]
0012fc58 002b8a26 xpcom_core!nsTimerImpl::Fire(void)+0x6d [f:\mozilla\tree-cvsmo\mozilla\xpcom\threads\nstimerimpl.cpp @ 384]
0012fc60 002b9312 xpcom_core!nsTimerEvent::Run(void)+0x1b [f:\mozilla\tree-cvsmo\mozilla\xpcom\threads\nstimerimpl.cpp @ 459]
0012fc80 0028b3f7 xpcom_core!nsThread::ProcessNextEvent(int mayWait = 1, int * result = 0x0012fc9c)+0xc3 [f:\mozilla\tree-cvsmo\mozilla\xpcom\threads\nsthread.cpp @ 491]
0012fc94 0160b60b xpcom_core!NS_ProcessNextEvent_P(class nsIThread * thread = 0x00000001, int mayWait = 1)+0x27 [f:\mozilla\tree-cvsmo\mozilla\objsuite\xpcom\build\nsthreadutils.cpp @ 227]
0012fca8 017818ef gkwidget!nsBaseAppShell::Run(void)+0x2a [f:\mozilla\tree-cvsmo\mozilla\widget\src\xpwidgets\nsbaseappshell.cpp @ 154]
0012fcb4 10007624 tkitcmps!nsAppStartup::Run(void)+0x1e [f:\mozilla\tree-cvsmo\mozilla\toolkit\components\startup\src\nsappstartup.cpp @ 171]
0012fef4 004012ce xul!XRE_main(int argc = 1, char ** argv = 0x003f6b88, struct nsXREAppData * aAppData = 0x003f6fa0)+0x127e [f:\mozilla\tree-cvsmo\mozilla\toolkit\xre\nsapprunner.cpp @ 3116]
0012ff24 00401325 seamonkey!main(int argc = 1, char ** argv = 0x003f6b88)+0xc1 [f:\mozilla\tree-cvsmo\mozilla\suite\app\nssuiteapp.cpp @ 100]
0012ff30 00401513 seamonkey!WinMain(struct HINSTANCE__ * __formal = 0x7c816fd7, struct HINSTANCE__ * __formal = 0x003f0000, char * args = 0x7c920732 "???", int __formal = 2147344384)+0x13 [f:\mozilla\tree-cvsmo\mozilla\suite\app\nssuiteapp.cpp @ 110]
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\system32\kernel32.dll - 
0012ffc0 7c816fd7 seamonkey!__tmainCRTStartup(void)+0x140 [f:\rtm\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 578]
WARNING: Stack unwind information not available. Following frames may be wrong.
0012fff0 00000000 kernel32!RegisterWaitForInputIdle+0x49
Summary: crash in nsExpirationTracker<gfxFont,3>::CheckStartTimer → crash in nsExpirationTracker<gfxFont,3>::CheckStartTimer [@ nsExpirationTracker<nsSHEntry,3>::RemoveObject]
Breakpad also has some crash reports in that direction (limited search to one week):
Rank    Signature                                                # Win  Mac   Lin
1 	nsExpirationTracker<gfxFont, 3>::AddObject(gfxFont*) 	61 61 	0     0
2 	nsExpirationTracker<nsSHEntry, 3>::RemoveObject(nsSHEntry*) 24 24  0  0
nsSHEntry using nsExpirationTracker is new.  gfxFont is different (unless this is a bug in nsExpirationTracker itself).  Most of the gfxFont crahes are from SeaMonkey and requires use of suite's old typeaheadfind (I can provide a testcase that will hit that consistently if it would be useful).
Keywords: crash
Maybe I should then split my crash in a new bug... :) (will do that later).
I don't suppose anyone has any idea how to reproduce this (nsSHEntry) crash?
Flags: blocking1.9?
Not really, no. I think one time I was waiting for a page to load and one time I was just reading a web page when it crashed.
Some additional information I got with a debug build (dunno if it helps), I crashed a few lines above in line 152: Many members of aObj are either 0xdddddddd, 0xdd or -572662307 (for the mScrollPositionX/Y members). Exceptions are mParent ([2] member under __vfptr is 0xdddddddd though), mCacheKey (same), mContentType, mDocument, mContentViewer (Windbg displays a "<Memory access error>" though for the values under __vfptr). No assertions or warnings in console before it crashed.
thanks.

Looks like a deleted object is still in the tracker. I'll see if I can add some assertions to help track this down.
Attached patch debugging codeSplinter Review
This should help track down the crash. Very simple brute-force checking
Assignee: nobody → roc
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #282343 - Flags: superreview?(bzbarsky)
Attachment #282343 - Flags: review?(bzbarsky)
Comment on attachment 282343 [details] [diff] [review]
debugging code

Let's true it
Attachment #282343 - Flags: superreview?(bzbarsky)
Attachment #282343 - Flags: superreview+
Attachment #282343 - Flags: review?(bzbarsky)
Attachment #282343 - Flags: review+
Attachment #282343 - Flags: approval1.9? → approval1.9+
Summary: crash in nsExpirationTracker<gfxFont,3>::CheckStartTimer [@ nsExpirationTracker<nsSHEntry,3>::RemoveObject] → crash in nsExpirationTracker<nsSHEntry,3>::RemoveObject
I checked in that patch ages ago.

Is anyone still seeing this? I haven't seen it at all.
Seems to be stable now. I guess it can be closed if you checked in that patch after 2007-08-20
That patch was just debugging code. It shouldn't have fixed any bugs :-(

I'll mark worksforme, I guess. At least the debugging code is there in case this flares up again.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
I also do not see my (this) crash anymore. crash-reports.m.c shows a peak for the nsExpirationTracker<nsSHEnty,3> signature around 9-30. crash-reports.m.c shows that there are still a few crashes though on trunk with that stack trace.
I see 72 crashers in the last week.
http://crash-stats.mozilla.com/report/list?range_unit=weeks&query_search=signature&query_type=contains&product=Firefox&version=Firefox%3A3.0a9pre&branch=1.9&signature=nsExpirationTracker%3CnsSHEntry%2C+3%3E%3A%3ARemoveObject(nsSHEntry*)&query=nsExpirationTracker&range_value=1

They're all over the place, seemingly at random.

It looks like we have a deleted nsSHEntry in our expiration tracker. I just don't see how that can happen. Maybe I should change this debug code so that when we destruct an nsSHEntry, if it's still in the tracker then we cause a crash right then. That would give us a more useful stack. Let me do that.
Attached patch patch (obsolete) — Splinter Review
Crash early in release builds if a destroyed nsSHEntry is still in the tracker.

I'm also moving gHistoryTracker->AddObject(this) to an earlier position just in case one of the intervening calls is somehow destroying the nsSHEntry --- which shouldn't be possible, and would cause crashes elsewhere, I would think, but just in case...
Attachment #285527 - Flags: superreview?(bzbarsky)
Attachment #285527 - Flags: review?(bzbarsky)
Attached patch updated patchSplinter Review
More crash-early instrumentation to detect if AddObject being called when the entry is already in the tracker is the cause of our problems.
Attachment #285527 - Attachment is obsolete: true
Attachment #285547 - Flags: superreview?(bzbarsky)
Attachment #285547 - Flags: review?(bzbarsky)
Attachment #285527 - Flags: superreview?(bzbarsky)
Attachment #285527 - Flags: review?(bzbarsky)
Comment on attachment 285547 [details] [diff] [review]
updated patch

Let's do it.
Attachment #285547 - Flags: superreview?(bzbarsky)
Attachment #285547 - Flags: superreview+
Attachment #285547 - Flags: review?(bzbarsky)
Attachment #285547 - Flags: review+
Comment on attachment 285547 [details] [diff] [review]
updated patch

need approval for this mostly-debug code to help us track down a mysterious crasher
Attachment #285547 - Flags: approval1.9?
Reopening since it's clearly still happening to some people
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment on attachment 282343 [details] [diff] [review]
debugging code

Resetting all approval1.9+ flags on bugs that have not been checked in by Oct 22 11:59 PM PDT.  Please re-request approval if needed.
Attachment #282343 - Flags: approval1.9+
Comment on attachment 282343 [details] [diff] [review]
debugging code

This was checked-in as rev 1.5 on 2007-09-30 14:34.
Attachment #282343 - Flags: approval1.9?
Comment on attachment 282343 [details] [diff] [review]
debugging code

Reset approval flag to + as it was already checked in.
Attachment #282343 - Flags: approval1.9? → approval1.9+
Related to bug 398084?
I don't think so, unless there's a bug in nsExpirationTracker itself, in which case I think we'd see more crashes and some crashes in gfxTextRunCache which uses it heavily.
Attachment #285547 - Flags: approval1.9? → approval1.9+
Whiteboard: [needs landing]
"updated patch" checked-in.

Checking in docshell/shistory/src/nsSHEntry.cpp;
/cvsroot/mozilla/docshell/shistory/src/nsSHEntry.cpp,v  <--  nsSHEntry.cpp
new revision: 1.61; previous revision: 1.60
done
Whiteboard: [needs landing]
The first few crashers due to that check-in are showing up on Socorro: See for example http://crash-stats.mozilla.com/report/index/6ea8c700-8fbd-11dc-be7b-001a4bd43e5c or http://crash-stats.mozilla.com/report/index/21871ab3-8fb6-11dc-b3ee-001a4bd43ef6 (or in general search for nsSHEntry::SetContentViewer).
Attached patch band-aidSplinter Review
I think we're violating this:

nsSHEntry::SetContentViewer(nsIContentViewer *aViewer)
{
  NS_PRECONDITION(!aViewer || !mContentViewer, "SHEntry already contains viewer");

Not sure how, and with no steps to reproduce, debugging is hard.

This patch mitigates this situation somewhat by calling DropPresentationState in that situation. This might save people from crashing.
Attachment #288252 - Flags: superreview?(bzbarsky)
Attachment #288252 - Flags: review?(bzbarsky)
Comment on attachment 288252 [details] [diff] [review]
band-aid

Let's hope Jesse hits those asserts...
Attachment #288252 - Flags: superreview?(bzbarsky)
Attachment #288252 - Flags: superreview+
Attachment #288252 - Flags: review?(bzbarsky)
Attachment #288252 - Flags: review+
Attachment #288252 - Flags: approval1.9? → approval1.9+
checked that patch in.
If I searched the crash reports database correctly, there were lots of crashes with 3.0b1 (2007110904), only one more crash with 3.0b2 (build id 2007121100), and none at all afterwards. So this is fixed?
Per the last patch, the crash-on-purpose lines have been taken out, so I suspect this bug is being kept open in case someone hits the assert.
I don't think it makes sense to leave bugs open forever "just in case". Let's close this and mark it WFM. We can always open a new bug.
Status: REOPENED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.