Closed Bug 426275 Opened 16 years ago Closed 14 years ago

crash on clear private data (history) [@ nsNavHistoryExpire::ClearHistory]

Categories

(Firefox :: Bookmarks & History, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: jbecerra, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Attachments

(4 files, 1 obsolete file)

While running Litmus test case #4141 on Fx3b5rc2 on: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9b5) Gecko/2008032619 Firefox/3.0b5

http://crash-stats.mozilla.com/report/index/617175f8-ff7f-11dc-8227-001a4bd43e5c

I opened the history sidebar, deleted an entry, searched for such entry, got some partial results in the sidebar, left the sidebar open, then went into Clear Private Data and selected Browing History (only). Firefox crashed after clicking on the Clear Private Data Now button.
Severity: normal → critical
Keywords: crash
Confirmed on Mac Nightly and Win XP Beta5rc2
Status: UNCONFIRMED → NEW
Ever confirmed: true
I cannot reproduce it right now. Clear STR would be recommended. Could you try with a fresh profile? And if it is reproducible even after a restart the broken places.sqlite would be interesting.

First 10 frames of the stack:

0  	@0xadfb6800  	
1 	nsNavHistoryExpire::ClearHistory() 	mozilla/toolkit/components/places/src/nsNavHistoryExpire.cpp:289
2 	nsNavHistory::RemoveAllPages() 	mozilla/toolkit/components/places/src/nsNavHistory.cpp:3815
3 	NS_InvokeByIndex_P 	mozilla/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_unixish_x86.cpp:179
4 	XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) 	mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2369
5 	XPC_WN_CallMethod(JSContext*, JSObject*, unsigned int, long*, long*) 	mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp:1470
6 	js_Invoke 	mozilla/js/src/jsinvoke.c:1287
7 	js_Interpret 	mozilla/js/src/jsinterp.c:4841
8 	js_Invoke 	mozilla/js/src/jsinvoke.c:1303
9 	nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) 	mozilla/js/src/xpconnect/src/xpcwrappedjsclass.cpp:1475
10 	nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) 	mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:559

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/toolkit/components/places/src/nsNavHistoryExpire.cpp&rev=1.43&mark=289#289
Status: NEW → ASSIGNED
OS: Mac OS X → All
Hardware: PC → All
Summary: crash on clear private data → crash on clear private data (history) [@ nsNavHistoryExpire::ClearHistory]
I repro'd the crash in the nightly on XP: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9pre) Gecko/2008033105 Minefield/3.0pre.

My STR:
1) opened a dirty profile (over 4 days of history), in the history sidebar
2) selected a random history item and deleted it
3) with history sidebar still opened, Cleared Private data with only "browsing history" checked
4) Crashed, with breakpad dialog appearing.

http://crash-stats.mozilla.com/report/index/76d0d766-ff83-11dc-a936-001a4bd43ef6
Flags: blocking-firefox3?
(In reply to comment #3)
> 2) selected a random history item and deleted it

Which view you have selected within the history sidebar when doing the deletion?
Status: ASSIGNED → NEW
the default one..  By Date.
please relnote for beta 5 and fix by final
Keywords: relnote
Flags: blocking-firefox3? → blocking-firefox3+
Priority: -- → P1
Target Milestone: --- → Firefox 3
Tony, if you're able to reliably reproduce, can you attach a places.sqlite that causes the problem?
no, i cant' reliable reproduce this now.   once i do, i'll add my places.sqlite file here.
Perhaps it's a cycle collection bug?  Looking at the breakpad report in comment 3, it seems to me that one of the observers we are trying to notify doesn't exist anymore.

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/toolkit/components/places/src/nsNavHistoryResult.cpp&rev=1.138&mark=4276#4276

Could also be how they are stored, since they are just stored as pointers.  I don't see us addreffing or removing them ever...
Hm, not sure about that.

It looks like nsNavHistory holds an nsRefPtr to a container, which holds an nsCOMArray to nodes, so I think somehow the node is forgetting to call RemoveHistoryObserver in some case when it should.

Cycle collection doesn't seem to be implicated since I can't see how we could unlink a node without also unlinking the result that owns it...

Note that nsNavHistoryQueryResultNode doesn't call RemoveHistoryObserver in a destructor like nsNavHistoryFolderResultNode does with RemoveBookmarkFolderObserver.
-> dietrich
Assignee: nobody → dietrich
(In reply to comment #11)
> -> dietrich
actually, -> me.  I'm working on this right now.
Assignee: dietrich → sdwilsh
Attached patch v0.1 testcaseSplinter Review
Here's a testcase, but I can't get it to crash.  This reproduces the situation in comment 3 as best I can, but doesn't crash.  So, if someone can get the test to crash, I'll be happy to take another look at it.  However, I can't get a test to crash, nor can I get a build to crash, so it's pretty hard for me to debug.
I'll have a look at this. But lets add qawanted keyword meanwhile...
Keywords: qawanted
Shawn, i think that the point for having crash is the opened history sidebar, maybe due to dynamically created containers
try a query with RESULTS_AS_DATE, leaving the container open while you clear history
Shawn, how can I run this unit test? Is this a xpshell one? So i have to do the steps mentioned here: http://developer.mozilla.org/en/docs/Writing_xpcshell-based_unit_tests?
(In reply to comment #15)
> Shawn, i think that the point for having crash is the opened history sidebar,
> maybe due to dynamically created containers
I though that was what this test was using :/


(In reply to comment #17)
> Shawn, how can I run this unit test? Is this a xpshell one? So i have to do the
> steps mentioned here:
> http://developer.mozilla.org/en/docs/Writing_xpcshell-based_unit_tests?
In your object director you'll want to run the following:
make -C toolkit/components/places/tests/ (just the first time to install the test)
make -C toolkit/components/places/tests/ check-one SOLO_FILE="test_426275.js"
I think if you are on windows, you have to the the first step each time you make a change.
(In reply to comment #18)
> In your object director you'll want to run the following:
> make -C toolkit/components/places/tests/ (just the first time to install the
> test)
> make -C toolkit/components/places/tests/ check-one SOLO_FILE="test_426275.js"
> I think if you are on windows, you have to the the first step each time you
> make a change.

I tried to do that but the subfolder "unit" doesn't exist. Seems that is something wrong with my build. The folders "browser" and "chrome" still exist. But running tests there it even wont run:

###!!! ASSERTION: Fault in cycle collector: multiple registrations of language runtime (ptr: 25c10780)
: 'Not Reached', file /Users/Shared/Projects/mozilla/source/mozilla/xpcom/base/nsCycleCollector.cpp, line 1053
nsCycleCollector_suspectedCount()+0x000003C5 [../../../../dist/bin/libxpcom_core.dylib +0x0007A28D]
nsCycleCollector_forgetRuntime(unsigned int)+0x0000007E [../../../../dist/bin/libxpcom_core.dylib +0x0007A3D6]
nsCycleCollector_registerRuntime(unsigned int, nsCycleCollectionLanguageRuntime*)+0x00000036 [../../../../dist/bin/libxpcom_core.dylib +0x0007A420]
UNKNOWN [/Users/Shared/Projects/mozilla/source/obj/browser-i386-apple-darwin8.11.1/toolkit/components/places/tests/../../../../dist/bin/components/libxpconnect.dylib +0x00004E4B]

So sorry, I can't run your unit test for now.
maybe related: while looking into a different issue, i just found that bookmark observers added via nsINavHistoryResult are never removed. i haven't looked further into it yet, so don't know why.
Attached patch fix (obsolete) — Splinter Review
sdwilsh said he couldn't get to this for a few days, so taking.

query results were never removing themselves as either history or bookmarks observers, and result nodes were never removed as result observers. this patch removes all observers at each level when the result is destroyed.
Assignee: sdwilsh → dietrich
Status: NEW → ASSIGNED
Attachment #313866 - Flags: review?(mano)
well, that is a *potential* fix, since i haven't yet repro'd the crash.
(In reply to comment #18)
> make -C toolkit/components/places/tests/ check-one SOLO_FILE="test_426275.js"
> I think if you are on windows, you have to the the first step each time you
> make a change.

I was able to run your testcase on my OS X 10.5 box. But there was no crash:

-n ../../../../_tests/xpcshell-simple/test_places/unit/test_426275.js: 
PASS
Whiteboard: [has patch][needs review mano]
(In reply to comment #23)
> I was able to run your testcase on my OS X 10.5 box. But there was no crash:
Yup - I couldn't reproduce either.
I don't understand this, aren't all elements from aData removed as you delete... aData?
Attached patch fix v2Splinter Review
yep, you're correct. i retested, and confirmed that bit is not necessary.
Attachment #313866 - Attachment is obsolete: true
Attachment #314125 - Flags: review?(mano)
Attachment #313866 - Flags: review?(mano)
Comment on attachment 314125 [details] [diff] [review]
fix v2

Drive by review comments...

>Index: toolkit/components/places/src/nsNavHistoryResult.cpp
>+    // remove history observer
>+    nsNavHistory* history = nsNavHistory::GetHistoryService();
>+    if (history) {
>+      history->RemoveObserver(this);
>+    }
I don't think it's even possible for us to have an observer without a history service, right? (is this check really neccessary)

General comment:  I thought it was the style in places to not have braces for single line ifs and fors.
(In reply to comment #27)
> (From update of attachment 314125 [details] [diff] [review])
> Drive by review comments...
> 
> >Index: toolkit/components/places/src/nsNavHistoryResult.cpp
> >+    // remove history observer
> >+    nsNavHistory* history = nsNavHistory::GetHistoryService();
> >+    if (history) {
> >+      history->RemoveObserver(this);
> >+    }
> I don't think it's even possible for us to have an observer without a history
> service, right? (is this check really neccessary)

theoretically, you are correct.

> General comment:  I thought it was the style in places to not have braces for
> single line ifs and fors.
> 

will fix.

also, Mano brought up that this dtor should never be called while any nodes are alive, as they hold a strong reference to the result. i did some debugging and it looks like results *are* sometimes being destroyed before the result nodes they contain, so seems nodes are not holding a strong reference to the result correctly, as Mano suggested.
Whiteboard: [has patch][needs review mano] → [has patch]
Ah? I said that the *observer* holds a strong reference to the result, not the nodes (I don't think they do, actually, and if they're, there's a cycle collector rule for that, iirc).
and, the code style in *toolkit*'s places is to have braces around single-line blocks.
Whiteboard: [has patch] → [has patch][needs review mano]
Comment on attachment 314125 [details] [diff] [review]
fix v2

Comment 29
Attachment #314125 - Flags: review?(mano)
Whiteboard: [has patch][needs review mano] → [has patch]
Whiteboard: [has patch] → [needs new patch]
(In reply to comment #29)
> Ah? I said that the *observer* holds a strong reference to the result, not the
> nodes (I don't think they do, actually, and if they're, there's a cycle
> collector rule for that, iirc).
> 

Er, did you mean the bookmark/history service holds a strong ref to the result? Because the result *is* the observer, and nodes are observers of the result.

Regardless, the result is supposed to be a weak observer of the history and bookmarks services, but doesn't seem to ever be let go of. Maybe should just remove nsMaybeWeakPtr and weak observer support altogether and have results always manually remove themselves (as in this patch), since this is the only weak bookmarks/history observer case in the tree afaict.
Hrm, I was wrong - these are being cleaned up (hence the dtor being called in the first place). Possibly the invalid weak refs to the GC'd result objects are not being removed from the observer list ever.
remove null entries in bookmarks and history service observer arrays.

these entries *shouldn't* be a problem, but best to stay clean.
Attachment #318324 - Flags: review?(mano)
Tony (or anyone else) have you seen this crash again since reporting? I've still not been able to reproduce it.
Comment on attachment 318324 [details] [diff] [review]
fix for comment #33

There's nothing safe in removing elements from an array, while looping over it using an index...

I don't understand why should we remove anything either.
Attachment #318324 - Flags: review?(mano) → review-
(In reply to comment #36)
> (From update of attachment 318324 [details] [diff] [review])
> There's nothing safe in removing elements from an array, while looping over it
> using an index...

*sigh*

> 
> I don't understand why should we remove anything either.
> 

just looping over null array elements, so i guess not that big of a problem.
tomcat is unable to reproduce. i'm unable to reproduce, tried against a 6 mo. history file, as well as shorter history files. tchung said he'd try again later today.
I was able to reproduce again this morning using the steps described in this crash report: http://crash-stats.mozilla.com/report/index/d4a62919-1612-11dd-8eb4-001cc45a2c28?date=2008-04-29-17

Since then I have tried many more times, and I have not been able to reproduce the problem. I will keep trying.
(In reply to comment #10)
> Note that nsNavHistoryQueryResultNode doesn't call RemoveHistoryObserver in a
> destructor like nsNavHistoryFolderResultNode does with
> RemoveBookmarkFolderObserver.
> 

so, mResult is never valid in the dtor when the node is GC'd, so that observer-removal code is not executed (which i found out after giving the same treatment to query result nodes).

looks like maybe the cycle collection rules create a scenario where mResult is null in the node dtors, and mRootNode is null in the result dtor, so can't really anything useful in either.
Seems to be happening rarely, so not blocking anymore. Renom if that changes.
Flags: blocking-firefox3+ → blocking-firefox3-
http://crash-stats.mozilla.com/?do_query=1&product=Firefox&branch=1.9&version=Firefox%3A3.0b5&query_search=stack&query_type=contains&query=nsNavHistoryExpire%3A%3AClearHistory()&date=&range_value=1&range_unit=weeks

shows that a ~million beta5 users encountered crashes that contained "nsNavHistoryExpire::ClearHistory() in the stack about 100 times.

doesn't look like this is anywhere near top crash, but we should keep an eye on it.

I've a similar problem with the release candidate. I close Firefox with the History sidebar open. I then open firefox and the first thing I do is to right click on the first History entry and select delete, Firefox then immediately crashes. The bug is repeatable, I'm on Windows XP SP2.
Target Milestone: Firefox 3 → Firefox 3.1
Flags: blocking1.9.0.3?
Flags: blocking1.9.0.4? → wanted1.9.0.x+
Dietrich, are you still working on this?
Whiteboard: [needs new patch] → [needs status update dietrich][needs new patch]
untargetting since cause is still not known and we need to look at crash reports since datas are based on 3.0.x betas still.
Target Milestone: Firefox 3.1 → ---
I was able to hit this crash twice now:
http://crash-stats.mozilla.com/report/index/8eea562c-082b-481e-8195-610552090121?p=1

I did a backup of my profile and will hopefully have a testcase and steps to reproduce the crash. Please stay tuned.
Minefield gives us a bit more information. So the crash happens in nsNavHistoryFolderResultNode::GetViewIndex. I remember that I got this crash each time after I've tried to reproduce bug 421246. I think both bugs are related to each other.

http://crash-stats.mozilla.com/report/index/772c5b53-e3bb-4068-966d-175572090121?p=1

0  	xul.dll  	nsNavHistoryFolderResultNode::GetViewIndex  	 toolkit/components/places/src/nsNavHistoryResult.h:284
1 	xul.dll 	nsNavHistoryResult::OnClearHistory 	toolkit/components/places/src/nsNavHistoryResult.cpp:4466
2 	xul.dll 	nsNavHistoryExpire::ClearHistory 	toolkit/components/places/src/nsNavHistoryExpire.cpp:291
3 	xul.dll 	nsNavHistory::RemoveAllPages 	toolkit/components/places/src/nsNavHistory.cpp:4612

I can't offer real STR right now, but browse some sites and try to reproduce bug 421246 (sidebar history entries cannot be deleted). Doing a clear private data action with all entries crashes most of the time for me.
removing qawanted... due diligence has been done by whimboo, others.
Keywords: qawanted
this could be another case related to bad cycle collecting (bug 487040)
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Assignee: dietrich → nobody
Severity: critical → major
Whiteboard: [needs status update dietrich][needs new patch]
Severity: major → critical
Whimboo: where is this on crash reports in 3.6, 3.7? is this a topcrash?
This specific crash doesn't appear anymore on crash-stats. No version of Firefox show crashes in this signature for the last 4 weeks. I or someone else should check my comment 48 and try to reproduce the crash. If we are not able to we should close this bug as WFM.
Keywords: relnoteqawanted
btw, GetViewIndex is gone on 3.6, and ClearHistory will disappear in 3.7.
there are only 4 reports of this crash over the last 2 weeks.  The crash only appeared on Mac, and for Firefox 3.0.1 and 3.0.15

http://crash-stats.mozilla.com/query/query?version=ALL%3AALL&date=&range_value=2&range_unit=weeks&query_search=signature&query_type=contains&query=nsNavHistoryExpire%3A%3AClearHistory&do_query=1
Strange that i wasn't able to find those crashes. Thanks chris. The interesting part is that those also only appear on 10.4. But that could be the cause of the limited amount of reports. So two of our users could have been crashed here.
Signature	@0x0 | nsNavHistoryExpire::ClearHistory()
UUID	146aa5e1-a662-486c-bcee-f3e702091209
Time 	2009-12-09 07:14:54.344121
Uptime	21391
Last Crash	88354 seconds before submission
Product	Firefox
Version	3.0.15
Build ID	2009101600
Branch	1.9.0
OS	Mac OS X
OS Version	10.4.11 8S165
CPU	ppc
CPU Info	
Crash Reason	EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash Address	0x0
User Comments	
Processor Notes 	
Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 		@0x0 	
1 	XUL 	nsNavHistoryExpire::ClearHistory 	mozilla/toolkit/components/places/src/nsNavHistoryExpire.cpp:302
2 	XUL 	nsNavHistory::RemoveAllPages 	mozilla/toolkit/components/places/src/nsNavHistory.cpp:4001
3 	XUL 	XUL@0x8c690c 	
4 	XUL 	XPCWrappedNative::CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2395

298 // XXX todo
299 // forcibly call the "on idle" timer here to do a little work
300 // but the rest will happen on idle.
301
302 ENUMERATE_WEAKARRAY(mHistory->mObservers, nsINavHistoryObserver,
303 OnClearHistory()) 

Just because it seems worth noting that this crash was a happy null pointer deref.
This is a mass change. Every comment has "assigned-to-new" in it.

I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.
Status: ASSIGNED → NEW
this is a 3.0.x only crash and it's hard to fix it there without large changes since most of the code has changed. I feel like it has been fixed in 3.5 by the patch in bug 487040 that fixed cycle collector glitches.

In the last 2 weeks there is only 1 report in 3.0.11.
So this is FIXED on 3.5+, WONTFIX on 3.0.11 (since fixing it would cause more regressions risks than what the benefit is)
Status: NEW → RESOLVED
Closed: 14 years ago
Depends on: 487040
Resolution: --- → FIXED
Crash Signature: [@ nsNavHistoryExpire::ClearHistory]
You need to log in before you can comment on or make changes to this bug.