Closed Bug 426261 Opened 15 years ago Closed 14 years ago

Crash in nsNavHistoryContainerResultNode::GetSortType() during bookmark synchronization

Categories

(Firefox :: Bookmarks & History, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Firefox 3

People

(Reporter: ondrej, Assigned: ondrej)

References

Details

(Keywords: crash, Whiteboard: [RC2+])

Attachments

(4 files, 5 obsolete files)

Bug 422743 has fixed a crash in nsVoidArray::Count() during bookmark synchronization, however, there is yet another crash in method nsNavHistoryContainerResultNode::GetSortType().

(In reply to bug 422743 comment #18)
> With my large bookmark set, I still saw the following, with two different sync
> attempts:
> http://crash-stats.mozilla.com/report/index/3ca87c05-fe93-11dc-9ead-001a4bd43ef6
> http://crash-stats.mozilla.com/report/index/7ab4fb82-fe93-11dc-a233-001a4bd43ef6
> 
> I spent a while over the weekend trying deleting groups of bookmarks, then
> creating fixed sized folders to keep adding to try and find some magic number.
> As near as I can tell, no particular number of bookmarks was triggering it, but
> there was definitely an issue somehow with the structure of the folders, or the
> depth, or the naming.
Flags: blocking-firefox3?
Drivers: the same segment of Foxmarks users afflicted by bug 422743 also experience this crash. this should block.
This is also a problem on the MacOS version of Firefox 3b5.

The following is a workaround that worked on the Mac.

I was able to get Foxmarks to work without losing my bookmarks. I made a backup of the bookmarks.html file, then deleted the profile directory and reinstalled Foxmarks (Do not run the setup wizard). Of course this wiped out my passwords, settings and cookies. My major concern is getting Foxmarks again.

I then quit Firefox 3b5, replaced the bookmarks.html with the backup copy and started Firefox. Instead of the wizard I used the preference for Foxmarks and told it to upload the bookmarks, replacing the server side. That worked and has been working for several sync's since.

Even with a clean install of Firefox, then the beta Foxmarks, trying to download from the server first, with or without the wizard, crashes Firefox.
I suspect the cycle collector running at wrong time . I would like to somehow disable it, or decrease possibility that it will run and test that it does not crash anymore. Is there some easy way how to do this in existing build, or should I modify something like nsJSContext::MaybeCC(PRBool aHigherProbability) and create a test build?

  if (mParent)
    return mParent->GetSortType();

This really should not be giving access violation.
Dietrich can you please create a test build which includes this patch. CycleCollector will be disabled and if the crash will not be reproducible we will know that the problem is related to it. The built will leak memory enormously, but it should be possible to sync with Foxmarks.
Flags: blocking-firefox3? → blocking-firefox3+
Here's a summary of what users have reported to me thus far:

1) On Linux: the test build core dumps on Firefox startup. So unfortunately no new info there.

2) On Windows: several users reported that syncing with Foxmarks continues to crash the test build of Firefox, but not in a way that produces a crash report. One user had Visual Studio installed and was thus able to get Windows to fire up a debugger. He reports:

Unhandled exception at 0x1047043f in firefox.exe: 0xC0000005: Access 
violation reading location 0x05534938.

This is all the information I have to date.
Test build core dumps on startup as well.
Test build core dumps on a Mac on startup as well.
(In reply to comment #8)
> Test build core dumps on a Mac on startup as well.
> 

I can confirm the same from another Mac OSX Foxmarks use. Here's an excerpt from the crash log; full log available if you're interested:

Process:         firefox-bin [204]
Path:            /Applications/Minefield.app/Contents/MacOS/firefox-bin
Identifier:      org.mozilla.firefox
Version:         3.0pre (3.0pre)
Code Type:       X86 (Native)
Parent Process:  launchd [1]

Date/Time:       2008-04-10 08:02:50.869 +0100
OS Version:      Mac OS X 10.5.2 (9C7010)
Report Version:  6

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000ffffffab
Crashed Thread:  0

Thread 0 Crashed:
0   ???                           	0x01ab5790 0 + 28006288
1   libmozjs.dylib                	0x000d4bab JS_CompareValues + 5460
2   libmozjs.dylib                	0x000e01a3 js_Invoke + 2984
3   libmozjs.dylib                	0x000a858d JS_ExecuteScript + 56
4   XUL                           	0x00ce6d32 XRE_GetFileFromPath + 251220
5   XUL                           	0x00ce745b XRE_GetFileFromPath + 253053
6   XUL                           	0x0148f44f NS_GetComponentRegistrar_P + 18951
7   XUL                           	0x0148f912 NS_GetComponentRegistrar_P + 20170
8   XUL                           	0x0148fa02 NS_GetComponentRegistrar_P + 20410
9   XUL                           	0x0148ff17 NS_GetComponentRegistrar_P + 21711
10  XUL                           	0x014614d2 NS_InitXPCOM3_P + 1520
11  XUL                           	0x00c8e7d1 0xc8a000 + 18385
12  XUL                           	0x00c9261b XRE_main + 10183
13  org.mozilla.firefox           	0x00001d25 start + 641
14  org.mozilla.firefox           	0x00001bb2 start + 270
15  org.mozilla.firefox           	0x00001acd start + 41
Attached file Problems in Foxmarks
I wanted to create another patch, which would disable garbage collection on other place, but I can see too many errors in console regarding Foxmarks. Upload of bookmarks did not finish. I have just updated to 2.0.44.17.
Attachment #314437 - Attachment is obsolete: true
(In reply to comment #10)

> I wanted to create another patch, which would disable garbage collection on
> other place, but I can see too many errors in console regarding Foxmarks.
> Upload of bookmarks did not finish. I have just updated to 2.0.44.17.
> 

This looks like an extension installation problem -- you'll probably find that if you re-install that it straightens itself out. We've seen this before but never understood what causes it.

-Todd

* Call to xpconnect wrapped JSObject produced this error:  *
[Exception... "'Error opening input stream (invalid filename?)' when calling met
hod: [nsIFactory::createInstance]"  nsresult: "0x8057001e (NS_ERROR_XPC_JS_THREW_STRING)"  location: "<unknown>"  data: no]
************************************************************
WARNING: Cannot create startup observer : service,@foxcloud.com/extensions/foxma
rks;1: file c:/usr/src/trunk/mozilla/embedding/components/appstartup/src/nsAppSt
artupNotifier.cpp, line 113
(In reply to comment #11)
> (In reply to comment #10)
> 
> > I wanted to create another patch, which would disable garbage collection on
> > other place, but I can see too many errors in console regarding Foxmarks.
> > Upload of bookmarks did not finish. I have just updated to 2.0.44.17.
> > 
> 
> This looks like an extension installation problem -- you'll probably find that
> if you re-install that it straightens itself out. We've seen this before but
> never understood what causes it.
> 
> -Todd

I made fresh check out and built a release version and still have the same problem.

Error: [Exception... "'Error opening input stream (invalid filename?)' when calling method: [nsIFactory::createInstance]"  nsresult: "0x8057001e (NS_ERROR_XPC_JS_THREW_STRING)"  location: "<unknown>"  data: no]

When I force sync, it shows "Foxmarks Status" window which I cannot close (cancel does not react, no status is shown).
Ondrej,

I guess I wasn't clear -- to fix this, you'll need to reinstall the Foxmarks extension, not Firefox.

-Todd
(In reply to comment #13)
> Ondrej,
> 
> I guess I wasn't clear -- to fix this, you'll need to reinstall the Foxmarks
> extension, not Firefox.
> 
> -Todd
> 

;-) 

I did that 3 times on old profile, did not help. 
I installed in on fresh profile, did not help.
I have built a release version from trunk and installed the extension, did not help.
(In reply to comment #14)
> (In reply to comment #13)
> > Ondrej,
> > 
> > I guess I wasn't clear -- to fix this, you'll need to reinstall the Foxmarks
> > extension, not Firefox.
> > 
> > -Todd

AVG (v. 7.5) Resident Shield Protection with "Scan Potentially Unwanted Programs" turned on (default) disables access to foxmarks-utils.js file. This is the minimal code that  is recognized as unwanted:

while (i < input.length) {

  enc1 = this._keyStr.indexOf(input.charAt(i++));
  enc2 = this._keyStr.indexOf(input.charAt(i++));
  enc3 = this._keyStr.indexOf(input.charAt(i++));
  enc4 = this._keyStr.indexOf(input.charAt(i++));

  chr1 = (enc1 << 2) | (enc2 >> 4);
}

I leave it up to you to change the code or contact AVG (if you have any AVG users) .
Dietrich would you please create another try server build? I have disabled garbage collection on another place.
(In reply to comment #15)
> 
> AVG (v. 7.5) Resident Shield Protection with "Scan Potentially Unwanted
> Programs" turned on (default) disables access to foxmarks-utils.js file.
> 
> (snip)
>
> I leave it up to you to change the code or contact AVG (if you have any AVG
> users) .
> 

Ah, yes, that would explain it. We did hear about a problem with AVG's misidentification of Foxmarks as malware for a few days last week. Apparently AVG's latest virus file has fixed the problem. Sorry if you wasted time tracking this down -- I wouldn't have guessed this was the problem given the behavior you described.
Is there a new build available against which I can ask people to test?
(In reply to comment #19)
The linux build coredumps on Ubuntu while trying to sync bookmarks with the foxmarks extension, although pretty late when writing to the sync file (the last line in the foxmarks log is: "NotifyObservers: Changed"
(In reply to comment #20)
> The linux build coredumps on Ubuntu while trying to sync bookmarks with the
> foxmarks extension, although pretty late when writing to the sync file (the
> last line in the foxmarks log is: "NotifyObservers: Changed"
> 

This sounds like the same crash we've been seeing. Do you have a crash report url?
No, sorry. It just coredumps without the crash reporter coming up (in contrast to regular 3.0b5)
Todd, we need Windows test users, that were not able to sync before, the try build works fine for me on Windows and I can sync in both directions without any problems.
I just tested with an install of the test build and an empty profile on Windows XP. I get basically the same behavior. It crashes during "writing sync file" with a windows error report, but not a talkback error, and no crash report.
(In reply to comment #24)
> I just tested with an install of the test build and an empty profile on Windows
> XP. I get basically the same behavior. It crashes during "writing sync file"
> with a windows error report, but not a talkback error, and no crash report.
> 

I just heard from another user (also on XP) that reported the same experience -- crash during sync with no crash report.

Todd, I believe that you are referring to me. I was just commenting on this bug when you provided your comment.
(In reply to comment #26)
> Todd, I believe that you are referring to me. I was just commenting on this bug
> when you provided your comment.
> 

I was indeed referring to you -- just trying to facilitate!
Todd, can we get some steps to reproduce here, and perhaps a regression range (ie: when did you start seeing this?) We're unable to reproduce this. Also, do you know what number of your users this is affecting?
Todd just wrote me, that he was able to reproduce crash on his computer with one of the crashing account and that he can give me access to it. Hopefully, we will be able to solve this soon.
I also had the bug 426261/422743 crash with foxmarks after syncing in a machine with a big number of bookmark changes.
Submitted a number of crash dumps since the bug comments mentioned problems collecting dumps:
http://crash-stats.mozilla.com/report/index/06b2b83d-1232-11dd-8fb2-0013211cbf8a?p=1
http://crash-stats.mozilla.com/report/index/4dbe9f73-1232-11dd-8ff5-0013211cbf8a?p=1
http://crash-stats.mozilla.com/report/index/a22da3d1-1232-11dd-a077-001cc4e2bf68?p=1
http://crash-stats.mozilla.com/report/index/4bfa89c0-1233-11dd-90ff-001cc45a2c28?p=1
http://crash-stats.mozilla.com/report/index/efd25563-1233-11dd-bce2-0013211cbf8a?p=1

They look more like 422743-fixed though so might not be very helpful..

For the problems collecting crash dumps - try with only a blank tab open. If I am logged in to gmail in any open tab I get a windows crash instead of breakpad submit-dialog window (have the appcompat.txt files from this as well if useful) - otherwise it submits just fine. (Possibly a breakpad bug?)

For a workaround to get foxmarks working again do a bookmark upload from one machine, then manually copy foxmarks-baseline-xxxx.json + places.sqlite from that machine to all other machines before doing a sync again.

Firefox 3.0b5, Foxmarks 2.0.44.17
If you have flash in a tab, the crash reporter gets suppressed.  Known bug in the current flash version, they have a fix that won't beat us out the door.
Actually, apparently its out now.  Anyone getting that problem please be sure to update flash.

At this point we don't have a clear sense or scope of users affected by this bug, and it doesn't seem like we have a solid understanding of the root cause, so we're not going to block on this, but we'll take a fix once we have one, even if that means 3.0.1.
Flags: wanted1.9.0.x+
Flags: wanted-firefox3+
Flags: blocking-firefox3-
Flags: blocking-firefox3+
(In reply to comment #32)
> At this point we don't have a clear sense or scope of users affected by this
> bug, and it doesn't seem like we have a solid understanding of the root cause,
> so we're not going to block on this, but we'll take a fix once we have one,
> even if that means 3.0.1.
> 

We estimate that at full deployment this would impact on the order of 100k users -- shipping without a fix would be very bad for us.
Flags: blocking-firefox3- → blocking-firefox3?
Severity: normal → critical
Keywords: crash
I think a better solution than blocking-firefox3+ is having Foxmarks to not mark themselves at compatible with 3.0.*  At this point, making this bug blocking-firefox3+ is committing to holding it until we fully understand what is causing the issue and then fixing that issue. Without any insight into what the time cost is there, we have to assume that we're talking about holding ship for an indefinite period of time. I don't think we're prepared to do that in order to support this one add-on.

(This is not a statement made lightly, but we need to balance the value of holding ship on a browser used by upwards of 100M people in order to support an issue that might affect 100k user of this single add-on.)

In the meantime, hopefully new crash data will help illuminate us and allow us to make a more informed decision. Please feel free to renominate.
Flags: blocking-firefox3? → blocking-firefox3-
Attached file Full stack from crash
I am able to reproduce the crash now with account from Todd. It is crashing on the same place - the mParent object called in in GetSortType() is most likely already released from memory when we access it. However, the mIndentLevel of the current folder is 3 and I was able to reproduce this bug even with cycle collector switched off. 

I will have more time for this bug tomorrow.
Blocks: 411821
(In reply to comment #35)
> Created an attachment (id=318246) [details]
> Full stack from crash
> 
> I am able to reproduce the crash now with account from Todd. It is crashing on
> the same place - the mParent object called in in GetSortType() is most likely
> already released from memory when we access it. However, the mIndentLevel of
> the current folder is 3 and I was able to reproduce this bug even with cycle
> collector switched off. 
> 
> I will have more time for this bug tomorrow.
> 

Did you get any further in the analysis?
Foxmarks opens a container, reads all children, put them into its own array and closes the container. I suspect, that cycle collector later deletes parent containers and the items in the array point to invalid memory.

To confirm this, I have changed nsINavHistoryResultNode and made parent attribute read/write, in foxmarks-places.js on places where children are put into the array, I have set the pointer to parent to be null.

Synchronization finished without a crash. Not sure, if this would be an acceptable workaround, so I'm analyzing deeper why this happens.
Status: NEW → ASSIGNED
The piece that puzzles me is how refcounting is supposed to work for mParent. I'm probably missing something, but if the crash is indeed caused by mParent being non-null and invalid as Ondrej suggests, this seems like the key piece.

There are only a few places where mParent is set to a non-null value, but none of them explicitly manage refcounts. In nsNavHistoryContainerResultNode::FillStats() for instance, we iterate over this's children, setting each child's mParent = this.

The child nodes can then be handed out to clients, but if the assignment above didn't result in this's refcount being incremented, that parent could indeed be deleted while the child still held a reference to it. And that could cause this kind of crash.

The guidelines for ownership in XPCOM suggest that children shouldn't own references to their parents, but as near as I can tell that guideline assumes that the lifespan of a child is entirely contained by its parent. That's not the case in Places, where it's possible to get your hands on a child (via GetChildAt, for instance) and then drop the parent. If I understand correctly, this means that the parent could be garbage-collected with the child still holding a reference to the garbage-collected parent.

Is that perhaps what's going on here? Apologies in advance if I'm being dense.
Not sure if this will help, but I only experience the crashes on initial sync under Ubuntu 8.04 Hardy 64-bit.

That's "download from server; delete local bookmarks".

If I try it on different XP Pro 32-bit machines, it's fine. FFF3b5
(In reply to comment #38)
> There are only a few places where mParent is set to a non-null value, but none
> of them explicitly manage refcounts. In
> nsNavHistoryContainerResultNode::FillStats() for instance, we iterate over
> this's children, setting each child's mParent = this.

This is not called when doing sync.

> The guidelines for ownership in XPCOM suggest that children shouldn't own
> references to their parents, but as near as I can tell that guideline assumes
> that the lifespan of a child is entirely contained by its parent. That's not
> the case in Places, where it's possible to get your hands on a child (via
> GetChildAt, for instance) and then drop the parent. If I understand correctly,
> this means that the parent could be garbage-collected with the child still
> holding a reference to the garbage-collected parent.

There is no crash if I add NS_ADDREF(this) in nsNavHistoryContainerResultNode::InsertChildAt. 
(In reply to comment #40)
> 
> There is no crash if I add NS_ADDREF(this) in
> nsNavHistoryContainerResultNode::InsertChildAt. 
> 

Great! That's really encouraging. Some questions:

1) Do you believe that the change is correct? The flip side of things getting deleted underneath you is things overstaying their welcome -- are you confident that this change won't introduce a memory leak?

2) Do the couple of other places where mParent gets set also need the same NS_ADDREF treatment?
See bug 387203 for discussion of a similar scenario and treatment.

If I'm understanding it correctly, we want to make mParent an nsRefPtr, and add a cycle collection rule for it.
(In reply to comment #42)
> See bug 387203 for discussion of a similar scenario and treatment.
> 
> If I'm understanding it correctly, we want to make mParent an nsRefPtr, and add
> a cycle collection rule for it.
> 

Sounds right to me, though this is really above my pay grade. I found the following relevant doc; is there anything else that one should look at to learn more about this?

http://developer.mozilla.org/en/docs/Interfacing_with_the_XPCOM_cycle_collector
Turning mParent into nsRefPtr revealed other bug causing crash:

  for (PRUint32 i = 0; i < list->Length(); i++) {
    nsNavHistoryFolderResultNode* folder = list->ElementAt(i);
    if (folder) {
      PRUint32 nodeIndex;
------ Here folder->mChildren->Count() still works:
      nsNavHistoryResultNode* node = folder->FindChildById(aItemId, &nodeIndex);
      // if ExcludeItems is true we don't update non visible items
      if (node &&
          (!folder->mOptions->ExcludeItems() ||
           !(node->IsURI() || node->IsSeparator())) &&
------ Here folder->mChildren->Count() already crashes:
          folder->StartIncrementalUpdate()) {
        node->OnItemChanged(aItemId, aProperty, aIsAnnotationProperty, aValue);
      }
    }
  }

I have solved this by maintaining reference count when adding and removing elements to the list.  So now it crashes on another place:

nsNavHistoryFolderResultNode::ClearChildren(PRBool unregister)
{
  for (PRInt32 i = 0; i < mChildren.Count(); i ++)
    mChildren[i]->OnRemoving();
  mChildren.Clear();

  if (unregister && mContentsValid) {
    if (mResult) {
------- Here is looks as if the current object (this) was released:
      mResult->RemoveBookmarkFolderObserver(this, mItemId);
      mIsRegisteredFolderObserver = PR_FALSE;
    }
  }
  mContentsValid = PR_FALSE;
}
Ondrej,

Can you please attach the change(s) you made that produced these results? It would be good in particular to see how you defined the cycle collection rule...

-Todd
(In reply to comment #45)
> Ondrej,
> 
> Can you please attach the change(s) you made that produced these results? It
> would be good in particular to see how you defined the cycle collection rule...
> 
> -Todd
> 

I did not get that far, I have simply changed mParent to be nsRefPtr and it started crashing elsewhere, I have fixed that crash (apparently pointers were stored without maintaining references) and it started crashing a little later. But yes, this may be because the cycle collector is missing the rules. I will work on it.
Just a note on scheduling - we are already code frozen for FF3 but please continue to work on a solution for this - we can always pick it up in the first 3.0.1 release or whatever is the nearest opportunity.
This seems to be fixing the crash. A brief look at memory consumption when syncing huge bookmarks collection suggests that it is not leaking, but I did not check it yet.
Attachment #320820 - Flags: review?(dietrich)
That looks really promising, Ondrej! Can someone (Dietrich?) produce a build with this patch that we can ask users to test against?

Where did you learn how this worked? NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS is not exactly what I would have expected. ;-)
(In reply to comment #49)
> Where did you learn how this worked?
> NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS is not exactly what I
> would have expected. ;-)

The macros are rather trivial, this just makes sure, that the correct interface is used because mParent inherits from nsNavHistoryResultNode and implements nsINavHistoryContainerResultNode, so it has nsISupports twice and cast would be ambiguous.
(In reply to comment #51)
> builds:
> 
> https://build.mozilla.org/tryserver-builds/2008-05-14_09:14-dietrich@mozilla.com-bug426261-cc/
> 

Success! I was part of the group with persistent crashes on initial sync, but I just downloaded the build Dietrich posted and was able to complete the initial sync successfully. Thanks everyone.
There was a bug in the patch - releasing the reference was done too early. I hope someone can suggest a better array, so that I do not have to maintain reference count manually.

I started playing with XPCOM_MEM_LEAK_LOG, but I'm not quite sure what should I expect. I will compare the result with and without the patch.
Attachment #320820 - Attachment is obsolete: true
Attachment #320994 - Flags: review?(dietrich)
Attachment #320820 - Flags: review?(dietrich)
that change was my 1 comment so far :)

another problem is that this patch is leaking. set that env var, start up, open the library, close it, and shutdown, and the leak log shows places nodes leaking (not leaking w/o the patch).
Comment on attachment 320994 [details] [diff] [review]
Use ref count for parent and related problems v.1

>Index: toolkit/components/places/src/nsNavHistoryResult.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryResult.cpp,v
>retrieving revision 1.145
>diff -u -8 -p -d -r1.145 nsNavHistoryResult.cpp
>--- toolkit/components/places/src/nsNavHistoryResult.cpp	6 May 2008 20:58:45 -0000	1.145
>+++ toolkit/components/places/src/nsNavHistoryResult.cpp	14 May 2008 22:28:56 -0000
>@@ -99,17 +99,26 @@ inline PRInt32 ComparePRTime(PRTime a, P
> inline PRInt32 CompareIntegers(PRUint32 a, PRUint32 b)
> {
>   return a - b;
> }
> 
> 
> // nsNavHistoryResultNode ******************************************************
> 
>-NS_IMPL_CYCLE_COLLECTION_0(nsNavHistoryResultNode)
>+
>+NS_IMPL_CYCLE_COLLECTION_CLASS(nsNavHistoryResultNode)
>+
>+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsNavHistoryResultNode)
>+  NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mParent)
>+NS_IMPL_CYCLE_COLLECTION_UNLINK_END 
>+
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN(nsNavHistoryResultNode)
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mParent, nsINavHistoryContainerResultNode);
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END
> 
> NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION(nsNavHistoryResultNode)
>   NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsISupports, nsINavHistoryResultNode)
>   NS_INTERFACE_MAP_ENTRY(nsINavHistoryResultNode)
> NS_INTERFACE_MAP_END
> 
> NS_IMPL_CYCLE_COLLECTING_ADDREF_AMBIGUOUS(nsNavHistoryResultNode, nsINavHistoryResultNode)
> NS_IMPL_CYCLE_COLLECTING_RELEASE_AMBIGUOUS(nsNavHistoryResultNode, nsINavHistoryResultNode)
>@@ -3701,16 +3710,20 @@ nsNavHistoryResult::nsNavHistoryResult(n
>   mRootNode->mResult = this;
> }
> 
> PR_STATIC_CALLBACK(PLDHashOperator)
> RemoveBookmarkFolderObserversCallback(nsTrimInt64HashKey::KeyType aKey,
>                                       nsNavHistoryResult::FolderObserverList*& aData,
>                                       void* userArg)
> {
>+  for (PRUint32 i = 0; i < aData->Length(); i ++) {
>+    NS_RELEASE(aData->ElementAt(i));
>+    aData->RemoveElementAt(i);
>+  }
>   delete aData;
>   return PL_DHASH_REMOVE;
> }
> 
> nsNavHistoryResult::~nsNavHistoryResult()
> {
>   // delete all bookmark folder observer arrays which are allocated on the heap
>   mBookmarkFolderObservers.Enumerate(&RemoveBookmarkFolderObserversCallback, nsnull);
>@@ -3874,16 +3887,17 @@ nsNavHistoryResult::AddBookmarkFolderObs
>     mIsBookmarkFolderObserver = PR_TRUE;
>   }
> 
>   FolderObserverList* list = BookmarkFolderObserversForId(aFolder, PR_TRUE);
>   if (list->IndexOf(aNode) != list->NoIndex) {
>     NS_NOTREACHED("Attempting to register an observer twice!");
>     return;
>   }
>+  NS_ADDREF(aNode);
>   list->AppendElement(aNode);
> }
> 
> 
> // nsNavHistoryResult::RemoveHistoryObserver
> 
> void
> nsNavHistoryResult::RemoveHistoryObserver(nsNavHistoryQueryResultNode* aNode)
>@@ -3904,17 +3918,18 @@ nsNavHistoryResult::RemoveAllBookmarksOb
> 
> void
> nsNavHistoryResult::RemoveBookmarkFolderObserver(nsNavHistoryFolderResultNode* aNode,
>                                                  PRInt64 aFolder)
> {
>   FolderObserverList* list = BookmarkFolderObserversForId(aFolder, PR_FALSE);
>   if (! list)
>     return; // we don't even have an entry for that folder
>-  list->RemoveElement(aNode);
>+  if (list->RemoveElement(aNode))
>+    NS_RELEASE(aNode);
> }
> 
> 
> // nsNavHistoryResult::BookmarkFolderObserversForId
> 
> nsNavHistoryResult::FolderObserverList*
> nsNavHistoryResult::BookmarkFolderObserversForId(PRInt64 aFolderId, PRBool aCreate)
> {
>@@ -4135,20 +4150,20 @@ nsNavHistoryResult::OnItemChanged(PRInt6
>   rv = bookmarkService->GetFolderIdForItem(aItemId, &folderId);
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   FolderObserverList* list = BookmarkFolderObserversForId(folderId, PR_FALSE);
>   if (!list)
>     return NS_OK;
> 
>   for (PRUint32 i = 0; i < list->Length(); i++) {
>-    nsNavHistoryFolderResultNode* folder = list->ElementAt(i);
>+    nsRefPtr<nsNavHistoryFolderResultNode> folder = list->ElementAt(i);
>     if (folder) {
>       PRUint32 nodeIndex;
>-      nsNavHistoryResultNode* node = folder->FindChildById(aItemId, &nodeIndex);
>+      nsRefPtr<nsNavHistoryResultNode> node = folder->FindChildById(aItemId, &nodeIndex);
>       // if ExcludeItems is true we don't update non visible items
>       if (node &&
>           (!folder->mOptions->ExcludeItems() ||
>            !(node->IsURI() || node->IsSeparator())) &&
>           folder->StartIncrementalUpdate()) {
>         node->OnItemChanged(aItemId, aProperty, aIsAnnotationProperty, aValue);
>       }
>     }
>Index: toolkit/components/places/src/nsNavHistoryResult.h
>===================================================================
>RCS file: /cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryResult.h,v
>retrieving revision 1.56
>diff -u -8 -p -d -r1.56 nsNavHistoryResult.h
>--- toolkit/components/places/src/nsNavHistoryResult.h	11 Apr 2008 16:22:02 -0000	1.56
>+++ toolkit/components/places/src/nsNavHistoryResult.h	14 May 2008 22:28:57 -0000
>@@ -378,17 +378,17 @@ public:
>     NS_ASSERTION(IsFolder(), "Not a folder");
>     return reinterpret_cast<nsNavHistoryFolderResultNode*>(this);
>   }
>   nsNavHistoryQueryResultNode* GetAsQuery() {
>     NS_ASSERTION(IsQuery(), "Not a query");
>     return reinterpret_cast<nsNavHistoryQueryResultNode*>(this);
>   }
> 
>-  nsNavHistoryContainerResultNode* mParent;
>+  nsRefPtr<nsNavHistoryContainerResultNode> mParent;
>   nsCString mURI; // not necessarily valid for containers, call GetUri
>   nsCString mTitle;
>   nsString mTags;
>   PRUint32 mAccessCount;
>   PRInt64 mTime;
>   nsCString mFaviconURI;
>   PRInt32 mBookmarkIndex;
>   PRInt64 mItemId;
Attachment #320994 - Flags: review?(dietrich) → review-
oops, didn't mean the comment :P

please get sicking to review the cycle collector rules, as he did same for bug 387203.
Comment on attachment 320994 [details] [diff] [review]
Use ref count for parent and related problems v.1

Jonas, the patch is leaking. Can you please take a look and give an advice?
Attachment #320994 - Flags: review?(jonas)
You'll have to find everything that holds those nodes, for example you should probably add mRootNode to nsNavHistoryResult's cycle collection macros.
Just wanted to note that despite numerous attempts I had been unable to sync bookmarks on two computers.... a Mac Pro and MacBook Air, both running Leopard.  However, the recent upgrade to Foxmarks 2.0.44.20 resolved the issue on both machines, and I am now a happy synchronized camper.
Cheers,
Peter O. 
(In reply to comment #58)
> You'll have to find everything that holds those nodes, for example you should
> probably add mRootNode to nsNavHistoryResult's cycle collection macros.
> 

This is already there. Strange is that I have the leak even if I have zero bookmarks and mParent is not filled at all. The only history result node that gets created is "Bookmarks Toolbar Folder" - it is created twice. At the end it is once collected by the cycle collector and once it stays having two references outstanding. So far I did not find who is holding the other reference.
I just wanted to echo what Peter Olesiuk said. I have a rather large set of bookmarks - 1.5MB and with Foxmarks 2.0.44.17 I wasn't able to get the bookmarks synched even once. I always got a crash on the initial sync. However, 2.0.44.20 fixes that and I am able to sync fine with 2.0.44.20 on FF3 RC1 and 2.0.44.20 on one FF2 and 2.0.44.17 on another FF2.
(In reply to comment #61)
> I just wanted to echo what Peter Olesiuk said. I have a rather large set of
> bookmarks - 1.5MB and with Foxmarks 2.0.44.17 I wasn't able to get the
> bookmarks synched even once. I always got a crash on the initial sync. However,
> 2.0.44.20 fixes that and I am able to sync fine with 2.0.44.20 on FF3 RC1 and
> 2.0.44.20 on one FF2 and 2.0.44.17 on another FF2.
> 

After seeing Garo's comment, I tried FF3 RC1 in conjunction with Foxmarks xxxxx.20. I tried to download from the server (rather than a normal sync) and experienced a crash, but when I restarted and let a sync proceed as normal, I was able to sync without issue. Good stuff.
Just want to add:
FireFox 3.0 RC1
Foxmarks 2.0.44.20

If one of your browser crashes after updating to Foxmarks 2.0.44.20 and/or FireFox 3.0 RC1, do this:

Option #1:
a) Export your bookmarks
b) Import your exported bookmarks to your other browser
c) Do a Sync

If it still won't work, do
Option #2:
a) On your crashing browser; Force foxmarks to UPLOAD to server (overwriting the server with the bookmarks on your crashing browser)
b) On your not-crashing browser; Issue a "Merge"
c) Check if it was successful;
d) If a success; go back to your crashing browser; Issue a "Merge"

Both options worked for me.  I have more than a thousand bookmarks if I'm not mistaken :p

Hope that helps.  It might be related still.
You need to traverse/unlink all strong references to nsNavHistoryResultNodes, and you made the bookmark folder observers hash hold strong references. This one doesn't leak for me.
Attachment #320994 - Attachment is obsolete: true
Attachment #322410 - Flags: review?
Attachment #320994 - Flags: review?(jonas)
Attachment #322410 - Flags: review? → review?(dietrich)
Given this affects an extension which uses places if we can get a safe fix I'd like it in 3.0.1 or if safe enough RC2.
Whiteboard: [RC2?]
It might make sense to make FolderObserverList a nsTArray<nsRefPtr<nsNavHistoryFolderResultNode> > instead, but that'll make ENUMERATE_BOOKMARK_FOLDER_OBSERVERS more expensive (copying the FolderObserverList will addref all observers). On the other hand, ENUMERATE_BOOKMARK_FOLDER_OBSERVERS makes a copy presumably because the callback can remove observers from the array, don't they need to addref the observers anyway?
I can confirm, that the account that was crashing constantly before does not crash anymore with the latest patch from Peter and that the latest patch does not leak.

I have reviewed the cc rules that Peter added and I find them very clear and correct. Thank you Peter, I was not that close to discover them myself, but now they look so obvious.

I can try tomorrow changing the type of the FolderObserverList. But the comment in the code says, a copy is necessary to prevent operating on items that may be added to the array too.
Dietrich: this is now blocking final ship of 3.0; can we get a review ASAP?
Flags: blocking-firefox3- → blocking-firefox3+
Whiteboard: [RC2?] → [RC2+]
Comment on attachment 322410 [details] [diff] [review]
Use ref count for parent and related problems v.1.1

r=mano
Attachment #322410 - Flags: review?(dietrich) → review+
Keywords: checkin-needed
This patch turns FolderObserverList into nsTArray<nsRefPtr<nsNavHistoryFolderResultNode> > so that we do not have to maintain ref count manually. It works, does not crash and does not leak, but I'm not quite sure, whether it is better, so I'm not pushing instead of the reviewed patch.
This is the same as v.1.1, but with nsTArray<nsRefPtr<nsNavHistoryFolderResultNode> >, which is slightly safer. I just checked this in.
Attachment #322410 - Attachment is obsolete: true
Attachment #322596 - Attachment is obsolete: true
Attachment #322647 - Flags: review+
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [RC2+] → [RC2+][needs to land in mozilla-central]
Target Milestone: --- → Firefox 3
Flags: wanted1.9.0.x+
Whiteboard: [RC2+][needs to land in mozilla-central] → [RC2+]
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
You need to log in before you can comment on or make changes to this bug.