Closed Bug 476292 Opened 11 years ago Closed 11 years ago

Crashes on startup on OS X and Linux [@ nsNavBookmarks::IsRealBookmark] [@ PL_DHashTableOperate]

Categories

(Toolkit :: Places, defect, P1, critical)

defect

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: smichaud, Assigned: sdwilsh)

References

Details

(4 keywords)

Crash Data

Attachments

(1 file, 2 obsolete files)

https://crash-stats.mozilla.com reports 124 of these over the last 96 hours (98 on Mac and 26 on Linux).  This is equivalent to 784 crashes over two weeks, and probably makes this a topcrasher.  (I can't get crash-stats to report over a longer period than 96 hours.)

http://crash-stats.mozilla.com/?do_query=1&product=Firefox&query_search=stack&query_type=exact&query=nsNavBookmarks%3A%3AIsRealBookmark(long%20long)&date=&range_value=96&range_unit=hours

I just saw a couple of these myself (with yesterday's Shiretoko nightly, on two different Macs, on startup).  Here are my Breakpad IDs:

bp-c9a68bb9-cb71-4af5-840a-ff1db2090131
bp-924b3999-f091-4c70-9e1b-8403a2090130
Keywords: topcrash
Severity: normal → critical
Flags: blocking-firefox3.1?
> This is equivalent to 784 crashes over two weeks

Oops, goofed up my math.  124 crashes in 4 days is equivalent to 434 crashes over two weeks.

Still probably a topcrasher, though.
Component: Bookmarks & History → Places
Flags: blocking-firefox3.1?
Product: Firefox → Toolkit
QA Contact: bookmarks → places
Flags: blocking1.9.1?
Looks like this is during module creation?  Maybe the bookmarks hashtable isn't created yet...
I'm able to reproduce this crash since I made some regression tests with older 3.0 builds. Everytime when I start a current nightly build with a profile, which was in use by an older 3.0 build, Minefield crashes.

Steps:
1. Load e.g. following build: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2008/04/2008-04-03-04-trunk/
2. Create a fresh profile and run it once with the above build
3. Start a current Minefield with the same profile

After step 3 we crash on startup. AFAIR that happened a while back. Marking as regression. I'll try to find the regression range.
Regressed between:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090127 Minefield/3.2a1pre

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090128 Minefield/3.2a1pre

Checkins: http://tinyurl.com/b97u38

I suspect bug 449406.

The top 5 stack frames:

0  	XUL  	PL_DHashTableOperate  	pldhash.c:615
1 	XUL 	nsNavBookmarks::IsRealBookmark 	nsTHashtable.h:170
2 	XUL 	nsNavHistory::CalculateFrecency 	toolkit/components/places/src/nsNavHistory.cpp:7207
3 	XUL 	nsNavHistory::RecalculateFrecenciesInternal 	toolkit/components/places/src/nsNavHistory.cpp:7259
4 	XUL 	nsNavHistory::RecalculateFrecencies 	toolkit/components/places/src/nsNavHistory.cpp:7221
5 	XUL 	nsNavHistory::Init 	toolkit/components/places/src/nsNavHistory.cpp:551
Version: Trunk → 1.9.1 Branch
I've seen crash [@ PL_DHashTableOperate] once with debug build of mozilla-central once on OpenSolaris. But I couldn't reproduce it.

The last 2 lines in terminal is:
###!!! ASSERTION: nsTHashtable was not initialized properly.: 'mTable.entrySize', file ../../../../dist/include/xpcom/nsTHashtable.h, line 163
dist/bin/run-mozilla.sh: line 143: 29786: Memory fault(coredump)

The stack is:
  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [8] PL_DHashTableOperate(table = 0xf3a9f9dc, key = 0x8041538, op = PL_DHASH_LOOKUP), line 613 in "pldhash.c"
  [9] nsTHashtable<nsBaseHashtableET<nsTrimInt64HashKey,long long> >::GetEntry(this = 0xf3a9f9dc, aKey = 139LL), line 165 in "nsTHashtable.h"
  [10] nsBaseHashtable<nsTrimInt64HashKey,long long,long long>::Get(this = 0xf3a9f9dc, aKey = 139LL, pData = 0x8041520), line 126 in "nsBaseHashtable.h"
  [11] nsNavBookmarks::IsRealBookmark(this = 0xf3a9f980, aPlaceId = 139LL), line 917 in "nsNavBookmarks.cpp"
  [12] nsNavHistory::CalculateFrecency(this = 0xf3223c00, aPlaceId = 139LL, aTyped = 0, aVisitCount = 1, aURL = CLASS, aFrecency = 0x8041648), line 7207 in "nsNavHistory.cpp"
  [13] nsNavHistory::RecalculateFrecenciesInternal(this = 0xf3223c00, aStatement = 0xf329a9a0, aCount = 50), line 7258 in "nsNavHistory.cpp"
  [14] nsNavHistory::RecalculateFrecencies(this = 0xf3223c00, aCount = 50, aRecalcOld = 0), line 7221 in "nsNavHistory.cpp"
  [15] nsNavHistory::Init(this = 0xf3223c00), line 550 in "nsNavHistory.cpp"
Finally I found a way to reproduce it: With certain firefox profile, switch between Firefox 3.0.5 and Firefox 3.2pre can 100% reproduce this issue.

The problem is mBookmarksHash was not initialized before IsRealBookmark().
Assignee: nobody → sdwilsh
Status: NEW → ASSIGNED
Priority: -- → P1
Whiteboard: [needs patch]
Target Milestone: --- → mozilla1.9.2a1
To be clear - I think this is a P1 blocker because we'll crash during a migration, which we'll hit when we upgrade b2 users to b3.  I know what the cause is, I just need to track down why it isn't happening.  I fully expect to have a patch up by the end of the day.
Blocking as per comment 8; please move to P2 if this is discovered to not happen during migration.
Flags: blocking1.9.1? → blocking1.9.1+
I need help getting proper steps to reproduce here.  I'm creating a new profile, and running it with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090129 Shiretoko/3.1b3pre.  I then load it into a mozilla-central build that I just built, and I cannot get a crash.

Any advice?

For what it is worth, I don't see how it's even possible to for the hashtable to not be initialized with the code.
Keywords: qawanted
Shawn, see my comment 3. You should better run a 3.0 build. It crashes all the time on my box when selecting the same profile with a 3.1b3pre or 3.2a1pre build afterward.
ack, I thought that was a 3.0 build...  I'm not sure when it got upgraded :(
Keywords: qawanted
Attached patch v0.1 (obsolete) — Splinter Review
This doesn't pass all the tests yet, and I need to write a unit test.  It's good for a first pass review though.

Right now it's failing at http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/tests/sync/test_database_sync_after_quit_application_with_removeAllPages.js#95
The entry has a frecency of 340.  Looking into it more.
Attachment #360192 - Flags: review?(dietrich)
The test fails because the event we dispatch runs *after* the test, but before quit-application is dispatched.  I think I need to spin the event loop a bit at the right time here so the test doesn't fail.  Awesome.
Cycling the event loop does in fact fix the test failure, so this patch is good.  Tomorrow I'll have a patch with a test that will crash without the fix, and won't crash with the fix.
Comment on attachment 360192 [details] [diff] [review]
v0.1

>diff --git a/toolkit/components/places/src/nsNavBookmarks.cpp b/toolkit/components/places/src/nsNavBookmarks.cpp
>--- a/toolkit/components/places/src/nsNavBookmarks.cpp
>+++ b/toolkit/components/places/src/nsNavBookmarks.cpp
>@@ -906,16 +906,19 @@ nsNavBookmarks::UpdateBookmarkHashOnRemo
>                            reinterpret_cast<void*>(&aPlaceId));
>   return NS_OK;
> }
> 
> 
> PRBool
> nsNavBookmarks::IsRealBookmark(PRInt64 aPlaceId)
> {
>+  NS_ABORT_IF_FALSE(mBookmarksHash.IsInitialized(),
>+                    "Bookmark hashtable has not been initialized!");
>+

how would this happen, given the other changes?

>diff --git a/toolkit/components/places/src/nsNavHistory.cpp b/toolkit/components/places/src/nsNavHistory.cpp
>--- a/toolkit/components/places/src/nsNavHistory.cpp
>+++ b/toolkit/components/places/src/nsNavHistory.cpp
>@@ -519,16 +519,23 @@ nsNavHistory::Init()
>     pbi->AddObserver(PREF_BROWSER_HISTORY_EXPIRE_DAYS_MIN, this, PR_FALSE);
>     pbi->AddObserver(PREF_BROWSER_HISTORY_EXPIRE_SITES, this, PR_FALSE);
>   }
> 
>   observerService->AddObserver(this, gQuitApplicationMessage, PR_FALSE);
>   observerService->AddObserver(this, gXpcomShutdown, PR_FALSE);
>   observerService->AddObserver(this, gAutoCompleteFeedback, PR_FALSE);
>   observerService->AddObserver(this, NS_PRIVATE_BROWSING_SWITCH_TOPIC, PR_FALSE);
>+  // In case we've either imported or done a migration from a pre-frecency
>+  // build, we will calculate the first cutoff period's frecencies soon.
>+  if (mDatabaseStatus == DATABASE_STATUS_CREATE ||
>+      mDatabaseStatus == DATABASE_STATUS_UPGRADED) {
>+    (void)observerService->AddObserver(this, PLACES_INIT_COMPLETE_EVENT_TOPIC,
>+                                       PR_FALSE);
>+  }

s/soon/once the rest of places infrastructure has initialized/

r=me so far
Attachment #360192 - Flags: review?(dietrich) → review+
(In reply to comment #16)
> how would this happen, given the other changes?
It shouldn't, but that's why it's an assertion.
Whiteboard: [needs patch]
Whiteboard: [ETA: Feb-03]
To be clear, the crash only happens when the bookmarks service get's called before the history service, which apparently only happens on the upgrade path.
Attached patch v1.0 (obsolete) — Splinter Review
With tests fixes and previous comments addressed.  I backed off the code changes, and the test crashes.
Attachment #360192 - Attachment is obsolete: true
Attachment #360389 - Flags: review?(mak77)
Attachment #360389 - Flags: review?(dietrich)
Whiteboard: [ETA: Feb-03] → [needs review dietrich][needs review MaK77]
Attachment #360389 - Flags: review?(dietrich) → review+
Comment on attachment 360389 [details] [diff] [review]
v1.0


>+++ b/toolkit/components/places/tests/unit/test_crash_476292.js
>@@ -0,0 +1,61 @@
>+/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
>+/* vim:set ts=2 sw=2 sts=2 et: */
>+/* ***** BEGIN LICENSE BLOCK *****
>+ * Version: MPL 1.1/GPL 2.0/LGPL 2.1
>+ *
>+ * The contents of this file are subject to the Mozilla Public License Version
>+ * 1.1 (the "License"); you may not use this file except in compliance with
>+ * the License. You may obtain a copy of the License at
>+ * http://www.mozilla.org/MPL/
>+ *
>+ * Software distributed under the License is distributed on an "AS IS" basis,
>+ * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
>+ * for the specific language governing rights and limitations under the
>+ * License.
>+ *
>+ * The Original Code is Places Test code.
>+ *
>+ * The Initial Developer of the Original Code is
>+ * Mozilla Corporation.
>+ * Portions created by the Initial Developer are Copyright (C) 2009
>+ * the Initial Developer. All Rights Reserved.
>+ *
>+ * Contributor(s):
>+ *  Shawn Wilsher <me@shawnwilsher.com> (Original Author)
>+ *
>+ * Alternatively, the contents of this file may be used under the terms of
>+ * either the GNU General Public License Version 2 or later (the "GPL"), or
>+ * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
>+ * in which case the provisions of the GPL or the LGPL are applicable instead
>+ * of those above. If you wish to allow use of your version of this file only
>+ * under the terms of either the GPL or the LGPL, and not to allow others to
>+ * use your version of this file under the terms of the MPL, indicate your
>+ * decision by deleting the provisions above and replace them with the notice
>+ * and other provisions required by the GPL or the LGPL. If you do not delete
>+ * the provisions above, a recipient may use your version of this file under
>+ * the terms of any one of the MPL, the GPL or the LGPL.
>+ *
>+ * ***** END LICENSE BLOCK ***** */
>+
>+/**
>+ * This tests a crash during startup found in bug 476292 that was caused by
>+ * getting the bookmarks service during nsNavHistory::Init when the bookmarks
>+ * service was created before the history service was.
>+ */
>+
>+function run_test()
>+{
>+  // First, we need to move our old database file into our test profile
>+  // directory.  This will trigger DATABASE_STATUS_UPGRADED (CREATE is not
>+  // sufficient since there will be no entires to update frecencies, which

typo/nit: "entries to update frecencies for"

r=me otherwise
Whiteboard: [needs review dietrich][needs review MaK77] → [needs review MaK77]
Comment on attachment 360389 [details] [diff] [review]
v1.0

>diff --git a/toolkit/components/places/tests/unit/test_crash_476292.js b/toolkit/components/places/tests/unit/test_crash_476292.js
>new file mode 100644

>+function run_test()
>+{
>+  // First, we need to move our old database file into our test profile
>+  // directory.  This will trigger DATABASE_STATUS_UPGRADED (CREATE is not
>+  // sufficient since there will be no entires to update frecencies, which
>+  // causes us to get the bookmarks service in the first place).
>+  let dbFile = do_get_file("toolkit/components/places/tests/unit/bug476292.sqlite");
>+  let profD = Cc["@mozilla.org/file/directory_service;1"].
>+             getService(Ci.nsIProperties).
>+             get(NS_APP_USER_PROFILE_50_DIR, Ci.nsIFile);

you could probably use profileDir that is defined in head_bookmarks

>+  dbFile.copyTo(profD, "places.sqlite");
>+
>+  // Now get the bookmarks service.  This will crash when the bug exists.
>+  let hs = Cc["@mozilla.org/browser/nav-bookmarks-service;1"].
>+           getService(Ci.nsINavBookmarksService);

nit: hs? hwv you probably don't need to assign it.

otherwise looks good
Attachment #360389 - Flags: review?(mak77) → review+
Whiteboard: [needs review MaK77]
(In reply to comment #21)
> you could probably use profileDir that is defined in head_bookmarks
it's actually not defined since it throws usually.
Version: 1.9.1 Branch → Trunk
Attached patch v1.1Splinter Review
Addresses comments and is a pushable patch (has commit message and user already set).
Attachment #360389 - Attachment is obsolete: true
I can't see myself being able to land this any time soon, so somebody else should.  The good news is that I've got a patch up with the commit message already, so it's dirt simple to do.
Keywords: checkin-needed
Whiteboard: [can land]
Flags: in-testsuite+
Whiteboard: [can land]
Duplicate of this bug: 476545
Verified on trunk and 1.9.1 with builds on OS X and Windows:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090205 Minefield/3.2a1pre

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090205 Shiretoko/3.1b3pre
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsNavBookmarks::IsRealBookmark] [@ PL_DHashTableOperate]
You need to log in before you can comment on or make changes to this bug.