Closed Bug 555014 Opened 14 years ago Closed 14 years ago

Intermittent failure in test_tags.js | (NS_ERROR_FAILURE) [nsINavBookmarksService.removeFolderChildren]

Categories

(Toolkit :: Places, defect)

x86
Windows Server 2003
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [transaction.commit() failure])

WINNT 5.2 mozilla-central debug test xpcshell on 2010/03/25 07:23:33
s: win32-slave39

TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-debug-unittest-xpcshell\build\xpcshell\tests\test_places\queries\test_tags.js | test failed (with xpcshell return code: 0), see following log:
TEST-UNEXPECTED-FAIL | (xpcshell/head.js) | [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsINavBookmarksService.removeFolderChildren]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: e:/builds/moz2_slave/mozilla-central-win32-debug-unittest-xpcshell/build/xpcshell/tests/test_places/queries/head_queries.js :: remove_all_bookmarks :: line 484"  data: no]
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1269643325.1269645664.28269.gz
WINNT 5.2 mozilla-central debug test xpcshell on 2010/03/26 15:42:05
s: win32-slave33
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1269818957.1269821900.12566.gz
WINNT 5.2 mozilla-central debug test xpcshell on 2010/03/28 16:29:17
s: win32-slave12
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1270222859.1270225236.14731.gz
WINNT 5.2 mozilla-central debug test xpcshell on 2010/04/02 08:40:59
s: win32-slave39
test_tags.js fails here, when it calls out to remove_all_bookmarks in head_common.js:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/tests/queries/test_tags.js#656

Because the transaction commit in nsNavBookmarks::RemoveFolderChildren fails:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/src/nsNavBookmarks.cpp#1636

test_tags.js calls remove_all_bookmarks as the final step in the final test and then once again after all tests have finished.  It looks like all tests finish and this first call succeeds, but the second fails, at line 849.  (From the full log linked from comment 4.)

So this might be fixed by removing that second call, and it's unnecessary anyway.  But the second call shouldn't fail.

NS_ERROR_FAILURE on commit indicates only one of two things: either Storage doesn't have a transaction in progress, or COMMIT TRANSACTION failed.

What I guess is happening is that the commit tries to get a reserved lock, but some other thread has already got one, or a shared lock, or its writes are otherwise pending.

What kind of background DB updating are we doing recently that might be running concurrently?
Bug 509970 comment 21 shows this same error.  There may be more clues there:  That test emits this warning right before the failure:

WARNING: attempting to update frecency for a bogus place: file /builds/slave/mozilla-central-linux-debug/build/toolkit/components/places/src/nsNavHistory.cpp, line 7667

The final assertion that succeeds in that test is:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/tests/unit/test_removeVisitsByTimeframe.js#147

Which is the final line in one of the sub-tests, but not the final one.  remove_all_bookmarks is called after every sub-test.
The warning from bug 509970 comment 21 is interesting.

1) nsNavBookmarks::RemoveFolderChildren gets the children of the folder, 2) deletes them, and then 3) updates the frecencies of their corresponding URLs.  The warning comes from the call at point 3, when the children's URLs apparently are no longer in the DB.  Either the URLs were invalid before point 1, or became invalid after point 1.  I doubt the former, since if that were happening it seems like a ton (well, even more) of our tests would be failing, and more often.  If the latter, could point 2 be triggering it?

Marco, could the expiration-on-idle stuff be running concurrently?  That would explain both the commit failure and the warning.  If this runs after point 2 but before point 3, that's bad news:

279   // Expire orphan URIs from the database.
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/src/nsPlacesExpiration.js#279
idle observer for expiration is added at first Notify call, that should happen about 3 minutes after startup, thus i think expiration should not run during majority of tests (practically it should run only during expiration tests).
I could be wrong due to a bug, that can be verified adding an expiration topic observer in head_ file and dumping when/if we get it.

I think could be that nsPlacesDBFlush has a lock on the db due to bookmarks removal and we fail committing. Another reason to remove temp tables in such a case.

If you can make any debugging on this, please do.
nsPlacesDBFlush was my first thought too, but given the UpdateFrecency warning the expiration stuff seems more likely.  Actually any idle update service could potentially be a problem during testing.

> about 3 minutes after startup, thus i think expiration should not run during
> majority of tests

That may well be true, but if it's false only a few times the result is random oranges, that's why they're random.  AIUI the same xpcshell environment is used for all tests, so if >= 3 minutes have elapsed before it starts the Places tests...  And given how slow some of the machines are and how many tests there are, it doesn't seem unreasonable that it's bound to happen sometimes.

Anyway, I'll try to debug.
(In reply to comment #9)
> AIUI the same xpcshell environment is used
> for all tests, so if >= 3 minutes have elapsed before it starts the Places
> tests...  And given how slow some of the machines are and how many tests there
> are, it doesn't seem unreasonable that it's bound to happen sometimes.

this is true, but each test is run separately, indeed you have to reinstance all services etc. Thus i think (and hope) timers will start when each test starts.
Depends on: 559678, 559682
Whiteboard: [orange] → [orange][transaction.commit() failure]
Resolving Places oranges that don't appear on central from at least 6 months.
Check dependencies for bugs that could have fixed the issue.
Please reopen if they re-appear in central.
Status: NEW → RESOLVED
Closed: 14 years ago
Depends on: 521225
Resolution: --- → FIXED
Whiteboard: [orange][transaction.commit() failure] → [transaction.commit() failure]
You need to log in before you can comment on or make changes to this bug.