Closed Bug 804479 Opened 12 years ago Closed 4 years ago

Intermittent test_password_tracker.js, test_history_tracker.js, test_bookmark_store.js, test_forms_tracker.js | test failed (with xpcshell return code: -2147483645), "ASSERTION: thread pool wasn't shutdown: '!mPool'"

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 78
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox-esr68 --- wontfix
firefox76 --- wontfix
firefox77 --- wontfix
firefox78 --- fixed

People

(Reporter: philor, Assigned: lina)

References

Details

(Keywords: assertion, intermittent-failure, Whiteboard: [test disabled on debug][leave open])

Attachments

(2 files, 1 obsolete file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=16366053&tree=Firefox
Rev3 WINNT 5.1 mozilla-central debug test xpcshell on 2012-10-22 19:51:58 PDT for push 48502b61a63e
slave: talos-r3-xp-012

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_password_tracker.js | test failed (with xpcshell return code: -2147483645), see following log:
...
Sync.AddonsReconciler	DEBUG	Stopping listening and removing AddonManager listeners.

TEST-INFO | (xpcshell/head.js) | test 1 finished

TEST-INFO | (xpcshell/head.js) | exiting test
Sync.SyncScheduler	DEBUG	Clearing sync triggers and the global score.

TEST-PASS | (xpcshell/head.js) | 14 (+ 0) check(s) passed

TEST-INFO | (xpcshell/head.js) | 0 check(s) todo
WARNING: nsExceptionService ignoring thread destruction after shutdown: file e:/builds/moz2_slave/m-cen-w32-dbg/build/xpcom/base/nsExceptionService.cpp, line 166
Sync.Tracker.Passwords	TRACE	Saving json to disk: weave/changes/passwords.json
###!!! ASSERTION: thread pool wasn't shutdown: '!mPool', file e:/builds/moz2_slave/m-cen-w32-dbg/build/netwerk/base/src/nsStreamTransportService.cpp, line 435
nsStringStats
 => mAllocCount:           5107
 => mReallocCount:          566
 => mFreeCount:            3850  --  LEAKED 1257 !!!
 => mShareCount:           9857
 => mAdoptCount:            567
 => mAdoptFreeCount:        565  --  LEAKED 2 !!!
(totally bogus stack)
###!!! ASSERTION: thread pool wasn't shutdown: '!mPool', file e:/builds/moz2_slave/m-cen-w32-dbg/build/netwerk/base/src/nsStreamTransportService.cpp, line 435
<<<<<<<
Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32-debug/1350956531/firefox-19.0a1.en-US.win32.crashreporter-symbols.zip
PROCESS-CRASH | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_password_tracker.js | application crashed (minidump found)
Crash dump filename: c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\7b5dfed6-143b-4dc8-a785-fea3c6b03e76.dmp
Operating system: Windows NT
                  5.1.2600 Service Pack 2
CPU: x86
     GenuineIntel family 6 model 23 stepping 10
     2 CPUs

Crash reason:  EXCEPTION_BREAKPOINT
Crash address: 0x3f1279

Thread 0 (crashed)
 0  mozalloc.dll!mozalloc_abort(char const * const) [mozalloc_abort.cpp:48502b61a63e : 23 + 0x0]
    eip = 0x003f1279   esp = 0x0012b9b8   ebp = 0x0012b9c0   ebx = 0x00000040
    esi = 0x10261440   edi = 0x10260670   eax = 0x00000000   ecx = 0xad09a820
    edx = 0x10361f48   efl = 0x00000206
    Found by: given as instruction pointer in context
 1  xul.dll!NS_DebugBreak_P [nsDebugImpl.cpp:48502b61a63e : 410 + 0xc]
    eip = 0x018550a9   esp = 0x0012b9c8   ebp = 0x0012bde4
    Found by: call frame info
 2  xul.dll!nsStreamTransportService::~nsStreamTransportService() [nsStreamTransportService.cpp:48502b61a63e : 435 + 0x36]
    eip = 0x005a4b73   esp = 0x0012bdec   ebp = 0x0012be14
    Found by: call frame info
 3  xul.dll!nsStreamTransportService::Release() [nsStreamTransportService.cpp:48502b61a63e : 460 + 0x58]
    eip = 0x005a51cc   esp = 0x0012be08   ebp = 0x0012be14
    Found by: call frame info
 4  xul.dll!nsCOMPtr<nsIEventTarget>::~nsCOMPtr<nsIEventTarget>() [nsCOMPtr.h:48502b61a63e : 492 + 0x5]
    eip = 0x0058f0f2   esp = 0x0012be1c   ebp = 0x0012be34
    Found by: call frame info
 5  xul.dll!nsAsyncStreamCopier::~nsAsyncStreamCopier() [nsAsyncStreamCopier.cpp:48502b61a63e : 42 + 0x17]
    eip = 0x00592afb   esp = 0x0012be28   ebp = 0x0012be34
    Found by: call frame info
 6  xul.dll!nsAsyncStreamCopier::`scalar deleting destructor'(unsigned int) + 0xa
    eip = 0x00592f48   esp = 0x0012be30   ebp = 0x0012be34
    Found by: call frame info
 7  xul.dll!nsAsyncStreamCopier::Release() [nsAsyncStreamCopier.cpp:48502b61a63e : 95 + 0x55]
    eip = 0x00592715   esp = 0x0012be3c   ebp = 0x0012be4c
    Found by: call frame info
 8  xul.dll!DoDeferredRelease<nsISupports *> [XPCJSRuntime.cpp:48502b61a63e : 543 + 0x5]
    eip = 0x01166281   esp = 0x0012be54   ebp = 0x0012be78
    Found by: call frame info
 9  xul.dll!XPCJSRuntime::GCCallback(JSRuntime *,JSGCStatus) [XPCJSRuntime.cpp:48502b61a63e : 749 + 0xa]
    eip = 0x0116a900   esp = 0x0012be64   ebp = 0x0012be78
    Found by: call frame info
10  mozjs.dll!Collect [jsgc.cpp:48502b61a63e : 4668 + 0x4]
    eip = 0x02cac2cb   esp = 0x0012be80   ebp = 0x0012beb4
    Found by: call frame info
11  mozjs.dll!js::GC(JSRuntime *,js::JSGCInvocationKind,js::gcreason::Reason) [jsgc.cpp:48502b61a63e : 4688 + 0x16]
    eip = 0x02cac42f   esp = 0x0012bebc   ebp = 0x0012bed4
    Found by: call frame info
12  mozjs.dll!js::GCForReason(JSRuntime *,js::gcreason::Reason) [jsfriendapi.cpp:48502b61a63e : 160 + 0xe]
Whiteboard: [orange]
Summary: Intermittent test_password_tracker.js | test failed (with xpcshell return code: -2147483645), "ASSERTION: thread pool wasn't shutdown: '!mPool'" → Intermittent test_password_tracker.js, test_history_tracker.js, test_bookmark_store.js, test_forms_tracker.js | test failed (with xpcshell return code: -2147483645), "ASSERTION: thread pool wasn't shutdown: '!mPool'"
Blocks: 722956
No longer blocks: 438871
https://tbpl.mozilla.org/php/getParsedLog.php?id=17385637&tree=Mozilla-Inbound

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_forms_tracker.js | test failed (with xpcshell return code: -2147483645), see following log:
PROCESS-CRASH | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_forms_tracker.js | application crashed (minidump found)
Thread 0 (crashed)
Richard, please can you take a look at this, it's quite high up the top-oranges list. Thank you :-)
Flags: needinfo?(rnewman)
Hmm. Last test:

add_test(function cleanup() {
   _("Clean up.");
  PlacesUtils.history.removeAllPages();
  if (tracker._lazySave) {
    tracker._lazySave.clear();
  }
  run_next_test();
});


  saveChangedIDs: function T_saveChangedIDs() {
    Utils.namedTimer(function() {
      Utils.jsonSave("changes/" + this.file, this, this.changedIDs);
    }, 1000, this, "_lazySave");
  },


Is this test really, really, kicking off a delayed write as the last thing it does before it quits?

hg blame says that this was added by <benjamin@python.org>.

Benjamin, did you do this for some particular reason?
Flags: needinfo?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #58)

> Is this test really, really, kicking off a delayed write as the last thing
> it does before it quits?

Ah, no, I'm misreading.

Presumably this doesn't cancel an existing lazy save. Let me look deeper.
OS: Windows XP → All
Hardware: x86 → All
Attached patch Tracker part. v1 (obsolete) — Splinter Review
This might be a r?, but starting gently.

The tracker responds to changes by scheduling a background timer to asynchronously write changedIDs to disk.

Our tests attempt to cancel this timer when they finish.

There's a problem with that: if the write starts by the time the tests finish, canceling the timer doesn't do anything. We finish up when there's still an async write occurring.

The easiest solution is to control precisely how we interact with writes. In most cases that's "don't flush". In one case I've added logic to flush, but wait for the write to finish before running the next test.

I would like to add a rudimentary check that the flush has resulted in an actual write to disk, even though we don't currently do so.

test_bookmark_store is presumably a similar issue.
Assignee: nobody → rnewman
Status: NEW → ASSIGNED
Attachment #688888 - Flags: feedback?(gps)
Attachment #688888 - Attachment is obsolete: true
Attachment #688888 - Flags: feedback?(gps)
Attachment #689002 - Flags: review?(gps)
Comment on attachment 689002 [details] [diff] [review]
Proposed patch. v2

Review of attachment 689002 [details] [diff] [review]:
-----------------------------------------------------------------

Sure.

The things worth mentioning:

1) I really wish we had a way to register test cleanup functions. This seems to be a giant workaround because we don't have those.
2) I wouldn't be surprised if there were other tests touching the tracker that will also need to be updated.

::: services/sync/modules/engines.js
@@ -76,5 @@
>    resetScore: function T_resetScore() {
>      this._score = 0;
>    },
>  
> -  saveChangedIDs: function T_saveChangedIDs() {

T_saveChangedIDs WTF? I shouldn't ask questions.

@@ +91,2 @@
>      Utils.namedTimer(function() {
> +      Utils.jsonSave("changes/" + this.file, this, this.changedIDs, cb);

While you are here, do you think we should self-clear _lazySave? I'd argue namedTimer() should do this automatically. But, that change would require some auditing.

::: services/sync/tests/unit/test_history_tracker.js
@@ +63,5 @@
> +  tracker.onSavedChangedIDs = function () {
> +    _("changedIDs written to disk. Proceeding.");
> +    // Turn this back off.
> +    tracker.persistChangedIDs = false;
> +    tracker.onSavedChangedIDs = undefined;

delete tracker.onSavedChangedIDs?

::: services/sync/tests/unit/test_password_tracker.js
@@ +19,5 @@
> +  initTestLogging("Trace");
> +
> +  Log4Moz.repository.getLogger("Sync.Engine.Passwords").level = Log4Moz.Level.Trace;
> +  Log4Moz.repository.getLogger("Sync.Store.Passwords").level = Log4Moz.Level.Trace;
> +  Log4Moz.repository.getLogger("Sync.Tracker.Passwords").level = Log4Moz.Level.Trace;

This is so wrong. First, initTestLogging() adjusts the entire Services. tree. Second, the loggers are "Services." prefixed the last I checked. Nuke these 3 lines.

::: services/sync/tests/unit/test_tracker_addChanged.js
@@ +49,5 @@
> +    // Verify the write by reading the file back.
> +    Utils.jsonLoad("changes/tracker", this, function (json) {
> +      do_check_eq(5, json[id]);
> +      tracker.persistChangedIDs = false;
> +      tracker.onSavedChangedIDs = undefined;

delete tracker.onSavedChangedIDs?
Attachment #689002 - Flags: review?(gps) → review+
(In reply to Gregory Szorc [:gps] from comment #63)

> 1) I really wish we had a way to register test cleanup functions. This seems
> to be a giant workaround because we don't have those.

That'd be a big improvement, but a big change: it would have to know when timers and async background writes complete.


> 2) I wouldn't be surprised if there were other tests touching the tracker
> that will also need to be updated.

Happy to let them come out of the woodwork on their own.


> > -  saveChangedIDs: function T_saveChangedIDs() {
> 
> T_saveChangedIDs WTF? I shouldn't ask questions.

Inorite.
 

> While you are here, do you think we should self-clear _lazySave? I'd argue
> namedTimer() should do this automatically. But, that change would require
> some auditing.

I'd rather not touch this code any more than necessary. It needs to die anyway.


> This is so wrong. First, initTestLogging() adjusts the entire Services.
> tree. Second, the loggers are "Services." prefixed the last I checked. Nuke
> these 3 lines.

Old code.

*shrug*
https://hg.mozilla.org/mozilla-central/rev/569db4506cb1
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Humph.

Let's see if this changes signature in any way….

This might be failing just flushing a different file, so there are more moles we can whack before we give up.
Target Milestone: mozilla20 → ---
Thank you for looking at this btw :-)
Whiteboard: [orange]
TBPL uses keyword=intermittent-failure rather than [orange] now (I've been meaning to post to dev.* about this, will do so within the next day or two). Removing [orange].
Whiteboard: [orange]
rnewman, any more ideas on this? :-)
Failure rate too high; disabled on debug runs whilst rnewman investigates:
https://hg.mozilla.org/integration/mozilla-inbound/rev/17c89659d75a
Whiteboard: [test disabled on debug][leave open]
(In reply to TinderboxPushlog Robot from comment #185)
> mak77%bonardo.net
> https://tbpl.mozilla.org/php/getParsedLog.php?id=18699420&tree=Mozilla-
> Inbound
> Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound opt test xpcshell on
> 2013-01-10 21:46:07
> slave: talos-mtnlion-r5-020
> 
> TEST-UNEXPECTED-FAIL |
> /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/
> test_history_tracker.js | test failed (with xpcshell return code: 0), see
> following log:
> TEST-UNEXPECTED-FAIL |
> /builds/slave/talos-slave/test/build/xpcshell/tests/services/common/tests/
> unit/head_helpers.js | 2 == 1 - See following stack:
> TEST-UNEXPECTED-FAIL |
> /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/
> test_history_tracker.js | 2147500036 - See following stack:

This is a mis-star, but maybe useful…
I'm afraid I don't have time to work on this, and it's disabled anyway. Unassigning.
Assignee: rnewman → nobody
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox

A landed patch and 7 years of inactivity, seems like we can probably close this..?

Status: REOPENED → RESOLVED
Closed: 12 years ago4 years ago
Resolution: --- → FIXED
Assignee: nobody → bugzilla

A landed patch which disabled several tests on debug, so that we get no information about whether or not the code hits other more dangerous assertion failures in the paths tested by those tests.

If you want to call that success, then you need to land a patch which removes the mention of this bug in the comments by the disabled tests, and change the comments to something like "we just don't care whether or not we assert in these tests, so they're disabled on debug." Then you could reasonably close the bug as WONTFIX, that being the closest resolution to DONTCARE.

Assignee: bugzilla → nobody
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

I went the other way and re-enabled them—Try looks good so far, and we can always disable them again.

Assignee: nobody → lina
Pushed by kcambridge@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/be02f68833f2
Re-enable all Sync tests on debug builds. r=markh

No oranges in the last couple of days, so calling this bug squashed. 🐞🚫

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: