Closed Bug 684783 Opened 9 years ago Closed 9 years ago

Intermittent test_errorhandler.js | test failed (with xpcshell return code: 0) from test_errorhandler.js | undefined == error.engine.reason.unknown_fail

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13
Tracking Status
firefox-esr10 --- wontfix

People

(Reporter: philor, Assigned: rnewman)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files)

This has been around at least since last week some time, but we'd far rather push into orange than actually file it.

https://tbpl.mozilla.org/php/getParsedLog.php?id=6282373
Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2011-09-05 11:09:58 PDT for push c8139d064682

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_errorhandler.js | test failed (with xpcshell return code: 0), see following log:
...
TEST-PASS | c:/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js | [test_sync_engine_generic_fail : 794] true == true

TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js | undefined == error.engine.reason.unknown_fail - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 445
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: _do_check_eq :: line 539
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_check_eq :: line 560
JS frame :: c:/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js :: test_sync_engine_generic_fail :: line 798
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: _run_next_test :: line 865
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 412

TEST-INFO | (xpcshell/head.js) | exiting test

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\head.js | 2147500036 - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 445
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: _run_next_test :: line 868
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 412
I believe there are two issues here. One is trivial: we mix up "catapult" and "steam". It turns out that this is a red herring.

The real issue is that there's a race between the reset-file-log handler (which resets all the Sync status flags!) and the checks that run after sync() is called.

On my machine we hit the checks *then* the file log handler. In the log in Comment 0, the reverse is true, and so the checks fail, because Status has been reset.

This patch includes fixes for the former (a simple replace) and the latter: it moves the checks inside the file log handler. Tada!

Tests continue to pass on my machine (not that that's saying much). I will kick off a Win opt + Snow Leopard try build tonight.
Assignee: nobody → rnewman
Status: NEW → ASSIGNED
Attachment #558410 - Flags: review?(philipp)
Try build:

https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=0dec028e88eb

Will check back in on this in the morning.
Attachment #558410 - Flags: review?(philipp) → review+
Landed in inbound:

http://hg.mozilla.org/integration/mozilla-inbound/rev/2dc945c5c66e
Whiteboard: [orange] → [orange][inbound]
Also pushed to s-c to help debug a new random orange there: http://hg.mozilla.org/services/services-central/rev/df706d14c53d
Whiteboard: [orange][inbound] → [orange][inbound][fixed in services]
http://hg.mozilla.org/mozilla-central/rev/2dc945c5c66e
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: [orange][inbound][fixed in services] → [orange]
Target Milestone: --- → mozilla9
Just saw this again on the s-c to m-c merge so I don't think this is fixed :(
Status: RESOLVED → REOPENED
OS: Windows XP → All
Hardware: x86 → All
Resolution: FIXED → ---
Target Milestone: mozilla9 → ---
Attached patch Follow-up, v1Splinter Review
I took another look at this.

The last log shows the reset-file-log handler being invoked before the sync has begun. The file log is only reset on login error, sync error, or sync completion.

My hypothesis, then, is that resetFileLog was called in an earlier test, at the conclusion of a sync, but that the NetUtil.asyncCopy() call finishes after our subsequent test has installed the reset-file-log observer.

The solution implemented in this patch is to only install our observer on weave:engine:sync:finish, which will take place when the requested sync is well underway, but before the sync concludes and the log processing logic is called. That should ensure that it's the current test scenario that's in play, and we're testing the right thing.

This passes on my machine. gps, take a look?
Attachment #602677 - Flags: review?(gps)
Comment on attachment 602677 [details] [diff] [review]
Follow-up, v1

Review of attachment 602677 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.

Please nuke the trailing whitespace on L1631.
Attachment #602677 - Flags: review?(gps) → review+
https://hg.mozilla.org/services/services-central/rev/857e7b6723d4
Status: REOPENED → ASSIGNED
Whiteboard: [orange] → [orange][fixed in services]
https://hg.mozilla.org/mozilla-central/rev/857e7b6723d4
Status: ASSIGNED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Whiteboard: [orange][fixed in services] → [orange]
Target Milestone: --- → mozilla13
Whiteboard: [orange]
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.