Closed Bug 711263 Opened 13 years ago Closed 13 years ago

xpcshell test failure: test_addons_reconciler.js | 1 == 3


(Firefox :: Sync, defect)

Windows XP
Not set





(Reporter: gps, Assigned: gps)



(Keywords: intermittent-failure)


(2 files, 2 obsolete files)

We have a consistent test failure on Windows machines with add-on sync:

TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_addons_reconciler.js | 1 == 3 - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 453
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: _do_check_eq :: line 547
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_check_eq :: line 568
JS frame :: c:/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_addons_reconciler.js :: <TOP_LEVEL> :: line 139
JS frame :: resource://services-sync/addonsreconciler.js :: <TOP_LEVEL> :: line 241
JS frame :: resource://services-sync/util.js :: <TOP_LEVEL> :: line 814
JS frame :: resource://gre/modules/NetUtil.jsm :: <TOP_LEVEL> :: line 174

I can't reproduce this on my Windows build machine. I have an official build machine request pending in 711257. RelEng says I'll get the machine tomorrow.
This patch fixes a race condition when saving add-on reconciler state by waiting for the save operation to complete before returning.

Unfortunately, since this is happening in an event listener, we can't fire a callback and perform the async properly: the saves need to "block" the caller of the listener so we can only have 1 save operation in progress. I suppose I could add a semaphore and complicate logic so the event loop isn't spun. But, this is simple and there is plenty of event loop spinning happening in this file already.

A bunch of people aren't around, so whoever gets to the review first wins.
Assignee: nobody → gps
Attachment #582378 - Flags: review?(rnewman)
Attachment #582378 - Flags: review?(philipp)
Attachment #582378 - Flags: review?(mconnor)
I should mention that a try build seemed to fix the xpcshell test failures on Windows:
Comment on attachment 582378 [details] [diff] [review]
Fix race condition when saving add-on reconciler state

Attachment #582378 - Flags: review?(rnewman)
Attachment #582378 - Flags: review?(philipp)
Attachment #582378 - Flags: review?(mconnor)
Attachment #582378 - Flags: review+
Pushed to s-c:

philikon, tracy, and tchung all agreed that landing on a departed train was warranted since this (hopefully) fixes an orange.
Whiteboard: [fixed in services]
Pushed to m-c:
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla11
Backed out because of test failures:
TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser-window/browser_bug596336.js | Version should be correct - Got 1.0, expected 2.0
Stack trace:
    JS frame :: chrome://mochikit/content/browser-test.js :: test_is :: line 402
    JS frame :: chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser-window/browser_bug596336.js :: check_addon :: line 59
    JS frame :: chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser-window/browser_bug596336.js :: <TOP_LEVEL> :: line 76
    JS frame :: resource://gre/modules/AddonManager.jsm :: safeCall :: line 56
    JS frame :: resource://gre/modules/AddonManager.jsm :: <TOP_LEVEL> :: line 1030
    JS frame :: resource:///modules/XPIProvider.jsm :: <TOP_LEVEL> :: line 3172
    JS frame :: resource:///modules/XPIProvider.jsm :: <TOP_LEVEL> :: line 5137
    JS frame :: resource:///modules/XPIProvider.jsm :: <TOP_LEVEL> :: line 4087
Resolution: FIXED → ---
Whiteboard: [fixed in services]
Blocks: 438871
Whiteboard: [orange]
Comment on attachment 582993 [details] [diff] [review]
Don't register AddonsReconciler with add-on manager unless necessary

This delays registration of the add-ons reconciler with the add-on manager until a) Sync is ready and the add-ons engine is enabled b) startListening() has been called explicitly.

I modified the unit tests to call startListening() explicitly instead of going through observers because it is more elegant. I did try to make it go through observers, but there were all kinds of async weirdness happening there. The underlying problem was I was falling off the end of xpcshell's run_test() without calling run_next_test() because I was waiting on yet another observer (from the weave:service:ready handler in the reconciler). I since ripped out this additional "I am registered" observer notification from the reconciler and replaced with a standard JS callback.

I've executed the extensions mochitests with this patch applied and they passed. I have a try build in progress, just in case. I'd probably want to land this on m-i instead of s-c because it needs to get into m-c quickly to kill an orange there.
Attachment #582993 - Attachment is patch: true
Incorporated feedback from philikon on IRC. Had to refactor the tracker test a little bit to ensure the reconciler is listening before changes are made.
Attachment #582993 - Attachment is obsolete: true
Attachment #582993 - Flags: review?(rnewman)
Attachment #583010 - Flags: review?(philipp)
Removed callback per philikon's IRC input.
Attachment #583010 - Attachment is obsolete: true
Attachment #583010 - Flags: review?(philipp)
Attachment #583021 - Flags: review?(philipp)
Attachment #583021 - Flags: review?(philipp) → review+
Landed attachment #583021 [details] [diff] [review] in s-c:

Will ping others to see what to do about landing elsewhere.
Pushed to m-c (with edmorley's permission):

Hopefully this causes tree to go green. If not, it will be a long night.
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Try run for e0e08e025c97 is complete.
Detailed breakdown of the results available here:
Results (out of 264 total builds):
    exception: 38
    success: 128
    warnings: 32
    failure: 66
Builds (or logs if builds failed) available at:
Whiteboard: [orange]
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.