Closed
Bug 712715
Opened 13 years ago
Closed 13 years ago
Intermittent failure in test_addon_nonrestartless_xpi.js
Categories
(Firefox :: Sync, defect)
Firefox
Sync
Tracking
()
RESOLVED
FIXED
mozilla13
People
(Reporter: gps, Assigned: gps)
References
Details
(Whiteboard: [qa-])
Attachments
(1 file, 2 obsolete files)
8.50 KB,
patch
|
rnewman
:
review+
|
Details | Diff | Splinter Review |
test_addon_nonrestartless_xpi.js fails intermittently. This is likely due to bug 710448.
Assignee | ||
Comment 1•13 years ago
|
||
At least part of the problem is a race condition between the TPS test runner and code in the addons engine. This change works around the race by forcing an additional sync.
Comment 2•13 years ago
|
||
Am I right in thinking that the two syncs are essentially glorified sleep() calls?
Assignee | ||
Comment 4•13 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #2) > Am I right in thinking that the two syncs are essentially glorified sleep() > calls? Yup.
Comment 5•13 years ago
|
||
OK, then please choose one of these, in order of increasing preference: * Comment that you're performing a sync for the side-effect of taking time, not because you want to ensure that some synchronization has occurred. * Add a sleep operation to TPS, so at least the test is explicit and doesn't introduce another fallible network hit. * Add a timeout-controlled polling operation to TPS, so we can poll Addons.verifyNot for some period of time. The timeout can be much longer than the sleep that we'd be willing to use. * Directly perform whatever side-effect syncing has. If we need to spin the event loop 9 times to ensure that the addons engine is fully set up, we can do that and say why. See head_helpers.js, waitForZeroTimer, which is AWFUL but necessary, and used six times in our unit tests: function waitForZeroTimer(callback) { // First wait >100ms (nsITimers can take up to that much time to fire, so // we can account for the timer in delayedAutoconnect) and then two event // loop ticks (to account for the Utils.nextTick() in autoConnect). * Somehow tie the appropriate TPS operations to events to achieve a strict order, rather than trying to impose an order by guessing delays. I don't know how feasible this is. I would love to get to Level 4 in this sequence, but I'll settle for Level 1 and a followup to do better!
Assignee | ||
Comment 6•13 years ago
|
||
I believe this patch achieves the highest desirable outcome from rnewman's list. There are effectively 3 parts, but they are all related to the same systemic issue: race conditions between Sync and TPS actions. 1) *all* TPS phases now wait on weave:service:ready before starting. Before, it was possible you could start performing actions before Sync was initialized. I think we just got lucky in that it didn't (most of the time). 2) When logging in and when resetting data, TPS will wait for the start tracking event to fire before proceeding. Before, it did not wait on this. 3) A new action "EnsureTracking" is available and utilized by the addons tests. This action ensures Sync is logged in and that tracking is active. This works around the original issue in this bug. It is required by the addons engine only because the addons reconciler waits for start-tracking before becoming active. Before, we had a race condition between Sync and TPS and the reconciler wasn't fully initialized before TPS starting performing actions. Since it wasn't initialized, performed actions were getting lost. These were discovered on the 2nd sync (after start-tracking was called for the first time) and the engine regained a proper view on the world. So, I'm confident there was no bug inside the addons engine. I've confirmed all TPS tests pass with this applied. However, I did also have bug 710448 applied, as that addresses buggy behavior exposed by TPS tests. I'm not sure if TPS tests will run cleanly with just this patch.
Attachment #590976 -
Attachment is obsolete: true
Attachment #590976 -
Flags: review?(rnewman)
Attachment #591313 -
Flags: review?(rnewman)
Comment 7•13 years ago
|
||
Comment on attachment 591313 [details] [diff] [review] Make TPS wait for Sync events Review of attachment 591313 [details] [diff] [review]: ----------------------------------------------------------------- ::: services/sync/tps/extensions/tps/modules/tps.jsm @@ +547,5 @@ > + if (!Weave.Status.ready) { > + Logger.logInfo("Waiting on Sync service to become ready..."); > + let cb = Async.makeSyncCallback(); > + Svc.Obs.add("weave:service:ready", cb); > + Async.waitForSyncCallback(cb); I tend to prefer the brevity and OO of let cb = Async.makeSpinningCallback(); Svc.Obs.add("weave:service:ready", cb); cb.wait(); but no big deal. @@ +557,5 @@ > + // service:ready event, this is required to ensure all handlers have > + // executed. > + Utils.nextTick(function() { > + this._executeTestPhase(file, phase, settings); > + }, this); Equivalent to Utils.nextTick(this._executeTestPhase.bind(this, file, phase, settings)); no? @@ +572,5 @@ > + * This is called by RunTestPhase() after the environment is validated. > + */ > + _executeTestPhase: function _executeTestPhase(file, phase, settings) { > + try { > + // TODO Unregister observers on unload. Bug number? @@ +577,4 @@ > Services.obs.addObserver(this, "weave:service:sync:finish", true); > Services.obs.addObserver(this, "weave:service:sync:error", true); > + Services.obs.addObserver(this, "weave:engine:start-tracking", true); > + Services.obs.addObserver(this, "weave:engine:stop-tracking", true); Can we add these first? Makes sense from a chronological perspective. @@ +720,5 @@ > + let cb = Async.makeSyncCallback(); > + Svc.Obs.add("weave:engine:start-tracking", cb); > + Async.waitForSyncCallback(cb); > + Svc.Obs.remove("weave:engine:start-tracking", cb); > + Logger.logInfo("weave:engine:start-tracking observed!"); This pattern has come up twice. How about function waitForEvent(event) { let cb = Async.makeSpinningCallback(); Svc.Obs.add(event, cb); cb.wait(); Svc.Obs.remove(event, cb); Logger.logInfo(event + " observed!"); } @@ +725,5 @@ > + } > + > + let cb = Async.makeSyncCallback(); > + Utils.nextTick(cb); > + Async.waitForSyncCallback(cb); Ooh, janky. @@ +796,5 @@ > Logger.AssertEqual(Weave.Status.service, Weave.STATUS_OK, "Weave status not OK"); > Weave.Svc.Obs.notify("weave:service:setup-complete"); > this._loggedIn = true; > + > + this.waitForTracking(); start-tracking is notified in onStartup, which is in charge of the basics like creating IDs and registering engines. It seems really backward to be logging in and wiping the server before this has happened. Does anything break if you wait for tracking *before* the login call?
Attachment #591313 -
Flags: review?(rnewman) → feedback+
Assignee | ||
Comment 8•13 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #7) > @@ +725,5 @@ > > + } > > + > > + let cb = Async.makeSyncCallback(); > > + Utils.nextTick(cb); > > + Async.waitForSyncCallback(cb); > > Ooh, janky. You are telling me. What would you prefer? A timer? > > @@ +796,5 @@ > > Logger.AssertEqual(Weave.Status.service, Weave.STATUS_OK, "Weave status not OK"); > > Weave.Svc.Obs.notify("weave:service:setup-complete"); > > this._loggedIn = true; > > + > > + this.waitForTracking(); > > start-tracking is notified in onStartup, which is in charge of the basics > like creating IDs and registering engines. It seems really backward to be > logging in and wiping the server before this has happened. > > Does anything break if you wait for tracking *before* the login call? I /think/ I tried this and encountered hangs waiting for tracking. The login/setup code in tps.jsm is really janky. I'm pretty sure it is doing a lot of things wrong. However, the login aspects of service.js are one of the parts of Sync I still don't grok too well, so I can't say anything beyond it tingles my spider senses.
Comment 9•13 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #8) > > Ooh, janky. > > You are telling me. What would you prefer? A timer? Heh! Naw. Not a comment about the code; a comment about the architecture Other than explicitly lifting the event loop spinning code in Async ("Async.spinFor(1);"), there's nothing better :/ Ugly systemic design means ugly code. > > Does anything break if you wait for tracking *before* the login call? > > I /think/ I tried this and encountered hangs waiting for tracking. I would encourage you to try again, on the off chance that it works. If it works, it'll avoid a whole class of possible bugs. (I'm not even sure how it manages to work right now!) > The login/setup code in tps.jsm is really janky. I'm pretty sure it is doing > a lot of things wrong. However, the login aspects of service.js are one of > the parts of Sync I still don't grok too well, so I can't say anything > beyond it tingles my spider senses. Having reworked it a couple of times, I can assure you that it's awful and doesn't make any sense. I'm happy to let this through with the other nits addressed (if it breaks, we'll know about it!), but I'd like you to try the tracking-first approach, make sure your recollection is correct.
Assignee | ||
Comment 10•13 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #9) > I'm happy to let this through with the other nits addressed (if it breaks, > we'll know about it!), but I'd like you to try the tracking-first approach, > make sure your recollection is correct. I moved the "wait for tracking" call to before Service.login() and the test runner hangs. I think this is happening because the user prefs aren't defined until the first time Service.login() and thus onStartup() doesn't have a sane environment, so it never emits start-tracking. I could be wrong.
Assignee | ||
Comment 11•13 years ago
|
||
I think I hit all the feedback points.
Attachment #591313 -
Attachment is obsolete: true
Attachment #592942 -
Flags: review?(rnewman)
Comment 12•13 years ago
|
||
Comment on attachment 592942 [details] [diff] [review] Make TPS wait for Sync events, v2 r+, but please first explore setting the credentials prefs and inverting the operations, see if that works.
Attachment #592942 -
Flags: review?(rnewman) → review+
Assignee | ||
Comment 13•13 years ago
|
||
https://hg.mozilla.org/services/services-central/rev/b6968b815726 Will file follow-up for review feedback.
Whiteboard: [fixed in services]
Updated•13 years ago
|
Whiteboard: [fixed in services] → [fixed in services][qa-]
Assignee | ||
Comment 14•13 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/b6968b815726
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in services][qa-] → [qa-]
Target Milestone: --- → mozilla13
Assignee | ||
Comment 15•13 years ago
|
||
This was really fixed by bug 712542. The changes in this bug did hopefully address timing issues in the TPS test harness, however.
Depends on: 712542
Updated•6 years ago
|
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in
before you can comment on or make changes to this bug.
Description
•