Closed Bug 981848 Opened 6 years ago Closed 6 years ago

Don't force login() of the user each time the browser gets started, which can cause hangs

Categories

(Testing Graveyard :: TPS, defect, major)

defect
Not set
major

Tracking

(firefox29 wontfix, firefox30 fixed, firefox31 fixed)

RESOLVED FIXED
mozilla31
Tracking Status
firefox29 --- wontfix
firefox30 --- fixed
firefox31 --- fixed

People

(Reporter: edwong, Assigned: whimboo)

References

Details

Attachments

(1 file, 4 obsolete files)

Not sure if Weave.Svc.Prefs.set should fire the weave:service:setup-complete event. This could be an issue with the update to TPS

1. install tps
2. run this:
runtps --testfile=/Users/Edwin/dev/services-central/services/sync/tests/tps/test_bug562515.js  --binary=/Applications/FirefoxNightly.app/Contents/MacOS/firefox 

test calls:
[Sync, SYNC_WIPE_CLIENT],

which calls:
Weave.Svc.Prefs.set("firstSync", "wipeClient");

CROSSWEAVE TEST PASS: executing action DELETE on bookmarks
CROSSWEAVE INFO: starting action: [null,{"menu":[{"uri":"http://www.google.com","loadInSidebar":true,"tags":["computers","google","internet","www"],"location":"menu"}],"menu/foldera":[{"uri":"http://www.yahoo.com","title":"testing Yahoo","location":"menu/foldera"}]}]
CROSSWEAVE INFO: executing action VERIFY-NOT on bookmark {"uri":"http://www.google.com","loadInSidebar":true,"tags":["computers","google","internet","www"],"location":"menu"}
CROSSWEAVE INFO: executing action VERIFY-NOT on bookmark {"uri":"http://www.yahoo.com","title":"testing Yahoo","location":"menu/foldera"}
CROSSWEAVE TEST PASS: executing action VERIFY-NOT on bookmarks
CROSSWEAVE INFO: starting action: [null,"wipe-client"]
CROSSWEAVE INFO: executing Sync wipe-client
CROSSWEAVE INFO: Setting client credentials.
CROSSWEAVE INFO: ----------event observed: fxaccounts:onlogin
CROSSWEAVE INFO: ----------event observed: weave:service:setup-complete
CROSSWEAVE INFO: ----------event observed: weave:engine:start-tracking
<TEST HANGS>

usually we get this:
CROSSWEAVE INFO: weave:service:setup-complete observed!
Blocks: 981706
safe to assume wipe is also hanging on these:
TEST-UNEXPECTED-FAIL | test_client_wipe.js | test did not complete
TEST-UNEXPECTED-FAIL | test_bug563989.js | test did not complete
* calls: Weave.Svc.Prefs.set("firstSync", "wipeRemote");
I do not see this problem with my upcoming refactoring patch on bug 966434 being applied. Lets re-evaluate once it has been landed.
Looks like this is only a problem when you run all the tests in the suite. Running the test alone doesn't cause this hang. So please forget my comment 2. It will not fix it.

We should really get this fixed ASAP. For now I will disable the test on bug 966434 so it doesn't hang our testrun.
Severity: normal → major
Actually multiple tests are involved here. I see the same with test_bug563989.js. So disabling might not be an option.
This might be a race condition here in our tps code. First we check the internal variable if such an event has already happened. If not we are registering a listener for the observer notification. If the notification actually happens in between, we will miss it.

I will work on that next because that's blocking our CI runs.
Assignee: nobody → hskupin
OS: Mac OS X → All
Hardware: x86 → All
Summary: weave:service:setup-complete fails to fire after wipe-client sync → TPS sometimes fails to notice the 'weave:service:setup-complete' notification
Looks like that the problem here lays somewhere else:
http://mxr.mozilla.org/mozilla-central/source/services/sync/tps/extensions/tps/resource/fxaccounts.jsm#51

Not sure why but in such a case the cb.wait() does not return even we call cb(null) in the final promise callback. Not sure what is blocking here, but I will investigate that now.
Status: NEW → ASSIGNED
Summary: TPS sometimes fails to notice the 'weave:service:setup-complete' notification → TPS sometimes wait forever in waitForSyncCallback() during fxaccounts login
So this not only happens during the login phase but also e.g. when dumping the bookmarks before quitting the browser in bookmarks.jsm line 44. In a debug build I don't see it yet. I will continue here with a self-build opt build, which hopefully let me get going.
Summary: TPS sometimes wait forever in waitForSyncCallback() during fxaccounts login → TPS sometimes wait forever in cb.wait() which is waitForSyncCallback()
It looks like that this hang happens when we don't run an initial sync a second time we use a profile. So most likely this is caused by an already signed-in user. I'm not sure if it is ok to call client.signIn() and fxAccounts.setSignedInUser() each time we start Firefox. Don't we run into conflicts with Firefox's own signIn process? Shouldn't we only do the above if no user has been signed in yet, or if it's username is different? Given that we might not need the latter approach (we always have a single account only) it should be trivial to fix. I quick spot-check made it work for me locally. I only have to wait for the correct events.
Component: Server: Sync → TPS
Product: Mozilla Services → Testing
Attached patch WIP v1 (obsolete) — Splinter Review
This is a WIP for the proposed solution. It currently fixes a lot of our tests! So when you work on broken tests make sure to work on top of it.

It's far away from being ready given that we might have to do a major refactoring of tps to better differentiate between fxaccounts and the old sync authentication. I will work on that today.
Summary: TPS sometimes wait forever in cb.wait() which is waitForSyncCallback() → Don't force login() of the user each time the browser gets started, which can cause hangs
Attached patch WIP v1 (obsolete) — Splinter Review
Ups, that was the wrong patch. Here the correct one.
Attachment #8396966 - Attachment is obsolete: true
Good news! With the patch on this bug we will be able to get nearly all tps tests passing! Only bug 983642 and bug 983628 will remain at least on my Linux box.
This patch fixes the problems for me and creates a new abstraction layer for auth types.

Andrei, would you mind testing it for the fxaccounts and old sync auth type? I would love to get your feedback that everything is working. Otherwise I would also like to hear from Chris if it is correctly done regarding the fxaccounts usage. Thanks.
Attachment #8396969 - Attachment is obsolete: true
Attachment #8397111 - Flags: review?(jgriffin)
Attachment #8397111 - Flags: feedback?(ckarlof)
Attachment #8397111 - Flags: feedback?(andrei.eftimie)
Comment on attachment 8397111 [details] [diff] [review]
Don't force login after each start v1

Review of attachment 8397111 [details] [diff] [review]:
-----------------------------------------------------------------

In general this works fine.
I do see some failures related to addon tests which I didn't see without this patch.
I am a bit concerned with the restartless and the non-restartless addon tests which hang. This didn't happen without this patch. We might need to adjust something in those tests. We'll need to investigate those a bit more.

Here's what a full report looks like for me (OSX, fx_account). This also has the patch from bug 983642 applied.
> TEST-PASS | test_sync.js | 
> TEST-PASS | test_prefs.js | 
> TEST-UNEXPECTED-FAIL | test_tabs.js | [phase3] Exception caught: ASSERTION FAILED! error locating tab No traceback available
> TEST-PASS | test_passwords.js | 
> TEST-PASS | test_history.js | 
> TEST-PASS | test_formdata.js | 
> TEST-PASS | test_bug530717.js | 
> TEST-PASS | test_bug531489.js | 
> TEST-PASS | test_bug538298.js | 
> TEST-PASS | test_bug556509.js | 
> TEST-PASS | test_bug562515.js | 
> TEST-PASS | test_bug563989.js | 
> TEST-PASS | test_bug535326.js | 
> TEST-PASS | test_bug501528.js | 
> TEST-PASS | test_bug575423.js | 
> TEST-PASS | test_bug546807.js | 
> TEST-UNEXPECTED-FAIL | test_history_collision.js | [phase4] Exception caught: ASSERTION FAILED! Uri visits found in history database, but they shouldn't be No traceback available
> TEST-PASS | test_privbrw_formdata.js | 
> TEST-PASS | test_privbrw_passwords.js | 
> TEST-PASS | test_privbrw_tabs.js | 
> TEST-PASS | test_bookmarks_in_same_named_folder.js | 
> TEST-PASS | test_client_wipe.js | 
> TEST-PASS | test_special_tabs.js | 
> TEST-PASS | test_addon_sanity.js | 
> TEST-UNEXPECTED-FAIL | test_addon_restartless_xpi.js | test did not complete
> TEST-UNEXPECTED-FAIL | test_addon_nonrestartless_xpi.js | test did not complete
> TEST-UNEXPECTED-FAIL | test_addon_reconciling.js | [phase06] Exception caught: ASSERTION FAILED! addon restartless-xpi@tests.mozilla.org is present, but it shouldn't be No traceback available
> TEST-UNEXPECTED-FAIL | test_addon_wipe.js | [phase02] Exception caught: ASSERTION FAILED! add-on is disabled: restartless-xpi@tests.mozilla.org No traceback available

I haven't gotten to test the old sync service yet. A full testrun takes quite a while.
Attachment #8397111 - Flags: feedback?(andrei.eftimie) → feedback+
All the tests were working fine for me without a hang. So it might be some random and intermittent failures. If those keep failing, we will investigate them in a separate bug. Thanks for running the tests!
Comment on attachment 8397111 [details] [diff] [review]
Don't force login after each start v1

Review of attachment 8397111 [details] [diff] [review]:
-----------------------------------------------------------------

As long as the account you're signing in to has previously verified its email, nothing stands out as funny here.
Attachment #8397111 - Flags: feedback?(ckarlof) → feedback+
Comment on attachment 8397111 [details] [diff] [review]
Don't force login after each start v1

Review of attachment 8397111 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm!

::: services/sync/tps/extensions/tps/resource/tps.jsm
@@ +93,3 @@
>  
> +  _init: function TPS__init() {
> +    // Check if Firefox Accounts are enabled

nit: s/are/is

@@ +565,2 @@
>        Logger.logInfo("Firefox version: " + Services.appinfo.version);
> +      Logger.logInfo('Firefox Accounts enabled: '+ this.fxaccounts_enabled);

nit: add a space before +

@@ +648,4 @@
>          this_phase.push([this.WipeServer]);
>        }
>  
> +      // If there has been specified a custom server, set it now

nit:  better worded as "If a custom server was specified, set it now"
Attachment #8397111 - Flags: review?(jgriffin) → review+
Patch which fixes the nits. Also I moved both auth_*.jsm files under a new subfolder called auth to get a better structure. I will land on inbound in a bit.
Attachment #8397111 - Attachment is obsolete: true
Attachment #8397603 - Flags: review+
Inbound is closed.  So here the final patch with the correct commit message for landing.
Attachment #8397603 - Attachment is obsolete: true
Attachment #8397604 - Flags: review+
I'm not around the next hours so it would be good if someone can land this for me on inbound. Thanks.
Keywords: checkin-needed
(In reply to Henrik Skupin (:whimboo) from comment #19)
> I'm not around the next hours so it would be good if someone can land this
> for me on inbound. Thanks.

np :) landed as https://hg.mozilla.org/integration/mozilla-inbound/rev/2dcb65629793
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/2dcb65629793
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Comment on attachment 8397604 [details] [diff] [review]
Don't force login after each start v2.1

[Approval Request Comment]
Bug caused by (feature/regressing bug #): None
User impact if declined: None, but I will fix a broken behavior of TPS when testing sync for fxa
Testing completed (on m-c, etc.): m-c
Risk to taking this patch (and alternatives if risky): None
String or IDL/UUID changes made by this patch: None
Attachment #8397604 - Flags: approval-mozilla-aurora?
Comment on attachment 8397604 [details] [diff] [review]
Don't force login after each start v2.1

looks like you might want this for beta too? please nom if so
Attachment #8397604 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Actually I cannot land this patch on mozilla-aurora because we need the patch on bug 985212 backported first.
Depends on: 985212
Whiteboard: [needs patch on bug 985212 landed first on each branch]
https://hg.mozilla.org/releases/mozilla-aurora/rev/59cc6bd58ed9
Whiteboard: [needs patch on bug 985212 landed first on each branch]
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.