Closed Bug 1018022 Opened 10 years ago Closed 9 years ago

Polling for verification mail stops after a relatively short time, which may "stall" sync migration

Categories

(Firefox :: Sync, defect)

defect
Not set
normal
Points:
5

Tracking

()

VERIFIED FIXED
Firefox 38
Iteration:
38.1 - 26 Jan
Tracking Status
firefox36 --- verified
firefox37 --- verified
firefox38 --- verified

People

(Reporter: markh, Assigned: adw)

References

Details

Attachments

(2 files, 3 obsolete files)

Currently Fx stops polling for email verification after a reasonably short time.  As the sync migration process will typically involve waiting for the user to create an FxA account and handle the verification mail, the fact we stop polling may "stall" the verification process - we will never see the fact the user is verified and thus will not be able to complete migration.

A work-around will be for the user to restart the browser (our migration process will need to account for that happening in any case), but that kinda sucks from a UX perspective.

Given we probably don't want to poll forever in the general case, it might be OK to expose a way so we can poll forever in the current session and have the migration code enable this while the migration is in progress.
Whiteboard: p=5
Whiteboard: p=5 → p=5 [qa?]
Blocks: 999910
No longer blocks: migratesync
Points: --- → 5
Flags: qe-verify+
Flags: firefox-backlog+
Whiteboard: p=5 [qa?]
FTR, bug 966517 exists to move the polling to a separate module - if we decide to go ahead with that, we should probably do both at the same time.
See Also: → 966517
Assignee: nobody → adw
Status: NEW → ASSIGNED
Iteration: --- → 38.1 - 26 Jan
See Also: → 1122355
See Also: → 1122443
Attached patch patch (obsolete) — Splinter Review
Here's a patch that does what we talked about in bug 1122443 comment 2.  I know we're waiting to hear back from Shane in that bug, so no worries if you choose to defer review until then -- well, if we get to the end of the iteration and we're still waiting, then worries.
Attachment #8552148 - Flags: review?(mhammond)
Comment on attachment 8552148 [details] [diff] [review]
patch

Review of attachment 8552148 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good with comments and the outcome of bug 1122443 addressed

::: services/fxaccounts/FxAccounts.jsm
@@ +840,5 @@
>  
> +  // Poll email status using truncated exponential back-off.
> +  pollEmailStatusAgain: function (currentState, sessionToken, timeoutMs) {
> +    let ageMs = Date.now() - this.pollStartDate;
> +    if (ageMs < this.POLL_SESSION) {

I think we should invert this condition, log a debug message saying we have given up and early-return.

@@ +841,5 @@
> +  // Poll email status using truncated exponential back-off.
> +  pollEmailStatusAgain: function (currentState, sessionToken, timeoutMs) {
> +    let ageMs = Date.now() - this.pollStartDate;
> +    if (ageMs < this.POLL_SESSION) {
> +      if (timeoutMs == undefined) {

=== here?

@@ +843,5 @@
> +    let ageMs = Date.now() - this.pollStartDate;
> +    if (ageMs < this.POLL_SESSION) {
> +      if (timeoutMs == undefined) {
> +        let currentSec = Math.ceil(ageMs / 1000);
> +        let timeoutSecs = Math.pow(2, currentSec) + 1;

I don't think this is the same as we most recently discussed - 2^n where n == number of seconds seems like it will grow too fast.

::: services/fxaccounts/FxAccountsCommon.js
@@ +79,5 @@
>  exports.CERT_LIFETIME      = 1000 * 3600 * 6;  // 6 hours
>  exports.KEY_LIFETIME       = 1000 * 3600 * 12; // 12 hours
>  
>  // Polling timings.
> +exports.POLL_SESSION       = 1000 * 60 * 20;   // 20 minutes

I think it's worth adding a concise comment above each constant for how it's used (and I'm not convinced a max of 20 minutes is that great - the cost for additional minutes is 1 poll per minute which seems low - but yeah, let's see what bug 1122443 says before worrying about this.
Attachment #8552148 - Flags: feedback+
Comment on attachment 8552148 [details] [diff] [review]
patch

Review of attachment 8552148 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccounts.jsm
@@ +788,5 @@
>      if (why == "start") {
>        // If we were already polling, stop and start again.  This could happen
>        // if the user requested the verification email to be resent while we
>        // were already polling for receipt of an earlier email.
> +      this.pollStartDate = Date.now();

We have some percentage (~15-20%) of Sync users which never verify their email. If these users don't remove their unverified account from Firefox. They will needlessly poll the server on each browser start. It makes sense to do one check on browser start, but if the user added their account 3 weeks ago, and it's still not verified, it probably never will be.

Possible solutions: 

1) Remove their account from the browser if they are still not verified after n weeks. 
2) Persist pollStateDate across browser restart.
Attached patch patch 2 (obsolete) — Splinter Review
Yikes, I am dumb.  This just addresses Mark's comment 3.  I'll leave off the r? while we're thinking about Chris's comment -- good point -- and waiting for bug 1122443.
Attachment #8552148 - Attachment is obsolete: true
Attachment #8552148 - Flags: review?(mhammond)
(In reply to Chris Karlof [:ckarlof] from comment #4)
> 2) Persist pollStateDate across browser restart.

This seems like the right thing to me, but we don't currently persist any per-account info in Firefox as far as I can tell, but maybe Mark can correct me.  So we'd need to set up some per-account storage first, like prefs.

I wonder if we can approximate it by only polling in certain cases, and in other cases not poll at all, just check once and then stop.  Wouldn't we only want to poll in two cases?  When the user first signs up, and when they click a Resend Verification Email button?  All other times, e.g. when FxAccounts.jsm is loaded and initialized, don't poll at all, just check once.
> This seems like the right thing to me, but we don't currently persist any per-account info in Firefox as far as I can tell, but maybe Mark can correct me.  So we'd need to set up some per-account storage first, like prefs.

FxA stores information in at least two places: 1) signedInUser.json in the profile dir, and 2) the password manager. It also uses the prefs for some configuration, although I don't think it currently uses the prefs for any account state. 

We want this state to be cleared if the user logs out. signedInUser.json is a good fit for that, and since you only need to write this value once on login, it might make sense to include it in the JSON blob before it gets written to disk. 

Here's a excerpt of FxAccounts.jsm where the user state gets written to disk after a login (http://mxr.mozilla.org/mozilla-central/source/services/fxaccounts/FxAccounts.jsm?force=1#459):

  setSignedInUser: function setSignedInUser(credentials) {
    log.debug("setSignedInUser - aborting any existing flows");
    this.abortExistingFlow();

    let record = {version: this.version, accountData: credentials};
    let currentState = this.currentAccountState;
    // Cache a clone of the credentials object.
    currentState.signedInUser = JSON.parse(JSON.stringify(record));

    // This promise waits for storage, but not for verification.
    // We're telling the caller that this is durable now.
    return this.signedInUserStorage.set(record).then(() => {
      this.notifyObservers(ONLOGIN_NOTIFICATION);
      if (!this.isUserEmailVerified(credentials)) {
        this.startVerifiedCheck(credentials);
      }
    }).then(result => currentState.resolve(result));
  },

If we want to persist this state across restart, I propose we rename pollStartDate to something like "addedAt", and add that value to the record right before we set it. Then the polling logic can use that as a reference point to make decisions and it will be persisted. FWIW, I think "addedAt" will semantically make more sense in the account record than pollStartDate, and may be more generally useful in the future.
Drew and I just chatted about this, and we decided not to solve all the polling issues here - we would like this bug uplifted so keeping it simple sounds like the right approach.

So this bug will (hopefully!) make an incremental improvement - it should allow us to poll longer but less frequently, with an explicit goal to not increase the number of polls done, even by users who never verify.

Drew will open a new bug to look at a bigger improvement to the polling - with one possible outcome being that it is better to invest in push or similar so that polling can be avoided completely, and also look into the possibility of automatically signing the unverified user out after some period, with the intent being that this larger improvement would ride the trains.

Speak now or forever hold your peace ;)
Attached patch patch 3 (obsolete) — Splinter Review
Chris, just to add on to what Mark said, this bug was originally about a quick way of improving the UX during the verification step of migration, and we'd like to keep it focused on that -- basically by extending the amount of time that we poll for, to allow more time for the user to respond to the verification email, while not decreasing the frequency so much that they're left waiting a long time after clicking the verification link.

This patch extends the polling period to 20 minutes.  In the first 2 minutes, we poll every 5 seconds.  In the remaining 18 minutes, we poll every 15 seconds.  That comes out to 96 total requests, 24 in the first 2 minutes, 72 in the remaining 18.  That's actually 4 fewer than the 100 requests we currently make at 3 seconds between requests for 5 minutes.  (We could keep the total number at 100 by polling for 19 minutes at the end instead of 18, but 21 total minutes is an odd number, and maybe it's a server-side bonus that fewer total requests are made.)

We're definitely open to changing the details of how many requests we make during what times, how we bucketize the polling in other words.  What do you think?
Attachment #8552603 - Attachment is obsolete: true
Attachment #8552792 - Flags: feedback?(ckarlof)
Makes sense. As you as you aren't increasing the overall polling, that's fine. 

We could use some larger decrease on the polling load though. Bug 1122443 is open for that, and it would be nice to get some near term improvements in for that sometime soon.
Attachment #8552792 - Flags: feedback?(ckarlof) → feedback+
Attachment #8552792 - Flags: review?(mhammond)
Comment on attachment 8552792 [details] [diff] [review]
patch 3

Review of attachment 8552792 [details] [diff] [review]:
-----------------------------------------------------------------

awesome, thanks!

::: services/fxaccounts/FxAccounts.jsm
@@ +296,5 @@
>   */
>  function FxAccountsInternal() {
>    this.version = DATA_FORMAT_VERSION;
>  
>    // Make a local copy of these constants so we can mock it in testing

nit: I guess the comment should now say "this constant"
Attachment #8552792 - Flags: review?(mhammond) → review+
See Also: → 1124941
https://hg.mozilla.org/mozilla-central/rev/f26bb81f9326
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 38
Comment on attachment 8553404 [details] [diff] [review]
landed patch with nit fixed

Approval Request Comment
[Feature/regressing bug #]: Firefox Account signup

[User impact if declined]: A slightly worse UX because Firefox will stop polling for FxA account verification after 5 minutes instead of 20

[Describe test coverage new/current, TreeHerder]: No new test coverage, already somewhat covered by existing test

[Risks and why]: Low risk, only changing the amount of time and frequency that Firefox polls for new FxA account verification

[String/UUID change made/needed]: None
Attachment #8553404 - Flags: approval-mozilla-aurora?
Attached patch Beta (36) patchSplinter Review
Approval Request Comment (same as comment 14)
[Feature/regressing bug #]: Firefox Account signup

[User impact if declined]: A slightly worse UX because Firefox will stop polling for FxA account verification after 5 minutes instead of 20

[Describe test coverage new/current, TreeHerder]: No new test coverage, already somewhat covered by existing test

[Risks and why]: Low risk, only changing the amount of time and frequency that Firefox polls for new FxA account verification

[String/UUID change made/needed]: None
Attachment #8555970 - Flags: approval-mozilla-beta?
Attachment #8555970 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8553404 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I see this bug has automated tests, is there need for manual testing? If so can you please offer some guidance? Thanks
Flags: needinfo?(adw)
Hi Bogdan, sure, you'll need to create a new profile, open the main menu panel, click the sign up for Sync button near the bottom, and create a new Firefox account.  You should receive an email about needing to verify your account.  Don't click the link in the email.  Instead, wait some number of minutes close to 20 but less than 20 (say 15) and then click the link.  Then the page in Firefox should reload and say something like "account created", and there should be a popup on the menu panel button that says "Sync enabled".

Another test would be to wait more than 20 minutes.  Then nothing should happen.

Instead of actually waiting, you could probably move your system clock forward by the same number of minutes.
Flags: needinfo?(adw)
(In reply to Drew Willcoxon :adw from comment #18)
> Hi Bogdan, sure, you'll need to create a new profile, open the main menu
> panel, click the sign up for Sync button near the bottom, and create a new
> Firefox account.  You should receive an email about needing to verify your
> account.  Don't click the link in the email.  Instead, wait some number of
> minutes close to 20 but less than 20 (say 15) and then click the link.  Then
> the page in Firefox should reload and say something like "account created",
> and there should be a popup on the menu panel button that says "Sync
> enabled".
> 
> Another test would be to wait more than 20 minutes.  Then nothing should
> happen.
> 
> Instead of actually waiting, you could probably move your system clock
> forward by the same number of minutes.

Thanks Drew. 
I reproduced the initial 'issue' using an old Nightly build, polling for email verification did not work after more then 5 minutes. 
I verified on Windows 7 64-bit, Mac OS X 10.9.5 and Ubuntu 14.04 32bit using latest Aurora, Firefox 36 RC build 2 and Firefox 37 beta 1 that Firefox stops polling for email verification only after 20 minutes and not earlier.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: