Closed Bug 1081158 Opened 5 years ago Closed 5 years ago

Firefox fails to ask me to reconnect my other browser after a password change

Categories

(Firefox :: Sync, defect)

35 Branch
defect
Not set
Points:
2

Tracking

()

VERIFIED FIXED
Firefox 39
Iteration:
39.2 - 23 Mar
Tracking Status
firefox35 --- wontfix
firefox36 - wontfix
firefox37 + verified
firefox38 + fixed
firefox39 --- verified
firefox-esr31 --- unaffected

People

(Reporter: rfeeley, Assigned: markh)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 1 obsolete file)

I recently changed my Firefox Account password while on my work computer. Weeks later, at home, I noticed that I was getting the yellow notification box (i.e. sync has not been able to connect for 14 days).

I should instead have seen:

https://www.dropbox.com/s/bdkjnymctj4h9gz/Desktop_Custom_Menu_Reconnect.pdf

https://www.dropbox.com/s/r2k3uyhynezffc5/Desktop%20-%20Preferences%20-%20Cannot%20Connect.pdf

Would love to see automated test coverage of this process.
> Weeks later, at home, I noticed that I was getting the yellow notification box (i.e. sync has not been able to connect for 14 days).

So that error message is in one sense accurate, but you're saying the "reconnect to sync" error should have precedence. When you need to re-enter your password on the second device, a different, similar looking yellow notification is supposed to show ("incorrect username or password"). 

But you're saying that the incorrect error states were not displayed in the hamburger menu and sync pref panel?
Attached image Legacy sync errors
Well as luck would have it, I have all the screenshots. Notice that all of them are dead-ends in terms of resolving the problem.
Flags: firefox-backlog+
(In reply to Ryan Feeley from comment #2)
> Created attachment 8503319 [details]
> Legacy sync errors
> 
> Well as luck would have it, I have all the screenshots. Notice that all of
> them are dead-ends in terms of resolving the problem.

The infobar saying you have the incorrect username or password has a button that should take you to the prefs window, which should offer the link to reauth.  So I'm a bit confused as to what exactly you are saying seeing the screenshots do show the expected info bar.  It sounds like you *didn't* see this before the 14 days or something, but I'm not sure.

Are you able to repro the issue?
Flags: needinfo?(rfeeley)
The only UI elements indicating the problem were the notification boxes. The preferences panel was in the normal state with the Manage and Disconnect button, no Reconnect button. Same for the menu items (which just opened preferences). So there was no way to reconnect aside from Disconnecting and then signing in again. I'll try and repro.
Flags: needinfo?(rfeeley)
One screenshot (not faked, just showed for a few minutes) depicts this state where the legacy notification box appears but the preferences is not reflecting it.
(In reply to Ryan Feeley from comment #5)
> One screenshot (not faked, just showed for a few minutes) depicts this state
> where the legacy notification box appears but the preferences is not
> reflecting it.

That "just" the generic "haven't been able to sync for 14 days" message - no notification in sync prefs is expected there.  The sync prefs only show something when in the "incorrect username or password" state.

I'm not sure *why* you haven't been able to sync for 14 days, but Firefox doesn't think it's due to a need to reauth.  Do you have logs?
Blocks: 1119667
I can reproduce this, and it's bad :(  Requesting 36 but that might be optimistic.  We should certainly try for 37.

STR:
* Change your FxA password, then restart the browser.  You *may* see the infobar once here once we detect the change.
* Restart the browser.

You will never see the "reauth needed" info bar.  Sync will never again automatically attempt to Sync.  (Note that manually triggering a sync *will* show the infobar)

The 2 things that conspire against us are:

* shouldReportError() treats a messing clusterURL as a "mid sync 401" - ie, a node reassignment.  But we previously dropped the cluster URL, so shouldReportError() is treating this very first error as "mid-sync", which is wrong in an FxA world.

* Because the error is treated as a fatal login error no new syncs are scheduled (not that they would succeed, and the point above means we still aren't going to prompt - so really this is somewhat moot)
OS: Mac OS X → All
Hardware: x86 → All
See Comment 7 for more context, but the short of it is that the fact we don't have a cluster URL isn't a good indication that an error is "mid-sync".  Thus, when the browser starts after a previous auth failure and fails to login, no error is reported.

This patch has shouldReportError() take a param which is the notification topic that caused us to be called.  If the topic is one that explicitly indicates the login failed we always consider it reportable.

I did consider checking the |lastSyncReassigned| pref, but the fact this is persisted means it doesn't work - if we startup with |lastSyncReassigned="true"| and fail to login, that pref will never get reset as a sync can never complete.  I personally think we should consider keeping that flag in-memory so we never startup thinking we failed mid-sync, but that's likely to be fragile.

Richard, what do you think?
Attachment #8558881 - Flags: feedback?(rnewman)
I agree that this is bad but I don't think this is either bad enough to ship in a point release or cause us to delay releasing. As we're going to build with beta 7 tomorrow, I think it's pretty late to take this change in 36 unless it ended up being a one liner. I think we should try to fix this in 37 if possible. I'm tracking for 37 but am not committing to fixing the problem in this release. We will need a fix soon (preferably in the next week or so) and will have to weight the risk of regression against the benefit of fixing this problem.
Mark - Looks like you're working on this one. Hope you don't mind that I assigned the bug to you.
Assignee: nobody → mhammond
Iteration: --- → 38.2 - 9 Feb
Status: NEW → ASSIGNED
Comment on attachment 8558881 [details] [diff] [review]
0001-Bug-1081158-ensure-we-report-all-login-related-error.patch

Review of attachment 8558881 [details] [diff] [review]:
-----------------------------------------------------------------

I don't see anything wrong with your analysis, so if this beats your STR, rock on.

::: services/sync/modules/policies.js
@@ +853,5 @@
>      }
>  
>      // We got a 401 mid-sync. Wait for the next sync before actually handling
>      // an error. This assumes that we'll get a 401 again on a login fetch in
>      // order to report the error.

You probably need to adjust this to say "We might have received a 401 mid-sync." I'd be inclined to leave a pointer to this bug.

::: services/sync/tests/unit/test_errorhandler.js
@@ +470,5 @@
>    yield deferred.promise;
>  });
>  
> +// Test that even if we don't have a cluster URL, a login failure due to
> +// authentication errors are always reported.

s/are/is

@@ +472,5 @@
>  
> +// Test that even if we don't have a cluster URL, a login failure due to
> +// authentication errors are always reported.
> +add_identity_test(this, function test_shouldReportLoginErrorWithNoCluster() {
> +  // Ensure no clusterURL - any error not specific to loggin in should repor.

s/loggin/logging
Attachment #8558881 - Flags: feedback?(rnewman) → review+
Hi Mark, can you assign a point value.
Flags: qe-verify?
Flags: needinfo?(mhammond)
https://hg.mozilla.org/integration/fx-team/rev/8f8af8afd1a6
Points: --- → 2
Flags: qe-verify?
Flags: qe-verify+
Flags: needinfo?(mhammond)
I think you broke something in xpshell with this commit. Can you look or shall I backout?
Iteration: 38.2 - 9 Feb → 38.3 - 23 Feb
It seems like a 401 during login *might* still be a node reassignment with the legacy identity.  So a better approach is probably to just treat |Status.login == LOGIN_FAILED_LOGIN_REJECTED| as "always report the error"

There are 2 cases outside of the identity that sets this status:

* https://dxr.mozilla.org/mozilla-central/source/services/sync/modules/service.js#753 - when we got a 404 on info/collections, then re-fetched the cluster URL, then *again* got a 404.

* https://dxr.mozilla.org/mozilla-central/source/services/sync/modules/stages/cluster.js#45, when the legacy identity got a 400 after fetching the cluster URL.

So it seems the chance of a "false-positive" with checking Status.login is fairly low.
Attachment #8558881 - Attachment is obsolete: true
Attachment #8562473 - Flags: review?(rnewman)
Iteration: 38.3 - 23 Feb → 39.1 - 9 Mar
Attachment #8562473 - Flags: review?(rnewman) → review+
https://hg.mozilla.org/mozilla-central/rev/03d1fd491515
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 39
Comment on attachment 8562473 [details] [diff] [review]
0001-Bug-1081158-ensure-we-report-all-login-related-error.patch

Approval Request Comment
[Feature/regressing bug #]: FxA Sync
[User impact if declined]: Users may never be notified that they need to login to Sync
[Describe test coverage new/current, TreeHerder]: Tests continue to pass, landed on central
[Risks and why]: Low risk limited to sync, simple patch
[String/UUID change made/needed]: None
Attachment #8562473 - Flags: approval-mozilla-beta?
Attachment #8562473 - Flags: approval-mozilla-aurora?
QA Contact: catalin.varga
Florin/Catalin, if possible, it would be great to verify this fix on Nightly before uplifting for Beta 4 on Mon, Mar 9.
I've tried reproducing this bug using FF 37.0b3 using the scenarios from the bug and the scenario from comment 7 without success. In both cases the behavior was the expected one eg:
- this infobar https://www.dropbox.com/s/bdkjnymctj4h9gz/Desktop_Custom_Menu_Reconnect.pdf is displayed after every restart.
- this state https://www.dropbox.com/s/r2k3uyhynezffc5/Desktop%20-%20Preferences%20-%20Cannot%20Connect.pdf is present in about:preferences#sync 

Due to the fact that I did not manage to reproduce the issue I can't provide a trustworthy verification.
Mark - can you address this feedback in comment 21 with updates to your changes?
Flags: needinfo?(mhammond)
I spoke with Ryan (reporter) about this bug. He tried to verify with me and was unclear how the behaviour had changed. Specifically, he's still seeing the yellow notification bar instead of the prompt that he's expecting. I'm reopening the bug on his behalf as this bug doesn't seem to be fixed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Iteration: 39.1 - 9 Mar → 39.2 - 23 Mar
(In reply to Catalin Varga [QA][:VarCat] from comment #21)
> I've tried reproducing this bug using FF 37.0b3 using the scenarios from the
> bug and the scenario from comment 7 without success.

Yes, that's my fault - comment 7 isn't as accurate as it could be, and I think you aren't hitting the state causing this bug (which is the fault of the instructions!)  Try this:

* With Sync configured, change your FxA password on the same device (ie, Options->Sync->Manage).
* Wait for the browser to report the "please re-enter your password" state without shutting down.  Note that this might be a few hours, although if you create a new boolean preference "services.sync.debug.ignoreCachedAuthCredentials" set to true it should happen almost immediately (no need to restart after setting this pref)
* Once you are in this state, use about:config to check that "services.sync.clusterURL" preference has become the empty string - this is the key to triggering this bug.  If the cluster URL is not blank when restarting the conditions for this bug are not met.
* Restart the browser and either wait for sync, or force a new sync.

Without this patch you should not see the "please re-enter your password" state we saw above, but Sync will just silently fail. With this patch you should see that "please re-enter your password" state as soon as Sync tries to start.

(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #23)
> I spoke with Ryan (reporter) about this bug. He tried to verify with me and
> was unclear how the behaviour had changed. Specifically, he's still seeing
> the yellow notification bar instead of the prompt that he's expecting. I'm
> reopening the bug on his behalf as this bug doesn't seem to be fixed.

I suspect there is some mis-communication here, as reproducing the original bug means getting into this state, then waiting 14 days for the "Haven't been able to sync for a long time" infobar, as during those 14 days we've never asked them to enter a new password.  Ryan, can you please clarify what you are currently seeing on Nightly regarding this?

But FTR, I just re-verified the problem on Aurora (so I could correct the STR above) and verified it fixed on a locally built nightly.
Flags: needinfo?(mhammond) → needinfo?(rfeeley)
I'm seeing the yellow notification bar, and the message in in-content sync preferences (though it lacks the warning icon), but the toolbar menu is not showing the Reconnect to Sync message until much later. I expanded it a dozen times while typing this message and it didn't show up until I entered and exited customize mode.

How about we just remove weave notification bars altogether? The updated UI handles the issues, but are just lagging behind a bit. I'd rather have one lagging, than two that are out of sync.
Flags: needinfo?(rfeeley)
(In reply to Ryan Feeley from comment #25)
> How about we just remove weave notification bars altogether? The updated UI
> handles the issues, but are just lagging behind a bit. I'd rather have one
> lagging, than two that are out of sync.

Yeah, we should consider that.  It may also be that the issue you are seeing is not exactly the issue I found - the symptoms seem the same but they may not be identical.

However, I believe the patch here is still an improvement, and that if it can be verified by Catalin it should be uplifted.  Catalin, can you please try and re-verify using the instructions in comment 24?
Flags: needinfo?(catalin.varga)
Attached image syncfinalstate.jpg
I've retested this bug using the steps from comment 25 using the following environment:

FF 39
Build Id: 20150315030236
OS: Win 7 x64 and Mac Os X 10.7.5

Restarting the browser after services.sync.clusterURL will set the browser in the following state:

- on the main window the following error message is displayed:
"Sync encountered an error while syncing: Incorrect account name or password. Sync will automatically retry this action."
- in the hamburger menu you are displayed as logged in to the Firefox account( this is an issue that remains to be solved I've logged bug 1143643 for this issue)
- in about:preferences#sync
you are not displayed as logged in and the message "Please sign in to reconnect account@email.com" is being displayed.

For a better understanding of the final state I've uploaded this print screen .

This patch is definitely an improvement from the initial bug when nothing was being displayed after the restart and the sync was just silently failing.
Flags: needinfo?(catalin.varga)
Comment on attachment 8562473 [details] [diff] [review]
0001-Bug-1081158-ensure-we-report-all-login-related-error.patch

Given that this is definitely an improvement and is a low risk patch, I think we should take this in 37. Catalin has filed follow-up bug 1143643. Beta+ Aurora+

Ryan - Can you please file a follow up for the issue as you described it in comment 25?
Flags: needinfo?(rfeeley)
Attachment #8562473 - Flags: approval-mozilla-beta?
Attachment #8562473 - Flags: approval-mozilla-beta+
Attachment #8562473 - Flags: approval-mozilla-aurora?
Attachment #8562473 - Flags: approval-mozilla-aurora+
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
I verified this fix using:

FF 37.RC2 
Build Id: 20150326190726
OS: Win 7 x64, Ubuntu 14.04 x86, Mac Os X 10.9.5
Removing qe-verify flag since this was verified for Nightly 39 and Beta 37.
Flags: qe-verify+
I also verified this fix using 
FF 38
Build Id: 20150330004006

and for some reason the fix is not working properly on FirefoxDeveloperEdition.
After the new password is entered "please re-enter your password" continues to show it,s like the new password cannot be successfully saved. 
Not sure if related with this issue but on this step :

* Once you are in this state, use about:config to check that "services.sync.clusterURL" preference has become the empty string - this is the key to triggering this bug.  If the cluster URL is not blank when restarting the conditions for this bug are not met.

the behavior on DeveloperEdition is that the pref gets complete deleted and not just has his value emptied.
Flags: needinfo?(mhammond)
Flags: needinfo?(rfeeley)
Any thought here on behavior for DevEdition mentioned in comment 33?
Sorry for the delay, but this is going to be out of date now. Please file another bug if strangeness can still be reproduced.
Flags: needinfo?(markh)
You need to log in before you can comment on or make changes to this bug.