Closed Bug 1182679 Opened 9 years ago Closed 9 years ago

Firefox Account "Communication Preferences" page is just a spinning throbber

Categories

(Cloud Services :: Server: Firefox Accounts, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: dholbert, Unassigned)

References

()

Details

Attachments

(1 file)

STR:
 1. Visit https://accounts.firefox.com/ and log in.
   --> You end up at "Manage Account" page

 2. Click the very first link on this page, "Communication preferences"


ACTUAL RESULTS:
I end up at an empty page (no text) with a throbber that spins forever.

EXPECTED RESULTS:
Some useful "communication preferences" page. (Or, no "communication preferences" link at all, if we don't currently have any content for that page.)


The URL of the page with the never-ending throbber is:
https://accounts.firefox.com/settings/communication_preferences
I suspect you hit an outage on the "basket" service that backs this page.  There's a bug where errors from that server are not properly reflected in our UI, which should be fixed in the next release:

  https://github.com/mozilla/fxa-content-server/pull/2711

If you're able to reproduce this reliably, it'd be great to get a devtools network view to confirm whether it is indeed a failing request to this service.

> (Or, no "communication preferences" link at all, if we don't currently have any content for that page.)

We're generally not in the habit of shipping empty placeholder pages for things that haven't been implemented yet ;-)
(In reply to Ryan Kelly [:rfkelly] from comment #1)
> I suspect you hit an outage on the "basket" service that backs this page. 
> There's a bug where errors from that server are not properly reflected in
> our UI, which should be fixed in the next release

Ah, OK!

> If you're able to reproduce this reliably, it'd be great to get a devtools
> network view to confirm whether it is indeed a failing request to this
> service.

I can still reproduce this right now (from a different computer, on a different network, ~7 hours later). So I think I can reproduce this reliably, and it's probably not just an outage (unless it's a really long outage).

I'll see what sort of data I can capture...

> We're generally not in the habit of shipping empty placeholder pages for
> things that haven't been implemented yet ;-)

(OK, good. :) I wasn't sure - that sort of thing does happen in browser UI sometimes, in Nightly. It's good we're not doing that in a production website, though. :))
So I see this JS error in the Console when loading this page:
{
  Unexpected error 778ac72e.main.js:7:7994
  s<.logError() 778ac72e.main.js:7
  l() 778ac72e.main.js:7
  D() 778ac72e.main.js:2
  s.partial/c() 778ac72e.main.js:2
  n<.beforeRender/<() 778ac72e.main.js:12
  F() 778ac72e.main.js:6
  E()
}

Surely that makes the problem clear? :D
Yikes.  /cc'ing a couple of FxA front-end experts onto this bug to see what they make of it
Not sure what the Unexpected error stack in comment 3 is about, but this from the network panel may be more relevant -- it looks like I'm getting a "400 Bad Request" response to my "lookup-user" GET request.

If I highlight that line, I see this information on the right:
  Request URL: https://accounts.firefox.com/basket/lookup-user
  Request method: GET
  Status code: 400 Bad Request
Anything unusual about the email address you used to sign up for FxA?
Nope. Just my username @ cs.stanford.edu.

Also, I can reproduce this (throbber + comment 3 & comment 5) with a fresh profile, in a Firefox release version (v38, slightly out of date), as well as in my normal browsing profile w/ Nightly. So, this isn't a interaction with an add-on / obscure local pref, nor is it a recent Nightly-channel browser bug.
:dholbert - the stack you showed in https://bugzilla.mozilla.org/show_bug.cgi?id=1182679#c3 is addressed by https://github.com/mozilla/fxa-content-server/pull/2730 I think, a fix for which was merged in to `master`. 

:vladikoff we should cherry-pick that fix back to next release train-41. 

Getting that stack trace is consistent with what happens when you get a 500 or 404 with the request to lookup-user while basket is not up. But I'll need to look into what the 400 is about.
I looked at some logs and the 400's that dholbert experienced were returned from basket after we had authenticated the bearer token that was pass to our proxy-to-basket.

So, 1. we should follow up with basket if they can see why these particular requests were rejected.
2. There is PR to handle 5xx errors coming from basket but apparently the error handling in the front end is not handling other error responses as it should. https://github.com/mozilla/fxa-content-server/pull/2711
(In reply to John Morrison [:jrgm] from comment #8)
> :dholbert - the stack you showed [...] is addressed by
> https://github.com/mozilla/fxa-content-server/pull/2730 I think, a fix for
> which was merged in to `master`.

Just to make sure I'm understanding -- that commit just makes error-reporting work better, right? (i.e. it won't actually fix my original problem here, but it'll produce a useful error report instead of the stack in comment 3.)

> Getting that stack trace is consistent with what happens when you get a 500
> or 404 with the request to lookup-user while basket is not up. But I'll need
> to look into what the 400 is about.

FWIW, I'm still seeing the same behavior [perma-throbber & 400 response to my GET in network panel]  which I suspect means my problem is unrelated to basket being up/down. (unless it's been down for days)
Trying to run down some possible causes on the ExactTarget side. Did you opt-in to the email thread (subscribe) through the Accounts UI?
Flags: needinfo?(dholbert)
Probably not (I tend not to opt into mailing lists), but I'm not sure.
Flags: needinfo?(dholbert)
> Did you opt-in to the email thread (subscribe) through the Accounts UI?

Specifically, that's the "Get the latest news on Mozilla and Firefox" checkbox that presents in the account creation screen.  I suggest that you almost certainly didn't check it by accident :-)
I created my Firefox Account back in February 2014; I don't know if we even had that checkbox at that time.  If we did, though, I don't remember whether I checked it (but I probably didn't).
Nope, the checkbox is newer than that.
Cool. Sounds like I never saw the checkbox, then. (and maybe that's part of what's causing weird server behavior here)
We are logging 4xx and 5xx errors to Heka, with the body returned by the Basket server using the |server.basketproxy| and |auth.unauthorized| namespaces.


:jrgm, do you have access to the basket proxy server data? I am unable to see it in Kibana to dig in myself, or maybe I'm just not looking hard enough.
Flags: needinfo?(jrgm)
So, yeah, these aren't in Kibana. These are not oauth auth issues; basket returns 400 and that's all the detail there is in the logs. 

Seems to me someone (not me) should look at the logs that are returning the 400, which obviously can say more about the reason.
Flags: needinfo?(jrgm)
ni? :pmac to see if we can find anything about this 400 errir in the logs on the basket side.  :pmac, Comment 5 and Comment 7 contain the most relevant info.
Flags: needinfo?(pmac)
Looking at the basket view code, if it's specifically a 400 error, then the view doesn't think it's received an email address or token to look up. That seems unlikely, but I don't have a way of getting any more info about this specific request. I just tried my account and everything seems to be working well. My guess (hope) is that this is specific to dholbert's account, but I can't be sure yet.
Flags: needinfo?(pmac)
> Looking at the basket view code, if it's specifically a 400 error, then the view doesn't think
> it's received an email address or token to look up.

IIUC utils.py:get_user_data can also return a 400 response if there's an error from ExactTarget.  Maybe this is due to some problem in his record on the ET backend?

:pmac, can you please try duplicating the request that's being made on the backend and see if basket returns a 400 directly for you?  I don't have the necessary keys to try it myself.  The request would be:

  GET /lookup-user?email=dholbert%40cs.stanford.edu
  X-API-Key:  our-prod-api-key-that-I-dont-have-access-to


(Also, marking moco-only because I just de-obfuscated Daniel's email after he carefully avoided writing it in Comment 7)
Group: mozilla-employee-confidential
Flags: needinfo?(pmac)
(In reply to Ryan Kelly [:rfkelly] from comment #21)
> (Also, marking moco-only because I just de-obfuscated Daniel's email after
> he carefully avoided writing it in Comment 7)

Thanks, I appreciate the concern! I'm fine leaving this bug public; my same email address is already public in our hg logs (and via hgweb). --> clearing confidential flag.
Group: mozilla-employee-confidential
Work to show the user "System unavailable. Please try again soon" on 4xx and 5xx errors has been merged into the content-server in https://github.com/mozilla/fxa-content-server/pull/2711.
(In reply to Ryan Kelly [:rfkelly] from comment #21)
> :pmac, can you please try duplicating the request that's being made on the
> backend and see if basket returns a 400 directly for you?  I don't have the
> necessary keys to try it myself.

done. it does indeed return a 400. I believe it is saying that ET can not find this email address.

Ben, can you find the email referenced in comment #21 in Exact Target?
Flags: needinfo?(pmac) → needinfo?(bniolet)
Negative. Email address doesn't exist on Master_Subscribers.
Flags: needinfo?(bniolet)
So, is this expected behaviour that we should be handling gracefully on our end, e.g. treating this error as equivalent to a 404?  In other cases where the user doesn't exist we've seen 404 responses.
Flags: needinfo?(pmac)
Okay. I think I have this figured out. The issue is that dholbert is in a pending state. He's submitted his email at some point, but never clicked the confirmation link in the opt-in email. So basket should be returning this:

{"status": "ok", "format": "H", "newsletters": [], "created-date": "6/17/2015 12:10:53 AM", "lang": "en-US", "confirmed": false, "country": "", "token": "*********************", "master": false, "email": "dholbert@cs.stanford.edu", "pending": false}

in a 200 response. The problem is that there's an error in the `get_user_data` method that tries to look up users in the "confirmed" table using whatever was passed in, but only token works for that table. I've got a fix in that is working for my local basket. Hopefully I'll have this patch landed and pushed in the next few days (have to wait on IT for a push).

In the mean time you're right that a 404 is the proper code for a user that doesn't exist, and 400 should be for a network error talking to ET. Either case should be handled on your end, but if you want to treat 400s differently in the UI then that should be fine and basket will behave correctly soon.

This leaves us with the question of what to do with users like this. Should we mark them as confirmed, which would move whatever newsletter they requested initially into a subscribed state, or just go ahead with whatever they've requested on your side. Since they've got a token you can use that to add them to subscribe and opt them into the newsletter from the FxA UI. Or have we already worked this out?
Flags: needinfo?(pmac)
Depends on: 1185999
> This leaves us with the question of what to do with users like this. Should we mark them as confirmed, which would move whatever newsletter they requested initially into a subscribed state, or just go ahead with whatever they've requested on your side.

Shane and Sai worked on handling 500 vs 404 vs 400. ni'd
Flags: needinfo?(stomlinson)
> This leaves us with the question of what to do with users like this. Should we mark them as confirmed, which would move whatever newsletter they requested initially into a subscribed state, or just go ahead with whatever they've requested on your side.

Subscribing the user to newsletters they subscribed to "long ago" might be surprising, or at least a vector that can be used to annoy someone. I could go subscribe you to a bunch of newsletters, you never verify because you don't want the newsletters, and then you sign up for FxA and opt-in to the newsletter from there. Corner case? Yeah.

Is it difficult to ignore that original request, or drop the entry from the "needs to verify" table and just do the new request?
Flags: needinfo?(stomlinson)
Agreed. I don't think it'll be hard to do, and in fact I think that's what's already happening since users that subscribe through FxA don't go through our double-opt-in process because they've had their email verified by virtue of the signup process. Thanks for the help.
So IIUC, the fix here is a combination of:

* FxA surfacing 400 errors from basket as useful error UI rather than a spinning throbber
* Basket correctly handling this corner-case of uncompleted opt-in

In which case, both of the fixes are in and we just have to wait for fresh deployments before confirming that it all works as expected.  Daniel, can you could please try this again sometime next week and let us know if it's fixed?
That's my understanding as well Ryan. I've got a fix committed, just need to do a push to prod and we should be back to the expected behavior of 404s on unknown users, and 200s with useful data. The basket bug for this depends on this one, and I'll update it when it goes to prod.
((In reply to Ryan Kelly [:rfkelly] from comment #31)
> Daniel, can you could please try this again sometime next week and let us know if it's
> fixed?

Sure (1.5 weeks later, sorry for slight delay).

I'm still seeing the bug -- page is just a spinning throbber, and I'm still seeing a 400 response in the network devtools panel, for a query to https://accounts.firefox.com/basket/lookup-user

I think (?) this might be expected, per comment 32 here. (Sounds like the fix isn't in production yet?)  pmac, please let me know when I should retest.
Flags: needinfo?(pmac)
Right, not in prod yet. I'll be getting that pushed out hopefully tomorrow. Apologies for the delay.
Flags: needinfo?(pmac)
This is pretty old now, I'm going to close it out on the assumption that the fix worked.  Daniel, if you happen to take the time to retest this and it's still a problem, please re-open.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Yup, looks good. Thanks!
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: