Closed Bug 509552 Opened 16 years ago Closed 16 years ago

Making Client handle 401s correctly

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: telliott, Assigned: anant)

References

Details

Attachments

(1 file, 1 obsolete file)

When the client receives a 404 from the server, it means that the user is not on that cluster. If they've been an active user, they have probably been moved to another cluster. The client needs to check back in with the user API node call and see if the cluster that user has been assigned to changes. Fixing this will free us up to migrate users off sj1 and then we can put the fixes in on that table for monitoring.
Priority: -- → P2
Target Milestone: --- → 0.6
Flags: blocking-weave1.0+
OS: Mac OS X → All
Priority: P2 → P1
Hardware: x86 → All
Blocks: 508889
Assignee: nobody → anant
Are there no other cases as to why a 404 may be generated? Do we simply catch all 404s on any request to the server and try to fetch the correct cluster?
The only way a 404 is generated is if the user is not found in the local auth db. That either means they previously existed here but don't any more, or they moved. It's possible that there was an error. If so, the checkNode call will still return the value of the original server, in which case the client should back off and try again later.
GET /.../collection/non-existent-id returns 404. Comment #2 is correct if we limit it to searches on the collection. Collections are never explicitly created or deleted, searching an empty (=never used) one would just return nothing. But operations on individual records are a different story.
Argh, I'm an idiot. Authentication failure returns a 401, not a 404.
So I think what we need is: 1) make sure auth failures, even after the initial login, and even during a sync, bubble up to the service where they can be dealt with. 2) service needs a backoff algorithm + the ability to retry whatever it was doing, if it was doing something. It might need to log the user out and wait for user input if it cannot recover. In this case it should fire an additional notification so the UI can present the problem to the user (e.g. as a warning icon). 3) make sure we check nodes at most once in X minutes, to avoid loops/spamming the server. This needs to be part of the backoff alg.
That sounds correct. It's basically: On a 401, if client hasn't checked central server in the last X minutes, check with central server. * If node has changed, retry at new node * If node hasn't changed, alert user to auth problem
Summary: Making Client handle 404s correctly → Making Client handle 401s correctly
On login and sync, if we encounter a 401 and have not updated the cluster in the last 5 minutes, then we fetch the new cluster and retry the operation. If the cluster was fetched in the last 5 minutes, the 401 exception is propogated upward.
Attachment #395225 - Flags: review?(thunder)
Actually check if the exception caused in engine.sync() is a RequestException and only then handle the 401.
Attachment #395225 - Attachment is obsolete: true
Attachment #395232 - Flags: review?(thunder)
Attachment #395225 - Flags: review?(thunder)
Comment on attachment 395232 [details] [diff] [review] Handle a 401 from the server for cluster-changed case (v2) Looks good. r=thunder
Attachment #395232 - Flags: review?(thunder) → review+
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Can this be triggered to test from the client?
Flags: in-litmus?
Not without installing the admin suite to move you from one node to another. I don't believe that's up yet.
Flags: in-litmus? → in-litmus-
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: