Closed
Bug 742485
Opened 13 years ago
Closed 13 years ago
Node assignment sends invalid "null" cluster response when no node is available
Categories
(Cloud Services :: Server: Registration, defect, P1)
Cloud Services
Server: Registration
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: jbonacci, Assigned: telliott)
References
Details
(Whiteboard: [qa+])
Attachments
(1 file)
|
7.36 KB,
patch
|
rfkelly
:
review+
|
Details | Diff | Splinter Review |
atoll and I tested this against stage re: No node found
Disable node assignments in Stage
Set up a new profile/account
Wait for first sync or force a Sync Now
Error bar appears about "NS_ERROR_UNKNOWN_HOST"
Cluster assignment ends up being https:https://null/
Sync log looks like this:
...etc...
1333567974789 Sync.Service INFO Account created: BLAH
1333567974803 Sync.Status DEBUG Status.service: service.client_not_configured => success.status_ok
1333567974803 Sync.AddonsReconciler INFO Registering as Add-on Manager listener.
1333567974803 Sync.AddonsReconciler DEBUG Adding change listener.
1333567974868 Sync.Service DEBUG User-Agent: Firefox/11.0 FxSync/1.14.0.20120312181643.
1333567974868 Sync.Service INFO Starting sync at 2012-04-04 12:32:54
1333567974868 Sync.Service DEBUG In sync: should login.
1333567974869 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok
1333567974869 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok
1333567974869 Sync.Service INFO Logging in user vonxzneli6fhmoby3opjh3xujp6j2o7x
1333567974869 Sync.Service DEBUG Finding cluster for user vonxzneli6fhmoby3opjh3xujp6j2o7x
1333567974990 Sync.Resource DEBUG mesg: GET success 200 https://stage-auth.services.mozilla.com/user/1.0/BLAH/node/weave
1333567974990 Sync.Resource DEBUG GET success 200 https://stage-auth.services.mozilla.com/user/1.0/BLAH/node/weave
1333567974991 Sync.Service DEBUG Cluster value = https://null/
1333567974991 Sync.Service DEBUG Setting cluster to https://null/
1333567974991 Sync.Service DEBUG Caching URLs under storage user base: https://null/1.1/BLAH/
1333567975085 Sync.Service DEBUG verifyLogin failed: NS_ERROR_UNKNOWN_HOST JS Stack trace: Res_get()@resource.js:483 < ()@service.js:749 < WrappedNotify()@util.js:148 < verifyLogin()@service.js:717 < ()@service.js:1006 < WrappedNotify()@util.js:148 < WrappedLock()@util.js:103 < WrappedCatch()@util.js:77 < WeaveSvc_login()@service.js:980 < ()@service.js:1272 < WrappedCatch()@util.js:77 < sync()@service.js:1268
1333567975085 Sync.Status DEBUG Status.login: error.login.reason.no_username => error.login.reason.network
1333567975085 Sync.Status DEBUG Status.service: success.status_ok => error.login.failed
1333567975085 Sync.Status DEBUG Status.login: error.login.reason.network => error.login.reason.network
1333567975085 Sync.Status DEBUG Status.service: error.login.failed => error.login.failed
1333567975086 Sync.Tracker.Clients DEBUG client.name preference changed
1333567975086 Sync.Tracker.Clients WARN Attempted to add undefined ID to tracker
1333567975087 Sync.SyncScheduler DEBUG Clearing sync triggers and the global score.
1333567975087 Sync.SyncScheduler DEBUG Next sync in 86400000 ms.
| Reporter | ||
Comment 1•13 years ago
|
||
It takes about 60 seconds to test this. UPDATE weave.available_nodes SET available=0; for Sync 1.1, at least, in staging. And immediately broke my Aurora 2012-04-04 UI with a progress bar stuck at 0% and, if I do Sync Now, a yellow error bar complaining about NS_ERROR_UNKNOWN_HOST, probably related to one of the sync about:config URL preferences being, in its entirety, "user/". (And jbonacci got cluster URL "https://null/", because the client is failing to handle the server's "no nodes available" null response.)
- R.
Updated•13 years ago
|
Priority: -- → P1
Summary: Sync returns "null" cluster when no node is available → Sync client mis-handles "null" cluster response when no node is available
Comment 2•13 years ago
|
||
If you "Sync Now" and an error is encountered, there is supposed to be an error bar telling you why Sync didn't work. The bug is the crummy error message.
I think it's also fair to notify the user, even if from, auto sync, if they were not assigned a node.
| Reporter | ||
Comment 3•13 years ago
|
||
Yea, atoll saw that error bar. I did not get one, which is rather strange.
But I got the accompanying error message in the sync log.
There is/was great UI for this, back in the day. rnewman found 12:00 <rnewman> browser-syncui.js:157, onSyncDelay
, which may be of some relevance in relinking it in modern UI.
Comment 5•13 years ago
|
||
We have a test for this at test_service_cluster.js:90. Test output:
A 'null' response won't make a difference either.
Sync.Identity INFO Username changed. Removing stored credentials.
Sync.Identity INFO Basic password has no value. Removing.
Sync.Identity INFO Sync Key has no value. Deleting.
Sync.Service DEBUG Finding cluster for user jimdoe
Sync.Resource DEBUG No authenticator found.
Sync.Resource DEBUG mesg: GET success 200 http://localhost:8080/user/1.0/jimdoe/node/weave
Sync.Resource DEBUG GET success 200 http://localhost:8080/user/1.0/jimdoe/node/weave
Sync.Service DEBUG Cluster value = null
TEST-PASS | /Users/gps/src/services-central/obj-ff-dbg/_tests/xpcshell/services/sync/tests/unit/test_service_cluster.js | [test_setCluster : 92] false == false
TEST-PASS | /Users/gps/src/services-central/obj-ff-dbg/_tests/xpcshell/services/sync/tests/unit/test_service_cluster.js | [test_setCluster : 93] http://weave.user.node/ == http://weave.user.node/
I don't suppose it is possible that the server is sending "https://null/" as the response body? I can easily add some logging and test on stage...
Comment 6•13 years ago
|
||
The HTTP response body sent by the server is "https://null/".
I added some additional logging to service.js and captured it on stage:
1333648724641 Sync.Service INFO Find cluster response code: 200
1333648724641 Sync.Service INFO Find cluster response body: 'https://null/'
1333648724641 Sync.Service DEBUG Cluster value = https://null/
Not sure if production is affected. If so, that would be very bad. Also, I'm not sure I got the right Bugzilla component.
Component: Firefox Sync: Backend → Server: Registration
QA Contact: sync-backend → reg-server
Summary: Sync client mis-handles "null" cluster response when no node is available → Node assignment sends invalid "null" cluster response when no node is available
| Assignee | ||
Comment 7•13 years ago
|
||
Had to rewrite an entire test suite to find it, but eventually tracked it down. There's a small hole in the communication between node and snode. This patch is 2 lines to fix it, then a whole lot of new testing :P
Assignee: nobody → telliott
Attachment #612673 -
Flags: review?(rkelly)
| Reporter | ||
Comment 8•13 years ago
|
||
:telliott thanks for the digging.
Services QA will add this to our "must test" list for AITC client.
| Reporter | ||
Updated•13 years ago
|
Whiteboard: [qa+]
Comment 9•13 years ago
|
||
I remember reading it on IRC, but can somebody please document the production (non-?)impact of this bug as a bug comment.
| Assignee | ||
Comment 10•13 years ago
|
||
The early assessment is correct. If we (as the only people using the node/snode split) ever run out of nodes, the user will get a https://null/
This has never happened and would pretty much require a DOS to occur, but it would be client-affecting and that's why we'll fix it.
Updated•13 years ago
|
Attachment #612673 -
Flags: review?(rkelly) → review+
| Assignee | ||
Comment 11•13 years ago
|
||
Fixed in http://hg.mozilla.org/services/server-node-assignment/rev/d3b1cda1f77f
Will open a bug to do a production push.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 12•13 years ago
|
||
Verified as part of the Production push for core and node.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•