Closed
Bug 1087636
Opened 11 years ago
Closed 9 years ago
Suboptimal handling of 503s during /meta/global fetch
Categories
(Firefox :: Sync, defect, P2)
Firefox
Sync
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox50 | --- | fixed |
People
(Reporter: rfkelly, Assigned: eoger)
References
Details
(Whiteboard: [qa+])
Attachments
(1 file)
Currently running Aurora: 35.0a2 (2014-10-21)
There was a db problem on my sync node and I got the following error log, along with an "unknown error" yellow bar. Two things in the log strike me as suboptimal.
First, a 503 on GET of /meta/global is treated like a 404 rather than network error:
* GET fail 503 https://sync-131-us-west-2.sync.services.mozilla.com/1.5/9079281/storage/meta/global
* No meta/global record on the server. Creating one.
Second, a 503 on PUT of /meta/global produces the "unknown error" yellow bar, so it's probably not checking Retry-After and triggering the right retry logic:
* PUT fail 503 https://sync-131-us-west-2.sync.services.mozilla.com/1.5/9079281/storage/meta/global
* Unable to upload new meta/global. Failing remote setup.
Updated•11 years ago
|
Whiteboard: [qa+]
Reporter | ||
Comment 1•9 years ago
|
||
> the following error log
Which I don't seem to have attached. I feel like this has been dealt with in subsequent bugs, :markh may remember more explicitly
Comment 2•9 years ago
|
||
(In reply to Ryan Kelly [:rfkelly] from comment #0)
> First, a 503 on GET of /meta/global is treated like a 404 rather than
> network error:
>
> * GET fail 503
> https://sync-131-us-west-2.sync.services.mozilla.com/1.5/9079281/storage/
> meta/global
> * No meta/global record on the server. Creating one.
It looks like this is still the case - service.js, in _remoteSetup(), has a special case for 401, but all other responses are treated as above.
> Second, a 503 on PUT of /meta/global produces the "unknown error" yellow
> bar, so it's probably not checking Retry-After and triggering the right
> retry logic:
The code appears to be checking the retry-after header (but only that header, not the x-weave-* variations etc) on 503 in that case - but only on that put (ie, the earlier 503 on get did not check for retry-after). I actually can't see how the error bar would have come up in that case - but the error bar is gone now anyway.
I think this should remain open to fix the handling of the failed get - treating everything that's not 401 the same seems wrong.
Flags: firefox-backlog+
OS: Windows 7 → All
Priority: -- → P2
Hardware: x86_64 → All
Assignee | ||
Comment 3•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/52619/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/52619/
Attachment #8752437 -
Flags: review?(markh)
Comment 4•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
https://reviewboard.mozilla.org/r/52619/#review49614
This function is complex - I think we need new tests to exercise the changes here (ie, if the existing tests still all pass, clearly they aren't covering enough!)
::: services/sync/modules/service.js:1087
(Diff revision 1)
> newMeta.isNew = true;
>
> this.recordManager.set(this.metaURL, newMeta);
> let uploadRes = newMeta.upload(this.resource(this.metaURL));
> if (!uploadRes.success) {
> this._log.warn("Unable to upload new meta/global. Failing remote setup.");
Let's add uploadRes.status to the log message here (checkServerError doesn't always log it)
::: services/sync/modules/service.js:1116
(Diff revision 1)
> STORAGE_VERSION > parseFloat(remoteVersion)) {
>
> this._log.info("One of: no meta, no meta storageVersion, or no meta syncID. Fresh start needed.");
>
> // abort the server wipe if the GET status was anything other than 404 or 200
> - let status = this.recordManager.response.status;
> + status = this.recordManager.response.status;
ISTM this should no longer be necessary (ie, that status is guaranteed to be the same value it was before)? Maybe changing the new |let status| above to be |const status|?
(What I'm concerned about is that this.recordManager.response is effectively a "global" and it's difficult to guarantee it hasn't changed via a side-effect of calling another function etc, and the comment here certainly implies it is intended to reflect the original get. It *looks* like it should still be the same object, but it's hard to ensure this as the world changes. This entire function sucks and is hard to rationalize about)
::: services/sync/modules/service.js:1118
(Diff revision 1)
> this._log.info("One of: no meta, no meta storageVersion, or no meta syncID. Fresh start needed.");
>
> // abort the server wipe if the GET status was anything other than 404 or 200
> - let status = this.recordManager.response.status;
> + status = this.recordManager.response.status;
> if (status != 200 && status != 404) {
> this.status.sync = METARECORD_DOWNLOAD_FAIL;
ISTM that you've basically prevented this block from being entered?
Attachment #8752437 -
Flags: review?(markh)
Assignee | ||
Comment 5•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52619/diff/1-2/
Attachment #8752437 -
Flags: review?(markh)
Comment 6•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
https://reviewboard.mozilla.org/r/52619/#review51142
::: services/sync/modules/service.js:1063
(Diff revision 2)
> // anything (e.g. key setup).
> _remoteSetup: function _remoteSetup(infoResponse) {
> let reset = false;
>
> this._log.debug("Fetching global metadata record");
> let meta = this.recordManager.get(this.metaURL);
Looking at this more, we can't reference .response.status here - this.recordManager.get() might return a cached version - but only the *record* is cached, not the response - so there's a chance this.recordManager.response.status will be from a completely unrelated request here?
It also looks like the existing code has the same problem down in the "// abort the server wipe if" block, but the reference a few lines down from here is OK as it ensures it really did make the request.
::: services/sync/modules/service.js:1083
(Diff revision 2)
>
> // ... fetch the current record from the server, and COPY THE FLAGS.
> let newMeta = this.recordManager.get(this.metaURL);
> + status = this.recordManager.response.status;
>
> - // If we got a 401, we do not want to create a new meta/global - we
> + if (status === 404) {
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style says to avoid "strict" equality and I don't see a reason you actually need them here.
::: services/sync/modules/service.js:1100
(Diff revision 2)
> - this._log.warn("Unable to upload new meta/global. Failing remote setup.");
> + this._log.warn("Unable to upload new meta/global (" + uploadRes.status + "). Failing remote setup.");
> this.errorHandler.checkServerError(uploadRes);
> return false;
> }
> + } else if (status === 200 && newMeta) {
> + newMeta.isNew = newMeta.isNew;
This looks wrong?
::: services/sync/modules/service.js:1109
(Diff revision 2)
> }
>
> // Switch in the new meta object and record the new time.
> meta = newMeta;
> this.metaModified = infoResponse.obj.meta;
> + } else if (status !== 200 && status !== 404) {
The 404 check looks wrong here - I'd have thought that's impossible given the changes above. Also, there should either be a trailing "else", or a comment saying why there isn't one - ie, it *looks* like maybe this should be written:
} else if (status != 200) {
logError = true;
} // else status is 200 and we got the current meta.
but this is really quite difficult to rationalize about.
::: services/sync/modules/service.js
(Diff revision 2)
> STORAGE_VERSION > parseFloat(remoteVersion)) {
>
> this._log.info("One of: no meta, no meta storageVersion, or no meta syncID. Fresh start needed.");
>
> - // abort the server wipe if the GET status was anything other than 404 or 200
> - let status = this.recordManager.response.status;
see comment above: ie, it's not clear to me that the existing check for status is valid - but also not clear that it's invalid - I'm not clear how this code finds itself in that state yet.
Attachment #8752437 -
Flags: review?(markh)
Comment 7•9 years ago
|
||
(In reply to Mark Hammond [:markh] from comment #6)
> Comment on attachment 8752437 [details]
> MozReview Request: Bug 1087636 - Better handling of server errors in GET
> /meta/global in Sync. r=markh
>
> https://reviewboard.mozilla.org/r/52619/#review51142
Aargh - reviewboard lost the "summary" comment I added. In short, I think this existing code is way too complex and possibly broken, so I think we need to restructure/rewrite parts/lots of it before we can actually fix it. Let's chat more about this.
Assignee | ||
Comment 8•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52619/diff/2-3/
Attachment #8752437 -
Attachment description: MozReview Request: Bug 1087636 - Better handling of server errors in GET /meta/global in Sync. r=markh → Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
Attachment #8752437 -
Flags: review?(markh)
Updated•9 years ago
|
Assignee | ||
Comment 9•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52619/diff/3-4/
Comment 10•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
https://reviewboard.mozilla.org/r/52619/#review58514
This looks great, thanks!
::: services/sync/tests/unit/test_service_sync_remoteSetup.js:66
(Diff revision 4)
> "/1.1/johndoe/storage/crypto/keys": upd("crypto", keysWBO.handler()),
> "/1.1/johndoe/storage/crypto": upd("crypto", cryptoColl.handler()),
> "/1.1/johndoe/storage/clients": upd("clients", clients.handler()),
> - "/1.1/johndoe/storage/meta/global": upd("meta", wasCalledHandler(meta_global)),
> - "/1.1/johndoe/storage/meta": upd("meta", wasCalledHandler(metaColl)),
> - "/1.1/johndoe/info/collections": collectionsHelper.handler
> + "/1.1/johndoe/storage/meta": upd("meta", wasCalledHandler(metaColl))
> + };
> + handlers[GLOBAL_PATH] = upd("meta", wasCalledHandler(meta_global));
Can't we just have these directly in the |handlers| assignment?
Attachment #8752437 -
Flags: review?(markh) → review+
Assignee | ||
Comment 11•9 years ago
|
||
Comment on attachment 8752437 [details]
Bug 1087636 - Better handling of server errors in GET /meta/global in Sync.
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52619/diff/4-5/
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 12•9 years ago
|
||
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/fx-team/rev/652fa69526ff
Better handling of server errors in GET /meta/global in Sync. r=markh
Keywords: checkin-needed
Comment 13•9 years ago
|
||
bugherder |
Updated•7 years ago
|
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in
before you can comment on or make changes to this bug.
Description
•