Closed Bug 1141639 Opened 11 years ago Closed 10 years ago

taskcluster-base: Define JSON schema(s) for error responses from REST API end-points

Categories

(Taskcluster :: Services, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pmoore, Assigned: jonasfj)

References

Details

Attachments

(1 file)

57 bytes, text/x-github-pull-request
jonasfj
: review+
Details | Review
When we call REST API endpoints and we get a non-200 "OK" response, the response body typically serves informative content in json format. See this full raw response for an http 401 error: HTTP/1.1 401 Unauthorized Content-Length: 157 Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT Access-Control-Allow-Origin: * Access-Control-Request-Method: * Connection: keep-alive Content-Type: application/json; charset=utf-8 Date: Tue, 10 Mar 2015 16:51:07 GMT Server: Cowboy Via: 1.1 vegur X-Powered-By: Express { "message": "Malformed URL parameters", "error": [ { "message": "Parameter 'runId' does not match [0-9]+", "error": "\u0000" } ] } It would be useful to define one or more json schemas that define the syntax of the error messages, so that they can be programmatically inspected. It might be necessary to provide more than one json schema if different types of services which to provide different data structures, or if different http response codes might issue different types of data. However a single schema might be enough. This could also be used by client libraries to return structured error data back from api call methods.
s/which to provide/wish to provide/ :p
We absolutely should do this. But we need to figure out what kinds of errors we have, then extend base.API to add not just a request.reply method, but also a request.error[Type] method. So let's try to list error types: 400, Input Syntax Malformed When schema validation fails we can probably do some structure form of error info. Info: some form of structured error message from JSON schema validator. 400, Input Invariant Violation When some extra check we do on the input is violated. Example: `task.deadline` is in the past. Info: details = {...} human readable JSON dictionary with data subset that caused violation. (not meant to be interpreted by computers) 401, Authorization Failed When the API end-point requires authentication, but: - No authentication is provided, - repeated nonce is detected, - Signature is invalid, - the credentials have expired, or - the client doesn't have scopes required Info: - reason: signature-missing, signature-invalid, signature-expired, credentials-expired (this will be an enum, but we may add additional entries later) - list of scopes required. (no info about what scopes the client has) 404, Resource Not Found When an entity, task, run, etc. isn't found. Info: (no structured info) 409, Request Conflict When a request fails because a client-side assumption about server state failed. Example: trying to claim a task, but the task is already claimed. This can happen when racing and Info: (no structured info) 413, Input Too Large When input is too large, example task with a definition that is > x kb. Info: size of input given and max allowed input size. (I'm not sure we need this error) 500, Internal Server Error When something bad happens server side. Info: incidentId (uuid) that is printed to logs along with any error messages. (error messages are NOT returned to client as the trace mean leak secrets) In addition to whatever structured info attached, all the error messages also contain a message explain the error in human readable terms. Preferably long enough to do so in some form of details. ### Include an error code: I suggest we include a `code` property with a value as: InputSyntaxMalformed, InputInvariantValidation, AuthorizationFailed, ... as listed above. We could do a statically typed list with more specific error code, like a code for every possible input invariant violation. But it would be a pain to maintain this list, I doubt it would be used much. We could aso make a dynamic list, so we just make up error codes as we go along. Honestly though I suspect that it doesn't pay off to categories errors further. We are going to handle some error in special ways like 404 and 409, which are valid when racing, or getting in-exact advice from azure queue storage. But for an input invariant violation like task.deadline being in the past, there never any good reason to be handling this kind of error. It only makes sense to log it. The same goes for an error like task.expires < task.deadline, I don't ever see a case were we would want to handle it in a catch-block.
Summary: Define json schema(s) for error responses from REST API endpoints → taskcluster-base: Define JSON schema(s) for error responses from REST API end-points
Note to self: We might actual want to include an incidentId with all error message. Why not it'll only make them easier to find in the log. Granted we should fix logging too. And differentiate between warnings and debug messages.
I think the problem here is deciding on what granularity is useful when handling errors. Relying on HTTP status code is a little too coarse, but I think inventing special errors codes for every possible condition is too broad and makes it hard to handle errors without forgetting some. And I think it's hard to formalize JSON schemas describing the contents of a "code" dependent "info" object. See Comment 2, for examples of what info contains depending on "code". @garndt, do you have any thoughts on this, I think docker-worker is one of the very few cases that actually cares about what kind of error was returned.
Flags: needinfo?(garndt)
As I look over docker-worker and our interaction with the queue, it seems the worker is more concerned with just trying to determine if it should retry the request (although taskcluster-client should handle this and now there is a retry/backoff) or if it's a fatal thing and if so, is the task failed or an exception. IF there was a way to determine this without relying on HTTP codes that would be awesome.
Flags: needinfo?(garndt)
See Also: → 1159656
Component: TaskCluster → General
Product: Testing → Taskcluster
Component: General → Platform Libraries
Attached file github PR
Please give this a review... It requires: let api = new base.API({ ... errorCodes: { MyCustomError: 404 // a mapping from custom errors to error code ... } }); And api.router({...., raven}), may be provided where raven is null or an raven client for sentry: https://github.com/getsentry/raven-node Hence, exception will be sent to sentry so we can keep track of them. (Not providing a raven client is strictly optional).
Assignee: nobody → jopsen
Status: NEW → ASSIGNED
Attachment #8705737 - Flags: review?(jhford)
Comment on attachment 8705737 [details] [review] github PR Carrying on with "This looks good to me! I'm looking forward to using it." as r+
Attachment #8705737 - Flags: review?(jhford) → review+
Blocks: 1226597
This landed a few days ago... just updated taskcluster-base..
meant to close this...
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: Platform Libraries → Services
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: