Closed
Bug 1435280
Opened 8 years ago
Closed 7 years ago
Validation of json-e inputs to API endpoints
Categories
(Taskcluster :: Services, enhancement)
Taskcluster
Services
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: pmoore, Unassigned)
Details
We don't have an easy way to specify in our API references that a given input contains a json-e template, and provide strict validation against that template input to guarantee that it can only generate json that complies to a given json schema. We can only specify that an input itself complies to a given json schema, not the transformation of it as json-e.
Perhaps our API references need to have an additional optional parameter to specify that the input is a json-e template, that when rendered, should validate against the given input json schema, rather than the input json directly needing to validate against the given input json schema...
Something like:
"jsonEInput": true
(default false)
Not sure if we should handle this for output schemas too, or just input schemas.
| Reporter | ||
Comment 1•8 years ago
|
||
Note, this does somewhat further bind json-e to our core platform, and make it a more critical piece. We should think carefully about whether we really want to impose this legacy transformation technology on our platform at its core. Note, in the case of task templating (such as used by taskcluster-github and hooks service), a decision task already achieves the same without any binding to a legacy templating technology, giving users the freedom to use tools and languages of their choice, rather than our legacy tool supported only in two languages.
| Reporter | ||
Comment 2•8 years ago
|
||
s/legacy/proprietary/g :-)
Comment 3•8 years ago
|
||
There's also a problem that we're going to have to execute user-generated templates in order to validate the output. That means that a bug in the JSON-E execution could be used to access other parts of program memory or the program environment. We should probably figure out how to sandbox the JSON-E execution context. If we're going to enable this in production, we should strongly consider consulting with the security engineering group to do a thorough security review of JSON-E.
Comment 4•8 years ago
|
||
When our APIs accepts a json-e template then correct way to document it in json-schema is:
type: object
description:
json-e template that given a context: `{param1, param2, ...}` renders to `{prop1: ..., prop2: ...}`,
which bla bla bla...
You might mention which json-schema the result should validate against.. Or you may not.
But ultimately, a json-e template is an object and all you can do in the schema is to give it a human description.
IMPORTANT:
We cannot statically evaluate if a json-e template is error-free.
We can only determine if a json-e template is error-free with respect to a specific input.
It might be possible, but doubtful, that in some future we could determine what parameter typing a json-e template requires.
I'm not sure we'll get it done. Regardless I'm confident that we can't express valid json-e templates in json-schemas, so we shouldn't try.
I don't know why we would want a programmatic way to identify that an object is a json-e template.
If we did, we could just indicate it with:
$ref: 'https://taskcluster.github.io/json-e/schema.json'
Technically, I think this document would say nothing of importance... Because it can't really make much in terms of checks.
But it could be used as a trick to indicate that a property is a json-e template.
I see no reason we would want this.
@pmoore, can we close this?
----
> rather than our legacy tool supported only in two languages
json-e is neither legacy nor proprietary -- and it's implemented in 3 languages :)
> If we're going to enable this in production, we should strongly consider consulting with the security engineering group to do a thorough security review of JSON-E.
Agree, json-e is a security risk.
But note: there is a reason we wrote our own interpreter, that is specifically done with security in mind.
But I'm not oppose to the idea of using a sub-process for json-e evaluation in services that do this at a low frequency.
I've filed an issue for json-e to have a low-privileged sandboxed mode of evaluation, offering extra security:
https://github.com/taskcluster/json-e/issues/230
(that is strictly unrelated to this bug)
Flags: needinfo?(pmoore)
| Reporter | ||
Comment 5•8 years ago
|
||
(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #4)
> I don't know why we would want a programmatic way to identify that an object
> is a json-e template.
Until now, API endpoints have validated inputs, so that data that was structurally invalid couldn't be submitted. For example, if you submit a task when calling queue.createTask, we have a schema that checks you've provided a data structure that the queue expects. Now if you create a hook with a task template, you could submit an invalid task template, which would be accepted, and only when the hook is triggered, would the hooks service notice that the generated task is structurally invalid.
So we've lost the ability to detect invalid inputs at submission time. Furthermore, we don't have a formal way to define that the input should be a json-e template that renders to json that complies with a given json schema. We can only write this in comments.
> @pmoore, can we close this?
I would prefer not to. If you disagree, let's meet on vidyo to discuss.
Flags: needinfo?(pmoore)
Comment 6•8 years ago
|
||
We have never had perfect validation -- it's possible to call createTask with a payload that the worker won't accept, for example. There are lots of other checks in various API calls that could be structural or not, depending on your perspective.
There's been a lot of debate within the team about whether to have the queue check the well-formedness of task payloads. I think the same is true here.
I agree that it is impossible, given a JSON-e template, to validate that all possible outputs would conform to a given JSON schema (in finite time..). So I think that option is off the table. I do think we could do some minor validation of JSON-e, rejecting a few common typos and errors using schema only. For example, if an object has a property that begins with a single '$', then it must be one of the defined operators, and depending on which operator it matches, the other properties must fit a particular schema. This would still miss typing errors, undefined variables, and so on -- I'm pretty sure those are not expressible in JSON-schema -- but might be useful all the same.
I think the benefit here is small over just requiring `{type: object, description: "...JSON-e..."}`, but if someone wants to dig into JSON schemas, please don't let me stop you :)
Comment 7•8 years ago
|
||
So far, the two people who understand the situation say that this is impossible.
Pete, compare this to task.extra -- it's a place where a user can put arbitrary content. So it has no defined structure. It happens that arbitrary content is sort-of executable (in a very limited, loop-free language), which brings things like the halting problem into play.
You might also compare it to task.payload.command: while we can verify that the command is structurally correct (for various definitions of correct depending on worker implementation), we can't in any way verify that it is "correct" without executing it in the context of a task.
So I think that from a user perspective, the places where we use JSON-e -- like hooks' task templates -- are programs that the user writes and then debugs by trying to execute them. That's not something the user would expect a schema to be able to determine.
Can we close this?
Flags: needinfo?(pmoore)
| Reporter | ||
Comment 8•7 years ago
|
||
Fair enough, sounds like json-e templates fundamentally can't be validated, other than checking they are valid json which we already do. Thanks for the feedback.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(pmoore)
Resolution: --- → INVALID
| Assignee | ||
Updated•7 years ago
|
Component: Platform Libraries → Services
You need to log in
before you can comment on or make changes to this bug.
Description
•