Closed Bug 1145220 Opened 9 years ago Closed 6 years ago

Our deploy process for api references and json schemas should ensure that all references continue to validate against schemas

Categories

(Taskcluster :: Services, defect, P5)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pmoore, Unassigned)

References

Details

In other words, if we e.g. update a reference, such as:
http://references.taskcluster.net/queue/v1/api.json

then our deployment process should automatically validate that it conforms to:
http://schemas.taskcluster.net/base/v1/api-reference.json

i.e. not allow a reference to be deployed that does not validate against its schema.

Likewise we should probably ensure that if a schema is updated, that all existing references that validate against it, can still validate successfully. This can be slightly trickier, to know which references are pointing to any given schema... However maybe that can be solved if we bump/hash the version number, so we never replace an old schema (I believe currently we reuse "v1").

Currently http://references.taskcluster.net/queue/v1/api.json does not validate against http://schemas.taskcluster.net/base/v1/api-reference.json which I have raised separately in bug 1134264.
This is already done... they validate against the schemas that are in the taskcluster-base repository.
I think this is more stable than loading from reference.taskcluster.net in production, which is
sketchy for robustness.

If this ^ is satisfactory, let's close this... And push new references from taskcluster-base to references.taskcluster.net util to do so is in tc-base, I can do it... (note I added comment about it in another branch).

> Likewise we should probably ensure that if a schema is updated, that all existing references that
> validate against it, can still validate successfully. 
Ideally, yes.. In practice... We shouldn't start doing this until we've decided that the
reference format is stable. We clearly have outstanding issues with it. Such as the discussion about
using swagger and the ability to add querystring parameters to GET requests (at minimum).

@pmoore, I think it's best to be practical for now... And not lock things down as declared stable.
That's why we have version numbers such as 0 and 0.2.0.

Note, I'm open to starting design discussions on getting to a point where we're reasonably confident locking down how references looks.
I think part of this is considering swagger/json-hyper-schema vs. our own. Part of it is considering
pulse guardian and discussing if exchange references could be a pulse feature. And part of it involves
figuring out how docs should be generated on-deploy of components.
Flags: needinfo?(pmoore)
Summary: Our deploy process for api references and json schemas should ensure that all references continue to validate against schemas → taskcluster-base: Our deploy process for api references and json schemas should ensure that all references continue to validate against schemas
OK, let's park this then for a later time, agreed.

Maybe an auto-deploy after an r+ and merge would be good just to make sure published schemas are consistent with latest versions in vcs, but the other matters we can sketch out when the schema definition framework decisions have been made.

Thanks!
Flags: needinfo?(pmoore)
Component: TaskCluster → General
Product: Testing → Taskcluster
Component: General → Platform Libraries
Component: Platform Libraries → Platform and Services
See Also: → 1433672
Component: Platform and Services → Redeployability
Blocks: 1457584
Assignee: nobody → pmoore
Pete, this could happen easily in the tc-references build process if you want to do it there.

Also, we will want to enforce some compatibility between references (as in, something to detect if we make a non-backward-compatible change to an API method).  Maybe that could happen there, too?
Priority: -- → P5
Per a discussion on IRC today, my understanding of the problem is that:

1. taskcluster-references is a build artifact of a cluster
2. taskcluster-references contains a static list of schemas
3. taskcluster-lib-api has a duplicated copy of the api reference scheam

With those in mind, my understanding of the problem here is that we need to figure out which API references schema to use when validating API references.

My thought is that we should pick the semver-greatest version of lib-api from the package.json of all services which are part of a deployment.  Since we cannot make backwards incompatible changes to the reference schema, the semver-greatest version of lib-api in use will have the most complete Schema, and should be used.  Whatever is doing the deployment should load all the package.json files and figure out which version of lib-api to use the schema from.  Unless we lock in the file path in the deployment process, we should probably have a field in the lib-api package.json which is a path relative to package.json where the schema is to be found.
I think this (plus bug 1476602) will be easy enough to solve, and while not critical to forward progress on r13y it will be good to clear up this unfinished business.

Each service publishes a version of its API references that matches the schema in the version of tc-lib-api that it links against.  That document should include a $schema indicating which schema that is (with some complexity around what hostname that URI should have).  Tc-lib-api can validate that at service test time, and tc-references can re-verify it (against its own copy of the schema) at cluster build time.

To avoid duplication, we can have tc-lib-api upload its version of the schema in each service.  Then when tc-references gathers all the schemas and references, if it finds two different schemas with the same filename, it can throw an error.  But assuming v3 is the same everywhere and v4 is the same everywhere, it can just gather them up and serve them.

This means that when no services publish API references with a particular schema (say, v1) then tc-references will no longer serve that schema (so /schemas/common/api-reference-v1.json might go from working in one build of Taskcluster to a 404 in the next, if nothing in the second build refers to that version).  But that seems OK, and the most practical way to avoid it is to duplicate the schemas into tc-references which brings its own problems.
Assignee: pmoore → dustin
Summary: taskcluster-base: Our deploy process for api references and json schemas should ensure that all references continue to validate against schemas → Our deploy process for api references and json schemas should ensure that all references continue to validate against schemas
Assignee: dustin → nobody
We solved this in RFC 128.
Blocks: 1506980
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: Redeployability → Services

I would like to take it !

It's already solved! I'm curious how you came to this bug?

I was just finding unsolved bugs on bugzilla , didn't realize it was already solved!
Sorry

You need to log in before you can comment on or make changes to this bug.