Closed Bug 1463260 Opened 6 years ago Closed 6 years ago

docs builds are failing because rootUrl isn't set

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: bstack)

References

Details

AssertionError [ERR_ASSERTION]: A `rootUrl` must be provided to taskcluster-lib-validate!
    at validator (/app/node_modules/taskcluster-lib-validate/src/validate.js:24:3)
    at setup (/app/src/main.js:39:23)
    at /app/node_modules/taskcluster-lib-loader/lib/loader.js:337:28
    at tryCallOne (/app/node_modules/promise/lib/core.js:37:12)
    at /app/node_modules/promise/lib/core.js:123:15
    at flush (/app/node_modules/asap/raw.js:50:29)
    at _combinedTickCallback (internal/process/next_tick.js:131:7)
    at process._tickDomainCallback (internal/process/next_tick.js:218:9)
    at Function.Module.runMain (module.js:695:11)
    at startup (bootstrap_node.js:191:16)
    at bootstrap_node.js:612:3

which makes sense -- we have no rootUrl at build time.  That needs to come at runtime.
The rootUrl is necessary to generate full $id URLs.

17:16:31 <•dustin> so I think there are two issues
17:16:33 <•dustin> 1. `schema` or `input`/`output` properties in references JSON
17:17:08 <•dustin> those could be easily relativized (just keep the values from the declarations, and require consumers to apply libUrl.schema(..))
17:17:40 <•dustin> 2. `$ref` used between schemas in the same app -- which hasn't been addressed yet at all
Blocks: 1459751
Assignee: dustin → bstack
Status: NEW → ASSIGNED
https://tools.ietf.org/html/draft-handrews-json-schema-01#page-11
```
8.1.  Initial Base URI

   RFC3986 Section 5.1 [RFC3986] defines how to determine the default
   base URI of a document.

   Informatively, the initial base URI of a schema is the URI at which
   it was found, or a suitable substitute URI if none is known.
```

so a relative URI is to be treated as relative to the current schema.


```
8.2.  The "$id" Keyword

   The "$id" keyword defines a URI for the schema, and the base URI that
   other URI references within the schema are resolved against.  A
   subschema's "$id" is resolved against the base URI of its parent
   schema.  If no parent sets an explicit base with "$id", the base URI
   is that of the entire document, as determined per RFC 3986 section 5
   [RFC3986].

8.2.1.  Identifying the root schema

   The root schema of a JSON Schema document SHOULD contain an "$id"
   keyword with an absolute-URI [RFC3986] (containing a scheme, but no
   fragment), or this absolute URI but with an empty fragment.
```

So a schema document with `{"$id": "/schemas/foo/v1/bar.json#"}` isn't allowed.  I suspect that doing disallowed things will cause various json-schema implementations to barf.

Maybe the right answer here is to make tc-lib-validate load all of the schemas, and be willing to write them out with relative (invalid!) top-level $id properties, but not actually validate anything until it's been given a rootUrl.  Sort of the way we have APIBuilder + rootUrl -> API, we would have ValidatorBuilder + rootUrl -> Validator.

The change you just made to tc-lib-api is to allow APIBuilder to produce a rootUrl-free API reference.  So I think we'd do the same for lib-validate, where a ValidatorBuilder can produce a rootUrl-free (invalid!) schema.  We would then add code to whatever is serving /schemas that will fix the $id's before serving.

We'll need to do similar pre-processing when generating clients that process the schemas (Java, Go).  Maybe those just apply the testRootUrl().

The other (maybe simpler) option is to use a temporary rootUrl for everything before deployment, like `https://rootUrl`.  Then we would have `{"$id": "https://rootUrl/foo/v1/bar.json#"}` which is valid and can be used for client and docs generation.  We can still have code dynamically rewrite those when serving /schemas.

By the way, I added an item to the service-migration gist to make all $ref's relative.  That should fix the $ref issue I mentioned in comment 1.
Blocks: 1462761
Thinking about this overnight, I think I like a combination of both options:

Define ValidatorBuilder -> Validator and ExchangesBuilder -> Exchanges just like APIBuilder -> API.

ValidatorBuilder.schemas, ExchangesBuilder.reference, and APIBuilder.reference return "generic" schema refs and IDs of the form `https://taskcluster/<path>`.

Validator.schemas, Exchanges.reference, and API.reference return "localized" schema refs and IDs based on the rootUrl.

The advantage is that at each stage, everything is consistent, and $id's are fully formed.  It's just that the "generic" version schema URIs are only URNs (not suitable for locating the resource, just a name), while in the "localized" version they are URLs (you'll get the schema if you fetch it).

The docs deployment or runtime will need to localize the schemas and references, which should be a pretty easy thing to define.

Maybe "ValidatorBuilder" etc. aren't the greatest names..
I think this makes sense. I also don't super-like the names but I can live with it.

In my head the steps to localize schemas/references in docs deployment/runtime (I think I lean towards deployment here) will just be 

schema: deepclone while replacing https://taskcluster with $rootUrl in every $id and $ref field
reference: go through and replace https://taskcluster with $rootUrl in every input/output field (and wherever they are referenced in exchanges)

Will that be sufficient?
Yes -- and we can save some effort by using relative schema URIs in exchange and API references.  I don't really mind what we do there
See Also: → 1091780
Commit pushed to master at https://github.com/taskcluster/taskcluster-lib-api

https://github.com/taskcluster/taskcluster-lib-api/commit/c61d80522f0de6636929335706cd8aa746d0b678
Bug 1463260 - [BREAKING] Make references and schemas redeployability-friendly (#100)

This library now requires taskcluster-lib-validate 11.0.0 or higher, and thus requires a schemaset instead of a validator as input.

The `input` and `output` values of each API method must now be simple filenames (`foo-request.yaml`) without any path; they are suffixed to the schema URL, service name, and version at runtime.

The references generated by this library are now entirely relative to the rootUrl -- that is, they need not be specialized per deployment of Taskcluster.  Specifically, the `input` and `output` properties in `api-

Changes:
* Bug 1463260 - Fix rootUrl for docs writing
* make schema.{input,output} always relative URIs
relative to the service schema, <rootUrl>/schemas/<serviceName>
* upgrade taskcluster-lib-validate
OK, all of the libraries are landed, and I've updated
  https://gist.github.com/djmitche/010023b33f05bd57752397a21e0763e7

Tc-secrets is, I think, ready to land, but I've held off pushing the button.
Commits pushed to master at https://github.com/taskcluster/taskcluster-docs

https://github.com/taskcluster/taskcluster-docs/commit/27a7bed668426277a5405db0bdf34aaf4190575a
Bug 1463260 - Render new format of schemas and refs

https://github.com/taskcluster/taskcluster-docs/commit/0343f80e001e12ea115c7a7f691a11a867d1b720
Merge pull request #261 from taskcluster/render-new-schemas-refs

Bug 1463260 - Render new format of schemas and refs
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: Redeployability → Services
You need to log in before you can comment on or make changes to this bug.