Implementing a Rust-based code generator
Categories
(Firefox Build System :: General, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: bdk, Unassigned)
Details
I'm trying to extend UniFFI to autogenerate JS backends for application-services components. It's been tried before, but hopefully this time will work.
One question I have is how to integrate this into the build system? The basic question here is how can we have the build system build a rust binary that then gets used to generate other files?
HOST_RUST_PROGRAM can create a binary, but there are some additional requirements:
- The binary needs to be built before it's required for code generation.
- nalexander suggests the
export
orpre-export
tier. Would that work? How can I configure that?
- nalexander suggests the
- We need to be able to use the binary to generate the code.
- Could
GeneratedFile
work? I think the only thing needed is some way for the script to figure out the path of the compiled binary.
- Could
- Am I missing any others?
(In reply to Ben Dean-Kawamura from comment #0)
I'm trying to extend UniFFI to autogenerate JS backends for application-services components. It's been tried before, but hopefully this time will work.
One question I have is how to integrate this into the build system? The basic question here is how can we have the build system build a rust binary that then gets used to generate other files?
HOST_RUST_PROGRAM can create a binary, but there are some additional requirements:
- The binary needs to be built before it's required for code generation.
- nalexander suggests the
export
orpre-export
tier. Would that work? How can I configure that?- We need to be able to use the binary to generate the code.
- Could
GeneratedFile
work? I think the only thing needed is some way for the script to figure out the path of the compiled binary.
Yes: once you have a genuniffibindings
or whatever, GeneratedFile
can have gen_uniffi_bindings.py
that invokes $topobjdir/host/genuniffibindings
.
- Am I missing any others?
In theory this is pretty similar to the various existing Rust bindings stuff we have. Does this really need to be in the tree, or can it be a separate toolchain task like cbindgen
? That might simplify things enormously, at the cost of slowing update cycles and probably some other things.
If you try to do this all in-tree, you're going to have a hard time accommodating artifact builds, which by definition don't have the compile environment needed to produce genuniffibindings
. We don't really have any way to "take the generated inputs from the upstream artifacts", but we are growing scenarios where we want that. See some of my thoughts over in https://bugzilla.mozilla.org/show_bug.cgi?id=1751762, which would be one path to supporting this.
Reporter | ||
Comment 2•3 years ago
|
||
I think treating it like cbindgen
should work okay. Does that get built that during mach bootstrap
? How can I add a toolchain task?
Is it possible to keep the rust crate in-tree, but exclude it from the normal build system? That way you could make some changes to the code and test the effects without having to publish a commit. If you had to run mach bootstrap
or some other command to update the tools that wouldn't be too bad.
Comment 3•3 years ago
|
||
I'm going to move some Matrix discussion over here, with a bit more details.
I think treating it like cbindgen should work okay. Does that get built that during mach bootstrap?
cbindgen
gets built in CI (such as Rust being built in CI here when the version changed in toolchains.yml
.
./mach bootstrap
(and auto-bootstrapping) just downloads the platform-specific compiled artifact from Taskcluster.
Is it possible to keep the rust crate in-tree, but exclude it from the normal build system?
Yep, we do this for geckodriver
(also see build-geckodriver.sh
).
That way you could make some changes to the code and test the effects without having to publish a commit.
If you have the code locally, you can follow your build scripts steps on your own machine (running cargo build
in your project, I'm guessing) to work on it. Additionally, once you have a toolchain task configured, you can run it on try
to test it further.
How can I add a toolchain task?
Your knowledge and I here are at the same level :)
I'd recommend looking in taskcluster/ci/toolchain/geckodriver.yml
and following how it works.
You'll probably want to add it to auto-bootstrapping, which can be done somewhere around bootstrap.configure
(unsure OTOH).
I'm a little fuzzy here, but hopefully this is enough information to get you well on your way.
Glandium has been managing a fair chunk of toolchains, so I'd recommend asking him for tips on how to improve ergonomics for iterating on your new toolchain.
Reporter | ||
Comment 4•3 years ago
|
||
CCing Glandium. I'm hoping you can help me understand a few things about toolchain tasks:
- How does downloading toolchain artifacts work? If I bump the version on a toolchain component and push the commit to
try
, how does thetry
build find the new artifact? How do regular builds find the older artifact before the new commit lands? - Is there any system to manual rebuild toolchain binaries locally? If I wanted to test a change to
geckodriver
, is there a way for me to get the new version without creating a commit and pushing it?
Updated•3 years ago
|
(In reply to Ben Dean-Kawamura from comment #4)
CCing Glandium. I'm hoping you can help me understand a few things about toolchain tasks:
- How does downloading toolchain artifacts work? If I bump the version on a toolchain component and push the commit to
try
, how does thetry
build find the new artifact? How do regular builds find the older artifact before the new commit lands?
The "version" of a toolchain is determined by a hash of specific inputs: see https://searchfox.org/mozilla-central/rev/48c71577cbb4ce5a218cf4385aff1ff244dc4432/taskcluster/ci/toolchain/geckodriver.yml#20-25. So builds always find their toolchains by looking at those hashes, allowing lots of different versions to co-exist.
- Is there any system to manual rebuild toolchain binaries locally? If I wanted to test a change to
geckodriver
, is there a way for me to get the new version without creating a commit and pushing it?
Generally, toolchain build scripts are "just" doing some existing process and putting the result in the right form for other jobs to consume. So you can "rebuild" geckodriver
using cargo geckodriver
, etc. Then you put it in place and just use it in the build while you iterate. Eventually, you push it to try, it gets built, and you start consuming it without the local build part.
Reporter | ||
Comment 6•3 years ago
|
||
Then you put it in place and just use it in the build while you iterate
Can you explain more about this process? Should I put it in ~/.mozbuild
like cbindgen
? In RunCBindgen.py
I notice it uses buildconfig.substs["CBINDGEN"]
. How can I arrange for our binary to be listed in buildconfig.substs
?
Thanks for all the help!
(In reply to Ben Dean-Kawamura from comment #6)
Then you put it in place and just use it in the build while you iterate
Can you explain more about this process? Should I put it in
~/.mozbuild
likecbindgen
? InRunCBindgen.py
I notice it uses
buildconfig.substs["CBINDGEN"]
. How can I arrange for our binary to be listed inbuildconfig.substs
?
This is all determined at configure time. Continuing with our geckodriver
example, see this code run as part of mach configure
-- although that is rather more complicated with the non-trivial default and handling !COMPILE_ENVIRONMENT. For a simpler example, see perhaps NSIS or git. Basically, check_prog
is what you want. The details around auto-bootstrapping toolchains I don't know entirely, but I think it's based on the local-toolchain
flag and the bootstrap
flag to check_prog
.
Thanks for all the help!
yw. :glandium or :mhentges can fill in the details about auto-bootstrapping, I've not yet needed to add such a toolchain.
Reporter | ||
Comment 8•3 years ago
•
|
||
We discussed this in #build last week and it looks like installing uniffi-bindgen-gecko-js
as a toolchain artifact isn't going to work for downstream builds. The current plan is manually generating the source files and checking them in. However, I haven't completely given up hope of auto-generating them at build time. Could this work?
- For non-artifact builds:
- Add a
RustHostProgram
that buildsuniffi-bindgen-gecko-js
. As nalexander pointed out, this would need to run in thepre-export
tier, since .h files are generated in theexport
tier. Is there currently a way to specify the tier a program is built in? - Add
GeneratedFile
/GENERATED_WEBIDL_FILES
entries for all auto-generated files (If we were only generating .cpp, .h, and .jsm files and not .webidl, would that simplify the task?)
- Add a
- For artifact builds:
- Can we skip generating the
.webidl
,.cpp
, and.h
files in this case since we are downloading an already built libxul? - For the
.jsm
files, I think glandium suggested building an artifact with the generated files, then downloading/extracting it here.
- Can we skip generating the
Reporter | ||
Comment 9•2 years ago
|
||
I don't think this is going to happen. We committed the initial UniFFI a month or 2 back. Instead of auto-generating the code, we added a mach command to manually generate and checked it in. If we were to revisit auto-generation, I think it would probably be with a python-based generator since that fits in much better with the build ecosystem.
Description
•