Closed Bug 1756214 Opened 3 years ago Closed 2 years ago

Implementing a Rust-based code generator

Categories

(Firefox Build System :: General, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: bdk, Unassigned)

Details

I'm trying to extend UniFFI to autogenerate JS backends for application-services components. It's been tried before, but hopefully this time will work.

One question I have is how to integrate this into the build system? The basic question here is how can we have the build system build a rust binary that then gets used to generate other files?

HOST_RUST_PROGRAM can create a binary, but there are some additional requirements:

  • The binary needs to be built before it's required for code generation.
    • nalexander suggests the export or pre-export tier. Would that work? How can I configure that?
  • We need to be able to use the binary to generate the code.
    • Could GeneratedFile work? I think the only thing needed is some way for the script to figure out the path of the compiled binary.
  • Am I missing any others?

(In reply to Ben Dean-Kawamura from comment #0)

I'm trying to extend UniFFI to autogenerate JS backends for application-services components. It's been tried before, but hopefully this time will work.

One question I have is how to integrate this into the build system? The basic question here is how can we have the build system build a rust binary that then gets used to generate other files?

HOST_RUST_PROGRAM can create a binary, but there are some additional requirements:

  • The binary needs to be built before it's required for code generation.
    • nalexander suggests the export or pre-export tier. Would that work? How can I configure that?
  • We need to be able to use the binary to generate the code.
    • Could GeneratedFile work? I think the only thing needed is some way for the script to figure out the path of the compiled binary.

Yes: once you have a genuniffibindings or whatever, GeneratedFile can have gen_uniffi_bindings.py that invokes $topobjdir/host/genuniffibindings.

  • Am I missing any others?

In theory this is pretty similar to the various existing Rust bindings stuff we have. Does this really need to be in the tree, or can it be a separate toolchain task like cbindgen? That might simplify things enormously, at the cost of slowing update cycles and probably some other things.

If you try to do this all in-tree, you're going to have a hard time accommodating artifact builds, which by definition don't have the compile environment needed to produce genuniffibindings. We don't really have any way to "take the generated inputs from the upstream artifacts", but we are growing scenarios where we want that. See some of my thoughts over in https://bugzilla.mozilla.org/show_bug.cgi?id=1751762, which would be one path to supporting this.

I think treating it like cbindgen should work okay. Does that get built that during mach bootstrap? How can I add a toolchain task?

Is it possible to keep the rust crate in-tree, but exclude it from the normal build system? That way you could make some changes to the code and test the effects without having to publish a commit. If you had to run mach bootstrap or some other command to update the tools that wouldn't be too bad.

I'm going to move some Matrix discussion over here, with a bit more details.

I think treating it like cbindgen should work okay. Does that get built that during mach bootstrap?

cbindgen gets built in CI (such as Rust being built in CI here when the version changed in toolchains.yml.
./mach bootstrap (and auto-bootstrapping) just downloads the platform-specific compiled artifact from Taskcluster.

Is it possible to keep the rust crate in-tree, but exclude it from the normal build system?

Yep, we do this for geckodriver (also see build-geckodriver.sh).

That way you could make some changes to the code and test the effects without having to publish a commit.

If you have the code locally, you can follow your build scripts steps on your own machine (running cargo build in your project, I'm guessing) to work on it. Additionally, once you have a toolchain task configured, you can run it on try to test it further.

How can I add a toolchain task?

Your knowledge and I here are at the same level :)
I'd recommend looking in taskcluster/ci/toolchain/geckodriver.yml and following how it works.

You'll probably want to add it to auto-bootstrapping, which can be done somewhere around bootstrap.configure (unsure OTOH).


I'm a little fuzzy here, but hopefully this is enough information to get you well on your way.
Glandium has been managing a fair chunk of toolchains, so I'd recommend asking him for tips on how to improve ergonomics for iterating on your new toolchain.

CCing Glandium. I'm hoping you can help me understand a few things about toolchain tasks:

  • How does downloading toolchain artifacts work? If I bump the version on a toolchain component and push the commit to try, how does the try build find the new artifact? How do regular builds find the older artifact before the new commit lands?
  • Is there any system to manual rebuild toolchain binaries locally? If I wanted to test a change to geckodriver, is there a way for me to get the new version without creating a commit and pushing it?
Priority: -- → P3

(In reply to Ben Dean-Kawamura from comment #4)

CCing Glandium. I'm hoping you can help me understand a few things about toolchain tasks:

  • How does downloading toolchain artifacts work? If I bump the version on a toolchain component and push the commit to try, how does the try build find the new artifact? How do regular builds find the older artifact before the new commit lands?

The "version" of a toolchain is determined by a hash of specific inputs: see https://searchfox.org/mozilla-central/rev/48c71577cbb4ce5a218cf4385aff1ff244dc4432/taskcluster/ci/toolchain/geckodriver.yml#20-25. So builds always find their toolchains by looking at those hashes, allowing lots of different versions to co-exist.

  • Is there any system to manual rebuild toolchain binaries locally? If I wanted to test a change to geckodriver, is there a way for me to get the new version without creating a commit and pushing it?

Generally, toolchain build scripts are "just" doing some existing process and putting the result in the right form for other jobs to consume. So you can "rebuild" geckodriver using cargo geckodriver, etc. Then you put it in place and just use it in the build while you iterate. Eventually, you push it to try, it gets built, and you start consuming it without the local build part.

Then you put it in place and just use it in the build while you iterate

Can you explain more about this process? Should I put it in ~/.mozbuild like cbindgen? In RunCBindgen.py I notice it uses buildconfig.substs["CBINDGEN"]. How can I arrange for our binary to be listed in buildconfig.substs?

Thanks for all the help!

(In reply to Ben Dean-Kawamura from comment #6)

Then you put it in place and just use it in the build while you iterate

Can you explain more about this process? Should I put it in ~/.mozbuild like cbindgen? In RunCBindgen.py I notice it uses
buildconfig.substs["CBINDGEN"]. How can I arrange for our binary to be listed in buildconfig.substs?

This is all determined at configure time. Continuing with our geckodriver example, see this code run as part of mach configure -- although that is rather more complicated with the non-trivial default and handling !COMPILE_ENVIRONMENT. For a simpler example, see perhaps NSIS or git. Basically, check_prog is what you want. The details around auto-bootstrapping toolchains I don't know entirely, but I think it's based on the local-toolchain flag and the bootstrap flag to check_prog.

Thanks for all the help!

yw. :glandium or :mhentges can fill in the details about auto-bootstrapping, I've not yet needed to add such a toolchain.

We discussed this in #build last week and it looks like installing uniffi-bindgen-gecko-js as a toolchain artifact isn't going to work for downstream builds. The current plan is manually generating the source files and checking them in. However, I haven't completely given up hope of auto-generating them at build time. Could this work?

  • For non-artifact builds:
    • Add a RustHostProgram that builds uniffi-bindgen-gecko-js. As nalexander pointed out, this would need to run in the pre-export tier, since .h files are generated in the export tier. Is there currently a way to specify the tier a program is built in?
    • Add GeneratedFile / GENERATED_WEBIDL_FILES entries for all auto-generated files (If we were only generating .cpp, .h, and .jsm files and not .webidl, would that simplify the task?)
  • For artifact builds:
    • Can we skip generating the .webidl, .cpp, and .h files in this case since we are downloading an already built libxul?
    • For the .jsm files, I think glandium suggested building an artifact with the generated files, then downloading/extracting it here.

I don't think this is going to happen. We committed the initial UniFFI a month or 2 back. Instead of auto-generating the code, we added a mach command to manually generate and checked it in. If we were to revisit auto-generation, I think it would probably be with a python-based generator since that fits in much better with the build ecosystem.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.