Closed Bug 1298422 Opened 3 years ago Closed 3 years ago

Colocate vendored crates from crates.io

Categories

(Firefox Build System :: General, defect)

defect
Not set

Tracking

(firefox51 fixed)

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: Nika, Assigned: ted)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Right now we have a single crate vendored in tree from crates.io: byteorder. It is currently stored in:

/media/libstagefright/binding/byteorder

This makes sense, as byteorder is currently only used by mp4parse, but in the future I expect that crates like byteorder which are vendored from crates.io will be used in many places. It would be nice to have a single directory which all of our vendored crates will end up in.

As an example, if bug 1293362 lands, it would want to vendor the `libc` crate, which will almost certainly be used by other projects as well.
We've discussed this in various meetings and bug 1294565, and we've settled on `third_party/rust`. We probably need to fix bug 1294565 to do this, or at least fix anything blocking that work such that we *can* run cargo vendor successfully.
This is just a rough cut, but it seems to work. I'm not sure if I removed all the build system bits that we need to. This patch does the following:
1) Add a workspace to toolkit/library/rust/Cargo.toml to pull in the top-level gtest crate in, so that we can vendor the full set of crates used by libxul and gtest in one fell swoop.
2) Change the reference to byteorder in the mp4parse crate to point at crates.io instead of a path.
3) Update toolkit/library/rust/Cargo.lock file to match. I just ran a local build, I don't know that there's actually a way to update Cargo.lock from Cargo.toml aside from running `cargo build`? We'll have to add a way to opt-out of building with `--frozen` for this case (I know we discussed this.)
4) Run cargo-vendor like so: cargo vendor --sync toolkit/library/rust/Cargo.lock third-party/rust
5) hg rm media/libstagefright/binding/byteorder/
6) hg addremove third-party/rust . I noticed when playing with cargo-vendor that it doesn't remove old crates, it only adds crates, so when we implement the mach vendor command we may need to rm third-party/rust before running cargo vendor.
7) Add a .cargo/config to point cargo at the vendored sources. I put it at the top level of the repo, which seems like the most sensible way to do things because cargo will search upward from the crate directory, so this will ensure that if we add new top-level crates anywhere in the tree they'll use the vendored crates.

I think that's all of it. I thought it'd be good to try this out manually before we tried to turn it into a mach command in bug 1294565.
Assignee: nobody → ted
> I just ran a local build, I don't know that there's actually a way to update Cargo.lock from Cargo.toml aside from running `cargo build`?

Another possible operation would be `cargo fetch`, but the ideal command here is `cargo generate-lockfile`. Right now that only works if Cargo.lock doesn't exist, however, and that's just a bug we should fix!

> I noticed when playing with cargo-vendor that it doesn't remove old crates, it only adds crates

Indeed! I'd be totally down for blowing away the directory to start out, but I figured it'd be useful if you're working with multiple Cargo.lock files vendored into the same directory. I'd also be fine adding a flag like `--delete` to `cargo-vendor` which had this behavior by default.

> Add a .cargo/config to point cargo at the vendored sources

This either may or may not want to be checked into the hg repo depending on the exact situation. With Cargo vendoring it should always be ok to pull from crates.io instead, so this shouldn't be strictly required unless you're explicitly trying to avoid the network (like CI infrastructure).

It may be useful for online development to hit crates.io to update dependencies naturally, but that mode could be off by default? Another option would be that the "configure script" for gecko would emit the `.cargo/config` file so it's not necessarily checked into the repo.

That's all kinda the same though, so just some thoughts!

> cargo will search upward from the crate directory

Yeah and to clarify, the current behavior is looking upwards from the cwd of the invocation of the `cargo` tool.
(In reply to Alex Crichton [:acrichto] from comment #4)
> Another possible operation would be `cargo fetch`, but the ideal command
> here is `cargo generate-lockfile`. Right now that only works if Cargo.lock
> doesn't exist, however, and that's just a bug we should fix!

Ah, yes!

> > I noticed when playing with cargo-vendor that it doesn't remove old crates, it only adds crates
> 
> Indeed! I'd be totally down for blowing away the directory to start out, but
> I figured it'd be useful if you're working with multiple Cargo.lock files
> vendored into the same directory. I'd also be fine adding a flag like
> `--delete` to `cargo-vendor` which had this behavior by default.

It's not a particularly big deal, honestly. It's just one extra line to remove the vendor directory before invoking cargo.

> > Add a .cargo/config to point cargo at the vendored sources
> 
> This either may or may not want to be checked into the hg repo depending on
> the exact situation. With Cargo vendoring it should always be ok to pull
> from crates.io instead, so this shouldn't be strictly required unless you're
> explicitly trying to avoid the network (like CI infrastructure).
> 
> It may be useful for online development to hit crates.io to update
> dependencies naturally, but that mode could be off by default? Another
> option would be that the "configure script" for gecko would emit the
> `.cargo/config` file so it's not necessarily checked into the repo.

So this has come up in other discussions, since we definitely want developers to be able to update Cargo.toml files with new dependencies and be able to rebuild without having to re-vendor. If we have this cargo configuration in place, and someone adds a new dependency and builds without --frozen, will that work? (Currently we're always passing --frozen if cargo supports it, so we need to at least fix that.)

> > cargo will search upward from the crate directory
> 
> Yeah and to clarify, the current behavior is looking upwards from the cwd of
> the invocation of the `cargo` tool.

Thanks for the clarification! I think this won't actually do the right thing then, since we're invoking cargo from the objdir.
> If we have this cargo configuration in place, and someone adds a new dependency and builds without --frozen, will that work?

No. If crates.io is replaced with a local vendor directory, then if you add a new dependency it won't be found in the vendor directory, so that'll fail. If you add a dependency on a crate that's already in the vendor directory, however, that will work.

For fetching a new dependency to work, you'll have to delete the .cargo/config [source] configuration, and then cargo-vendor will need to be rerun before committing as presumably the CI builders will always build with --frozen.
Comment on attachment 8786733 [details]
bug 1298422 - vendor byteorder into third-party/rust.

https://reviewboard.mozilla.org/r/75656/#review73632

This all looks pretty reasonable, but I assume there are going to be some changes to deal with the cargo build failures, so I'm just going to cancel the review for the moment.
Attachment #8786733 - Flags: review?(nfroyd)
Okay, that patch should fix things. I'll run it by try and see how that goes. Changes over the first patch:
a) Put a top-level Cargo.toml in the repo root, which just has [workspace], pointing to our other top-level crates.
b) Put a Cargo.lock file next to it, and remove the other in-tree Cargo.lock files. Workspaces use a single Cargo.lock.
c) Write `$objdir/.cargo/config` using `CONFIGURE_SUBST_FILES` to point at $topsrcdir/third_party/rust.
d) Actually use $topsrcdir/third_party/rust, since I had apparently spelled it "third-party" previously. (Also I just realized that I did not change the patch description to match.)

I'm pretty confident the .cargo/config works because I had bad content in there while working on this patch revision and cargo raised an error. :)

We can support something like an `--enable-rust-developer` mode to let developers add dependencies without vendoring by just not writing the .cargo/config file.
Comment on attachment 8786733 [details]
bug 1298422 - vendor byteorder into third-party/rust.

https://reviewboard.mozilla.org/r/75656/#review73712

r=me, but this can't land because of the unstable Cargo requirement for workspace support, right?

::: Cargo.toml:1
(Diff revision 3)
> +[workspace]
> +members = ["toolkit/library/rust", "toolkit/library/gtest/rust"]

`[workspace]` is nightly Cargo only at the moment, correct?  So this would work OK in automation (where we mix-and-match stable Rust and unstable Cargo), but it wouldn't work so great for people on stable Rust developing?

Do we need to use workspaces right now, or can we just make `byteorder` a normal dependency of `mp4parse` and have everything work for the time being?  Oh...does that not work because of Cargo's insistence on hierarchical directories?

::: python/mozbuild/mozbuild/frontend/emitter.py
(Diff revision 3)
> -        # This is not exactly how cargo works (cargo would permit
> -        # identically-named packages with different versions in the same
> -        # library), but we want to try and avoid using multiple
> -        # different versions of a single package for code size reasons.

Can you file a followup bug about putting this check back?  I'm not excited about the possibility of multiply-versioned crates being allowed, though I think Lars has pointed out that we will need to have some mechanism for (temporarily) permitting it, so as to not make dependency updating super-onerous.  Maybe some mode where we permit multiple versions on nightly, or with particular flags set in the `[metadata]` of Cargo.toml, but not otherwise?

I assume `cargo vendor` permits multiple versions of crates to be vendored.

::: Cargo.lock:2
(Diff revision 3)
> +name = "mp4parse-gtest"
> +version = "0.1.0"

It's really weird that this is the `[root]` in `Cargo.lock`, rather than a `[[package]]`, but I'll trust that this works...
Attachment #8786733 - Flags: review?(nfroyd) → review+
So this broke in automation and I'm not sure why:
https://treeherder.mozilla.org/logviewer.html#?job_id=26691146&repo=try#L5201

 13:03:12 INFO - env CARGO_TARGET_DIR=. RUSTC=/builds/slave/try-lx-00000000000000000000000/build/src/rustc/bin/rustc /builds/slave/try-lx-00000000000000000000000/build/src/cargo/bin/cargo build --release --frozen --manifest-path /builds/slave/try-lx-00000000000000000000000/build/src/toolkit/library/rust/Cargo.toml --target=i686-unknown-linux-gnu --verbose --
 13:03:12     INFO -      Updating registry `https://github.com/rust-lang/crates.io-index`
 13:03:12     INFO -  error: failed to fetch `https://github.com/rust-lang/crates.io-index`
 13:03:12     INFO -  Caused by:
13:03:12 INFO - attempting to update a git repository, but --frozen was specified

Is the cargo we're using in automation new enough to support workspaces?
(In reply to Nathan Froyd [:froydnj] from comment #11)
> Can you file a followup bug about putting this check back?  I'm not excited
> about the possibility of multiply-versioned crates being allowed, though I
> think Lars has pointed out that we will need to have some mechanism for
> (temporarily) permitting it, so as to not make dependency updating
> super-onerous.  Maybe some mode where we permit multiple versions on
> nightly, or with particular flags set in the `[metadata]` of Cargo.toml, but
> not otherwise?

I can do that. If you'd prefer, I can alternately try to fix that check while removing the path dependency check. I honestly didn't expect this patch to get r+ off the bat, I just ripped this code out to try to get it to build. :)

> I assume `cargo vendor` permits multiple versions of crates to be vendored.

I'm pretty sure it does, which is why it originally put version numbers in the vendored directories, but I don't know enough about cargo dependency handling to get things into a state where this would be required.
I tested, vendoring multiple versions with cargo vendor works fine:
https://github.com/luser/vendor-test/tree/vendor-multiple-versions/vendor

You can see there's both a `bitflags` and a `bitflags-0.6.0` in there. (Alex changed `cargo vendor` to leave out the version number when there's only one version in use.)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #13)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=d57d3b82fb97

I split the workspace changes out to a separate patch, since that won't work well until we require a cargo new enough to understand them. This patch is still failing with the same error on try though:
 06:35:35     INFO -  env CARGO_TARGET_DIR=. RUSTC=/builds/slave/try-lx-00000000000000000000000/build/src/rustc/bin/rustc /builds/slave/try-lx-00000000000000000000000/build/src/cargo/bin/cargo build --release --frozen --manifest-path /builds/slave/try-lx-00000000000000000000000/build/src/toolkit/library/rust/Cargo.toml --target=i686-unknown-linux-gnu --verbose --
 06:35:35     INFO -      Updating registry `https://github.com/rust-lang/crates.io-index`
 06:35:35     INFO -  error: failed to fetch `https://github.com/rust-lang/crates.io-index`
 06:35:35     INFO -  Caused by:
06:35:35 INFO - attempting to update a git repository, but --frozen was specified

Alex: I could use some help figuring out what's going on here.
Flags: needinfo?(acrichton)
Hm I wonder if the .cargo/config isn't being picked up? In theory it shouldn't even attempt to hit a git repository if .cargo/config is being used.

Perhaps some extra debug logs from Cargo through `RUST_LOG=cargo` would help?
Flags: needinfo?(acrichton)
The answer on IRC is that the cargo we're using in automation is not new enough to support source replacement.
Depends on: 1299888
I filed https://github.com/rust-lang/cargo/issues/3066 about improving cargo's source replacement to support using our vendored crates but optionally pulling new crates from crates.io.
Depends on: 1299971
No longer depends on: 1299888
Blocks: 1294565
See Also: 1294565
Blocks: 1301190
Comment on attachment 8786733 [details]
bug 1298422 - vendor byteorder into third-party/rust.

https://reviewboard.mozilla.org/r/75656/#review73712

> `[workspace]` is nightly Cargo only at the moment, correct?  So this would work OK in automation (where we mix-and-match stable Rust and unstable Cargo), but it wouldn't work so great for people on stable Rust developing?
> 
> Do we need to use workspaces right now, or can we just make `byteorder` a normal dependency of `mp4parse` and have everything work for the time being?  Oh...does that not work because of Cargo's insistence on hierarchical directories?

I split the workspace bits out to a separate patch.

> Can you file a followup bug about putting this check back?  I'm not excited about the possibility of multiply-versioned crates being allowed, though I think Lars has pointed out that we will need to have some mechanism for (temporarily) permitting it, so as to not make dependency updating super-onerous.  Maybe some mode where we permit multiple versions on nightly, or with particular flags set in the `[metadata]` of Cargo.toml, but not otherwise?
> 
> I assume `cargo vendor` permits multiple versions of crates to be vendored.

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1301190
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #22)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=aae2ea6d15ca

I got hung up because I didn't generate the Cargo.lock files with a new-enough cargo (despite my best efforts). I got it sorted out locally, so that should work now. If the try builds are green I'll finally get this landed.
I still broke SM-tc(pkg) builds because they need to copy .cargo/config.in into the SM source package, but that shouldn't be hard to fix.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #25)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=c0da7411a2f0

I patched the SM-tc(pkg) script, so hopefully that build will be green. I also removed a bunch of tests from test_emitter that were testing the Cargo.toml validation code that I had removed, so that should hopefully make the rest of the builds green.
Blocks: 1302704
https://hg.mozilla.org/mozilla-central/rev/0ee12557d6e6
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.