Closed Bug 1323557 Opened 7 years ago Closed 6 years ago

Need a way to easily modify vendored crates for local testing and try pushes

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox61 fixed)

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: bholley, Assigned: myk)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

We've discussed this before, but I just hit it in practice.

Sometimes we need code changes in an upstream crate. Upstreaming is time-consuming, and so we want to make sure everything works before doing that. This includes local testing, code review, and try pushes.

We already have a copy of the sources in-tree, so it would be super nice to just edit those and keep hacking. Right now that doesn't work because it messes up the checksums, so my preferred solution is some simple way to just disable the cargo checksum verification. We'd obviously have a commit hook to prevent this from getting checked into m-c.

Yehuda and Alex have pushed back on this in the past, and suggested that we should just make it really easy to check in local overrides. I'm trying that now, and hitting a few speed bumps:
(1) .cargo/config.in is a pre-processed file, which the build system turns into OBJDIR/.cargo.config. I was worried that this would take a lot of time and require lots of stuff to be rebuilt, but thankfully it doesn't seem to.
(2) I need to manually cp -r the vendored sources somewhere, and add them as a separate git commit so that I can see what I actually changed.
(3) I get inexplicable versioning warnings: https://pastebin.mozilla.org/8952223
Needinfo Alex on the feasibility of disabling checksums, or alternate ideas.
Flags: needinfo?(acrichton)
The workflow I'd currently recommend is:

1. Determine which crate needs changes.
2. Copy vendor/$crate somewhere else in tree
3. Modify Cargo.toml with:

    [replace]
    '$crate:$version' = { path = "path/to/copy" }

4. Modify the copied sources
5. Eventually upstream those changes
6. Eventually update vendor dir and delete copy

Does that make sense?

As an aside I wouldn't recommend `paths` as it doesn't handle changes in the dependency graph or feature set, which often happens from what I've seen.
Flags: needinfo?(acrichton)
I explicitly tested this when we were hashing out vendoring, and it seems to work fine in this test crate:
https://github.com/luser/vendor-test/commit/1e146ca6be23e7585473c8a9febcb2a6310911d9

You should be able to just use [replace], but put the path to the already-vendored copy there, and then edit the vendored sources without issues.
That workflow seems a bit heavyweight. I would if we could make it more turnkey via `mach`.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> I explicitly tested this when we were hashing out vendoring, and it seems to
> work fine in this test crate:
> https://github.com/luser/vendor-test/commit/
> 1e146ca6be23e7585473c8a9febcb2a6310911d9
> 
> You should be able to just use [replace], but put the path to the
> already-vendored copy there, and then edit the vendored sources without
> issues.

Oh neat! Yeah that seems like a great compromise.

(In reply to Gregory Szorc [:gps] from comment #4)
> That workflow seems a bit heavyweight. I would if we could make it more
> turnkey via `mach`.

Turnkey is great. I think the only decision is whether it belongs in cargo or mach. Alex, does Ted's workflow from comment 3 (pointing to the same vendored sources as a [replace]) seem like something cargo could do?
Flags: needinfo?(acrichton)
Oh I think a workflow like this definitely belongs in Cargo in one way shape or form. For example Servo really wants a push-button "fork this and let me edit it" as well. If Gecko wants to start out with some logic in mach though I wouldn't mind getting experience there (which is faster than landing in Cargo) and then using those lessons to inform an implementation in Cargo.
Flags: needinfo?(acrichton)
I think as a first cut it'd be useful to add a mach command. I don't know what you'd want to call it, `mach fork-crate <foo>`? It would have to figure out what Cargo.toml files reference crate `foo` and add the [replace] there, and maybe just print a message like "You can now make changes in 'third_party/rust/foo'".

Is this something we'd want to allow to land in-tree anywhere but try? Would we want to have a hook to forbid this sort of [replace] in our Cargo.toml files, assuming that if you really need to patch a dependency you'd fork it and point at our fork?
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #7)
> Is this something we'd want to allow to land in-tree anywhere but try?

I think it is not.

> Would
> we want to have a hook to forbid this sort of [replace] in our Cargo.toml
> files, assuming that if you really need to patch a dependency you'd fork it
> and point at our fork?

Agreed.
Note that we may still need to use [replace] if we need to fork something down the dependency chain, but we should be replacing it with something that has its own repo, rather than just our own code in m-c.
Product: Core → Firefox Build System
What are the necessary next steps to making this better?
The instructions in this bug and its duplicates don't seem to work anymore. What worked for me was:

 1. Copy the /third_party/rust/yourcrate/ directory to a directory *outside* of the mozilla source directory and apply your modifications there.
 2. In /Cargo.toml (the one in the root directory), add the following line to the [patch.crates-io] section:

yourcrate = { path = "path/to/your/fork/of/yourcrate" }
Random idea: `mach try` could detect when vendored sources are modified and perform any necessary actions to appease Cargo. We already create temporary commits during `mach try`. So continued hacking up of the versioned code should be acceptable as long as it doesn't materially change functionality.
(In reply to Gregory Szorc [:gps] from comment #15)
> Random idea: `mach try` could detect when vendored sources are modified and
> perform any necessary actions to appease Cargo. We already create temporary
> commits during `mach try`. So continued hacking up of the versioned code
> should be acceptable as long as it doesn't materially change functionality.

I don't think it makes sense for this to be in mach try, because generally we encourage people to build their code before pushing to try, and code won't build until cargo is satisfied. So it needs to be a standalone workflow/
Nowadays we should have enough support in Cargo that all you need to do is to add [patch.crates-io] pointing directly at the sources in third_party, but if that doesn't work please feel free to file a bug on Cargo!
(In reply to Alex Crichton [:acrichto] from comment #17)
> Nowadays we should have enough support in Cargo that all you need to do is
> to add [patch.crates-io] pointing directly at the sources in third_party,
> but if that doesn't work please feel free to file a bug on Cargo!

It doesn't work, but the problem isn't Cargo.  Mozbuild deletes third_party/rust before calling `cargo update`:

https://searchfox.org/mozilla-central/source/python/mozbuild/mozbuild/vendor_rust.py#290-297

And then `cargo update` fails because it can't find the patched source directory in third_party:

> > ./mach vendor rust
>  0:01.42 rm -rf /Users/myk/Projects/gecko/third_party/rust
> error: failed to resolve patches for `https://github.com/rust-lang/crates.io-index`
> 
> Caused by:
>   failed to load source for a dependency on `test`
> 
> Caused by:
>   Unable to update file:///Users/myk/Projects/gecko/third_party/rust/test
> 
> Caused by:
>   failed to read `/Users/myk/Projects/gecko/third_party/rust/test/Cargo.toml`
> 
> Caused by:
>   No such file or directory (os error 2)

Leaving third_party/rust in place before calling `cargo update` resolves the problem:

> > ./mach vendor rust
>       Adding test v0.1.0 (file:///Users/myk/Projects/gecko/third_party/rust/test)
>     Removing test v0.1.0

And it doesn't seem to cause other problems (on my machine, anyway).  However, we're presumably deleting that directory for a reason.

Ted: it looks like you did that over in bug 1294565, describing the process in bug 1294565, comment 14.  Is it necessary in all cases, or can we stop doing it, at least when vendoring locally?

(I've pushed a prospective patch that stops doing it unconditionally, but I can alter that to do so conditionally, perhaps when a flag is specified.)
(In reply to Myk Melez [:myk] [@mykmelez] from comment #19)
> Leaving third_party/rust in place before calling `cargo update` resolves the
> problem:
> 
> > > ./mach vendor rust
> >       Adding test v0.1.0 (file:///Users/myk/Projects/gecko/third_party/rust/test)
> >     Removing test v0.1.0
> 
> And it doesn't seem to cause other problems (on my machine, anyway). 
> However, we're presumably deleting that directory for a reason.
> 
> Ted: it looks like you did that over in bug 1294565, describing the process
> in bug 1294565, comment 14.  Is it necessary in all cases, or can we stop
> doing it, at least when vendoring locally?

Let's say crate C version Y contains file F.  Crate C version Z deletes F.

We have version Y vendored as third_party/rust/$C.  Does vendoring version Z into third_party/rust/$C correctly delete F if we don't blow away the directory beforehand?  (I don't think `cargo vendor` concerns itself with this scenario, nor should it.)

Alternate, related scenario: we depend on crate C version Y and version Z, so we have:

third_party/rust/$C-$Y
third_party/rust/$C (implicitly version Z)

We remove all dependencies on version Y.  When we re-vendor packages, how do we automatically get rid of third_party/rust/$C-$Y ?

Starting from a clean slate solves lots of issues.
(In reply to Nathan Froyd [:froydnj] from comment #20)
> Let's say crate C version Y contains file F.  Crate C version Z deletes F.
> 
> We have version Y vendored as third_party/rust/$C.  Does vendoring version Z
> into third_party/rust/$C correctly delete F if we don't blow away the
> directory beforehand?

Yes.  On my branch that preserves third_party/rust, if I vendor a test crate C, version Y, which contains file F:

https://github.com/mykmelez/gecko/commit/da6f5524e9cd91f8289a20506549e42153b6e6a4

And then I update the vendoring to version Z, which deletes file F:

https://github.com/mykmelez/gecko/commit/7e39a7e4500de40a7fe2d52f46046f5996ef22ae

Then file F gets removed from the tree.


> Alternate, related scenario: we depend on crate C version Y and version Z,
> so we have:
> 
> third_party/rust/$C-$Y
> third_party/rust/$C (implicitly version Z)
> 
> We remove all dependencies on version Y.  When we re-vendor packages, how do
> we automatically get rid of third_party/rust/$C-$Y ?

With no further changes, vendoring both versions of package C:

https://github.com/mykmelez/gecko/commit/19c153ce570097353af52a93b5521efcc14fef8f

And then un-vendoring version Y:

https://github.com/mykmelez/gecko/commit/740cc03de1b3123ab32150c50b9adca47df4cc34

Results in version Y's directory (third_party/rust/C-0.1.0 in this example) remaining in the tree.

However, if I remove the --no-delete flag from the `cargo vendor` invocation:

https://github.com/mykmelez/gecko/commit/b54f9f4eccc6d5982dba795ae152f4adda9167cb

And then re-vendor:

https://github.com/mykmelez/gecko/commit/09c17f6675a2285c35c63211a5fccd0d3a8658b8

Then version Y's directory gets removed from the tree.


> Starting from a clean slate solves lots of issues.

I can certainly imagine this.  It's why I `mach clobber` when in doubt!  Nevertheless, I haven't been able to find any issues so far in this case.  Are there other scenarios you're concerned about?  Perhaps --no-delete is itself necessary for other reasons?
Comment on attachment 8969762 [details]
Bug 1323557 - preserve vendored crates when re-vendoring; .mielczarek

https://reviewboard.mozilla.org/r/238572/#review244984

I was curious, and as it turns out cargo-vendor did not used to delete unused crates from the vendored directory, but that changed last year:
https://github.com/alexcrichton/cargo-vendor/commit/4e7e56a38fb6bb810816122e577b377d1f186f05#diff-639fbc4ef05b315af92b4d836c31b023

So this should be fine.
Attachment #8969762 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #22)
> Comment on attachment 8969762 [details]
> Bug 1323557 - preserve vendored crates when re-vendoring; .mielczarek
> 
> https://reviewboard.mozilla.org/r/238572/#review244984
> 
> I was curious, and as it turns out cargo-vendor did not used to delete
> unused crates from the vendored directory, but that changed last year:
> https://github.com/alexcrichton/cargo-vendor/commit/
> 4e7e56a38fb6bb810816122e577b377d1f186f05#diff-
> 639fbc4ef05b315af92b4d836c31b023
> 
> So this should be fine.

It looks like Kats added --no-delete to the cargo-vendor invocation in response to that change.  In bug 1369156, comment 23 (and 24), he says: "So I did this locally. Turns out the next step in this yak-shaving adventure is to figure out why cargo-vendor is suddenly deleting a whole pile of crates that we still need… This is (unsurprisingly) a result of https://github.com/alexcrichton/cargo-vendor/commit/4e7e56a38fb6bb810816122e577b377d1f186f05. We'll need to add --no-delete to our invocation of cargo vendor."

I don't see that in my local testing, however.
(In reply to Myk Melez [:myk] [@mykmelez] from comment #23)
> It looks like Kats added --no-delete to the cargo-vendor invocation in
> response to that change.

IIRC I had to do this because at that time, we were running cargo-vendor for each of the top-level crates in m-c (toolkit/library/rust, toolkit/library/gtest/rust, js/whatever, etc.) and so running it for the second crate would delete the deps of the first crate, and so on. Since then, we've combined all the top-level crates into a workspace and just run cargo-vendor once on the workspace so this problem has gone away.
Yeah, the way we invoke cargo-vendor has changed since we fixed bug 1381576 so that should no longer be necessary. We run it just once, so it can have full control over the vendoring directory.
Comment on attachment 8969762 [details]
Bug 1323557 - preserve vendored crates when re-vendoring; .mielczarek

https://reviewboard.mozilla.org/r/238572/#review245300

::: python/mozbuild/mozbuild/vendor_rust.py:297
(Diff revision 1)
>  
>          # We use check_call instead of mozprocess to ensure errors are displayed.
>          # We do an |update -p| here to regenerate the Cargo.lock file with minimal changes. See bug 1324462
>          subprocess.check_call([cargo, 'update', '-p', 'gkrust'], cwd=self.topsrcdir)
>  
>          subprocess.check_call([cargo, 'vendor', '--quiet', '--no-delete', '--sync', 'Cargo.lock'] + [vendor_dir], cwd=self.topsrcdir)

Oops, I missed this, but you will need to remove the `--no-delete` here as well. cargo-vendor should only delete crates that are no longer necessary.
Pushed by myk@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1374936db018
preserve vendored crates when re-vendoring; r=ted.mielczarek
Assignee: nobody → myk
Status: NEW → ASSIGNED
https://hg.mozilla.org/mozilla-central/rev/1374936db018
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Blocks: 1459661
After landing this, I realized that a fix was needed in cargo-vendor for dependencies vendored from crates.io (i.e. most vendored crates).  (My tests didn't catch the issue because they were vendoring crates from a GitHub repo.)

That fix landed in https://github.com/alexcrichton/cargo-vendor/pull/71, and acrichto has published an updated version of cargo-vendor.  Before using the fix in this bug, install the new version of cargo vendor:

cargo install --force cargo-vendor

Then you can patch a vendored crate:

[patch.crates-io]
zip = { path = "third_party/rust/zip" }

And re-vendor to update Cargo.lock:

$ mach vendor rust
      Adding zip v0.3.1 (file:///path/to/workdir/third_party/rust/zip)
    Removing zip v0.3.1
You need to log in before you can comment on or make changes to this bug.