Closed Bug 1625281 Opened 6 months ago Closed 2 months ago

RUST_PROGRAMS are not kept up to date in dist/bin

Categories

(Firefox Build System :: General, defect)

defect

Tracking

(firefox80 fixed)

RESOLVED FIXED
mozilla80
Tracking Status
firefox80 --- fixed

People

(Reporter: agashlin, Assigned: froydnj)

Details

Attachments

(2 files)

To reproduce, on Windows (the update agent is Windows-only and is the only use of RUST_PROGRAMS that I know of so far):

  1. put ac_add_options --enable-update-agent in mozconfig
  2. ./mach build, there will now be an updateagent.exe in <object dir>/dist/bin
  3. modify toolkit/components/updateagent/src/main.rs (e.g. change "Mozilla" to "Mozilla2").
  4. ./mach build again, the exe in dist/bin will not have changed from step 2. The exe in the Cargo target dir, <object dir>/toolkit/components/updateagent/x86_64-pc-windows-msvc/release/updateagent.exe, is new.
  5. ./mach build a third time, now the exe in dist/bin is up to date

I'm guessing that when the .exe already existed in dist/bin, force-cargo-program-build doesn't get to run before the target that copies from the Cargo target dir to dist/bin.

Just to confirm, because I don't want to get out my Windows machine at the moment, what are the contents of $objdir/toolkit/components/updateagent/backend.mk?

Hm, OK. So the code to install the rust binaries lives in

https://searchfox.org/mozilla-central/source/config/makefiles/target_binaries.mk#10-15

and that code runs this awful looking Make monstrosity:

https://searchfox.org/mozilla-central/source/config/rules.mk#1077-1091

The code to compile the rust binaries lives at:

https://searchfox.org/mozilla-central/source/config/makefiles/rust.mk#356-371

I think what's happening (possibly because of Windows paths?) is that some dependency chains are not coming out correctly, and something inside make is not registering the dependency between "install the rust program" and "compile the rust program" and so you have a race. I think I'm going to have to get out the Windows machine to debug this.

Assignee: nobody → nfroyd

OK, so make -p says:

# makefile (from 'backend.mk', line 12)
RUST_PROGRAMS = x86_64-pc-windows-msvc/release/updateagent.exe

x86_64-pc-windows-msvc/release/updateagent.exe: force-cargo-program-build

../../../dist/bin/updateagent.exe: x86_64-pc-windows-msvc/release/updateagent.exe x86_64-pc-windows-msvc/release/updateagent.exe
	$(install_targets_sanity)
	$(call install_cmd,$(IFLAGS2) '$<' '$(@D)')

target:: x86_64-pc-windows-msvc/release/updateagent.exe
target::
target:: ../../../dist/bin/updateagent.exe

which I believe is a correct dependency graph, since installing the updateagent in dist/bin/ depends on actually compiling the updateagent, which in turn depends on running cargo. I don't see where the failure in compiling the things but then not installing the thing is coming from...unless there are problems with timestamps of the actual files?

After a successful build, what does ls -l --time=style=+%s.%N on dist/bin/updateagent.exe and its corresponding file in toolkit/components/updateagent look like? After an unsuccessful build, what do those timestamps look like?

Thanks for digging into this Nathan! I seem to recall there was a problem like this before (not for Rust) with the path separators going from / to \, but I guess that's not the issue here.

After first ./mach build:

$ ls -l --time-style=+%s.%N obj-x86_64-pc-mingw32/dist/bin/updateagent.exe obj-x86_64-pc-mingw32/toolkit/components/upd ateagent/x86_64-pc-windows-msvc/release/updateagent.exe
-rwxr-xr-x 1 Adam Administrators 288768 1585330496.000000000 obj-x86_64-pc-mingw32/dist/bin/updateagent.exe
-rwxr-xr-x 2 Adam Administrators 288768 1585330826.000000000 obj-x86_64-pc-mingw32/toolkit/components/updateagent/x86_64-pc-windows-msvc/release/updateagent.exe

$ md5sum obj-x86_64-pc-mingw32/dist/bin/updateagent.exe obj-x86_64-pc-mingw32/toolkit/components/updateagent/x86_64-pc-
windows-msvc/release/updateagent.exe
856b484a339b9edda204aca88859a41a *obj-x86_64-pc-mingw32/dist/bin/updateagent.exe
48af19f8d6fb68544f86fbdde3f971d1 *obj-x86_64-pc-mingw32/toolkit/components/updateagent/x86_64-pc-windows-msvc/release/updateagent.exe

After second ./mach build:

$ ls -l --time-style=+%s.%N obj-x86_64-pc-mingw32/dist/bin/updateagent.exe obj-x86_64-pc-mingw32/toolkit/components/upd
ateagent/x86_64-pc-windows-msvc/release/updateagent.exe
-rwxr-xr-x 1 Adam Administrators 288768 1585330826.000000000 obj-x86_64-pc-mingw32/dist/bin/updateagent.exe
-rwxr-xr-x 2 Adam Administrators 288768 1585330826.000000000 obj-x86_64-pc-mingw32/toolkit/components/updateagent/x86_64-pc-windows-msvc/release/updateagent.exe

$ md5sum obj-x86_64-pc-mingw32/dist/bin/updateagent.exe obj-x86_64-pc-mingw32/toolkit/components/updateagent/x86_64-pc- windows-msvc/release/updateagent.exe
48af19f8d6fb68544f86fbdde3f971d1 *obj-x86_64-pc-mingw32/dist/bin/updateagent.exe
48af19f8d6fb68544f86fbdde3f971d1 *obj-x86_64-pc-mingw32/toolkit/components/updateagent/x86_64-pc-windows-msvc/release/updateagent.exe

OK, I can reproduce this here. The minimal repro is to build a few files in the root objdir (.cargo/config, etc.) and then cd to toolkit/components/updateagent, where you can make target, make edits, make target, and make target again. The second one will build but not install.

I can't decide whether we are giving a bad graph to make, or whether make is making a hash out of the graph we do give it. The graph looks like:

          +--- dist/bin/updateagent [---> x86_64/release/updateagent]
          |            |
target ---+            |
          |            v
	  +--- x86_64/release/updateagent ---> force-cargo-program-build

AFAICT from make -n -d target, make sees the "lower" updateagent node, and determines that it has to build the force-cargo-program-build node, which is fine. The dist/bin/updateagent node is then considered, and make considers its input to not be new enough to trigger a rebuild of dist/bin/updateagent... which ignores that x86_64/release/updateagent might have been updated by the previous build step!

Once cargo's updateagent' is rebuilt, everything works out OK; make sees that cargo'supdateagentis already newer thandist/bin/updateagent`, and schedules an installation of the latter.

This seems incredibly weird to me, but I think it is a peculiar consequence of using double-colon rules to hang dependencies off of target (comment 4). The documentation for make says:

Double-colon rules with the same target are in fact completely separate from one another. Each double-colon rule is processed individually, just as rules with different targets are processed.

The double-colon rules for a target are executed in the order they appear in the makefile. However, the cases where double-colon rules really make sense are those where the order of executing the recipes would not matter.

Double-colon rules are somewhat obscure and not often very useful; they provide a mechanism for cases in which the method used to update a target differs depending on which prerequisite files caused the update, and such cases are rare.

I guess the first paragraph is saying that the dependency chains for double-colon rules do not actually interact in any way? So the above vertical dependency doesn't actually exist and the bracketed dependency is the one that's actually considered? (The debugging output seems to support this hypothesis, as the debugging output for each dependency of target is not related to the other(s) by indentation...)

If that is the case, then a) fixing the problem here is a bit involved and b) why haven't we noticed this before? Have people just dealt with the problem and muttered imprecations in the direction of the build system every time?

If it's not the case, then I am at a loss to explain what is going on.

OK, I hacked things to not use double-colon rules and we still get the same result. So...that's terrible. Different debugging output which says that make's internal dependency graph must look different, but same result.

I don't know whether things would be any better if make had visibility into cargo's dependency graph, so that make could tell that cargo's updateagent is obviously going to be rebuilt, or whether we have to gut the dependency graph somehow.

(In reply to Nathan Froyd [:froydnj] from comment #6)

b) why haven't we noticed this before? Have people just dealt with the problem and muttered imprecations in the direction of the build system every time?

I've generally used cargo --package updateagent directly, only building with mach when finally testing config and packaging, so I didn't run into it before. Now that it's in the tree it's a different story.

I should add that updateagent landed, configed off, very recently, and as the first non-testing use of RUST_PROGRAMS it isn't too surprising that there are things like this to shake out. It isn't too hard to run ./mach build && ./mach build for now.

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is -- (Backlog,) indicating it has has not been previously triaged, the bug's Severity is being updated to -- (default, untriaged.)

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is -- (Backlog,) indicating it has has not been previously triaged, the bug's Severity is being updated to -- (default, untriaged.)

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is -- (Backlog,) indicating it has has not been previously triaged, the bug's Severity is being updated to -- (default, untriaged.)

Severity: normal → S3

The severity of these bugs was changed, mistakenly, from normal to S3.

Because these bugs have a priority of --, indicating that they have not been previously triaged, these bugs should be changed to Severity of --.

Severity: S3 → --

Adam, out of curiosity, does applying https://paste.mozilla.org/Ggh3ZyB7 solve the problem of needing to double-build?

Flags: needinfo?(agashlin)

(In reply to Nathan Froyd [:froydnj] from comment #14)

Adam, out of curiosity, does applying https://paste.mozilla.org/Ggh3ZyB7 solve the problem of needing to double-build?

It does seem to solve the problem on my machine, so I am going to go with it.

Flags: needinfo?(agashlin)

Doing this means that make will correctly understand that the underlying
program has changed and therefore it needs to trigger install rules.

Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/df0c703cdb10
force make to update timestamp caches when building rust programs; r=firefox-build-system-reviewers,rstewart

That also solved it for me, thank you!

Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
You need to log in before you can comment on or make changes to this bug.