Closed Bug 1656993 Opened 11 months ago Closed 10 months ago

Create and require a top-level `mach` virtualenv in `~/.mozbuild`

Categories

(Firefox Build System :: General, task)

task

Tracking

(firefox81 fixed)

RESOLVED FIXED
81 Branch
Tracking Status
firefox81 --- fixed

People

(Reporter: rstewart, Assigned: rstewart)

References

Details

Attachments

(1 file, 1 obsolete file)

No description provided.

In two different places we've been encountering issues regarding 1) how we configure the system Python environment and 2) how the system Python environment relates to the virtualenvs that we use for building, testing, and other dev tasks. Specifically:

  1. With the push to use glean for telemetry in mach, we are requiring (or rather, strongly encouraging) the glean_sdk Python package to be installed with bug 1651424. mach bootstrap upgrades the library using your system Python 3 in bug 1654607. We can't vendor it due to the package containing native code. Since we generally vendor all code required for mach to function, requiring that the system Python be configured with a certain version of glean is an unfortunate change.

  2. The build uses the vendored glean_parser for a number of build tasks. Since the vendored glean_parser conflicts with the globally-installed glean_sdk package, we had to add special ad-hoc handling to allow us to circumvent this conflict in bug 1655781.

  3. We begin to rely more and more on the zstandard package during build tasks, this package again being one that we can't vendor due to containing native code. Bug 1654994 contained more ad-hoc code which subprocesses out from the build system's virtualenv to the SYSTEM python3 binary, assuming that the system python3 has zstandard installed.

As we rely more on glean_sdk, zstandard, and other packages that are not vendorable, we need to settle on a standard model for how mach, the build process, and other mach commands that may make their own virtualenvs work in the presence of unvendorable packages.

With that in mind, this patch does all the following:

  1. Separate out the mach virtualenv_packages from the in-build virtualenv_packages. Refactor the common stuff into common_virtualenv_packages.txt. Add functionality to the virtualenv_packages manifest parsing to allow the build virtualenv to "inherit" from the parent by pointing to the parent's site-packages. The in-virtualenv feature from bug 1655781 is no longer necessary, so delete it.

  2. Add code to bootstrap, as well as a new mach command create-mach-environment to create virtualenvs in ~/.mozbuild.

  3. Add code to mach to dispatch either to the in-~/.mozbuild virtualenvs (or to the system Python 3 for commands which cannot run in the virtualenvs, namely bootstrap and create-mach-environment).

  4. Remove the "add global argument" feature from mach. It isn't used and conflicts with (3).

  5. Remove the --print-command feature from mach which is obsoleted by these changes.

This has the effect of allowing us to install packages that cannot be vendored into a "common" place (namely the global ~/.mozbuild virtualenvs) and use those from the build without requiring us to hit the network. Miscellaneous implementation notes:

  1. We allow users to force running mach with the system Python if they like. For now it doesn't make any sense to require 100% of people to create these virtualenvs when they're allowed to continue on with the old behavior if they like. We also skip this in CI.

  2. We needed to duplicate the global-argument logic into the mach script to allow for the dispatch behavior. This is something we avoided with the Python 2 -> Python 3 migration with the --print-command feature, justifying its use by saying it was only temporarily required until all mach commands were running with Python 3. With this change, we'll need to be able to determine the mach command from the shell script for the forseeable future, and committing to this forever with the cost that --print-command incurs (namely mach startup time, an additional .4s on my machine) didn't seem worth it to me. It's not a ton of duplicated code.

Assignee: nobody → rstewart
Status: NEW → ASSIGNED
Pushed by rstewart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eff0a199fae6
Create and require by default global `virtualenv`s in `~/.mozbuild` for `mach` r=mhentges,ahal
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch

Are multiple trees (potentially at quite different revisions) going to cause problems by sharing one venv?

Immediately, you shouldn't worry about it. At some point in the future when we start to make stronger guarantees about the state of the ~/.mozbuild virtualenvs, that is potentially a cause for concern though. I can't tell you exactly what the timetable is here (for now, I'm just rolling out the feature and seeing what feedback we get, and we'll make improvements over time as they appear to be safe bets).

The easiest workaround will be to set a per-tree MOZBUILD_STATE_PATH. If you have the MOZBUILD_STATE_PATH environment variable set, that location will be used instead of ~/.mozbuild; you can just configure one for each of your trees and you'll avoid any possible issues due to conflicting revisions. If this isn't feasible for you for any reason, then let me know your use case and we can work on alternatives.

Side note: one of my goals is to prevent SILENT errors due to virtualenv incompatibilities. So if the problem you're describing comes up the goal is to fail mach as fast as possible with a description of what went wrong and how to fix it.

I don't have a specific use case but I use multiple trees all the time (examples: working on different patches in different trees, can only reproduce something in a specific type of build like opt or debug, etc). Using a different mozbuild dir for each tree would mean all of the downloaded things in the mozbuild directory might need to be downloaded again. If there was a env var just for the virtualenv location that might be good.

For other caches in the state dir we ended up using a hash of the working directory path to ensure that state isn't shared between different repos; particularly with git worktree-based workflows people can end up with multiple checkouts on very different revisions (e.g. some trees based on autoland and some based on esr).

I also wonder why running mach create-mach-environment is necessary as a matter of course. Couldn't we detect that the state is missing and just automatically run the setup rather than erroring out (I just discovered that this change broke the wpt sync because all the mach commands just started failing).

To expand on that last part a little; mach commands are approximately public API for interacting with a gecko source tree. I think we should treat them as such and try to avoid making breaking changes where possible, and where not possible at least announce the changes in advance so that consumers have some opportunity to adapt.

After this, I was getting ModuleNotFoundError: No module named 'zstandard' during bootstrap but resolved it with removing .mozbuild/_virtualenvs and re-running.

Regressions: 1660197
Regressions: 1664058

In Ricky's earlier patch, the case in which get_command was set was
removed.
Accordingly, its usage in CommandAction will always be evaluated to
False, and it can be removed.

Comment on attachment 9175735 [details]
Bug 1656993: remove unused get_command logic from CommandAction

Revision D90198 was moved to bug 1665164. Setting attachment 9175735 [details] to obsolete.

Attachment #9175735 - Attachment is obsolete: true
Regressions: 1670167
You need to log in before you can comment on or make changes to this bug.