Open Bug 1593778 Opened 5 years ago Updated 2 years ago

[mozbuild] Consolidate virtualenv handling code behind a common interface

Categories

(Firefox Build System :: General, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: mars, Unassigned)

Details

We are currently running 4 separate virtualenvs with two virtualenv management systems and two interpreters. This state of the world will persist until the entire source tree's Python 2 and Pipenv code is changed. The Python interface to these virtualenvs in our code would be easier to use if virtualenv use and management was consolidated behind a single interface. A new ProjectInterpreters object accessible as a property of MozbuildObject and mach could do the job.

Some of the virtualenv code's callers want to ensure specific virtualenvs are configured, some callers want to activate virtualenvs, and some callers want to query specific virtualenvs or the currently running interpreter. We also want to re-use virtualenvs whenever possible to save mach execution time and ensure consistent repeatable builds. However, the code that handles these responsibilities is scattered across MozbuildObject, init.configure, mozbuild.virtualenv, mozbuild.python_utils, and mozbuild.python.MachCommands.run_python_tests(). This makes re-use and maintenance challenging. Many of these clusters of virtualenv activity in the codebase overlap, too, making it challenging to find the right place to add new functionality or fix bugs.

A ProjectInterpreters object could help solve these problems. It could be a consolidated interface for mach and mozbuild to manage Pipenv, Python 2, and Python 3.

Some potential benefits:

  • ProjectInterpreters would become a single source of truth for virtualenv names and paths. The special names init and init_py3 would be contained in ProjectInterpreters instead of hard-coded and manipulated across the codebase.
  • The multiple methods and properties of MozbuildObject that handle the different kinds of virtualenvs could move to the ProjectInterpreters object, simplifying MozbuildObject.
  • Code that has to pick a specific interpreter could ask ProjectInterpreters for the necessary information. E.g. project_interpreters.py2, project_interpreters.py3, project_interpreters.testrunner or project_interpreters.running. Once the code has a reference to the virtualenv it wants it can .ensure() the virtualenv exists, ensure it's up to date, activate it, etc.
  • Future changes to the test and build infrastructure to, e.g., change how test virtualenvs are handled (Pipenv) or change the Python version would be contained within the ProjectInterpreters object.

The ProjectInterpreters object could also serve as a consolidated point for the "minimum Python version" magic number. I see the value 3.5.0 scattered through the codebase in various representations (string, tuple, LooseVersion, StrictVersion). It would be nice to consolidate that magic constant at a single location in the codebase.

That change would make a good follow-up bug.

@chmanchester @ahal I'd appreciate your feedback.

Flags: needinfo?(cmanchester)
Flags: needinfo?(ahal)

I don't really have any objections, though I would note a few things:

  1. Pipenv was a hack from a different time and both :davehunt and I would like to get rid of it. Rather than building infrastructure that tries to manage it nicely, I'd recommend blocking on removing it from the tree and using pip-tools + regular virtualenvs instead.

  2. I've been thinking recently that we should abolish init and init_py3 completely. If commands need to install additional out-of-tree packages, then they should create an entirely new virutalenv specific to that command. Trying to share a single virtualenv across commands is a recipe for trouble. Fyi the majority of commands don't activate either init or init_py3.

  3. It feels a little like ProjectInterpreters is in a similar space as pyenv. I wonder if we could use that or some other existing tool to manage the legwork. I know :glandium was adamant that we couldn't rely on pyenv in bootstrapping, but I don't remember why (e.g assuming we vendor it).

But yeah, our Python management could definitely use an overhaul.

Flags: needinfo?(ahal)

(In reply to Andrew Halberstadt [:ahal] from comment #3)

  1. I've been thinking recently that we should abolish init and init_py3 completely. If commands need to install additional out-of-tree packages, then they should create an entirely new virutalenv specific to that command. Trying to share a single virtualenv across commands is a recipe for trouble. Fyi the majority of commands don't activate either init or init_py3.

The init (and soon, init_py) virtualenvs are also used during the build. The PYTHON and PYTHON3 substs point to those virtualenvs, and they are used to run python scripts for GENERATED_FILES and such. What do you suggest we do for that if those virtualenvs go away?

I'd suggest we have a virtualenv called build and make the build system use that instead. Right now any command can call self._activate_virtualenv() and the init venv is activated. My proposal would be to instead call self._activate_virtualenv("<name>"). Basically remove the default venv.

I'm fine with the proposal overall, although I'll note there may be significant reasons for disparities we haven't accounted for yet, so this might be a rather large undertaking. For instance, the configure process doesn't have easy access to a MozbuildObject and may want to call into standalone utility functions. Overall I'd like to see us move decisively in the direction of having fewer special situations to worry about rather than expend effort on them, i.e., getting everything on Python 3 rather than working on the ergonomics of Python 3 and Python 2 co-existing. I'm also partial to Andrew's suggestion of using standard tools wherever possible.

Flags: needinfo?(cmanchester)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.