Closed Bug 1746390 Opened 3 years ago Closed 3 years ago

"pip check" is fatal when building >=firefox-95

Categories

(Firefox Build System :: Mach Core, defect, P3)

Firefox 95
defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1755516

People

(Reporter: juippis, Unassigned)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0

Steps to reproduce:

Try to build firefox from source with pip installed, and pip check not being successful.

Actual results:

Build failed with:
Traceback (most recent call last):
File "/var/tmp/portage/www-client/firefox-95.0/work/firefox-95.0/./mach", line 167, in <module>
main(sys.argv[1:])
File "/var/tmp/portage/www-client/firefox-95.0/work/firefox-95.0/./mach", line 159, in main
mach = check_and_get_mach(os.path.dirname(os.path.realpath(file)))
File "/var/tmp/portage/www-client/firefox-95.0/work/firefox-95.0/./mach", line 146, in check_and_get_mach
return load_mach(dir_path, mach_path)
File "/var/tmp/portage/www-client/firefox-95.0/work/firefox-95.0/./mach", line 134, in load_mach
return mach_initialize.initialize(dir_path)
File "/var/tmp/portage/www-client/firefox-95.0/work/firefox-95.0/build/mach_initialize.py", line 291, in initialize
_activate_python_environment(topsrcdir)
File "/var/tmp/portage/www-client/firefox-95.0/work/firefox-95.0/build/mach_initialize.py", line 233, in _activate_python_environment
raise Exception(
Exception: According to "pip check", the current Python environment has package-compatibility issues.

ERROR: www-client/firefox-95.0::gentoo failed (configure phase)

Expected results:

The whole pip check seems to be automagic, without the ability to control it via the build system. If it finds pip being installed, it will run the check. In my opinion pip-combatibility should be a configurable option in the build system.

There are few patches to disable pip check altogether, / not make it crash:
https://git.exherbo.org/desktop.git/tree/packages/net-www/firefox/files/firefox-non-fatal-pip-check.patch
https://828999.bugs.gentoo.org/attachment.cgi?id=758715
https://828999.bugs.gentoo.org/attachment.cgi?id=758913

But I'd hope for a more permanent resolution for this issue.

The Bugbug bot thinks this bug should belong to the 'Firefox Build System::General' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → General
Product: Firefox → Firefox Build System

The whole pip check seems to be automagic, without the ability to control it via the build system. If it finds pip being installed, it will run the check. In my opinion pip-combatibility should be a configurable option in the build system.

There's a couple ways to work around this problem:

  1. Don't set MACH_USE_SYSTEM_PYTHON=1 or MOZ_AUTOMATION=1 (I'm not sure which you're using): these environment variables tell the build system "hey, use packages from an external source", and, to avoid runtime failures, Mach runs pip check to guarantee some stability.
    • By removing the option, Mach will automatically manage its own "Python virtualenv", and will ignore potential issues with the system's environment since it's unused
  2. Fix the issues in the system environment that pip check has identified
  3. Manually create a virtualenv, pip install some packages in it, leave MACH_USE_SYSTEM_PYTHON=1 set, then invoke Mach using that virtualenv's python. pip check will still run, but it'll succeed if you set up the virtualenv right.
Component: General → Mach Core

1: We use MACH_USE_SYSTEM_PYTHON=1 since python is guaranteed to be present, and in general we dislike using bundled libraries. This is a problem on source-based distributions. Trying MACH_USE_SYSTEM_PYTHON=1 MOZ_AUTOMATION=0 unfortunately doesn't pass either.

2: These happen randomly on user's machines. Obviously it'd be ideal if their pip installations were fine, but some are python developers and conflicts are bound to happen.

3: What's the interaction to "when pip is found" vs. "when it's not", since building Firefox is successful without pip. So the system python installation is fine, and we'd always prefer that, disabling anything pip-related always. Pip is not a required dep, so it feels really weird it can crash the build like that.

Network access is disabled during build time, which prevents using "pip install". But I believe I saw the vendored libs being shipped. Still, the system python should be enough for us without problems.

I understand the importance of "pip check" for stability as you said, but could there be an option upstream to not make it fatal? And/Or make everything pip-related behind a common configure option?

in general we dislike using bundled libraries

Note that MACH_USE_SYSTEM_PYTHON=1 doesn't affect whether or not bundled/vendored Python libraries are used: it influences whether non-vendored libraries (such as native libraries like zstandard) are manually pip-installed, or fetched from the existing system. It's primarily used for the use case where network access isn't available within the build context, but packages that would be "pip-installed" are still needed.

These happen randomly on user's machines. Obviously it'd be ideal if their pip installations were fine, but some are python developers and conflicts are bound to happen.

That's fair, but if a conflict exists, it could just as easily manifest as a byzantine error deep within the Firefox build, which would be harder to diagnose from this bug tracker :)
I wish that we could just do pip check [packages-that-we'd use], but I don't believe that's possible.

What's the interaction to "when pip is found" vs. "when it's not", since building Firefox is successful without pip. So the system python installation is fine, and we'd always prefer that, disabling anything pip-related always. Pip is not a required dep, so it feels really weird it can crash the build like that.

Oh, I see, this is a good point. I suppose that there's three cases which can occur:

  1. Regular usage on developer machines: optional native libraries (glean, zstandard, psutil) are pip-installed, then the build occurs
  2. Regular usage on CI: no network access, but "optional" native libraries that are mandatory in the CI context (psutil, sometimes zstandard) are pre-installed at worker creation, then the build runs and is informed to use system packages
  3. Edge case: Don't pip-install at build-time, but also don't attempt to use any of the "optional" packages from the system. This case is triggered when:
    • pip isn't installed, as you've mentioned
    • pip is installed, but none of the "optional" packages are available.

The reasoning for that last case was to "fuzzily" determine whether system packages should be used or not: if an environment provides an optional package (like zstandard), then assume that we should use the environment's package. If no such package is available (such as cases in CI where the python environment was super old), then assume that we should ignore the environment and "make do" with just vendored packages.


Anyways, that's a lot of history and fun details, but the core pieces here are:

  • The build will always have to be able to succeed while only using vendored packages (no pip install, no using system packages).
  • Mach's strategy for consciously ignoring the system python packages is insufficient: the "check if any optional package is available" technique is failing for the portage use case.

We can just expose another environment variable, but there's already some movement towards improving the way Mach uses/ignores "optional" packages at a structural level. Perhaps the solution is to have gentoo continue with its existing Firefox patches until we deploy the more elegant, maintainable solution.

We can just expose another environment variable

Was going to propose something like this, PIP_CHECK_ALLOW_FAIL=1 :)

but there's already some movement towards improving the way Mach uses/ignores "optional" packages at a structural level. Perhaps the solution is to have gentoo continue with its existing Firefox patches until we deploy the more elegant, maintainable solution.

Glad to hear it. Yes, for 95.0.1 I guess the patch will suffice. Thanks for your replies! I hope I've managed to explain our concerns about this current implementation and why it's causing us some headache.

Priority: -- → P3

The severity field is not set for this bug.
:mhentges, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mhentges)
Severity: -- → S4
Flags: needinfo?(mhentges)

In 96.0 looks like export MACH_SYSTEM_ASSERTED_COMPATIBLE_WITH_MACH_SITE=1 does what we want currently, without specifically patching the code.

That's a fair workaround, though I can't guarantee how long it'll remain stable for.
Ideally, an improved situation as alluded to in this comment will land before our mechanism for tracking compatible sites changes.

See Also: → 1755516

I'm going to dupe this over to here, as the proposed MACH_NATIVE_PACKAGE_SOURCE variable should cover the portage use case :)

Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Resolution: --- → DUPLICATE
See Also: 1755516

Thank you for the heads-up and keeping me in the loop! We're currently exporting these,

export MACH_USE_SYSTEM_PYTHON=1
export MACH_SYSTEM_ASSERTED_COMPATIBLE_WITH_MACH_SITE=1
export MACH_SYSTEM_ASSERTED_COMPATIBLE_WITH_BUILD_SITE=1

And as a last resort this was added:

diff --git a/python/mach/mach/site.py b/python/mach/mach/site.py
index 8fef9bfaf8..61c3101c11 100644
--- a/python/mach/mach/site.py
+++ b/python/mach/mach/site.py
@@ -940,6 +940,9 @@ def _assert_pip_check(topsrcdir, pthfile_lines, virtualenv_name):
     If there's an incompatibility, raise an exception and allow it to bubble up since
     it will require user intervention to resolve.
     """
+
+    return True
+
     if os.environ.get(
         f"MACH_SYSTEM_ASSERTED_COMPATIBLE_WITH_{virtualenv_name.upper()}_SITE", None
     ):

So at least pip isn't causing us failures anymore. But happy to have this solved by upstream, if possible. I'll keep an eye on the new bug, and adapt for the upcoming versions.

The fix for this issue had landed here.
You should now be able to replace your patch and environment variables with:
export MACH_BUILD_PYTHON_NATIVE_PACKAGE_SOURCE=none.

Let me know if that works!

(In reply to Mitchell Hentges [:mhentges] 🦀 from comment #11)

The fix for this issue had landed here.
You should now be able to replace your patch and environment variables with:
export MACH_BUILD_PYTHON_NATIVE_PACKAGE_SOURCE=none.

Let me know if that works!

(sorry for the late reply, but...) It does seem like our headaches with broken pip are gone now and this works :) I've removed all of our temporary workarounds and people are reporting it works, even if their pip checks aren't healthy.

Thanks a lot and big congratulations to the Mozilla team for the 100th release!

You need to log in before you can comment on or make changes to this bug.