Implement custom Python module importer for mozilla-central



6 years ago
6 months ago


(Reporter: gps, Unassigned)


(Blocks: 1 bug)


Firefox Tracking Flags

(Not tracked)




6 years ago
mozilla-central has an interesting usage of Python. We have a bunch of Python packages checked into the tree in python/*, build/, testing/mozbase/*, and other misc places. As part of configuring the tree, we create a virtualenv in the "object directory" (the output directory for the build). We don't do |python install| to populate the virtualenv because this results in .py files being copied into the objdir. So, changes to the source files in the tree aren't reflected unless the virtualenv is repopulated. Since people change these files all the time, this extra step would be a burden to the development workflow.

For a while we used |python develop|, which simply creates a .pth file in the virtualenv pointing back to the source directory. But, we didn't like this either because it created a bunch of needless egg-info directories in the source directory. Yuck. So, now we just create the .pth files directly, bypassing (Well, we still call in a few cases, but only where it is needed.)

The biggest remaining problem is that as part of running Python, our source directory gets polluted with pyc files. For those who don't know Python, when Python's default module importer runs, it caches the bytecode output of the parser to a .pyc file in the same directory as the source .py file. When it imports a module, if the pyc file exists and is not older than the source .py file, the pyc file is loaded and parser overhead is skipped. Anyway, we don't like .pyc files in the source tree because of pollution and because they occasionally lead to breakage. For example, if you rename a file and you try to import under the old name and a .pyc for the old name is present, Python will happily import it. Doh!

Unfortunately, Python provides very little control over how pyc files are managed. In Python 2.7+, you have the option to disable bytecode/pyc file writing via PYTHONDONTWRITEBYTECODE/sys.dont_write_bytecode. This could be considered if there isn't an adverse performance impact. AFAIK nobody has measured this to rule this out. But, this really feels like using a chainsaw to perform surgery. In Python 3.x, semantics of pyc files change (they are written to a __pycache__ directory - see but they are still written alongside the source files, so the same problems apply.

Anyway, it appears the only way to work around the limitations in Python while satisfying our development workflow will be to change Python's default module importing behavior to write pyc files to a separate directory tree. Fortunately, we seem to have some control over this.

PEP 302 ( defines importing hooks where we can register a custom module finder and loader. In theory, we should be able to register one in our virtualenv which handles all imports, effectively replacing the built-in importer. Alternatively, we could monkeypatch __builtin__.__import__. But, I /think/ we can accomplish everything via importing hooks.

The trick to importing a finder and loader that replaces the built-in one is ensuring it covers all the cases we need it to. The built-in importer has support for zip files and other arcane functionality. However, I /think/ that we don't need all this extra support. mozilla-central relies on the core Python distribution and whatever is in mozilla-central. So, I /think/ this means we only need to worry about:

1) Importing .py files from traversing a directory
2) Following .pth files
3) Discovering and importing C extension modules when appropriate
4) Honoring sys.path

I've never looked at the full logic of the built-in importer, so I could be horribly naive. FWIW, Python 3.3 moves the default importer into a pure Python module, so looking at the source of Python 3.3 for deciphering behavior might be easier than looking at C.


6 months ago
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.