Closed Bug 1566427 Opened 5 years ago Closed 4 years ago

Compiler gating is still not good

Categories

(Core :: JavaScript: WebAssembly, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla76
Tracking Status
firefox76 --- fixed

People

(Reporter: lth, Assigned: lth)

References

Details

Attachments

(2 files)

Bug 1565768 (and bug 1558165 before it) came about because a predicate like IonCanCompile() is only half the story: it checks whether Ion could in principle compile wasm (or in this case asm.js) code on the platform in question. But the other half of the story is whether ion is actually enabled on the platform.

Similarly, HasCompilerSupport() checks whether some compiler is available that could in principle compile wasm code on the platform in question, but does not check whether any of those compilers are enabled/disabled by a pref, and ignores the question of whether the compiler supports the feature set needed by the code that is to be compiled. (In a number of cases the answer will be "no" for Cranelift but we know from Bugzilla traffic that people are enabling Cranelift in Firefox, presumably in Nightly.) That last bit is probably not really fixable but possibly means that we should simply disallow the use of Cranelifts in some contexts.

I see code using these predicates as though they mean something else. For example, code in TryInstantiate in AsmJS.cpp checks HasCompilerSupport(), as if that means anything when we're trying to compile asm.js, but it doesn't, not really.

We should further try to clean these up.

I think fixing this will amount to implementing fallible compilation with a well-understood fallback chain. Even when Cranelift is successful we'll probably want Ion for a long time to support 32-bit platforms and MIPS, and we should just deal with the problem.

If we assume that (a) we can always compile asm.js code as JS and (b) we can always compile non-asm.js wasm code with Rabaldr, then:

  • Odin compilation should in general become fallible with a fallback to JS (not a fallback to Rabaldr)
  • Ion and Cranelift wasm compilation should in general become fallible with a fallback to Rabaldr

In particular:

  • Odin needs to check whether Ion-for-wasm is available on the current platform (here we can assume that Ion supports the necessary feature set but not that Ion-for-wasm is available, eg for ARM64), and if not it needs to fall back to JS.
  • Baldr makes a determination about the first tier to compile for, and if this is Ion or Cranelift then it must be prepared for that tier to fail at some point during compilation with information about unsupported features or an unsupported platform, at which point we'll fall back to Rabaldr

A consequence of this is that every wasm feature will have to be supported for Rabaldr (including SIMD). This is probably OK (modulo SIMD, which is a big job). We could instead designate another compiler (Cranelift is the obvious choice) as the infallible compiler, but since we need baseline compilation for debugging it's probably just as well to make the choice that Rabaldr is it.

Assignee: nobody → lhansen

A very minor consequence of the proposed fallback chain is that some wasm code that is currently rejected by Rabaldr but allowed by Ion must probably be allowed by Rabaldr: this is code with static frame sizes > 256KB. Rabaldr rejects those programs simply because such large frames are implausible, but Ion's static limit is larger I believe.

It would also be nice to clean up the plumbing we have for feature flags, since it's a lot of plumbing per feature.

I'm going to upload my current patches for safekeeping. A couple of patterns are emerging. One is to create predicates that are more knowledgable and use those. The existing IonCanCompile() determines whether Ion in principle works on the platform, but does not take into account whether Ion is actually selected by command line switches. So there is a new predicate, IonAvailable(), which tests IonCanCompile as well as the command line switch for Ion. But Ion really is not available if debugging is selected, because only baseline can be used for debugging. So the command line switch for debugging is tested as well. (More on this below.)

Another pattern is to delegate the knowledge of feature implementation to the compiler itself. For example, HasReftypesSupport() is under #ifdef control, but in the case of ENABLE_WASM_REFTYPES, it calls HasFeature(Feature::Reftypes), which will check for all the compilers that are available whether they support the feature. HasGcSupport() is even more complicated; under ENABLE_WASM_GC it checks both that the GC feature is enabled and that HasFeature(Feature::GCTypes).

These patterns simplify the code and centralize the testing of clusters of attributes that together decide whether a feature is available, but they're not without problems. For example, to get the GC feature using this system one has to specify --wasm-gc --wasm-compiler=baseline, which is a little clunky, and it may lead to more complicated test cases. For another, some attributes are hanging off the realm and sometimes there is no realm (as when the shell is instantiated to perform a wasm compilation in a subprocess during testing). I'm still struggling to understand some of these uses, but it's clear that we can't keep having the code that we have now and something has to change, and more tightly controlling the allowable combinations is probably what we have to do, even if that means usage is more complicated for testing.

Attached file compiler-gating.bundle
Blocks: 1478632
Priority: P3 → P2

First, we introduce <Compiler>Available(JSContext*) to test for the
actual availability of a compiler, irrespective of wasm features that
are enabled. These predicates take into account whether a compiler is
compiled into the executable, whether it supports the hardware,
whether it is selected by options/switches, and whether it can be used
as a result of the runtime environment (Ion and Cranelift are not
available if the debugger is observing the page). We switch to using
these predicates most places that used <Compiler>CanCompile() or
options().wasm<Compiler>(), since those don't tell the full story.

Second, we implement a priority order of the optimizing compilers that
always prioritizes Ion before Cranelift and always makes Cranelift
unavailable if Ion is available. (This priority order may change in
the future or may become platform-dependent. Changing it is trivial,
with the above changes.) The default compiler selection in both
browser and shell remains Baseline+Ion, and to get Cranelift in the
browser, the user must both enable Cranelift and disable Ion in
about:config. (This is a change from the current code, which will
prioritize Cranelift over Ion, since Cranelift is off by default.)

Third, we rename HasCompilerSupport() as HasPlatformSupport(), since
the predicate does not test whether compilers are available, only
whether they are present in the executable and support the hardware.

The intent here is to have as few semantic changes in compiler
selection as possible, though some are inevitable in corner cases, and
those wishing to test Cranelift will run into the changes most often.
A few test cases are included here to try to ensure that things work
as they should.

My current thinking is roughly this:

(a) Compiler availability (patch just submitted) is unaware of experimental features that may or may be selected. The compiler is selected based on #ifdefs, platform, config options, and runtime environment (debugger or not). At most one baseline compiler and one optimizing compiler are selected based on a static and well-understood policy.

(b) Compilation strategy for a given module is based on the module's size and the measured attributes of the hardware, as now, but does not take into account any experimental features that may or may not be selected. We pick either tiered or optimized-only mode for the module.

(c) We require the baseline compiler to support all features any optimized compiler offers. So if we compile baseline-first (in tiered mode) it will succeed if we support the feature at all. But if the tier2 compiler then fails, we can fall back to baseline code. If we compile optimized-first (in optimized-only mode) but the optimizing compiler fails, we cab fall back to the baseline compiler by restarting the compilation on the baseline compiler. In both cases the mode for the module ends up being baseline-only. Notably we do not try the other optimized compiler even if it is present and possible on the hardware because (1) this has no utility in the field and (2) it makes testing harder, not easier.

(d) The set of features that we "offer" is the set implemented by the baseline compiler, and this is the set of features recognized by the verifier too, irrespective of which compilers that are available for compilation. This can lead to surprising results during testing: the verifier will say that the GC feature is available, but if the baseline compiler has been disabled by a command line switch then verification will pass but compilation will fail. I don't think we should try to fix this problem; it is a problem with testing, and the testing just needs to be exposed to more and better predicates. It is fine to throw a "can't compile" error.

Patches for (b) and (c) and (d) forthcoming.

Patch updated.

I want to walk back comment 7. My current thinking is that I'll just land part (a) with the additional wrinkle that compiler selection will take into account the set of experimental features enabled, and call that good for now; that cleans up a number of things and can be refined further. A fallback scheme is too complicated at this time.

A consequence of no fallback is that experimental features that are not supported by Ion cannot be enabled by default in Nightly because they would disable Ion. For example, --wasm-gc really requires the enablement switch, as would SIMD if we land baseline-only SIMD support. But that's the current situation, which is OK for now.

Attachment #9130125 - Attachment description: Bug 1566427 - Improved compiler availability computation. r?bbouvier → Bug 1566427 - Improved compiler availability computation. r=bbouvier
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/59b6e23c9926
Improved compiler availability computation.  r=bbouvier
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/110e70d237da
Backed out changeset 59b6e23c9926 for spidermonkey bustage CLOSED TREE

Error easily fixed, but led to some more testing which found another issue, which is being fixed...

Flags: needinfo?(lhansen)
Blocks: 1620986
Depends on: 1621208
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ce48cc49397b
Improved compiler availability computation.  r=bbouvier
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9d03915989aa
Backed out changeset ce48cc49397b for causing unexpected failures in python/mozbuild/mozbuild/test/configure/lint.py CLOSED TREE

Linting problem with updates to moz.configure. Worked around this by creating a vacuous dependency on nightly, though surely there must be a better way.

Flags: needinfo?(lhansen)
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1313df275548
Improved compiler availability computation.  r=bbouvier
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
Regressions: 1621933

Technically 1621933 is not a regression, it is a bug uncovered by the changes in this patch. (I'm being pedantic.)

See Also: → 1623889
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: