Closed
Bug 1279312
Opened 8 years ago
Closed 8 years ago
Baldr: Use a hidden argument register for a TLS pointer
Categories
(Core :: JavaScript Engine: JIT, defect, P1)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla50
Tracking | Status | |
---|---|---|
firefox50 | --- | fixed |
People
(Reporter: jolesen, Assigned: jolesen)
References
Details
Attachments
(6 files, 3 obsolete files)
23.46 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
1.47 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
4.69 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
10.00 KB,
patch
|
bhackett1024
:
review+
|
Details | Diff | Splinter Review |
1.80 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
7.85 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
WebAssembly modules need a per-thread per-module TLS data structure that can hold information like: - The current stack limit. - Pointer to global variables. - Heap size. - Heap base. - Wasm TLS variables for the module. This allows us to run multiple instances of the same module without having to specialize the generated code to each instance. The TLS pointer can be passed to compiled wasm functions in a designated ABI register. If the register allocator known that this register is not clobbered by a function call, it is not necessary to reserve the register. It would be permitted to spill the TLS pointer between function calls, making more registers available for computing stuff.
Assignee | ||
Comment 1•8 years ago
|
||
When enabled with CompileArgs::useTLSPointerArgument, WebAssembly functions expect to be passed a hidden argument in WasmTLSPointerReg which is a pointer to a TLSData struct. Temporarily allocate a TLSData instance in the wasm::Module itself. When wasm supports multithreading, we will need to allocate a TLSData instance per thread per module. This patch generates code to pass the TLS pointer to WebAssembly functions, preserving it through intra-module calls. The pointer is not used for anything yet, and the the TLS pointer register is not currently preserved across function calls.
Attachment #8761776 -
Flags: feedback?(luke)
Assignee | ||
Comment 2•8 years ago
|
||
When enabled with CompileArgs::useTlsStackLimit, get the stack limit from TlsData::stackLimit instead of SymbolicAddress::StackLimit. Since the TLS pointer register is available at every function prologue, the over-recursion check is the same cost as using the statically linked address. Initialize the TlsData stack limit during Module::staticallyLink. We probably need to update this limit if the context's stack limit is modified later.
Attachment #8761777 -
Flags: feedback?(luke)
Assignee | ||
Comment 3•8 years ago
|
||
Luke, this is just a prototype. It's passing a TLS pointer around in the right register, but none of the register allocator optimizations are there yet. Note that we're making a copy of the stack limit in staticallyLink. If the stack limit is changes later, the wasm modules won't notice. Do we need a hook to update all the copies of the stack limit? I want to avoid a double indirection in every function prologue, hence all the copies.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → jolesen
Comment 4•8 years ago
|
||
Comment on attachment 8761776 [details] [diff] [review] Pass a TLS pointer hidden argument to WebAssembly functions. Review of attachment 8761776 [details] [diff] [review]: ----------------------------------------------------------------- Agreed with the strategy here, looks good. ::: js/src/asmjs/WasmStubs.cpp @@ +136,5 @@ > + // This is the third argument in the EntryFuncPtr prototype. > + if (usesTlsPointer) { > +#if defined(JS_CODEGEN_X86) > + masm.loadPtr( > + Address(masm.getStackPointer(), EntryFrameSize + masm.framePushed() + 2 * sizeof(void*)), Pre-existing, b/c we already do this for argv, but we really should be using ABIArgIter (like below) to determine where and how to load TLS reg from. Also, it might be good to move this loading of the TLS reg to right before the call so that we don't have to depend on TLSReg being disjoint from ABINonArgReturnReg(1,2) (which are used as scratch registers before the call). ::: js/src/asmjs/WasmTypes.h @@ +774,5 @@ > bool useSignalHandlersForInterrupt; > > + // Do WebAssembly functions take a hidden TLS pointer argument in > + // `WasmTlsPointerReg`? > + bool useTlsPointerArgument; Totally fine to keep this in the patch for testing but I think by the time we'd land it we'd simply have it always on. Or is there a platform dependency that might disable it? I can't think of one.
Attachment #8761776 -
Flags: feedback?(luke) → feedback+
Comment 5•8 years ago
|
||
Comment on attachment 8761777 [details] [diff] [review] Use TLS pointer for the stack limit check. Review of attachment 8761777 [details] [diff] [review]: ----------------------------------------------------------------- Cool, makes sense. ::: js/src/asmjs/WasmTypes.h @@ +779,5 @@ > > + // Should we use the stackLimit in TlsData when checking for over-recursion, > + // or the global AddressOf(StackLimit)? > + // This requires useTlsPointerArgument to be true. > + bool useTlsStackLimit; Same basic point of, for landing, let's have it always on. ::: js/src/jit/CodeGenerator.cpp @@ +8824,5 @@ > + if (useTlsStackLimit) { > + // Get the per-thread stack limit from the `TlsData` struct pointed > + // to by the `WasmTlsPointerReg` hidden argument register. > + masm.branchPtr(Assembler::AboveOrEqual, > + Address(WasmTlsPointerReg, offsetof(wasm::TlsData, stackLimit)), We're also depending here on WasmTlsPointerReg being disjoint from the regs which are clobbered inside GenerateFunctionPrologue (ABINonArgReg(0,1)). In general, we try to informally capture these constraints by putting the necessarily-disjoint registers in Assembler-*.h. See, e.g., WasmTableCall*Reg. I'm thinking maybe the best way is to add WasmTlsPointerReg as a sibling to ABINonArgReg(0,1) since that's the constraint here. Also, perhaps WasmTlsReg would be a shorter-yet-still-accurate name?
Attachment #8761777 -
Flags: feedback?(luke) → feedback+
Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #4) > Comment on attachment 8761776 [details] [diff] [review] > Pass a TLS pointer hidden argument to WebAssembly functions. > ::: js/src/asmjs/WasmStubs.cpp > @@ +136,5 @@ > > + // This is the third argument in the EntryFuncPtr prototype. > > + if (usesTlsPointer) { > > +#if defined(JS_CODEGEN_X86) > > + masm.loadPtr( > > + Address(masm.getStackPointer(), EntryFrameSize + masm.framePushed() + 2 * sizeof(void*)), > > Pre-existing, b/c we already do this for argv, but we really should be using > ABIArgIter (like below) to determine where and how to load TLS reg from. That would be nice, yes. > Also, it might be good to move this loading of the TLS reg to right before > the call so that we don't have to depend on TLSReg being disjoint from > ABINonArgReturnReg(1,2) (which are used as scratch registers before the > call). Since the TLS pointer is arriving in an argument register, we have to capture it before setting up arguments for the call. IntArgReg2 would be clobbered otherwise. > ::: js/src/asmjs/WasmTypes.h > @@ +774,5 @@ > > bool useSignalHandlersForInterrupt; > > > > + // Do WebAssembly functions take a hidden TLS pointer argument in > > + // `WasmTlsPointerReg`? > > + bool useTlsPointerArgument; > > Totally fine to keep this in the patch for testing but I think by the time > we'd land it we'd simply have it always on. Or is there a platform > dependency that might disable it? I can't think of one. No, it seems to work an all platforms. I thought you wanted a dynamic switch for this? Anyway, it's useful for performance testing.
Assignee | ||
Comment 7•8 years ago
|
||
WebAssembly functions now expect to be passed a hidden argument in WasmTlsReg which is a pointer to a TlsData struct. Temporarily allocate a TlsData instance in the wasm::Instance itself. When wasm supports multithreading, we will need to allocate a TlsData instance per thread per module instance. This patch generates code to pass the TLS pointer to WebAssembly functions, preserving it through intra-module calls. The pointer is not used for anything yet, and the the TLS pointer register is not currently preserved across function calls.
Attachment #8771112 -
Flags: review?(luke)
Assignee | ||
Updated•8 years ago
|
Attachment #8761776 -
Attachment is obsolete: true
Assignee | ||
Comment 8•8 years ago
|
||
Get the stack limit from TlsData::stackLimit instead of SymbolicAddress::StackLimit. Since the TLS pointer register is available at every function prologue, the over-recursion check is the same cost as using the statically linked address.
Attachment #8771113 -
Flags: review?(luke)
Assignee | ||
Updated•8 years ago
|
Attachment #8761777 -
Attachment is obsolete: true
Assignee | ||
Comment 9•8 years ago
|
||
WebAssembly functions take a TLS pointer argument and now ensure that the WasmTlsReg register has the same value when they return. This is not yet exploited by the register allocator which still thinks that all registers are clobbered by function calls.
Attachment #8771115 -
Flags: review?(bbouvier)
Assignee | ||
Comment 10•8 years ago
|
||
- Add a virtual isCallPreserved() method to LNode which allows a call instruction to indicate that it preserves the values of some registers. Use this hook in BacktrackingAllocator when processing a call instruction. - Add a preservesTlsReg() property to MAsmJSCall and use this to implement the LAsmJSCall::isCallPreserved() method. - Mark intra-module WebAssembly calls as preserving the TLS pointer register. This change allows the backtracking register allocator to leave the TLS pointer register alone in small functions that don't need it for something else. There are probably more improvements to be done if we need to split the live range of the TLS pointer register. For example, BacktrackingAllocator::splitAcrossCalls() will still split that live range at all calls.
Attachment #8771116 -
Flags: review?(bhackett1024)
Assignee | ||
Comment 11•8 years ago
|
||
Treat WasmTlsReg as a pinned register in code generated by the baseline compiler. This ensures that the register is not clobbered by any Wasm functions. Use the stack limit stored in TlsData instead of the StackLimit symbolic address in baseline prologues.
Attachment #8771117 -
Flags: review?(lhansen)
Assignee | ||
Comment 12•8 years ago
|
||
This symbolic address is not used any longer. It has been replaced by the stackLimit entry in the TlsData struct.
Attachment #8771118 -
Flags: review?(luke)
Updated•8 years ago
|
Attachment #8771116 -
Flags: review?(bhackett1024) → review+
Comment 13•8 years ago
|
||
Comment on attachment 8771117 [details] [diff] [review] Use WasmTlsReg in baseline Wasm compiler. Review of attachment 8771117 [details] [diff] [review]: ----------------------------------------------------------------- Could you add a little prose to the block at the top of the file that defines ScratchRegX86 to note that in addition to not being one of the WasmTableCall registers, ScratchRegX86 should also not be WasmTlsReg? ::: js/src/asmjs/WasmBaselineCompile.cpp @@ +6279,5 @@ > #endif > + // In the baseline compiler we treat WasmTlsReg as a pinned register. > + // In the optimizing compiler it is not pinned, except as a fixed hidden > + // argument register. > + availGPR_.take(WasmTlsReg); So this might be OK for right now but once we add 64-bit support on x86 it may actually cause us to be register-starved since we'll need two registers for every operand. Right now there are six registers available on x86 (after the scratch register was taken out), giving us three pairs to play with. If dropping down to five causes a lot of turbulence in the code generation it will probably be better to save on entry and restore on call-outs. That said, it may be that scratchRegX86 can be part of a scratch-register pair and that we only need five registers to have three pairs, so we'll see. (Hannes is working on 64-bit support.)
Attachment #8771117 -
Flags: review?(lhansen) → review+
Comment 14•8 years ago
|
||
Comment on attachment 8771115 [details] [diff] [review] Preserve TLS pointer register across calls. Review of attachment 8771115 [details] [diff] [review]: ----------------------------------------------------------------- This looks good, but I'm not too sure to understand why this is needed, if the register allocator knows that it must not clobber the TLS register? Won't this force reloads of the TLS register at the end of intra-function calls? Or is it because as calls will split live ranges, then the TLS register can be dead after a call till the end of a function? function g() {} function f() { // TLS alive g(); // TLS dead // MAsmJSVoidReturn in f() makes TLS live again? } ::: js/src/jit/Lowering.cpp @@ +4136,5 @@ > MOZ_CRASH("Unexpected asm.js return type"); > + > + // Preserve the TLS pointer we were passed in `WasmTlsReg`. > + MDefinition* tlsPtr = ins->getOperand(1); > + if (tlsPtr->type() == MIRType::Pointer) In which cases can't it be a pointer? It seems to be it could either be not defined (tlsPtr == nullptr) or the type be MIRType::Pointer. @@ +4149,5 @@ > + auto* lir = new(alloc()) LAsmJSVoidReturn; > + > + // Preserve the TLS pointer we were passed in `WasmTlsReg`. > + MDefinition* tlsPtr = ins->getOperand(0); > + if (tlsPtr->type() == MIRType::Pointer) ditto
Attachment #8771115 -
Flags: review?(bbouvier) → review+
Assignee | ||
Comment 15•8 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #13) > Comment on attachment 8771117 [details] [diff] [review] > Use WasmTlsReg in baseline Wasm compiler. > > Review of attachment 8771117 [details] [diff] [review]: > ----------------------------------------------------------------- > > Could you add a little prose to the block at the top of the file that > defines ScratchRegX86 to note that in addition to not being one of the > WasmTableCall registers, ScratchRegX86 should also not be WasmTlsReg? Good idea, will do. > ::: js/src/asmjs/WasmBaselineCompile.cpp > @@ +6279,5 @@ > > #endif > > + // In the baseline compiler we treat WasmTlsReg as a pinned register. > > + // In the optimizing compiler it is not pinned, except as a fixed hidden > > + // argument register. > > + availGPR_.take(WasmTlsReg); > > So this might be OK for right now but once we add 64-bit support on x86 it > may actually cause us to be register-starved since we'll need two registers > for every operand. Right now there are six registers available on x86 > (after the scratch register was taken out), giving us three pairs to play > with. If dropping down to five causes a lot of turbulence in the code > generation it will probably be better to save on entry and restore on > call-outs. That said, it may be that scratchRegX86 can be part of a > scratch-register pair and that we only need five registers to have three > pairs, so we'll see. Maybe the TLS pointer could be treated as a special kind of local on x86?
Assignee | ||
Comment 16•8 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #14) > Comment on attachment 8771115 [details] [diff] [review] > Preserve TLS pointer register across calls. > > Review of attachment 8771115 [details] [diff] [review]: > ----------------------------------------------------------------- > > This looks good, but I'm not too sure to understand why this is needed, if > the register allocator knows that it must not clobber the TLS register? > Won't this force reloads of the TLS register at the end of intra-function > calls? Or is it because as calls will split live ranges, then the TLS > register can be dead after a call till the end of a function? The idea is to keep the TLS pointer in an SSA value between calls instead of pinning the register everywhere. In code with dense calls, this generates the same machine code as pinning the register once the register allocator understands that the TLS register is preserved by function calls. It will simply choose WasmTlsReg for that live range. In computational code with high register pressure and no calls, however, the register allocator can split the range of the TLS pointer SSA value and spill it to the stack. This frees up the register where it's needed for something else. So yes, if the body of a function has high register pressure, the TLS register will be used for computations and there will be a restore of the TLS pointer before returning. But if the register pressure is low, no spills or even move instructions are needed since the TLS pointer is already in the right register for returning. > > + if (tlsPtr->type() == MIRType::Pointer) > > In which cases can't it be a pointer? It seems to be it could either be not > defined (tlsPtr == nullptr) or the type be MIRType::Pointer. Good catch. This code was left over from an earlier version of the patch where the use of a TLS pointer was controlled by a flag. I'll just delete the 'if'.
Assignee | ||
Comment 17•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4f1a23b3ac50
Comment 18•8 years ago
|
||
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #15) > (In reply to Lars T Hansen [:lth] from comment #13) > > Comment on attachment 8771117 [details] [diff] [review] > > Use WasmTlsReg in baseline Wasm compiler. > > > > Review of attachment 8771117 [details] [diff] [review]: > > ----------------------------------------------------------------- > > Maybe the TLS pointer could be treated as a special kind of local on x86? Yes, I think so too. The medium-term plan for the baseline compiler is to cache locals in registers in basic blocks, and to "pre-cache" (ie not eagerly spill) incoming register arguments, and that would work just fine for the TLS register.
Updated•8 years ago
|
Priority: -- → P1
Updated•8 years ago
|
Attachment #8771113 -
Flags: review?(luke) → review+
Updated•8 years ago
|
Attachment #8771118 -
Flags: review?(luke) → review+
Comment 19•8 years ago
|
||
Comment on attachment 8771112 [details] [diff] [review] Pass a TLS pointer hidden argument to WebAssembly functions. Review of attachment 8771112 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/asmjs/WasmInstance.cpp @@ +422,5 @@ > +void > +Instance::updateStackLimit(JSContext* cx) > +{ > + // Capture the stack limit for cx's thread. > + tlsData_.stackLimit = *(void**)cx->stackLimitAddressForJitCode(StackForUntrustedScript); Looking at Gecko again, I think it's fine to set this once, in the Instance() constructor, passing a JSContext* arg. ::: js/src/asmjs/WasmIonCompile.cpp @@ +1784,5 @@ > const FuncImportGenDesc& funcImport = f.mg().funcImports[funcImportIndex]; > const Sig& sig = *funcImport.sig; > > FunctionCompiler::CallArgs args(f, lineOrBytecode); > + if (!EmitCallArgs(f, sig, false, &args)) Hmm, I'd like to understand why the TLS pointer isn't passed for indirect calls. Currently, in trunk, call_indirect always stays intra-module. Even when inter-module support is added, my understanding is that we'll want the TLS reg live in WasmTlsReg at the callsite so, right before hand, we can push it, load the callee's TLS reg, load the callee's HeapReg/GlobalReg (hopefully not for much longer...), do the call, then after pop the TLS reg (preserving the same call invariant as normal calls) and restore HeapReg/GlobalReg. I also don't understand how, if we don't pass the TLS reg, we don't crash: the callee is expecting WasmTlsReg to be live on entry, right? ::: js/src/asmjs/WasmStubs.cpp @@ +132,5 @@ > if (usesHeap) > masm.loadAsmJSHeapRegisterFromGlobalData(); > > +// Put the per-thread, per-module TLS pointer into WasmTlsReg. > +// This is the third argument in the ExportFuncPtr prototype. nit: comment needs indent
Comment 20•8 years ago
|
||
Comment on attachment 8771112 [details] [diff] [review] Pass a TLS pointer hidden argument to WebAssembly functions. Review of attachment 8771112 [details] [diff] [review]: ----------------------------------------------------------------- Oops, I was misreading "EmitCallImport" as "EmitCallIndirect", hence the massive confusion. Yes, this makes great sense, fantastic patch, thanks!
Attachment #8771112 -
Flags: review?(luke) → review+
Assignee | ||
Comment 21•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=91c496cccf9f
Assignee | ||
Comment 22•8 years ago
|
||
Tests are failing js\src\jit-test\tests\asm.js\testFunctionPtr.js on Windows XP 32-bit with baseline.
Assignee | ||
Comment 23•8 years ago
|
||
I can reproduce the x86 baseline crash of asm.js\testFunctionPtr.js on macOS. A baseline-generated prolog is crashing in the stack limit check because %esi (the WasmTlsReg on x86) is 0. The problem, probably: The baseline compiler's emitCallImport() does not save the pinned WasmTlsReg, and the called imported function is not guaranteed to do so either. While %esi is a callee-saved register in the C calling convention, JIT code does not preserve it. So we need a mechanism for allocating a stack slot for the TLS pointer so it can be saved during import calls. I may as well bite the bullet and treat it as a special kind of local as discussed above.
Comment 24•8 years ago
|
||
Jakob, can't you use the mechanism that is already there that is used for saving HeapReg and GlobalReg on other architectures?
Assignee | ||
Comment 25•8 years ago
|
||
Allocate an additional slot in localInfo_ and use it to save the incoming TLS pointer. When setting up arguments for a function call, get the TLS pointer from that local slot. Also preserve the TLS pointer register by reloading it before returning. This makes the Baseline ABI compatible with the Ion ABI.
Attachment #8773445 -
Flags: review?(lhansen)
Assignee | ||
Updated•8 years ago
|
Attachment #8771117 -
Attachment is obsolete: true
Assignee | ||
Comment 26•8 years ago
|
||
Lars: The new baseline patch simply stores the TLS pointer in a local slot and reloads it when needed. I didn't allocate a slot in the stk_ stack to avoid interfering with wasm semantics. I think if we want better register allocation for the TLS pointer, we should push it onto stk_ before anything else, and leave it there. I guess this is very similar to what you wanted to do for the normal incoming register arguments. The TLS pointer should be able to participate in these future register allocation optimizations. Additionally, sync() could be taught that intra-module calls don't clobber WasmTlsReg.
Assignee | ||
Comment 27•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2221cda52dff
Assignee | ||
Comment 28•8 years ago
|
||
Comment on attachment 8773445 [details] [diff] [review] Use WasmTlsReg in baseline Wasm compiler. Moving review to Luke since Lars is busy this week.
Attachment #8773445 -
Flags: review?(lhansen) → review?(luke)
Comment 29•8 years ago
|
||
Comment on attachment 8773445 [details] [diff] [review] Use WasmTlsReg in baseline Wasm compiler. Review of attachment 8773445 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/asmjs/WasmBaselineCompile.cpp @@ +6656,5 @@ > > const ValTypeVector& args = func_.sig().args(); > > + // localInfo_ contains an entry for every local in locals_, followed by > + // entries for speical locals. Currently the only special local is the TLS s/speical/special/
Attachment #8773445 -
Flags: review?(luke) → review+
Assignee | ||
Comment 30•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/29f104c4c3ce6d8b89d6cead50a2bdab8de42731 Bug 1279312 - Pass a TLS pointer hidden argument to WebAssembly functions. r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/866d7307db91fd01cd26b50aba25b8396d716d93 Bug 1279312 - Use TLS pointer for the stack limit check. r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/6816fc6e9b61b38d2558e672fe9708bf1e4a6a9c Bug 1279312 - Preserve TLS pointer register across calls. r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/e3c513c3208856343feedd686a1997f9832f17fc Bug 1279312 - Handle call-preserved registers in register allocator. r=bhackett https://hg.mozilla.org/integration/mozilla-inbound/rev/d1b51032934a3afa50ac756ae26700084d11f9d3 Bug 1279312 - Use WasmTlsReg in baseline Wasm compiler. r=lth https://hg.mozilla.org/integration/mozilla-inbound/rev/ededa984c0bd6a1ede20619c9ba9c9373daf94a1 Bug 1279312 - Remove SymbolicAddress::StackLimit. r=luke
Comment 31•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/29f104c4c3ce https://hg.mozilla.org/mozilla-central/rev/866d7307db91 https://hg.mozilla.org/mozilla-central/rev/6816fc6e9b61 https://hg.mozilla.org/mozilla-central/rev/e3c513c32088 https://hg.mozilla.org/mozilla-central/rev/d1b51032934a https://hg.mozilla.org/mozilla-central/rev/ededa984c0bd
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox50:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in
before you can comment on or make changes to this bug.
Description
•