Closed
Bug 1436811
Opened 6 years ago
Closed 6 years ago
Baldr: reset stack pointer in signal handler on Trap::StackOverflow
Categories
(Core :: JavaScript Engine: JIT, enhancement)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
DUPLICATE
of bug 1428453
People
(Reporter: jolesen, Unassigned)
Details
Currently, the over-recursion stack overflow check for wasm functions will jump to out-of-line code after each function that resets the stack pointer before jumping to the trap handler. This is to make sure the trap handler has enough stack space to run. Now that we have frame pointers everywhere, this functionality can be moved to the signal handler: prologue: push TLSReg push %rbp mov %rsp, %rbp sub #frame, %rsp stack_check: cmp %rsp, limit(TLSReg) jb good ud2 ; TrapSite(StackOverflow) good: ... SIGILL signal handler: When detecting a StackOverflow trap, pop the last frame that failed to allocate by using the frame pointer: %rsp = %rbp + 3*8 ; return address, TLS pointer, frame pointer. %rbp = *%rbp This saves precious executable memory since the stack overflow trap no longer needs its own trap stub after each function. This would also be a first step towards using stack guard pages instead of explicit stack overflow checking: When we get a SIGSEGV from a stack guard page, the signal handler can pop enough frames to make space for the trap handler to run. This is possible with a frame pointer.
Comment 1•6 years ago
|
||
I think you're right, but the change I was planning (when converting StackOverflow from an OldTrap to new-style Trap) was to just have a ud2 that is jumped over by the stack overflow branch. (New traps never have out-of-line paths.) I got hung up on converting the bad-indirect-call-signature trap (because it's within the prologue) but this stack overflow trap should be easy to convert from old to new so I can do that soon if it helps.
Reporter | ||
Comment 2•6 years ago
|
||
The question is where to we rewind the stack pointer after it overflowed. If the function failed to allocate a MB of frame, the trap handler won't be able to run without rewinding the sp.
Comment 3•6 years ago
|
||
Oh sorry, I missed that. So on linux, we use SA_ONSTACK which means we're save to run the signal handler. On OSX, we simply use a separate thread which waits on debug ports. On Windows, we apparently do nothing today, but I just discovered SetThreadStackGuarantee() which could help (seems like breakpad should be setting that regardless...). Regardless, though, I worry that we'll run into OS problems if SP shoots past the end-of-stack when we ud2. Actually, that seems like a vague concern today if a random (e.g., profiling) signal interrupts PC after the increment and before the stack-overflow check. It seems like what we should do is: - if the framePushed is small (<100 bytes or so), don't bother fixing up sp; we have a large amount of space before the actual end of native stack - if the framePushed is large, do the more expensive `(tls->stackLimit - sp) > framePushed)` test *before* incrementing sp, and fault if this fails Thus, there is never any fixup and probably the large-framePushed case overhead is amortized by the actual call. Thoughts?
Comment 4•6 years ago
|
||
Also, once we use ud2 and remove the ool path, it seems good to move this into GenerateFunctionPrologue() (passing in a new 'needsOverflowCheck' bool).
Comment 5•6 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #3) > Oh sorry, I missed that. So on linux, we use SA_ONSTACK which means we're > save to run the signal handler. On OSX, we simply use a separate thread > which waits on debug ports. On Windows, we apparently do nothing today, but > I just discovered SetThreadStackGuarantee() which could help (seems like > breakpad should be setting that regardless...). Ooh, interesting. Breakpad uses SetUnhandledExceptionFilter() [1] to forward unhandled exceptions to a dedicated thread, so it should be safe (like OSX), but our vectored exception handlers are a different story. SetThreadStackGuarantee() only applies to the calling thread so we should probably set it on every thread that could trigger an exception we want to handle. [1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms680634.aspx
Reporter | ||
Comment 6•6 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #3) I think testing `(tls->stackLimit - sp) > framePushed)` in large frames is a good idea. This also avoids the potential integer overflow when a huge frame is larger than the numerical value of the stack pointer so the stack pointer adjustment would wrap around. (I don't know if any OS places stacks that close to the null page). Moving the whole thing into GenerateFunctionPrologue() would certainly make things easier for Cretonne, where modeling this cleanly is tricky.
Comment 7•6 years ago
|
||
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #5) Yeah. On IRC, Ted said we are known to handle stack overflows on Windows. I guess maybe it's usually "ok" or there is a default value that is big enough? (In reply to Jakob Stoklund Olesen [:jolesen] from comment #6) Ok, great. I can do this (and bump it ahead of other patches in my priority queue) or you can if you'd like.
Reporter | ||
Comment 8•6 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #7) > Ok, great. I can do this (and bump it ahead of other patches in my priority > queue) or you can if you'd like. I'll implement my original plan [1] which is essentially the small-frame case handled Cretonne-side. This unblocks me and other users of Cretonne. Then I can simply stop using Cretonne's stack_check instruction once GenerateFunctionPrologue() handles the big frames. [1] https://github.com/stoklund/cretonne/issues/234
Comment 9•6 years ago
|
||
To keep the patch sequence contained, I put the patch in bug 1428453. Should be able to land soon.
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•