Closed Bug 644900 Opened 15 years ago Closed 15 years ago

Windows requires stack to grow one page at a time

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

Q2 11 - Wasabi

People

(Reporter: wmaddox, Assigned: wmaddox)

References

Details

(Whiteboard: WE:2801119,fixed-in-nanojit)

Attachments

(2 files, 4 obsolete files)

Draft patch -- needs testing 15 years ago William Maddox 3.22 KB, patch		Details \| Diff \| Splinter Review
Insert stack probes to properly grow Windows stack for large frames 15 years ago William Maddox 3.82 KB, patch		Details \| Diff \| Splinter Review
Patch for tr-spicy. Uses old i386 register names (EAX instead of rEAX, etc.). 15 years ago William Maddox 3.21 KB, patch		Details \| Diff \| Splinter Review
Patch for tr-spicy. Uses old i386 register names (EAX instead of rEAX, etc.). 15 years ago William Maddox 3.21 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review
Insert stack probes to properly grow Windows stack for large frames, --stkskip flag for lirasm 15 years ago William Maddox 10.82 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review
Patch for tr-spicy, includes backport of MOVLMI on X64. 15 years ago William Maddox 8.72 KB, patch	rreitmai : review+	Details \| Diff \| Splinter Review

William Maddox

Assignee

Description

•

15 years ago

Windows uses a single guard page for extending the stack. If allocating a frame larger than a single page, special care must be taken that each page is first written in sequential order in the direction of stack growth. See http://support.microsoft.com/kb/100775 . The JIT does not currently respect this requirement.

William Maddox

Assignee

Comment 1

•

15 years ago

Attached patch Draft patch -- needs testing (obsolete) — Details — Splinter Review

Insert a sequence of dummy writes to fault and commit new stack pages. This should be faster for reasonably-sized frames than calling an out-of-line function like Microsoft's __chkstk() function, which supposedly just does something similar.

William Maddox

Assignee

Updated

•

15 years ago

Assignee: nobody → wmaddox

Status: NEW → ASSIGNED

Whiteboard: WE:2801119

Target Milestone: --- → Q2 11 - Wasabi

Dan Smith

Updated

•

15 years ago

Flags: in-testsuite-

Flags: flashplayer-qrb+

Flags: flashplayer-injection-

Flags: flashplayer-bug+

OS: Windows XP → All

Priority: -- → P2

William Maddox

Assignee

Comment 2

•

15 years ago

Attached patch Insert stack probes to properly grow Windows stack for large frames (obsolete) — Details — Splinter Review

Attachment #521722 - Attachment is obsolete: true

William Maddox

Assignee

Comment 3

•

15 years ago

Attached patch Patch for tr-spicy. Uses old i386 register names (EAX instead of rEAX, etc.). (obsolete) — Details — Splinter Review

Attachment #521992 - Flags: review?(edwsmith)

William Maddox

Assignee

Comment 4

•

15 years ago

Attached patch Patch for tr-spicy. Uses old i386 register names (EAX instead of rEAX, etc.). (obsolete) — Details — Splinter Review

Last post was wrong version -- Updated.

Attachment #521992 - Attachment is obsolete: true

Attachment #521992 - Flags: review?(edwsmith)

Attachment #521994 - Flags: review?(edwsmith)

Edwin Smith

Comment 5

•

15 years ago

Comment on attachment 521994 [details] [diff] [review] Patch for tr-spicy. Uses old i386 register names (EAX instead of rEAX, etc.). Is it possible to add a lirasm test case for this? It would probably be better to write a constant to the stack than the contents of EAX. stretching, but: EAX could have a pointer, causing conservative retention. or eax could somehow be attacker-controlled. that reminds me, we do have an api for random bits that could be used, too. nit: house style is braces on top right. conditional R+: use VMPI_getVMPageSize() for the page size instead of hard coding 4096.

Attachment #521994 - Flags: review?(edwsmith) → review+

William Maddox

Assignee

Comment 6

•

15 years ago

Attached patch Insert stack probes to properly grow Windows stack for large frames, --stkskip flag for lirasm — Details — Splinter Review

1) Uses VMPI_getVMPageSize() instead of hard-coded value. 2) A lirasm test case is provided. It is principally useful for examining the verbose disassembly output, but it can be used with the next feature as well. 3) Lirasm is extended to facilitate experimentation. The --stkskip N switch causes Lirasm to push approximately N Kbytes of stack before beginning execution of any code fragments. This is done by simply performing the execution at the end of a chain of recursions in which each recursive call pushes roughly 1Kbyte of memory. (The method declares one local array 1K in length. Frame overhead and spill slots are not accounted for.) With a sufficiently-large choice of N, one can provide that the first fragment will execute near (i.e., within a page) of the committed extent of the stack. It's up to the user to choose an appropriate value for N, so as to keep Lirasm from needing to know the details of its own stack usage, including the invocation of the compiler proper. 4) There's a bit of refactoring in Lirasm to avoid duplicating code with the --stkskip option. parseOptionalInt() handles options that behave similarly to --stkskip and --random.

Attachment #521991 - Attachment is obsolete: true

Attachment #521994 - Attachment is obsolete: true

Dan Smith

Comment 7

•

15 years ago

Cool extension to lirasm. Do we know whether other OSes require sequential page accesses for large stack allocations?

William Maddox

Assignee

Comment 8

•

15 years ago

(In reply to comment #7) > Cool extension to lirasm. > > Do we know whether other OSes require sequential page accesses for large stack > allocations? Googling around a bit, it appears that this is widely recognized as a Windows issue, but I was unable to find a satisfactorily detailed and definititive description how Linux and MacOS manage their user-mode stack. On both Linux and MacOSX, there is a fixed stack limit set in the executable, by a ulimit call (or shell command), or by default. It is typically 8MB. Access outside of the reserved stack region produces a segfault exception. In Windows, there is a similar reservation (1MB by default), but the space is mapped one page at a time on demand. Empirically, I verified that on MacOSX, one can allocate a single frame sufficiently large as to extend into the last reserved page without failure. It is not clear whether the OS is extending the mapped stack region dynamically, or whether the entire reserved area is mapped at the outset, though most likely committed lazily as the pages are touched. By default, at least, GCC does not generate stack probes or anything similar in the function prologues on either MacOSX or Linux, so I suspect they work the same way.

William Maddox

Assignee

Comment 9

•

15 years ago

In multi-threaded programs, threads are generally allocated a fixed-size stack, without the guarantee of an unallocated region following, as would be the case in the usual storage layout of a single-threaded process. Guard pages are used in this case to detect stack overflow, so that overflow does not cause silent corruption. There is an option in GCC to generate stack checks, including guard page probes. If a nanojit host is expecting to use such mechanisms, then probes may be needed in the generated code. The situation differs in Windows in that stack space that is nominally allocated is incrementally committed. At present, it appears to be assumed that nanojit-generated code will not produce a stack overflow, or that the host will explicitly generate LIR to perform an overflow check as we do in Tamarin. BTW, I question whether the check we are doing in Tamarin is adequate. We compare the address of the method frame with the stack limit minstack, but the method frame is allocated prior to the variables. The frame can extend beyond this point, limited only by NJ_MAX_STACK_ENTRY, e.g, 16KB on i386 or 32KB on SPARC. I imagine the only reason we are getting away with this is that we allocate some slop with kStackMargin, which is woefully small on the embedded platforms.

William Maddox

Assignee

Updated

•

15 years ago

Attachment #522065 - Flags: review?(edwsmith)

Edwin Smith

Updated

•

15 years ago

Attachment #522065 - Flags: review?(edwsmith) → review+

William Maddox

Assignee

Comment 10

•

15 years ago

Pushed to nanojit-central: http://hg.mozilla.org/projects/nanojit-central/rev/670889ebdbca

William Maddox

Assignee

Comment 11

•

15 years ago

Attached patch Patch for tr-spicy, includes backport of MOVLMI on X64. — Details — Splinter Review

Unfortunately, the patch for nanojit-central relies on some changes made since the Spicy branch. On i386, the old register name EBP is used instead of rEBP. On X64, the MOVLMI opcode emitter is needed, to generate instructions of the form 'movl disp(r), imm32'. Support for immediate moves to memory was added recently.

Attachment #523501 - Flags: review?(rreitmai)

Rick Reitmaier

Updated

•

15 years ago

Attachment #523501 - Flags: review?(rreitmai) → review+

William Maddox

Assignee

Comment 12

•

15 years ago

Pushed to tr-spicy: http://hg.mozilla.org/users/stejohns_adobe.com/tr-spicy/rev/25d1d0ab335e

William Maddox

Assignee

Updated

•

15 years ago

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Dan Schaffer

Comment 13

•

15 years ago

marking as verified.

Status: RESOLVED → VERIFIED

William Maddox

Assignee

Comment 14

•

15 years ago

This bug should remain open until the change has propagated to tamarin-redux and tracemonkey.

Whiteboard: WE:2801119 → WE:2801119,fixed-in-nanojit

Nicholas Nethercote [inactive]

Comment 15

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/70dc21f4187f(In reply to comment #14) > This bug should remain open until the change has propagated to tamarin-redux > and tracemonkey. Yes. People haven't been following protocol lately. I recently cleaned up https://developer.mozilla.org/en/NanojitMerge which describes the procedure, now might be a good time for everyone to refresh their memories.

Status: VERIFIED → RESOLVED

Closed: 15 years ago → 15 years ago

Tamarin Bot

Comment 16

•

15 years ago

changeset: 6227:0c16b8944742 user: William Maddox <wmaddox@adobe.com> summary: Bug 644900 - Generate probes for stack expansion when allocating large frames on Windows (r=edwsmith) http://hg.mozilla.org/tamarin-redux/rev/0c16b8944742

Ed Morley [:emorley]

Updated

•

15 years ago

Depends on: 656017

You need to log in before you can comment on or make changes to this bug.