Closed Bug 506013 Opened 15 years ago Closed 15 years ago

MMGC_GCENTER must use real stack pointer, not address of stack variable

Categories

(Tamarin Graveyard :: Garbage Collection (mmGC), defect, P1)

x86
macOS
defect

Tracking

(Not tracked)

VERIFIED FIXED
flash10.1

People

(Reporter: treilly, Assigned: lhansen)

References

Details

(Whiteboard: Has patch)

Attachments

(5 obsolete files)

Using the the address of the jmp_buf didn't work on windows and is breaking on PPC as well. Need to go back to getting the real stack pointer.
Blocks: 489345
I assume this has been fixed...?
Severity: normal → blocker
Priority: -- → P1
Target Milestone: --- → flash10.1
Flags: flashplayer-qrb+
No its a subtle problem that only manifested on windows but could come back depending on compiler behavior. Say I have: MMGC_GCENTER(gc); GCObject *o = new (gc) GCObject(); ... collection ... o may or may not be in the region the GC is scanning if we use the address of a stack object to determine the stack boundaries. We need to use the stack pointer like the MMGC_GET_STACK_EXTENTS macro does. There are no known instances of this bug in the wild since usually there's interleaving frames between the MMGC_GCENTER and the use of stack only referenced GC objects.
Severity: blocker → normal
Summary: MMGC_GCENTER broken → MMGC_GCENTER must use real stack pointer, not address of stack variable
this has nothing to do with OOM removing dependency
No longer blocks: 489345
Attached patch Implementation for MacTel (obsolete) — Splinter Review
Also cleaned up the use of AVMSYSTEM_, which is not right here - should use the normal ifdefs (MMGC_WHATEVER since we're in MMgc code).
Attachment #417932 - Flags: review?(treilly)
The correct general fix is to factor this as a macro: VMPI_getStackPointer(uintptr_t* sp) which reliably stores a value in *sp that represents the most extreme (usually lowest) address in the current frame that might contain a pointer to a managed object.
Attachment #417932 - Flags: review?(treilly) → review+
With the quicklist code I'm seeing selftest errors on linux also, so I think we really need to fix this.
Blocks: 528338
Comment on attachment 417932 [details] [diff] [review] Implementation for MacTel redux changeset: 3335:52cf465a973e
Attachment #417932 - Attachment is obsolete: true
Factored out the stack base computation too, seemed silly not to. Here's the patch as we talked about. Let me know what you think, notably about the documentation for the two functions, and whether they are implementable as specified. When I think about it, though, one thing puzzles me: Why does VMPI_getStackPointer need to be a macro that uses inline assembler and even then isn't always reliable? (After all we don't control how the compiler allocates stack space, nothing prevents the compiler from allocating stack dynamically.) Would it not be equally reliable (or no more unreliable) and a lot more portable if VMPI_getStackPointer is just a function that returns the address of a local stack variable? Or do we want the flexibility of doing as well as possible, with a per-platform implementation?
Assignee: treilly → lhansen
Status: NEW → ASSIGNED
Attachment #418145 - Flags: review?(treilly)
Attachment #418145 - Flags: review?(treilly) → review+
Note that the selftest ST_mmgc_threads.st tends to provoke this problem. I have commented out that test in the source file. It needs to be re-enabled when the problem has been fixed.
Flags: in-testsuite?
This is better but still regresses Win64 and Solaris, and I don't understand why. They fail smoke testing each in their different way.
Attachment #418145 - Attachment is obsolete: true
watson 2464396 is a instance of this in the wild, win64.
Theoretically the MMGC_GET_STACK_EXTENTS mechanism has two problems (in both the existing incarnation and the rewritten one): - even if it correctly saves callee-registers -- and we only have a heuristic approach to that with longjmp -- it is not guaranteed that the jmp_buf that holds those registers won't be repurposed for other local variables, as its address is not taken and passed to some routine that's not visible to the compiler in the current context - callee-saved registers may be saved into local stack slots and the registers reused for other values before we spill into the jmp_buf. (So the jmp_buf would holds irrelevant values.) This can be a problem if the compiler-generated stack slots holding those saved values are at lower addresses than the stack pointer we compute. The BDW collector handles this differently: the function that flushes callee-save registers to the stack takes a function as an argument which it calls in a non-tail fashion. The latter computes a conservatively correct stack pointer (in a portable fashion), then eagerly scans the "hot" end of the stack to pick up register contents before returning. The hot part of the stack is defined roughly as the difference between SP and the GC entry point (when the GC is running). We could do something similar, except that MMGC_GC_ROOT_THREAD and the way that is used pprobably gets in the way. The BDW collector also has more sophisticated approaches to flushing registers than setjmp, though it does fall back on setjmp on a number of platforms.
On the BDW GC mailing list (http://www.hpl.hp.com/hosted/linux/mail-archives/gc/2007-February/001588.html), Hans Boehm is quoted as saying roughly this: "I've been trying to get away from this issue, particularly in gc7, by increasingly capturing the register set using either __builtin_unwind_init() or getcontext(), which would presumably need to be adjusted anyway when the register set grows. (Sometimes setjmp has also been used in this way, but that sometimes doesn't save a sufficient part of the register set.) Unfortunately, none of this is 100% portable. Thus we're stuck using whatever is available, which sometimes involves explicitly traversing the set of known registers, and ficing that code when we have to." and later in the thread this: "I think setjmp implementations generally have two choices: 1) Save the callee-save registers in the buffer. Longjmp restores the state. 2) Save little more than the position on the stack in the buffer. Longjmp unwinds the stack, restoring callee-saves registers as it goes. (Type 2 implementations probably perform better, since they do less work in setjmp, but can't handle certain kinds of (ab)uses.) Type (1) implementations are more common, but not universal. Type (2) implementations save way too little state for the GC, which is why it now prefers something like getcontext if it's available."
This slices the problem differently by making the saving of register content a porting interface, which I think is roughly right, and following the BDW collector by calling to a subroutine while the saved registers are definitely on the stack.
Attachment #418177 - Attachment is obsolete: true
I can split this patch if you think it's unwieldy, but it has the following: - Removes MMGC_GC_ROOT_THREAD and MMGC_GET_STACK_EXTENTS, they are not workable in practice. - Introduces VMPI_getThreadStackBase to compute the highest stack address for the calling thread; this code was previously an #ifdef nest in GC.cpp and partly replaces MMGC_GET_STACK_EXTENTS. - Introduces VMPI_callWithRegistersSaved to invoke a function when the register state has been reliably flushed to the stack and a conservative stack pointer can be obtained. Partly replaces MMGC_GET_STACK_EXTENTS. - Introduces GC::CreateRootFromCurrentStack to replace MMGC_GC_ROOT_THREAD - Re-enables the threads selftest that was previously disabled - Drive-by fixes to the selftest system; AVMPLUS_SELFTEST is obsolete, the new name is VMCFG_SELFTEST, but this was not caught in the rewrite Anyway previous comments on this bug motivate the split into functions and abolishing the macros, I think. After landing a little bit of cleanup needs to be done in the Flash Player to remove uses of MMGC_GC_ROOT_THREAD but it all looks like local rewrites.
Attachment #420719 - Attachment is obsolete: true
Attachment #420893 - Flags: review?(treilly)
Whiteboard: Has patch
The first 3 lines of GetStackTop should be removd, ie it should just return GetStackEnter, the MMGC_GCENTER system is used everywhere now.
Attachment #420893 - Flags: review?(treilly) → review+
Nevermind I see the performance code your talking about, we don't need VMPI_getThreadStackBase and should remove it in the future.
(In reply to comment #16) > The first 3 lines of GetStackTop should be removd, ie it should just return > GetStackEnter, the MMGC_GCENTER system is used everywhere now. (In reply to comment #17) > Nevermind I see the performance code your talking about, we don't need > VMPI_getThreadStackBase and should remove it in the future. This is intriguing, but I'm not sure what to conclude from these two comments (especially the last one). Could you elaborate some more? Should I leave things be as they are in the patch now or should I remove VMPI_getThreadStackBase as you propose?
Comment on attachment 420893 [details] [diff] [review] Factor VMPI_callWithRegistersSaved and VMPI_getThreadStackBase, v2 Pushed to TR: changeset: 3483:b19b1793f01c changeset: 3484:60f29cabb614
Attachment #420893 - Attachment is obsolete: true
I'll close this but I'll open another one to track removal of VMPI_getThreadStackBase and we can move the discussion there.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Engineering work item. Marking as verified.
Status: RESOLVED → VERIFIED
(In reply to comment #21) > Engineering work item. Marking as verified. This was a serious bug that manifested itself in subtle ways on various platform, not a work item (the way I understand that term). Just for the record...
Are there any vm/abs testcases that can be written to verify the fix and include in our suite?
(In reply to comment #23) > Are there any vm/abs testcases that can be written to verify the fix and > include in our suite? Tricky. The quick list work (in bug #528338) uncovered this on some platforms, but this patch then landed before that, so the bug was never visible in tamarin-redux, only locally to me. Quoting Tommy above, "Watson 2464396 is a instance of this in the wild, win64". The Win64 problem was also reported to me by email from the APE group (Jim Sandman), and they were using stock FlashRuntime.
in-testuite+, selftest changes made in changeset 3483:b19b1793f01c
Flags: in-testsuite? → in-testsuite+
NB Lars opened bug 539401 to track VMPI_getThreadStackBase's utility/removal.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: