Closed Bug 242518 Opened 20 years ago Closed 18 years ago

CheckStackGrowthDirection Assertion

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: dougt, Assigned: brendan)

References

Details

(Whiteboard: not holding RC1 for this - but will take a patch if it appears)

Attachments

(4 files)

I have compiled mozilla using VC7.net running XP.   At startup, I see a fatal
assertion during a call to JS_SetThreadStackLimit().  Apparently, this assertion
is failing!:

JS_ASSERT(&dummy2 < dummy1addr);

In the debugger, I can not access the |int dummy1| in the
JS_SetThreadStackLimit.  The error listed implies that no stack frame was created.  

When I change the function CheckStackGrowthDirection to not use the |static|
keyword, this problem went away.

Doug
Here is the asm with and without the use of the |static| keyword:

With |static| keyword:

JS_PUBLIC_API(void)
JS_SetThreadStackLimit(JSContext *cx, jsuword limitAddr)
{
00CF5A0E  push        ebp  
00CF5A0F  mov         ebp,esp 
00CF5A11  push        ecx  
00CF5A12  push        esi  
#ifdef DEBUG
    int dummy1;

    CheckStackGrowthDirection(&dummy1, limitAddr);
00CF5A13  lea         eax,[ebp-4] 
00CF5A16  lea         ecx,[ebp-4] 
00CF5A19  cmp         eax,ecx 
00CF5A1B  push        edi  
00CF5A1C  mov         esi,offset string "c:/builds/mozilla/trunk7/mozilla"...
(0D56098h) 
00CF5A21  jb          JS_SetThreadStackLimit+28h (0CF5A36h) 
00CF5A23  push        702h 
00CF5A28  push        esi  
00CF5A29  push        offset string "&dummy2 < dummy1addr" (0D562F8h) 
00CF5A2E  call        @ILT+3580(_JS_Assert) (0CF1E01h) 
00CF5A33  add         esp,0Ch 
00CF5A36  mov         edi,dword ptr [limitAddr] 
00CF5A39  lea         eax,[dummy1] 
00CF5A3C  cmp         edi,eax 
00CF5A3E  jb          JS_SetThreadStackLimit+45h (0CF5A53h) 
00CF5A40  push        703h 
00CF5A45  push        esi  
00CF5A46  push        offset string "limitAddr < (jsuword)&dummy2" (0D57168h) 
00CF5A4B  call        @ILT+3580(_JS_Assert) (0CF1E01h) 
00CF5A50  add         esp,0Ch 
#endif


Without |static| keyword:


JS_PUBLIC_API(void)
JS_SetThreadStackLimit(JSContext *cx, jsuword limitAddr)
{
00CF5A0E  push        ebp  
00CF5A0F  mov         ebp,esp 
00CF5A11  push        esi  
#ifdef DEBUG
    int dummy1;

    CheckStackGrowthDirection(&dummy1, limitAddr);
00CF5A12  mov         esi,dword ptr [limitAddr] 
00CF5A15  lea         eax,[limitAddr] 
00CF5A18  push        esi  
00CF5A19  push        eax  
00CF5A1A  call        @ILT+3680(_CheckStackGrowthDirection) (0CF1E65h) 
#endif
...


#ifdef DEBUG
void
CheckStackGrowthDirection(int *dummy1addr, jsuword limitAddr)
{
00CF55F7  push        ebp  
00CF55F8  mov         ebp,esp 
00CF55FA  push        ecx  
    int dummy2;

#if JS_STACK_GROWTH_DIRECTION > 0
    JS_ASSERT(dummy1addr < &dummy2);
    JS_ASSERT((jsuword)&dummy2 < limitAddr);
#else
    /* Stack grows downward, the common case on modern architectures. */
    JS_ASSERT(&dummy2 < dummy1addr);
00CF55FB  lea         eax,[ebp-4] 
00CF55FE  cmp         eax,dword ptr [ebp+8] 
00CF5601  push        esi  
00CF5602  mov         esi,offset string "c:/builds/mozilla/trunk7/mozilla"...
(0D56098h) 
00CF5607  jb          CheckStackGrowthDirection+25h (0CF561Ch) 
00CF5609  push        702h 
00CF560E  push        esi  
00CF560F  push        offset string "&dummy2 < dummy1addr" (0D562F8h) 
00CF5614  call        @ILT+3580(_JS_Assert) (0CF1E01h) 
00CF5619  add         esp,0Ch 
    JS_ASSERT(limitAddr < (jsuword)&dummy2);
00CF561C  lea         eax,[dummy2] 
00CF561F  cmp         dword ptr [limitAddr],eax 
00CF5622  jb          CheckStackGrowthDirection+40h (0CF5637h) 
00CF5624  push        703h 
00CF5629  push        esi  
00CF562A  push        offset string "limitAddr < (jsuword)&dummy2" (0D57168h) 
00CF562F  call        @ILT+3580(_JS_Assert) (0CF1E01h) 
00CF5634  add         esp,0Ch 
00CF5637  pop         esi  
#endif


Notice that with the static keyword, it is as if the function has been inlined.  
Is there a workaround so I can get my build to run?
Severity: normal → blocker
Workaround: use --disable-optimize in your .mozconfig.
Looks like this doesn't happen automatically when you --enable-debug
The debug build doesn't seem to work at all when optimations are enabled, which
I have to manually turn off when I'm using a .mozconfig.

Reverting back to normal severity.
Severity: blocker → normal
*** Bug 243617 has been marked as a duplicate of this bug. ***
I checked in a workaround.  Someone tell me why it's legal in standard C to lie
about addresses of locals, or else report a bug to MS.

/be
The first assertion is gone but it still crashes. First comes an assertion in
_CrtIsValidHeapPointer, then _BLOCK_TYPE_IS_VALID, then DAMAGE: before free
block etc.
Attached patch proposed fixSplinter Review
bbaetz suggested making the stack growth direction testing function be
recursive.  Please let me know how this works, and I'll check it in if it cures
the symptom.

/be
No go, same assertions as with the first one. The stack check doesn't fail, but
the debug heap detects the problems I mentioned in comment #6. I have a feeling
the real problem is somewhere else.
If we're battling the optimizer, aren't we going to loose or endanger of loosing
when it gets smarter? If it's inlining functions is there any real requirement
that the two variables occur "on the stack" in the order they would have been if
not inlined?

This is only going to be safe isf optimization is turned off as Comment 3 suggests.
Do you know what exactly caused this problem? For example build from 20040501
works fine.
> 00CF5A13  lea         eax,[ebp-4]
> 00CF5A16  lea         ecx,[ebp-4]
> 00CF5A19  cmp         eax,ecx

Moving the same address into two register and comparing them will obviously 
give you an equality. I would think this is a compiler bug.

You could change the assertion to jsuword)&dummy2 <= limitAddr but that
bothers me. Alternatively you could try a volatile temporary in
JS_SetThreadStackLimit, something like this:

    int dummy1;
    jsuword volatile dummy2 = limitAddr;

    CheckStackGrowthDirection(&dummy1, dummy2);

Volatility might make the optimizer back off.
20040507 crashes.
Volitile or maybe assigning different values. It still makes me queezy to assume
this will work with future optimizations.

Does the C or C++ standard state any requirement that locals must reside on a
stack or even the same stack? I don't have the books at home, so can't look it up.
Is anyone else seeing the heap corruption that Ere reports?  There's no such bug
on other compiler/OS combinations that I know of.  It's hard to see how any
stack growth direction misconfiguration could lead to it.

/be
20040504 fails.
Build from 20040501 works fine, 20040502 crashes.
I don't know what causes this, but the assertion doesn't fail on 20040501 and
fails on 20040502 so it's either a really bad coincidence or the problems are
related.
Ok, I believe the problem is that --enable-optimize is on as a default (checkin
by cls at 2004-04-30 21:01 bonsai time) as Aaron said in comment #3. Apparently
it's causing more problems than the growth check error with VC++ .NET.
Why are we optimizing debug builds by default? That's going to create a lot of
fun for debuggers.
The default Mozilla build state is now compiler-optimized (--enable-optimize),
non-debug (--disable-debug) by default.  --enable-debug & --disable-optimize are
2 distinct switches as they've always been.  This problem might be fixed by
applying the patch from bug 243079 since NSPR was using the wrong runtime
library in the --enable-optimize --enable-debug case.
Ok, then that explains that, my .mozconfig was created before the change in
defaults, when optimize was disabled by default.

Should debug code be disabled if optimizations are turned on? When would you
want both? I want symbols and optimizations, but not debug code and optimizations.
I use optimization and debug code often enough.  Faster execution with sanity
checks, tastes great, less filling.

/be
Then, would it be worth it to carve the code in question out into a separate
file and compile it with no optimization?
Well, for me it's no problem now that I know I need to use --disable-optimize.
The bigger problem was that it took more than a week to find it out and in the
meantime I've been unable to do anything.
As per https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=135255
StackGrowthDirection() is wrongly inlined sometimes and causes mozilla to not
start.	This patch is from a suggestion made by Ulrich, who says this is what
glibc did for a while to prevent inlining.
Ports will need this on branches.  Reviewing in a minute.

/be
Flags: blocking1.7.x+
Flags: blocking-aviary1.0+
Comment on attachment 162972 [details] [diff] [review]
Fix for inlined stack growth detection under GCC

>+#ifdef __GNUC__

Nit: Extra newline here to space the #ifdef vertically the same as you do the
#else and #endif.

>+#if (__GNUC__ > 2)
>+__attribute__((noinline))
>+#endif
>+static int StackGrowthDirection(int *dummy1addr)
>+{
>+    int *dummy2 = alloca (sizeof (int));
>+
>+    return (dummy2 < dummy1addr) ? -1 : 1;
>+}
>+
>+#else /* __GNUC__ */

Nit: /* !__GNUC__ */ here and after the #endif.

>+
> static int StackGrowthDirection(int *dummy1addr)
> {
>     int dummy2;
> 
>     return (&dummy2 < dummy1addr) ? -1 : 1;
> }
> 
>+#endif /* __GNUC__ */

What to do for Windows?  There's probably an MS no-inline pragma.  Anyone know
it?

/be
Should be able to tag the method as __declspec(noinline) for MSVC.
Assignee: general → brendan
Could we add a JS_HAVE_ALLOCA define to jsosdep.h to choose between the two
versions of the StackGrowthDirection implementation, please?  And while we're at
it, add JS_AVOID_INLINE_DECL (or other suitable name) to choose the __declspec()
or __attribute__() stuff depending on compiler?  Then we remove
platform-specific noise from the source code and put it into configuration files
where it belongs.  (And where odd-platform embedders will look for it.)

--scole 
Whiteboard: not holding RC1 for this - but will take a patch if it appears
I think brendan reported that this might be best fixed on the trunk and have
some shake out there.  renominate if a solid, well tested patch appears and
there is still time in the short aviary 1.0 schedule
Flags: blocking-aviary1.0+ → blocking-aviary1.0-
*** Bug 276285 has been marked as a duplicate of this bug. ***
OS: Windows XP → All
Hardware: PC → All
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.8beta
*** Bug 270878 has been marked as a duplicate of this bug. ***
I think I persuaded dougt to take this one.  Thanks, Doug!

/be
Assignee: brendan → dougt
Status: ASSIGNED → NEW
This bug can back to me because I was talking to brendan about a new assert I
was seeing.  On arm wince, the stack grows down and I assert on:

JS_ASSERT(limitAddr < (jsuword)&dummy2);


Reading over this bug, it seams, imo, like a alot of thought has gone in on
something that really doesn't matter all that much.  :-)  

I would like to remove both limitAddr checks.  if you are porting or cross
compiling, I think the dummy1addr check is good enough to tell if your
misconfigured JS_STACK_GROWTH_DIRECTION.  

Any objection to removing the limitAddr assertions?
dougt, as I said to you I'm fine with losing assertions even to the point of
hardcoding, or using an autoconf test.  Most stacks grow down (anyone know of a
counterexample architecture?).  Thanks,

/be
PA-RISC, i960, rare modes of ARM, Intel 8051, Dec-20... all platforms unlikely
to be running this code though.
Ah, my beloved DECSystem-2060!  Memories....  I loved that architecture.

/be
In a recent checkin for Windows CE compilation issues, I checked in the
following patch.  This resolves the issue that I was seeing. Should we keep
this bug open to completely remove the CheckStackGrowthDirection function?
Yeah, let's clean this up.  Someone (who might just have been cc'd ;-) feel free
to take off my list.

/be
Assignee: dougt → brendan
In version 3.194 of jsapi.c, i fixed/removed this assertion.  This was part of a
Windows CE fixup (bug 242518); the patch was reviewed by shaver.

I think we can mark this bug as fixed.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Doug, I was wondering whether comment 39 ("Should we keep this bug open to
completely remove the CheckStackGrowthDirection function?") meant there was more
to do under this bug.

/be

removes CheckStackGrowthDirection.

maybe we can also rid ourselves of JS_STACK_GROWTH_DIRECTION?
Attachment #186961 - Flags: review?(brendan)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 186961 [details] [diff] [review]
removes CheckStackGrowthDirection

Sure, but check this in -- we'll be that much closer to requiring downward
stack growth.  I'll figure out what to do with the +1/-1 macro.

/be
Attachment #186961 - Flags: review?(brendan)
Attachment #186961 - Flags: review+
Attachment #186961 - Flags: approval1.8b3+
Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.210; previous revision: 3.209
done
Thanks Doug -- I'll do the rest later.

/be
Status: REOPENED → ASSIGNED
Flags: blocking1.7.5+
Flags: blocking-aviary1.0-
Target Milestone: mozilla1.8beta1 → mozilla1.9alpha
We hardcode now, and don't try to check with &local assertions -- to prone to optimization breaking such attempts to prove facts about stack growth direction. If some real platform needs the stack to grow upward, file a new bug.

/be
Status: ASSIGNED → RESOLVED
Closed: 19 years ago18 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: