Closed Bug 151066 Opened 22 years ago Closed 22 years ago

Crash calling 'Variables' in jsparse.c

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 2000
defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: jrgmorrison, Assigned: brendan)

References

Details

(Keywords: crash, js1.5)

Attachments

(6 files, 4 obsolete files)

My win32, optimized, mozilla trunk build crashes with the stack(s) below.
I pulled the source at about 7pm. I believe this is fallout from the checkin
to bug 137000. The crash occurs every time I start the browser.

stephend, I think, has also seen this crash.

/* Crash location inside 'Variables' in jsparse.c. `fun', below, is null */

     if (!js_AddNativeProperty(cx, obj, (jsid)atom,
                               currentGetter, currentSetter,
                               SPROP_INVALID_SLOT,
                               pn2->pn_attrs | JSPROP_SHARED,
crash =>                       SPROP_HAS_SHORTID, fun->nvars)) {
         ok = JS_FALSE;
     }

Stack for the case where 'component.reg' doesn't exist:

Variables(JSContext * 0x00ec6b00, JSTokenStream * 0x00eab018, JSTreeContext * 
0x0012faa4) line 2049 + 3 bytes
Statement(JSContext * 0x00ec6b00, JSTokenStream * 0x00eab018, JSTreeContext * 
0x0012faa4) line 1708 + 10 bytes
Statements(JSContext * 0x00000001, JSTokenStream * 0x00edfc70, JSTreeContext * 
0x0012faa4) line 886 + 14 bytes
js_CompileTokenStream(JSContext * 0x00ec6b00, JSObject * 0x00ea5b78, 
JSTokenStream * 0x00eab018, JSCodeGenerator * 0x0012faa4) line 392 + 12 bytes
CompileTokenStream(JSContext * 0x00ec6b00, JSObject * 0x00ea5b78, JSTokenStream 
* 0x00eab018, void * 0x00ec6b80, int * 0x00000000) line 2846 + 19 bytes
JS_CompileFileHandleForPrincipals(JSContext * 0x00ec6b00, JSObject * 
0x00ea5b78, const char * 0x00ec50b8, _iobuf * 0x7803bc10, JSPrincipals * 
0x00ebef0c) line 3026 + 13 bytes
mozJSComponentLoader::GlobalForLocation(mozJSComponentLoader * const 
0x0012faa4, const char * 0x00ee7fd0, nsIFile * 0x00eac3a0) line 1187 + 28 bytes
mozJSComponentLoader::ModuleForLocation(mozJSComponentLoader * const 
0x0012faa4, const char * 0x00ee7fd0, nsIFile * 0x00eac3a0) line 999
mozJSComponentLoader::AttemptRegistration(mozJSComponentLoader * const 
0x0012faa4, nsIFile * 0x00eac3a0, int 0x00000000) line 835
mozJSComponentLoader::AutoRegisterComponent(mozJSComponentLoader * const 
0x00e9adf8, int 0x00000000, nsIFile * 0x00eac3a0, int * 0x00000001) line 769 + 
43 bytes
mozJSComponentLoader::RegisterComponentsInDir(mozJSComponentLoader * const 
0x0012faa4, int 0x00000000, nsIFile * 0x00eac3a0) line 593
mozJSComponentLoader::AutoRegisterComponents(mozJSComponentLoader * const 
0x00e9adf8, int 0x00000000, nsIFile * 0x00e8eb20) line 547
nsComponentManagerImpl::AutoRegisterImpl(nsComponentManagerImpl * const 
0x0012faa4, int 0x00000000, nsIFile * 0x00000000, int 0x00000001) line 3151
nsComponentManagerImpl::AutoRegister(nsComponentManagerImpl * const 0x1002e6f2, 
int 0x0025d774, nsIFile * 0x00000000) line 3049 + 18 bytes
nsCOMPtr_base::assign_from_helper(nsCOMPtr_base * const 0x0012faa4, const 
nsCOMPtr_helper & {...}, const nsID & {...}) line 78
main1(int 0x00000001, char * * 0x00252ba8, nsISupports * 0x00252bd0) line 1305 
+ 8 bytes
main(int 0x00000001, char * * 0x00252ba8) line 1805 + 26 bytes
WinMain(HINSTANCE__ * 0x00400000, HINSTANCE__ * 0x00400000, char * 0x00133338, 
HINSTANCE__ * 0x00400000) line 1825 + 23 bytes
MOZILLA! WinMainCRTStartup + 308 bytes


Stack for the case where 'component.reg' does already exist:

Variables(JSContext * 0x00f5c188, JSTokenStream * 0x01943ce0, JSTreeContext * 
0x0012f6c8) line 2049 + 3 bytes
Statement(JSContext * 0x00f5c188, JSTokenStream * 0x01943ce0, JSTreeContext * 
0x0012f6c8) line 1708 + 10 bytes
Statements(JSContext * 0x00f5c188, JSTokenStream * 0x01943ce0, JSTreeContext * 
0x0012f6c8) line 886 + 14 bytes
js_CompileTokenStream(JSContext * 0x00f5c188, JSObject * 0x00ebae20, 
JSTokenStream * 0x01943ce0, JSCodeGenerator * 0x0012f6c8) line 392 + 12 bytes
CompileTokenStream(JSContext * 0x00f5c188, JSObject * 0x00ebae20, JSTokenStream 
* 0x01943ce0, void * 0x00f5c208, int * 0x00000000) line 2846 + 19 bytes
JS_CompileUCScriptForPrincipals(JSContext * 0x00f5c188, JSObject * 0x00ebae20, 
JSPrincipals * 0x00f2efbc, const unsigned short * 0x01959018, unsigned int 
0x00000b56, const char * 0x0012f80c, unsigned int 0x00000001) line 2926 + 13 
bytes
nsJSContext::CompileScript(nsJSContext * const 0x00000000, const unsigned short 
* 0x01959018, int 0x00000b56, void * 0x00ebae20, nsIPrincipal * 0x00f2efbc, 
const char * 0x0012f80c, unsigned int 0x00000001, const char * 0x01047284 
`string', void * * 0x019803e8) line 787 + 27 bytes
nsXULPrototypeScript::Compile(nsXULPrototypeScript * const 0x0012f6c8, const 
unsigned short * 0x01959018, int 0x00000b56, nsIURI * 0x01945078, int 
0x00000001, nsIDocument * 0x00ef7858, nsIXULPrototypeDocument * 0x0190b08c) 
line 5465
nsXULDocument::OnStreamComplete(nsXULDocument * const 0x01945078, 
nsIStreamLoader * 0x0197a0e0, nsISupports * 0x00000000, unsigned int 
0x00000000, unsigned int 0x00000b56, const char * 0x0196f3a8) line 5867 + 36 
bytes
nsStreamLoader::OnStopRequest(nsStreamLoader * const 0x00000000, nsIRequest * 
0x00ef788c, nsISupports * 0x00000000, unsigned int 0x00000000) line 163
nsJARChannel::OnStopRequest(nsJARChannel * const 0x0195dccc, nsIRequest * 
0x019586ec, nsISupports * 0x00000000, unsigned int 0x00000000) line 606 + 28 
bytes
nsOnStopRequestEvent::HandleEvent(nsOnStopRequestEvent * const 0x0012f6c8) line 
213
PL_HandleEvent(PLEvent * 0x01973fcc) line 597
PL_ProcessPendingEvents(PLEventQueue * 0x10031ae9) line 526 + 6 bytes
_md_EventReceiverProc(HWND__ *
My Linux debug build is fine.  Weird.

/be
brendan backed out rev 3.105 of jsinterp.c which avoids the crash in my 
win32 opt. build, pending further investigation.
                                                                      
I pulled source at the same time as John, but my debug Mozilla
builds launch OK on both WinNT and Linux -
What's more, jrgm's crash makes no sense to me.  I cannot see how fun could be
null at the place implicated by the stack.  But I believe jrgm.  So I'm insane.

/be
Just to chime in with support for John's mental health condition - I saw this
yesterday as well, same exact stack.  In fact, I think I saw it before John, but
at that point, I was the only one who did see it, so I neglected to file
(pending confirmation).
Great. A win32 debug build does _not_ crash, but a full clobber of my win32 
optimized build still crashes with rev 3.105 or jsinterp.c. Perhaps
an uninitialized variable...
This would appear to be an optimizer bug (or abuse-of-optimizer). Rev 3.105
of jsinterp.c collapsed js_GetArgument and js_GetLocalVariable to simply
do 'return JS_TRUE;'. Both are then used as function pointers in jsparse.c.
This seems to get mangled somehow.

With rev 3.105, at the crash in Variables, |currentGetter| and |currentSetter|
have the same value and both (according to the msvc debugger) point to 
js_GetLocalVariable. |fun| is null.

But when I stuck a printf into each of js_GetArgument and js_GetLocalVariable,
the crash did not happen, currentGetter and currentSetter had different values,
and fun (presumably) points to a valid JSFunction.
Someone file a bug with M$.  They're in violation of ISO C.

I guess I'll hack an #ifdef'd solution when I get back in town on Thursday.

/be
Blocks: 137000
Severity: blocker → critical
Keywords: js1.5, mozilla1.0.1
Blocks: 149801
Ok, this ought to do it.  jgrm, can you test?  I don't have a windows build
handy right now.  Still waiting to hear back from email to jband and rpotts
about the best way to test the MSVC optimizer level.

/be
I don't know of any way to detect optimization levels at build time with the MS
compiler. The only MS predefines I know of are listed here:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/HTML/_predir_predefined_macros.asp

FWIW, unless you've verified for sure that this is only a problem when _MSC_VER
== 1200, I'd suggest leaving that version test out and just pay the cost for all
MS compilers.

Also FWIW, I took a stab at finding this problem in the MS public bug info. I
didn't find this actual problem. But - 
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q254226
- has an example of a #pragma to turn off an optimization at function level.
Someone *might* want to fiddle with that approach.
I tried the JS_MSVC_HACK_AROUND patch, and while I didn't crash and could start 
and run, something in JS was obviously broken (i.e., profile manager didn't 
show me any profile names, prefs panel did not set pref values when opened and
the pref dialog would not close once opened). I remember that when I stuck 
the printf's in the functions to make them unique, I had to set it in both 
or I was seeing similar bustage. (And creating a calling JS_MSVC_HACK_AROUND2
seems to work, but that's getting pretty ugly). 

I tried to play around with |#pragma optimize(STR, off)| around the 
JS_GetArgument and js_GetLocalVariable. I tried setting both "g" and the 
special "" (everything) to off, and then back on. I only managed to move 
the crash elsewhere in the code.

I tried the various subvalues of /O1, namely /Og /Os /Oy /Ob1 /Gs /Gf /Gy with
the test program. The setting that causes f() and g() to get the same address
is /Gy, if that's a clue to any compiler gurus.
for STR, "g", 
> for STR, "g", 

Ignore that line. Just bad editing of comments.
Attached file reduced C test program
Here's the test program I wrote, with the functions f and g that jrgm
referenced.

/be
Actually, this is more of a linker thing that a compiler thing.  I think what
you want is to give the linker the options:
    /opt:ref /opt:noicf
or at any rate, that's what I gather from the online msvc documentation.  If the
link command comes straight from the compiler, then that should be "/link
/opt:ref /link /opt:noicf".  Note that I haven't actually *tried* this, just
surfed around MS's online documentation.

--scole
Hey Steve. Thanks for the pointer (no pun intended). 

The normal, optimized (but not MOZ_PROFILE=1) link command for js3250.dll is:

link /NOLOGO /DLL /OUT:js3250.dll /PDB:js3250.pdb /SUBSYSTEM:WINDOWS \
     <*.obj> js3240.res <*.lib>

where <*.obj> is 'jsapi.obj jsarena.obj jsarray.obj jsatom.obj jsbool.obj
jscntxt.obj jsdate.obj jsdbgapi.obj jsdhash.obj jsdtoa.obj jsemit.obj
jsexn.obj jsfun.obj jsgc.obj jshash.obj jsinterp.obj jslock.obj jslog2.obj
jslong.obj jsmath.obj jsnum.obj jsobj.obj jsopcode.obj jsparse.obj jsprf.obj
jsregexp.obj jsscan.obj jsscope.obj jsscript.obj jsstr.obj jsutil.obj
jsxdrapi.obj prmjtime.obj'

and <*.lib> is 'fdlibm/fdm.lib ../../dist/lib/nspr4.lib
../../dist/lib/plc4.lib ../../dist/lib/plds4.lib kernel32.lib user32.lib
gdi32.lib winmm.lib wsock32.lib advapi32.lib'

This crashes, and js_GetArgument and js_GetLocalVariable have the same address.

However, adding '/OPT:noicf' results in no crash, and functions have different 
addresses. This doesn't appear to affect the (disk) size of js3250.dll.

link /NOLOGO /DLL /OUT:js3250.dll /PDB:js3250.pdb /SUBSYSTEM:WINDOWS \
     /OPT:noicf                                                      \
     <*.obj> js3240.res <*.lib>

See 
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/
_core_.2f.opt.asp for definitions of /opt:ref and /opt:[no]icf

So this appears to fix this (although its use may not be suitable for some 
other reason; perhaps this can be "prelinked" to a smaller unit before linking 
with js3250.dll if we don't want to make this "global" to 
all of js/src). 
It would also be really nice to have some sort of ASSERT in the code that
verifies the fact that these two functions must have different addresses as
well.  (Of course, this bug only shows up when optimizations are turned on and
ASSERT is turned off, so it'd have to be something different...)

This is especially important to us here, because we don't use the standard
makefile anyway (though we do use msvc sometimes), and I would have missed this
potential problem...

--scole
I should note that I was building with mozilla/js sources pulled to the tip,
and jsinterp.c reverted to rev 3.105, which was the version that initially was
reported to crash.

I shold also note that when I mention "prelinking", I'm wildly waving my hands
in the air and wondering if someone knows a clever trick to do something like
that, or whether that is just a stupid comment. :-)
This patch tries a useless export of one of the stub functions, in the hope
that that will distinguish it from the unexported stub.  If that doesn't work
(jrgm, this means you :-), try exporting both stubs.  If *that* doesn't work, I
think we should turn off the optimization in the js/src windows build goop --
I'd welcome a patch from someone with a windows build who can help, should it
come to that.

/be
Attachment #88855 - Attachment is obsolete: true
No joy, even when exporting both stubs. Crash is the same and the addresses
are still folded to be the same.

Options for a build system fix are to link js3250.dll (or a subset of it)
with /opt:noicf or to compile jsinterp.c without setting /Gy (i.e., use 
"/Og /Os /Oy /Ob1 /Gs /Gf" in place of "/O1" as the opt flag for cl jsinterp.c).
Can someone who knows how (I have no clue) please file a bug on MSVC?

/be
Status: NEW → ASSIGNED
Just for lurning, here's a patch for gmake Makefile.in which prevents '/Gy' 
from being used in compiling jsinterp.c. With that patch, and rev 3.105 of 
jsinterp.c, I don't crash and I don't have any (apparent) problems in running
the build. 

But that's a trial balloon and I expect there's a better way to do this (and 
I don't know how to do this for nmake builds).
attachment 90034 [details] [diff] [review] didn't quite work for me (spaces instead of tabs, no quotes
around $(COMPILE_CFLAGS), using path with /cygdrive), but this variant does. 
(The first part I could understand if the patch were cut&pasted, and the last
if you were doing a srcdir build, but I can't understand how the middle one
worked unless there's weird variation between gmake or shell versions.)
Man. I must have attached the wrong patch (intent all the same, but minus
the tabs->spaces and the plus the ''). Also, yeah, I was using a srcdir build.

It also might be nice is we matched the equivalent of /\s-Gy/, i.e., with 
leading whitespace, but I can't find the requisite sed syntax (\s and [:space:]
just get ignored).

Anyways, I also gave some effort to trying some of the pragma's in msvc to see 
if I could get the compiler to see things our way, but no luck.

Anyone have a way to do the gmake patch equivalent in nmake?

Cc'ing rpotts in hope he'll take pity on us nmake losers.

/be
i don't believe that nmake supports the 'modern pattern matching' that gmake
does :-(

i'd suspect that the best we can do is break out the build rule for jsinterp.c :-(
i'll try to attach a patch as soon as possible...

i hope i'm wrong tho :-)
-- rick
I've attached an 'untested patch' for makefile.win which *should* disable /Gy.

I used a cheap trick to avoid dealing with *all* the different possible settings
for $(OPTIMIZER)...

Basically, I add /Od which turns off *all* optimizations, and then I specify the
optimizations that are wanted.  So hopefully this will work...

Otherwise, i'll need to explicitly deal with all the different values that
$(OPTIMIZER) can have :-(

Can someone with an optimized build test this out and let me know what happens?

thanks,
-- rick
Sorry to say, but rpott's patch doesn't quite fix the problem. (The insertion
of the '/Od /Og...' flags has to move below '$(CFLAGS)' since CFLAGS also adds
'/O1' in an optimized build). However, '-Gy' is set by OS_CFLAGS in
config/WIN32 and winds up as part of CFLAGS. The problem is that while setting
'/Od' will unset '/O1', it doesn't appear to unset '-Gy' when set
explicity. So, without a way to munge the CFLAGS (or set it privately for
mozilla/js), this won't work for nmake builds.

But, I think we can fix this by setting some linker flags, and that it is
possibly the right way to fix this. Here's what I (unfortunately only vaguely)
understand.

Enabling '/Gy' allows "the compiler to package individual functions in the
form of packaged functions (COMDATs)."  Linker option /OPT:REF will "exclude
unreferenced packaged functions" and will turn on /OPT:ICF. /OPT:ICF will
"perform identical COMDAT folding", and can be disabled with /OPT:NOICF.

So, the fix is add /OPT:REF /OPT:NOICF to the linker flags for optimized
builds on win32 (for both nmake and gmake builds). I'll attach a diff
comparing the /VERBOSE linker output for the current default build, and a
second build with those options enabled. You can see that the only difference
is the elimation of the ICF's, and that we were only saving 79 bytes anyways
(some of the savings ill-gotten as well). Note that js_GetArgument and 
js_GetLocalVariable were part of a group of five functions that were being
folded together (all of which just return true in the function body).

[What I don't understand is that in the current nmake build system, I don't
see us setting /OPT:REF anywherer, but it seems to be taking effect
nonetheless (based on a comparison of the /VERBOSE output from the linker from
a run with those explictly enabled, and a current default build).]

Maybe pschwartau can run the js regression tests in an optimized gmake (or
nmake) build with jsinterp.c reverted to rev 1.105 (the crashing rev) and see
if this patch produces any problems.

((p.s., the "Discarded" section, visible at the top of the diff, notes dead
code that is thrown out. Or, at least, if I add the function below to
jsinterp.c, I get these lines added in the /VERBOSE linker output:

  Discarded "`string'" (??_C@_0CB@NOEM@unique?5but?5uncalled?5function@) \
            from jsinterp.obj
  Discarded _js_JunkFunction from jsinterp.obj

+ JSBool
+ js_JunkFunction(JSContext *cx, JSObject *obj, jsval id, jsval *vp)
+ {
+     printf("unique but uncalled function");
+     return JS_TRUE;
+ }

However, I'm not clear why, e.g., js_str_escape is discarded, but I didn't
follow all of the trail of defines, etc., to see if it is indeed not
ultimately used. But, to be clear, we were already discarding js_str_escape,
etc., even without this proposed linker change, so adding these options is
not changing that situation.))
This fixes the crash. I have a feeling I may be using the wrong defines and/or
variables in the makefile(s) syntax, so corrections welcome.

Can we see if the JS regression tests are happy in an optimized build?
Attachment #90034 - Attachment is obsolete: true
Attachment #91085 - Attachment is obsolete: true
For the makefile.win case, you 'should' be able to use $(LLFLAGS) instead of
$(LFLAGS)...

$(LLFLAGS) was originally intended for 'local linker flags'...  And config.mak
folds them into LFLAGS (along with OS_LFLAGS and MOZ_LFLAGS)...

-- rick
Thanks, Rick. Setting "LLFLAGS= /OPT:REF /OPT:NOICF" in makefile.win works
correctly. I'll update the makefile patch if/when there are comments on the 
gmake side.
I'd like to get past this for 1.1.  Asa, what do you say?

/be
Blocks: 1.0.1
Keywords: crash
Is there a reason none of these are marked reviewed/super-reviewed?  Is this
going into trunk?  Is this wanted for 1.0.1 as is implied? 
The patch that is going in is attachment 91211 [details] [diff] [review], the change to the makefile. (The
other part of this fix is to restore rev 3.105 of jsinterp.c, which was r=/sr=
on bug 137000).
Can I get r= from someone (several someones who test it seem best to me -- jrgm,
pschwartau?) on attachment 91211 [details] [diff] [review]?  I'll sr= it.

/be
Comment on attachment 91211 [details] [diff] [review]
updated patch to use LLFLAGS in makefile.win; Makefile.in unchanged

sr=bryner
Attachment #91211 - Flags: superreview+
Comment on attachment 91211 [details] [diff] [review]
updated patch to use LLFLAGS in makefile.win; Makefile.in unchanged

my patch, so can't r=jrgm

but, I'll take that as r=brendan and bryner has sr=
Attachment #91211 - Flags: review+
Comment on attachment 91211 [details] [diff] [review]
updated patch to use LLFLAGS in makefile.win; Makefile.in unchanged

a=asa (on behalf of drivers) for checkin to 1.1
Attachment #91211 - Flags: approval+
Fixed on branch and trunk.  Thanks again, jrgm and rpotts.

/be
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Keywords: mozilla1.0.1fixed1.0.1
Resolution: --- → FIXED
Marking Verified - my current WinNT trunk and branch builds
do not crash on startup -
Status: RESOLVED → VERIFIED
Keywords: verified1.0.1
Flags: testcase-
*** Bug 326005 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: