Closed Bug 83388 Opened 24 years ago Closed 24 years ago

gcc-2.96 -O2 jsinterp.c busts dialogs (OK with gcc-2.95, gcc-3.0 or just -O)

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
major

Tracking

()

VERIFIED FIXED

People

(Reporter: bbaetz, Assigned: bryner)

References

Details

Attachments

(1 file)

While mozilla is about 10% faster when compiled with -O2 rather than -O (bug 53486), password dialogs didn't work with gcc-2.96. (bryner originally noticed this, and I confirmed it) I got annoyed with having to use a slower build to check mail, and tracked this down. It turns out that compiling jsinterp.c with -O instead of -O2 fixes the problem. With -O2, a dump() at the end of xpfe/global/resources/content/commonDialog.js:commonDialogOnOK() returned the correct results, but something was happening before the results got back to the window watcher. There are two uninitialzed variable warnings for that file, but I don't know if that is relevant. This may turn out to be a compiler bug - someone who knows the JS engine needs to look at it. -O -fstrict-alising works fine, as does -O2 -fno-strict-aliasing. I don't know if that points to an aliasing bug (either in gcc or the engine) or not. dbaron, you build with gcc-3.0. I know that it doesn't work with optimisations on ATM, but can you try just building that one file with -O2, and see if you notice the problem?
Blocks: O2
I build daily with -O2 using gcc-2.95.4 and it works fine.
i see the password bug when building with 02 on RH7.1 (RH gcc 2.96-81) If i'm unfortunate enough to try contact my mailserver during maintainance or whatever, i get a dialog box saying "the service is temporary unavailable - try again in a few minutes". When i then try again, i am prompted for the password. And mozilla does not see the password i key in then. I have to quit moz, start a nightly instead, key in the password at the prompt - and then it gets set right again.
I actually build my optimized build with gcc 2.96 due to bug 76604, so I see this as well, but that doesn't add much information. :-(
cc'ing Patrick, jband -
Ok, so I'm seeing this too (with a fresh-squeezed -O2 build, using 2.96-81). Information that might be interesting: When wallet auto-fills in the username and password for me, in the password dialog, and I click "OK", it works just fine. When wallet *doesn't* auto-fill for me, but I type everything myself, that's when it seems to fail. rogerl: Any ideas?
The variable that's getting messed up en-route to the window watcher is the integer parameter to nsDialogParamBlock::SetString. It should be 7 and 6, but it's 0 and 0: http://lxr.mozilla.org/seamonkey/source/xpfe/global/resources/content/commonDialog.js#235 http://lxr.mozilla.org/seamonkey/source/embedding/components/windowwatcher/src/nsDialogParamBlock.cpp#81
When I put a printf statement to print Int32 values in js/src/xpconnect/src/xpcconvert.cpp line 387, it started working fine (and of course printed the right value). In fact, just a printf for wstrings in the same function seems to fix it.
dbaron: implies that it *might* not even be an jsinterp.c problem. Are you certain your code change did not include compiling with different flags or something? i.e. maybe it was not the addition of the code that mattered at all? Maybe the optimizer has a problem with the cast use of the 'void* d' param? Does it make a difference to apply the same pattern to the nsXPTType::T_I32 case that is used in the other similar cases?: Index: xpcconvert.cpp =================================================================== RCS file: /cvsroot/mozilla/js/src/xpconnect/src/xpcconvert.cpp,v retrieving revision 1.63 diff -u -r1.63 xpcconvert.cpp --- xpcconvert.cpp 2001/05/22 08:42:34 1.63 +++ xpcconvert.cpp 2001/06/03 03:32:30 @@ -392,8 +392,9 @@ *((int16*)d) = (int16) ti; break; case nsXPTType::T_I32 : - if(!JS_ValueToECMAInt32(cx, s, (int32*)d)) + if(!JS_ValueToECMAInt32(cx, s, &ti)) return JS_FALSE; + *((int32*)d) = (int32) ti; break; case nsXPTType::T_I64 : if(JSVAL_IS_INT(s))
FWIW, it is easier for me to imagine that the optimizer in that compiler is doing something wrong in the giant switch statement of jsinterp.c than with the stuff in xpcconvert.cpp. Does the fun casting in XPCConvert::JSData2Native look like illegal C++ to anyone? I have to wonder if the experiments are really focused enough yet to give meaningful clues. I'm not clear on why bbaetz is specifically pointing at jsinterp.c or if dbaron's change is isolated from other changes (intentional or otherwise).
jband: I'm pointing at jsinterp.c because recompiling just that file fixes the problems. What happened is that I tried a dump at the end of the dialog js, and saw that the value was correct. Compiling all of the js dir without -O2 also kept it working, so I just tried compiling individual files with -O2 until it broke... I then did make clean, and just recompiled jsinterp.c without -O2 and it started working. recompiling with -O2 -> break. The fix/breaking of jsinterp is definately repeatable, and unless that .c file in #included somewhere, I can't see how it cvould be any other file. I didn't try all the files though.... However, a debug build of mozilla with just jsinterp.c compiled -O2 didn't break things either. dbaron said that this started happening with a change from about two weeks ago (I presume he means http://bonsai.mozilla.org/cvsview2.cgi? diff_mode=context&whitespace_mode=show&file=jsinterp.c&root=/cvsroot&subdir=mozi lla/js/src&command=DIFF_FRAMESET&rev1=3.80&rev2=3.81 ) dbaron, did backing out that change fix the problem? You did tell me, but I can't remember what you said. Maybe someone who knows x86 assembler could look at the differences in code?
> Does it make a difference to apply the same pattern to the nsXPTType::T_I32 > case that is used in the other similar cases? No. > dbaron said that this started happening with a change from about two weeks ago > (I presume he means http://bonsai.mozilla.org/cvsview2.cgi? > diff_mode=context&whitespace_mode=show&file=jsinterp.c&root=/cvsroot&subdir=mo > zilla/js/src&command=DIFF_FRAMESET&rev1=3.80&rev2=3.81 ) > > dbaron, did backing out that change fix the problem? You did tell me, but I > can't remember what you said. I never said that. It must've been someone else.
Oops. s/dbaron/bryner. Sorry. For more info, I have no problems on my -O2 --enable-debug build.
I said I thought it started happening around that time, but never tested backing out that particular checkin.
Target Milestone: --- → Future
This doesn't seem to be a problem with gcc3 (although I was compiling on a different machine, I admit).
beard, why did you Future this? Netscape certainly shouldn't be shipping anything build with -O rather than -O2; and I wouldn't think Mozilla would want to either.
Ah, I didn't realize that this bug was not a problem in 2.95 and earlier. Never mind.
Changing summary to avoid confusion. Maybe Brendan could make something of this? (runs and hides...)
Assignee: rogerl → brendan
Summary: Dialogs don't work when jsinterp.c is compiled with -O2 and gcc-2.96 → gcc-2.96 -O2 jsinterp.c busts dialogs (OK with gcc-2.95, gcc-3.0 or just -O)
Target Milestone: Future → ---
You asked for it. /be
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → WONTFIX
Marking Verified Wontfix -
Status: RESOLVED → VERIFIED
D'oh!
Theres an updated redhat gcc errata, I'm going to see if that has the bug. If it does, then if someone can show me how to determine the gcc version from a Makefile.in, I'll submit a patch for js/Makefile.in (there is already a compiler-workarround in there for AIX) I'm running with an unconditional patch now, and have been for several weeks.
The more recent gcc package does not fix the problem. However, I was able to isolate exactly what's going wrong. In this call, from xpfe/global/resources/content/commonDialog.js: gCommonDialogParam.SetString(6, editField1.value); the first parameter is getting mangled to a 0 by the time it gets to the actual call to nsDialogParamBlock::SetString. I'm not sure how to debug this any further... brendan? jband?
You could just upgrade to gcc3. :-) Also, see my comments about SetString, etc., above.
drepper figured this one out. jsnum.h uses some macros that aren't safe with gcc's aliasing optimizations. Patch coming up, so I'll reopen the bug and take it.
Status: VERIFIED → REOPENED
Resolution: WONTFIX → ---
taking
Assignee: brendan → bryner
Status: REOPENED → NEW
Status: NEW → ASSIGNED
Attached patch patchSplinter Review
The patch uses gcc extensions (the braced group within the macro), which is why the new code is only turned on for gcc. We could do this portably if we required the sh_u variable be passed into the macro, but that would require changing all the call sites.
Just a few more words. I proposed to have the special gcc code and leave the questionable code for othe r compilers. As far as I know (and I've seen a lot) there is no other C/C++ compiler which performs this optimization. This is why the bug didn't turned out earlier. A general solution would either require to change the interface of many of the macros in jsnum.h (all callers would have to be changed) or the introduction of helper functions. These helper functions cannot be inlined since inline is an ISO C99 feature as well. If there comes another compiler with this optimization another customized solution can be found. Given the current trend in the industry this compiler will likely have tons of gcc extensions and therefore the use of statement expressions might be exactly what would be needed. So, I think the proposed patch is exactly what should be done. It has no negative impacts on code generation and does not require non-local source changes. So, in case this is valid: r=drepper
So, does this mean that gcc-3.0 isn't doing an optimisation which 2.96 did? Or does 3.0 just not have -fstrict-aliasing on by default?
> So, does this mean that gcc-3.0 isn't doing an optimisation which 2.96 did? Or > does 3.0 just not have -fstrict-aliasing on by default? I just looked at the sources. gcc3 does before this optimization at -O2. But this does not mean that the result for the code in question must be the same. The alias analysis does not do any code transformations itself. It just annotates the internal representation of the code with dependency information which other optimization passes then can use to move code around (or eliminate code). So, it's sheer luck if gcc3 can compile the code correctly. gcc2.95 and before did not have this optimization and they are not affected but the proposed change doesn't hurt.
drepper: OK, just checking if we needed to file a gcc bug report
bryner: please to be respecting local indentation (the Emacs modeline specifies this) and bracing (obvious) style. Other than that, sr=brendan@mozilla.org. /be
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
bbaetz: does this fix the problem? If so, please mark Verified - thanks.
I'll check today
QA Contact: pschwartau → bbaetz
verified, opt -O2 cvs build from this morning on linux with gcc-2.96
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: