1.46 KB, patch
Josh Aas: review-
|Details | Diff | Splinter Review|
7.70 KB, patch
Josh Aas: review+
|Details | Diff | Splinter Review|
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/417.2 (KHTML, like Gecko) Safari/417.1 Build Identifier: Trunk On build 8F1099 of Mac OS X on Intel, an unoptimized debug build of Firefox crashes on launch in XPTC_InvokeByIndex with a misaligned stack. The misalignment happens within XPTC_InvokeByIndex itself, which doesn't appear to be properly aligning the first call in the inline asm. Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand. 0x8fe12ee4 in __dyld_stub_binding_helper_interface () 2: /x $esp = 0xbfffd8b8 1: x/i $pc 0x8fe12ee4 <__dyld_stub_binding_helper_interface+18>: movdqa %xmm0,32(%esp) (gdb) bt #0 0x8fe12ee4 in __dyld_stub_binding_helper_interface () #1 0x01008000 in ?? () #2 0x0108e45d in XPTC_InvokeByIndex (that=0x1c3b690, methodIndex=3, paramCount=1, params=0xbfffdb4c) at xptcinvoke_unixish_x86.cpp:147 #3 0x2b1bf00f in XPCWrappedNative::CallMethod (ccx=@0xbfffdd74, mode=CALL_GETTER) at xpcwrappednative.cpp:2139 #4 0x2b1e6297 in XPCWrappedNative::GetAttribute (ccx=@0xbfffdd74) at xpcwrappednativejsops.cpp: 1925 #5 0x2b1c6263 in XPC_WN_GetterSetter (cx=0x1c32280, obj=0x204b068, argc=0, argv=0x1c439c8, vp=0xbfffdec4) at xpcwrappednativejsops.cpp:1476 #6 0x0054da41 in js_Invoke (cx=0x1c32280, argc=0, flags=2) at jsinterp.c:1177 #7 0x0054de69 in js_InternalInvoke (cx=0x1c32280, obj=0x204b068, fval=33862752, flags=0, argc=0, argv=0x0, rval=0xbfffe9bc) at jsinterp.c:1274 #8 0x0054e149 in js_InternalGetOrSet (cx=0x1c32280, obj=0x204b068, id=29580784, fval=33862752, mode=JSACC_READ, argc=0, argv=0x0, rval=0xbfffe9bc) at jsinterp.c:1317 #9 0x0057a70e in js_GetProperty (cx=0x1c32280, obj=0x204b068, id=29580784, vp=0xbfffe9bc) at jsobj.c:2931 #10 0x0055e1a0 in js_Interpret (cx=0x1c32280, pc=0x1c434ad "5", result=0xbfffed80) at jsinterp.c: 3327 #11 0x0054e526 in js_Execute (cx=0x1c32280, chain=0x204af98, script=0x1c43450, down=0x0, flags=0, result=0xbfffef10) at jsinterp.c:1407 #12 0x00512e20 in JS_ExecuteScript (cx=0x1c32280, obj=0x204af98, script=0x1c43450, rval=0xbfffef10) at jsapi.c:4009 #13 0x2b1d35a2 in mozJSComponentLoader::GlobalForLocation (this=0x1c314d0, aLocation=0x1c30f60 "rel:jsconsole-clhandler.js", aComponent=0x1c31600, aGlobal=0xbffff090) at mozJSComponentLoader.cpp:1567 #14 0x2b1d3803 in mozJSComponentLoader::ModuleForLocation (this=0x1c314d0, registryLocation=0x1c30f60 "rel:jsconsole-clhandler.js", component=0x1c31600, status=<incomplete type>) at mozJSComponentLoader.cpp:917 #15 0x2b1d42be in mozJSComponentLoader::AttemptRegistration (this=0x1c314d0, component=0x1c31600, deferred=0) at mozJSComponentLoader.cpp:755 #16 0x2b1d488b in mozJSComponentLoader::AutoRegisterComponent (this=0x1c314d0, when=0, component=0x1c31600, registered=0xbffff304) at mozJSComponentLoader.cpp:682 #17 0x2b1d11f2 in mozJSComponentLoader::RegisterComponentsInDir (this=0x1c314d0, when=0, dir=0x1c0e110) at mozJSComponentLoader.cpp:590 #18 0x2b1d128b in mozJSComponentLoader::AutoRegisterComponents (this=0x1c314d0, when=0, aDirectory=0x1c0e110) at mozJSComponentLoader.cpp:546 #19 0x01062178 in nsComponentManagerImpl::AutoRegisterNonNativeComponents (this=0x1c0c910, spec=0x1c0e110) at nsComponentManager.cpp:3262 #20 0x010647cf in nsComponentManagerImpl::AutoRegisterImpl (this=0x1c0c910, when=0, inDirSpec=0x0, fileIsCompDir=1) at nsComponentManager.cpp:3232 #21 0x01064882 in nsComponentManagerImpl::AutoRegister (this=0x1c0c910, aSpec=0x0) at nsComponentManager.cpp:3404 #22 0x01010d3c in NS_InitXPCOM3_P (result=0xbffff738, binDirectory=0x1c0c3a0, appFileLocationProvider=0xbffff768, staticComponents=0x54ab0, componentCount=1) at nsXPComInit.cpp:616 #23 0x00003433 in ScopedXPCOMStartup::Initialize (this=0xbffff738) at nsAppRunner.cpp:595 #24 0x00009259 in XRE_main (argc=1, argv=0xbffff8d4, aAppData=0x549e0) at nsAppRunner.cpp: 2141 #25 0x000024aa in main (argc=1, argv=0xbffff8d4) at nsBrowserApp.cpp:61 (gdb) Reproducible: Always Steps to Reproduce: 1) Build Firefox with --disable-optimize, --enable-debug, and --disable-tests. 2) Run. Actual Results: Firefox crashes. Expected Results: Not crashing is often considered to be proper behavior.
*** Bug 313399 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of 312929 ***
This is not related to 312929. It's crashing in a different part of the code.
Unoptimized builds don't work anyway (gcc 4.0.0 5026). Bug 305827. This stack is different, though, so I'll leave this alone.
Oh. Well, how hard can that be to fix? :)
This (and everything else in Firefox) is really annoying to debug because I can't get a build with debug information to launch. When I build with --disable-optimize, --enable-debug, and --disable-tests, Firefox quits on launch because em is NULL in XRE_main. Obviously that isn't the case for a non-debug build, since otherwise I wouldn't be able to launch Firefox at all. But --disable-optimize doesn't give me debug information; it just builds with -O0. I wonder how anyone else gets this to work. It must work, or else nobody would debug Firefox.
I don't follow. Or maybe I do. --disable-optimize --enable-debug crashes, but --disable-optimize alone or with --disable-debug doesn't? Hey, if it's only going to crash in some builds, I'd be glad it's at least crashing debugging is on! If you want to turn off assertions and all of the other in-code (de)buggery, you want --enable-optimize="-O0 -g" --disable-debug. Yeah, I know, it looks weird, but we don't actually do anything with --enable-optimize other than set C[XX]FLAGS. Trust me, I'm a doctor.
P.S. Bug 305827 is fixed on the trunk now, so at least on ppc, we're able to launch unoptimized builds, with or without debugging, produced by gcc 4.
Looks like --disable-optimize doesn't work. I'm not sure what I was seeing earlier when I thought this was triggered by --enable-debug. Anyway, thanks much for the --enable-optimize="-O1 -g" tip -- that's helped out a lot. I suspect the problem with -O0 is that some functions which Firefox expects to be inlined aren't getting inlined, which really just means that judicious application of the always_inline attribute should probably happen. Not that I'm volunteering. :)
Created attachment 201363 [details] [diff] [review] Fixes stack alignment problems with XPTC_InvokeByIndex Previous patches to XPTC_InvokeByIndex didn't account for the variable amount of stack space used for the parameters added to the stack by invoke_copy_to_stack. Since the amount of alignment required is dependent on the size of those parameters, the code could crash when the number of parameters wasn't 0. The previous code also aligned the stack after the adjustment for the parameters invoke_copy_to_stack copies, but that was wrong because the subsequent calls need the copied parameters on top of the stack without an uninitialized stack alignment adjustment following them. This patch does not align the stack for the call to invoke_copy_to_stack. Since that's defined in the same module, calls to it shouldn't go through the dyld stub and therefore alignment shouldn't be necessary there. Aligning that call would be a little annoying -- doable, but I don't feel like doing it right now, particularly because it doesn't appear to be necessary.
(In reply to comment #9) > Looks like --disable-optimize doesn't work. I'm not sure what I was seeing > earlier when I thought this was triggered by --enable-debug. Anyway, thanks > much for the --enable-optimize="-O1 -g" tip -- that's helped out a lot. I > suspect the problem with -O0 is that some functions which Firefox expects to > be inlined aren't getting inlined, which really just means that judicious > application of the always_inline attribute should probably happen. Not that > I'm volunteering. :) That's exactly what happened in bug 305827, now fixed on the trunk. The fix hasn't made it to the branch (yet?), so if you're working with 1.8 (1.5), you'll want to apply my patch from that bug in your tree if you hope to get anything useful done. If your --disable-optimize builds are crashing in QuickDraw, then 305827 is it for sure, and your inlining predicition is spot-on. If you've got another crash that kinda sorta seems like mumble inlining mumble, gimme a new bug. (Especially if it's reproducible on ppc. No x86 Mac here.) Thanks.
Oops...I should've mentioned that I've already applied your patch. :)
And what I'm seeing isn't actually a crash, but rather an assertion failure which causes the app to quit. I don't know if it's reproducible on PowerPC -- I actually don't have any PPC Macs running Tiger around here.
Comment on attachment 201363 [details] [diff] [review] Fixes stack alignment problems with XPTC_InvokeByIndex With this patch, I see this about 50% of the time I try to start Firefox on an Intel Mac: Host Name: biko Date/Time: 2005-11-09 11:39:09.792 -0600 OS Version: 10.4.3 (Build 8F1111) Report Version: 4 Command: firefox-bin Path: /Users/josh/src/ff_1.8_opt/mozilla/ff_1.8_opt_obj/dist/DeerPark.app/Contents/MacOS/firefox-bin Parent: WindowServer  Version: 1.5 (1.5) PID: 8900 Thread: Unknown Link (dyld) error: lazy pointer not found
(In reply to comment #14) > With this patch, I see this about 50% of the time I try to start Firefox on an > Intel Mac: > > Host Name: biko > Date/Time: 2005-11-09 11:39:09.792 -0600 > OS Version: 10.4.3 (Build 8F1111) > Report Version: 4 > Link (dyld) error: > > lazy pointer not found That wouldn't be due to this patch, but it isn't good. I haven't tried Firefox on that build yet, though. Can you try building without this patch and seeing whether you can reproduce the problem there? If so, write up a separate bug about it and I'll take a look.
I can't reproduce it without your patch. I'll try another build though.
Comment on attachment 201363 [details] [diff] [review] Fixes stack alignment problems with XPTC_InvokeByIndex A fresh build does not have that problem odd. r=josh
So, from reading the patch, it looks like what the old code did (ifdef XP_MACOSX) was (using % for modulus): esp <- (((esp - n) - 4) % 16) + 4 and the new code is doing esp <- (((esp - (n + 4)) % 16) + (n + 4)) - n These two expressions look the same to me. What's the difference, or did I read something wrong?
(Oops, I didn't actually mean modulus, I meant "A % B == A minus (A mod B)".)
The existing code always uses n = 4, the patch makes it use n = 4 + (paramCount << 3).
Exactly. It's confusing that inline asm argument references look the same as constants, but the initial sub %8, %%esp means that we're adjusting %esp by a variable amount (n, specifically). We weren't accounting for that before and always adjusting by 4. That works when n is 0, but not in other cases. Now we adjust the stack by n+4 before aligning, then align, then move the stack back so after the other parts of the code are done playing with it it'll be aligned properly at the site of the call.
Those expressions above include the movement of the ifdef XP_MACOSX block from after the "subl %8, %%esp\n\t" /* make room for params */ to before it. As far as I can tell, the changes inside that block exactly cancel out the movement of that block to the other side of the line above. I don't see how any of your explanations refute my argument: they point to changes in the code, but I still don't see how the changes as a whole don't cancel each other out. What's wrong with the mathematical expressions above?
So I'll restate the two expressions I have above, with an extra pair of parentheses around the outside, and annotate each pair of parentheses with which operation is directly inside those parentheses: Old: esp <- ((((esp - n) - 4) % 16) + 4) DCBA New: esp <- ((((esp - (n + 4)) % 16) + (n + 4)) - n) AGCF E E The operations are: A: "subl %8, %%esp\n\t" /* make room for params */ B: "subl $0x4, %%esp\n\t" C: "andl $0xfffffff0, %%esp\n\t" /* make sure stack is 16-byte aligned */ D: "addl $0x4, %%esp\n\t" E: "g" (n + 4), /* %9 */ F: "subl %9, %%esp\n\t" G: "addl %9, %%esp\n\t" Are my expressions wrong in some way? Or are they not equivalent in some way? In what case does the result end up different?
dbaron, I see what you're saying. The current behavior is: ESP -= n } Identical to ESP -= 4 } ESP -= (n+4) ESP &= 0xfffffff0 ESP += 4 Augmented by this patch, the behavior is equivalent: ESP -= (n+4) ESP &= 0xfffffff0 ESP += (n+4) } Identical to ESP -= n } ESP += 4 The existing code is cleaner and it's easier to see what's going on. The cross-platform "subl %8, %%esp" serves the same function on all platforms. Eric, can you confirm that you see this crash on a fresh unpatched build?
I'll try to get back to this tonight and address everyone's comments here.
Comment on attachment 201363 [details] [diff] [review] Fixes stack alignment problems with XPTC_InvokeByIndex Marking sr- for lack of an explanation of what this changes; feel free to re-request if there is one.
Sorry...real life is interfering. I'll try to get back to this soon.
Comment on attachment 201363 [details] [diff] [review] Fixes stack alignment problems with XPTC_InvokeByIndex Doesn't work, see dbaron's comments.
Created attachment 208059 [details] [diff] [review] Align stack for all calls I'm pretty sure I know what's going on here. There are two CALLs in XPTC_InvokeByIndex. The asm ensures that the stack is 16-byte aligned for the second CALL, but the stack will always be offset by 8 bytes for the first CALL to invoke_copy_to_stack. invoke_copy_to_stack is CALLed directly by IP offset, so the misaligned stack doesn't cause an immediate problem. invoke_copy_to_stack does call IsPtrData through a stub, though, and this is where the alignment requirement is enforced. This doesn't present a problem at -O1 or higher only because IsPtrData is inlined and no CALL is made inside invoke_copy_to_stack. That procedure returns, our asm aligns the stack for the second call, and everything proceeds as it should. We shouldn't assume that the stack doesn't need to be aligned for the first CALL. We should adhere to the ABI and guarantee alignment any time we CALL. This patch makes that change by aligning for the first CALL instead of the second, and manipulating ESP accordingly between CALLs so that the stack remains aligned throughout.
Created attachment 208060 [details] [diff] [review] Align stack for all calls, and restore ESP more intelligently This is the same as the previous patch, except it doesn't do any unnecessary ESP manipulation when it's done, it simply puts ESP back the way it found it. This patch also includes clean-up to turn the XP_MACOSX macros into KEEP_STACK_16_BYTE_ALIGNED ones, and adds the appropriate #define to xptc_platforms_unixish_x86.h.
Created attachment 208082 [details] [diff] [review] Align stack for all calls, and restore ESP more intelligently I noticed that the previous patch wouldn't build on machines that don't require aligned stacks, because of overzealous #ifdefing. This fixes that problem, and replaces the temp3 placeholder with saved_esp. I also noticed that the parameter number for fn_copy (%0) was incorrect (%3). I took ESP out of the clobber spec for OS X because save/restore is now handled directly in that asm block, so ESP is not clobbered. The compiler didn't know what to do about ESP anyway (hence the requirement to save/restore).
Comment on attachment 208082 [details] [diff] [review] Align stack for all calls, and restore ESP more intelligently Unfortunately, this doesn't work. "Link (dyld) error: lazy pointer not found"
Could you run it in the debugger and get the backtrace when dyld sends its SIGTRAP? Thanks.
Oh, the args after "this" are in the wrong spot.
Created attachment 208121 [details] [diff] [review] Align stack for all calls, set up parameters properly I believe this version works for alignment in xptcinvoke. It maintains the parameter area for the second CALL above the area for the first CALL; both are aligned, and invoke_copy_to_stack is pointed into the second CALL's parameter area for its destination parameter. dbaron, I'd appreciate a review of this asm. An -O0 build now crashes elsewhere. I don't believe the new crash is related to xptcall.
The problems we're now seeing on the DTK are a race condition within dyld. In conjunction with this patch, disabling hyperthreading results in a stable application. See http://www.opensource.apple.com/darwinsource/10.4.4.ppc/dyld-44.2/src/dyld.cpp, bindLazySymbol. ADC login required. Note that this code is not equivalent to the shipping x86 code. We don't yet know if shipping x86 systems are susceptible to the dyld problem. The xptcall invoke stack alignment issue remains a very real problem. In order to run on x86, we need to get this x86 patch in. (Since we already did an alignment patch once over the summer, we'll call this patch a REalignment fix.) dbaron, could you please take a look?
Comment on attachment 208121 [details] [diff] [review] Align stack for all calls, set up parameters properly Looks good to me, works fine. This should be tested on a system that does not define KEEP_STACK_16_BYTE_ALIGNED since this patch modifies the way things are done on such systems. I don't have a setup to do that at the moment.
I can actually test this without the 16-byte alignment stuff. I'll just turn it off and run on PPC Mac OS X.
Seems to work fine without KEEP_STACK_16_BYTE_ALIGNED being defined.
(In reply to comment #38) > I can actually test this without the 16-byte alignment stuff. I'll just turn it > off and run on PPC Mac OS X. ppc doesn't use any of this x86 code, but the only changes on systems that don't use KEEP_STACK_16_BYTE_ALIGNED are just to get the parameter indices right.
blocking22.214.171.124: this is required for an x86 Mac release.
Checked in to CAMINO_1_0_BRANCH. Still waiting on review, dbaron!
Comment on attachment 208121 [details] [diff] [review] Align stack for all calls, set up parameters properly dbaron says he won't have a chance to review. shaver, if you decide to use your free pass too, I'm going to have to start picking superreviewers who haven't advertised their x86 asm prowess.
Comment on attachment 208121 [details] [diff] [review] Align stack for all calls, set up parameters properly Superreview isn't going to require deep knowledge of x86 assembly here, it's integration review; the review should be on the basis of the code analysis, and I see that josh has reviewed, so sr=shaver.
Landed on trunk. We need this on the branches.
This fix appears to have broken BeOS builds. See bug 326838 for details.
Comment on attachment 208121 [details] [diff] [review] Align stack for all calls, set up parameters properly b181=shaver
Comment on attachment 208121 [details] [diff] [review] Align stack for all calls, set up parameters properly approved for 1.8.0 branch, a=dveditz for drivers
Awaiting review on bug 326838 before landing this on branches, so as to avoid busting BeOS on the branches.
Checked in on 1_8_0 and 1_8 branches.