Last Comment Bug 312929 - Firefox does not launch on latest Intel Mac OS X build (8F1099)
: Firefox does not launch on latest Intel Mac OS X build (8F1099)
Status: RESOLVED FIXED
: fixed1.8
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- critical (vote)
: ---
Assigned To: Josh Aas
:
:
Mentors:
Depends on: 297326
Blocks:
  Show dependency treegraph
 
Reported: 2005-10-18 15:51 PDT by Josh Aas
Modified: 2005-10-24 20:02 PDT (History)
11 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
crash log (2.55 KB, text/plain)
2005-10-18 15:55 PDT, Josh Aas
no flags Details
Align the stack inside nsXPTCStubBase::Stub##n (1.91 KB, patch)
2005-10-21 01:21 PDT, Eric Albert
doug.turner: review+
dbaron: superreview+
asa: approval1.8rc1+
Details | Diff | Splinter Review

Description Josh Aas 2005-10-18 15:51:57 PDT
Firefox does not launch on latest Intel Mac OS X build (8F1099).
Comment 1 Josh Aas 2005-10-18 15:55:27 PDT
Created attachment 200005 [details]
crash log

> Exception:  EXC_BAD_INSTRUCTION (0x0002)

Probably misaligned stack. From the seed notes for 8F1099:

"The 16-byte stack alignment requirement in the Intel ABI is now
aggressively enforced. Applications which crash with an illegal
instruction exception are probably misaligning the stack. This should
only affect assembly code; all code built with GCC should automatically
align the stack correctly."
Comment 2 Chris McAfee 2005-10-18 17:39:20 PDT
We have hit this problem in my group at Apple, this sounds like some assembly
somewhere, NSPR? is not 16-byte aligned.

Someone in my group took a look at this stack and suggested that
frames 1 and 14 of Thead 0 looks like a good place to start looking.

I can give more details in email and/or hook people up with OS people,
send me mail.
Comment 3 Josh Aas 2005-10-19 10:45:09 PDT
If anyone else who cares is as uneducated on this subject as I am, you can visit
http://tinyurl.com/965zr for an idea of what is going on.
Comment 5 Josh Aas 2005-10-20 17:14:57 PDT
bsmedberg: We use "xptcinvoke_unixish_x86.cpp" not "xptcinvoke_gcc_x86_unix.cpp".

McAfee: Almost assuredly the problem is in "xptcinvoke_unixish_x86.cpp". We are
short on people who know x86 assembly, and if you know someone who could take a
look that would be great. See 297326 for what we had to do for the first Mac OS
X for Intel builds.
Comment 6 Chris McAfee 2005-10-20 17:22:41 PDT
I have a few people to ask, just sent mail.
Comment 7 Eric Albert 2005-10-21 01:21:25 PDT
Created attachment 200322 [details] [diff] [review]
Align the stack inside nsXPTCStubBase::Stub##n
Comment 8 Eric Albert 2005-10-21 01:26:39 PDT
Firefox wasn't aligning the stack inside nsXPTCStubBase::Stub5.  I've attached diffs against mozilla TOT 
which do this.  With this change in place I can launch DeerPark and browse the web a bit.  I eventually 
hit a crash inside some plugin code that isn't due to a misaligned stack, so I'm guessing that's a 
separate problem in TOT.

For what it's worth, I'm not sure why nsXPTCStubBase::Stub##n has inline asm at all.  As far as I can tell, 
the function could be written like this:
  return PrepareAndDispatch(firstArg, n, &secondArg);

You can't quite do that today because the stubs are defined as taking no arguments in 
xptcstubsdecl.inc, but that's clearly wrong.  They obviously do take arguments, and if they were 
declared like that this code would be a lot more maintainable.
Comment 9 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2005-10-21 14:52:40 PDT
(In reply to comment #8)
> You can't quite do that today because the stubs are defined as taking no
arguments in 
> xptcstubsdecl.inc, but that's clearly wrong.  They obviously do take
arguments, and if they were 
> declared like that this code would be a lot more maintainable.

The problem is that they take variable numbers of arguments using normal (not
varargs) calling conventions, and the numbers and types are determined by
information contained in data structures that the callee has access to.
Comment 10 Josh Aas 2005-10-21 15:02:09 PDT
With Eric's patch, Firefox is up and running again. Thanks!
Comment 11 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2005-10-21 15:26:46 PDT
Comment on attachment 200322 [details] [diff] [review]
Align the stack inside nsXPTCStubBase::Stub##n

>+#define ALIGN_STACK	\
>+    "andl $0xfffffff0, %%esp\n\t" \
>+    "subl $0x4, %%esp\n\t"

You could put an 

"addl $0x4, %%esp\n\t" \

in as the first line of this macro to avoid wasting 16 bytes in one of the 4
possible cases (although we probably never hit that one; I suspect there's no
room for variation and if we hit that one nobody would have filed the bug)

With or without that, sr=dbaron.
Comment 12 Josh Aas 2005-10-21 16:15:24 PDT
Comment on attachment 200322 [details] [diff] [review]
Align the stack inside nsXPTCStubBase::Stub##n

Requesting 1.8 approval. Consider carefully - while we don't need this fix for
FF 1.5, we will need it on the Mozilla 1.8 branch within the next 2-3 months.
It is absolutely necessary in order to get FF running on Intel Macs.

So, we either take the risk now or in a later point release. Perhaps get some
guidance on the risk factor from dbaron.
Comment 13 Chris McAfee 2005-10-21 16:43:26 PDT
The diff to me looks like MACOSX/i386 only, we could test this with
other OS's that use the xptcstubs_unixish_x86.cpp file to build, hmm
looks like linux/i386 does?
Comment 14 Josh Aas 2005-10-21 18:02:51 PDT
Landed on trunk with dbaron's suggestion from comment #11 (it seems to work
fine). Forgot to give Eric credit in the cvs commit credit. Won't forget on
branch landing whenever that happens. Thanks Eric!
Comment 15 Eric Albert 2005-10-21 19:03:04 PDT
(In reply to comment #9)
> The problem is that they take variable numbers of arguments using normal (not
> varargs) calling conventions, and the numbers and types are determined by
> information contained in data structures that the callee has access to.

The weird thing about that, though, is that the stubs don't seem to pass those
extra arguments on to the function they're calling or use those extra arguments.
Maybe that's something specific to the x86 implementation -- I haven't looked
at any of the others -- but PrepareAndDispatch only takes three arguments.  If
we pass fewer than two arguments it'll be passed junk, and if we pass more than
that they'll get ignored.  Because we're calling PrepareAndDispatch rather than
jumping to it, it doesn't really have a way to access extra arguments unless it's
both written in assembly and doing some very nasty stuff.

Or maybe I'm missing something very important here. :)  Which wouldn't be
surprising considering that I hadn't looked at this code before last night....
Comment 16 Mark Mentovai 2005-10-21 20:48:18 PDT
Magic!

or...

The stub doesn't pass arguments to PrepareAndDispatch in accordance with the
standard calling convention, it passes a pointer to the base of the arguments in
memory.  The args argument that PrepareAndDispatch receives points to the
arguments that the stub received in its own stack frame.  methodIndex, the same
as the stub number, informs PrepareAndDispatch of the quantity and types of the
arguments.

Other architectures work similarly, although the calling conventions are of
course different.
Comment 17 :Gavin Sharp [email: gavin@gavinsharp.com] 2005-10-22 11:35:26 PDT
*** Bug 313398 has been marked as a duplicate of this bug. ***
Comment 18 Josh Aas 2005-10-23 22:47:53 PDT
Landed on branch. Thanks Eric!

Note You need to log in before you can comment on or make changes to this bug.