Closed Bug 272369 Opened 20 years ago Closed 20 years ago

[s390] firefox -register results in SIGSEGV

Categories

(Core :: XPCOM, defect)

1.7 Branch
x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: caillon, Assigned: bugzilla)

Details

(Keywords: crash, fixed-aviary1.0.4, fixed1.7.8)

Attachments

(3 files)

Same for thunderbird.


#0  0x77f42830 in nsQueryInterface::operator() (this=0x719c78,
    aIID=@0x7776212c, answer=0x7fffdac4) at nsCOMPtr.cpp:47
#1  0x77f42992 in nsCOMPtr_base::assign_from_qi (this=0x7fffdae0, qi=
      {mRawPtr = 0x719c78}, iid=@0x7fffdac4) at nsCOMPtr.cpp:96
#2  0x77755bb6 in XPCWrappedNative::GetNewOrUsed (ccx=@0x7fffdcf8,
    Object=0x7776212c, Scope=0x4e4d18, Interface=0x714db0,
    resultWrapper=0xf775e086) at nsISupportsUtils.h:208
#3  0x7773d8c0 in XPCConvert::NativeInterface2JSObject (ccx=@0x7fffdcf8,
    dest=0x7fffdca0, src=0x719c78, iid=0x7fffdce8, scope=0x7776212c, pErr=0x0)
    at xpcprivate.h:3153
#4  0x7773fcf8 in XPCConvert::NativeData2JS (ccx=@0x7fffdcf8, d=0x7fffde50,
    s=0x7776212c, type=@0x80004003, iid=0x7fffdce8, scope=0x53d3b0, pErr=0x0)
    at nsCOMPtr.h:1211
#5  0x77750eae in nsXPCWrappedJSClass::CallMethod (this=0x71b8f0,
    wrapper=0x7fffdd90, methodIndex=12, info=0x544f00, nativeParams=0x7fffdf68)
    at xpcwrappedjsclass.cpp:1226
#6  0x7774c588 in nsXPCWrappedJS::CallMethod (this=0x71bbe0, methodIndex=8492,
    info=0x719c78, params=0x80004003) at xpcprivate.h:2139
#7  0x77f307fc in PrepareAndDispatch (self=0x71bbe0, methodIndex=12,
    a_gpr=0x7fffe040, a_fpr=0x7fffe040, a_ov=0x7fffe034)
    at xptcstubs_linux_s390.cpp:161
#8  0x77f30b74 in nsXPTCStubBase::Stub12 (this=0x719c78) at xptcstubsdef.inc:14
#9  0x76f9a87e in RDFContainerUtilsImpl::IsA (this=0x719c78,
    aDataSource=0x77f30b38, aResource=0x7776212c, aType=0x7fffdac4)
    at nsRDFContainerUtils.cpp:477
#10 0x76f9a902 in RDFContainerUtilsImpl::IsContainer (this=0x4a05a0,
    aDataSource=0x71bbe0, aResource=0x719c78, _retval=0x7fffe0d8)
    at nsRDFContainerUtils.cpp:229
#11 0x76f98cb2 in RDFContainerImpl::Init (this=0x71b7e8, aDataSource=0x71bbe0,
    aContainer=0x719c78) at nsRDFContainer.cpp:166
#12 0x77f305c4 in XPTC_InvokeByIndex (that=0x71b7e8, methodIndex=2004230444,
    paramCount=2, params=0xf7e6a82e) at xptcinvoke_linux_s390.cpp:239
#13 0x00000000 in ?? ()
(gdb) frame 0
#0  0x77f42830 in nsQueryInterface::operator() (this=0x719c78,
    aIID=@0x7776212c, answer=0x7fffdac4) at nsCOMPtr.cpp:47
47                                      status = mRawPtr->QueryInterface(aIID,
answer);
(gdb) p mRawPtr
$1 = (nsISupports *) 0x1
Oh, and this is with the patches for bug 264324 and bug 264326
and it affects s390x as well
Attached patch Incomplete patchSplinter Review
Still doesn't work, but from Pete Zaitcev who says these changes are quite
necessary:

----
OK, I think I did everything I could. Patch is attached.

This thing might have even worked, but the method pointer is bogus,
and it happens way before assembler gets involved. It comes from here:

    PRUint32 *vtable = *(PRUint32 **)that;
#if defined(__GXX_ABI_VERSION) && __GXX_ABI_VERSION >= 100 /* G++ V3 ABI */
    PRUint32 method = vtable[methodIndex];
#else /* not G++ V3 ABI  */
    PRUint32 method = vtable[methodIndex + 2];
#endif /* G++ V3 ABI */

----
I'm currently investigating this problem.  It appears that the glue code
makes assumptions about GCC behaviour that no longer hold true for GCC 3.4;
in particular the xptcinvoke code only works if invoke_count_word is *not*
inlined (but 3.4 does inline it), and the xptcstubs code only works if the
compiler generates a stack backchain (which 3.4 doesn't by default).

Unfortunately the 264326 patch made the problem worse; the current stubs code
won't work correctly even with an old compiler.  Pete Zaitcev's patch also
does not solve the core problems, in fact it introduced even *more* code that
relies on the presence of a backchain.

As this whole code is a bit fragile, I would prefer to simply use additional
compiler flags for building these two files that cause modern compilers to
again behave as this code expects.  (In particular reimplementing the stubs
code *without* using the stack backchain appears impossible -- unless we go
to a full assembler implementation.)  I'm testing a patch ...

I don't think it would be out of line to have two versions of the stubs files,
one compiled for one compiler and one compiled for the other.
Using this patch I was able to build and run firefox on
s390-ibm-linux and s390x-ibm-linux.  The patch reverts the
problematic parts of the earlier fix attempt, and uses compiler
options to make GCC 3.4 behave as the assembly stubs expect:

-fno-strict-aliasing
  (several of the routines violate C aliasing rules)
-fno-inline
  (so that the InvokeByIndex function allocates a stack frame)
-fomit-frame-pointer
  (so that %r11 is available for allocation, otherwise we might
   not be able to satisfy 6 inputs + 6 clobbers in GPRs)
-mbackchain
  (so that xptcstubs has the backchain available)

Note that older compilers will accept those options as well, so
that it should be possible to build firefox with any 3.x compiler.
I will try that but have to note that without the reverted patch it was not
possible to run regxpcom of mozilla on s390(x) compiled with gcc3.3.x.

I will retest with this patch both firefox and mozilla
I cannot pass Dr. Weigand's comment #5. My patch did NOT "rely on presense of
a backchain". I assume he meant the xpcall part. In reality, it forms a whole
new frame, then stores the old r15 into the backchain slot to be used later.
I could as well put it, say, into r11, if only I knew how to tell gcc nicely
not to evaluate other asm arguments into it.

Another thing, existing code for xpcall is wrong. It does not create a new
frame before calling invoke_copy_to_stack. Instead, it tries to move it
quietly by doing this:
        "l     2,0(15)\n\t"
        "la    15,0(3,7)\n\t"
        "st    2,0(15)\n\t"
If no overflow arguments are present, it only moves it down by 32 bytes.
Arguments end in the "undefined" area at offset 96 - 32 = 64, which is
"saved f6". As overflow arguments start to appear, the overlap changes,
and eventually the just created arguments will overwrite saved register
values which the very first stm of XPTC_InvokeByIndex stored.

Now I'm not saying that my patch was 100% correct, for a couple of
reasons.

First reason is the use of r7. Old code does not use it as an overflow
argument pointer. Instead, a) xpstub simply adds 96 to r15 and expects to
find arguments there, and b) xpcall uses r7 as equivalent of frame pointer.
If that's how you want it, my patch has to be changed to tuck overflow
area right behind the frame.

Second reason is that in xptstub I had no idea how that's called, so
I went by what was there. If you want to step one more frame up with
l 15,0(15), well, I had no way to know. I wonder though.

In any case, what I wrote about the frame overlap and corruption of
registers saved by XPTC_InvokeByIndex prologue seems correct. Should be
easy to test, too. Just creat a method with 20 arguments.
without thinking about the details, I've tested the compile-options and
reverting the patch for mozilla trunk and aviary.
Both "work" with attachment 167746 [details] [diff] [review] where work means that build is successful and
regxpcom resp. firefox-register doesn't segfault.
This was tested for gcc 3.3.3 and 3.3.4 used by SUSE.
I haven't made further tests up to now.
Pete,  
  
the code as modified by your patch never stores to the backchain. 
However, re-reading your patch in view of your comment, it would 
appear that instead of the line 
+        "s     7,0(15)\n\t" 
you may have intended to write 
+        "st    7,0(15)\n\t" 
which would indeed have stored the to the backchain slot. 
 
With that change my comment as to relying on -mbackchain is moot, 
but even so it would be dangerous as it relies on the *called* 
function not *changing* the backchain slot.  This assumption is 
dangerous as the whole 96 byte stack bias area 'belongs' to the 
called function, and e.g. GCC 4.0 (or Red Hat's 3.4) when invoked 
with the -mpacked-stack switch will generate code that does modify 
all of the stack bias area. 
 
Now, as to %r7, this register has no special meaning whatsoever 
defined by ABI.  In particular, the argument area is defined by 
the ABI to start at %r15 + 96 (on function entry), and has nothing 
to do with %r7.  Thus I'm somewhat confused about your comment 
(it certainly explains your change to xptstubs, but that doesn't 
make it correct :-/). 
 
[ Note that the fact that the overflow area starts at %r15 + 96 
is the whole *reason* why InvokeByIndex is so ugly.  If I could 
simply allocate the overflow area anywhere, everything would be 
much simpler.  It is also the reason why the xptstubs code would 
probably be impossible to implement without a backchain -- there 
is no way from within an inline assembly to determine the value 
of %r15 *at the start of the function* containing the assembly, 
except by reading the backchain value. ] 
 
As to the InvokeByIndex assembly not allocating a stack frame, 
that's of course a correct observation, and it is indeed a bug 
*if InvokeByIndex is a leaf function*!  That's just what I said 
in my first comment.  If InvokeByIndex is *not* a leaf function 
(and it isn't unless invoke_count_word is inlined), then its 
function prolog already allocates a stack frame, and the assembly 
uses that frame to provide the stack bias area to its callee, 
just like a compiler-generated call would. 
 
Wolfgang, 
 
if by 'gcc3.3.x' in comment #8 you meant the compiler 
provided by SuSE (e.g. SLES8 SPS3 or SLES9), then this 
compiler has by default the backchain disabled (just 
like vanilla 3.4 and above do, but unlike a vanilla 3.3), 
so you'd see the same problem as I described for 3.4 with 
the original xptstubs code. 
 
Comment on attachment 167746 [details] [diff] [review]
xptcall fix for s390(x)

as this patch seems to fix the problems, we should check this in as long as no
better solution is available (or possible at all)
Attachment #167746 - Flags: superreview?
Attachment #167746 - Flags: review?(shaver)
one addition: the patch is meant for trunk and not aviary, where only the
Makefile change would be needed. So if another version is built out for aviary
it should go there, too. The CFLAGS change is very low risk.
Comment on attachment 167746 [details] [diff] [review]
xptcall fix for s390(x)

r=shaver
Attachment #167746 - Flags: review?(shaver) → review+
Attachment #167746 - Flags: superreview? → superreview?(dougt)
Attachment #167746 - Flags: superreview?(dougt) → superreview+
Comment on attachment 167746 [details] [diff] [review]
xptcall fix for s390(x)

I just took the liberty of checking this fix in to the trunk.
Severity: normal → critical
thanks Chris. I have no CVS access so I've hoped that somebody take it ;-)
So we can close this?
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Yeah this can be closed.  I think all of Pete's concerns have been addressed
too, but if not those can be separate bugs.
Moving products so I can request approval for 1.7.7
Component: General → XPCOM
Flags: review+
Product: Firefox → Core
Version: 1.0 Branch → 1.7 Branch
Comment on attachment 167746 [details] [diff] [review]
xptcall fix for s390(x)

Want to check this into the aviary and 1.7 branches for the distributors who
care about s390(x).
Attachment #167746 - Flags: approval1.7.8?
Attachment #167746 - Flags: approval-aviary1.0.4?
Comment on attachment 167746 [details] [diff] [review]
xptcall fix for s390(x)

a=mkaply
Attachment #167746 - Flags: approval1.7.8?
Attachment #167746 - Flags: approval1.7.8+
Attachment #167746 - Flags: approval-aviary1.0.4?
Attachment #167746 - Flags: approval-aviary1.0.4+
This is a rediffed version of the interdiff between this patch and the patch
that this was reverting.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: