Closed Bug 961488 Opened 10 years ago Closed 10 years ago

Segmentation fault in mozJSComponentLoader::Import on ppc32

Categories

(Core :: XPConnect, defect)

PowerPC
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32
Tracking Status
firefox29 --- wontfix
firefox30 --- fixed
firefox31 --- fixed
firefox32 --- fixed
b2g-v1.4 --- fixed

People

(Reporter: stevensn, Assigned: gaston)

References

Details

(Whiteboard: [qa-])

Attachments

(7 files, 2 obsolete files)

Attached file gdb stack trace
On Linux / powerpc32 on mozilla-central 61fd0f987cf2 

firefox gets a segmentation fault on startup in mozJSComponentLoader::Import


#3  0x0b7c8c64 in mozJSComponentLoader::Import (this=0x48e1bbe0, registryLocation=..., 
    targetValArg=<error reading variable: Cannot access memory at address 0xffffff82>, cx=0x48e1d940, 
    optionalArgc=0 '\000', retval=<error reading variable: Cannot access memory at address 0xffffff82>)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/loader/mozJSComponentLoader.cpp:1094
#4  0x0b6c04f4 in nsXPCComponents_Utils::Import (this=0x48ef92e0, registryLocation=..., targetObj=
    <error reading variable: Cannot access memory at address 0xffffff82>, cx=0x48e1d940, optionalArgc=0 '\000', 
    retval=<error reading variable: Cannot access memory at address 0xffffff82>)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCComponents.cpp:2733
#5  0x09a04360 in NS_InvokeByIndex ()
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_asm_ppc_linux.S:86
#6  0x0b7724ec in Invoke (this=0xbfa990c8)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:2550
#7  Call (this=<optimized out>)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:1890
.
.
.


I haven't been seeing this one my ppc64 machine
9409405e0739 seems uneffected.
81bced59e8b3 also seems uneffected
9bcc52594322 effected 
7cd28a857f88 (+bcbe93f41547 so it links) is uneffected.
357883f2205b effected
dec061c862ff couldn't build because of bustage
259c34b488f7 + bcbe93f41547 is uneffected
f017ae03bb6c effected
913a1c5086c5 uneffected

Looks like bug 939294 introduced this
Blocks: 939294
Steve, thanks for the notification. This is an endian problem in the handle. It's accessing 0xffffff82, which is of course the tag, not the value. That is a large bug, however, so I can't find where just yet.
On first review, one bandaid might be to do the endian swap for the handle in the xptcall -- this should work and wouldn't impact the tier-1 little endian platforms. Boris, do you think this is workable, or do you suspect something else? Or would this be a bsmedberg question?
Flags: needinfo?(bzbarsky)
That sounds more like a bsmedberg question to me.

That said the code used to pass a Value& and now passes a HandleValue, but the two should be binary-compatible, no?  So it's not clear to me why we need any xptcall changes...
Flags: needinfo?(bzbarsky)
Component: JavaScript Engine → XPConnect
OK. I'm not familiar enough with how RootedValues work, though, to know how the pointers are extracted. It must be something wrong in the xpconnect changes I'm not seeing yet.
RootedValue just has an internal Value member.  A Handle has a Value* member.  The conversion is like so:

1135 template <typename T> template <typename S>
1136 inline
1137 Handle<T>::Handle(const Rooted<S> &root,
1138                   typename mozilla::EnableIf<mozilla::IsConvertible<S, T>::value, int>::Type dummy)
1139 {
1140     ptr = reinterpret_cast<const T *>(root.address());
1141 }

Here root.address() returns an S*.  In the case when both S and T are Value, there is no particular magic...

The patches in bug 939294 just changed the headers generated by xpidl to use "HandleValue" instead of "const Value&" and "MutableHandleValue" instead of "Value*" and fixed up the implementations of the methods to match the new signature.  There were no changes to argument conversions or xptcall....
Steve, are you sure about that regression range?
Flags: needinfo?(steve)
I have done a few tests that verify that things work before f017ae03bb6c and don't after.
Okay, unless anyone sees something obvious, I'll try to debug this when I have a 29 build up for TenFourFox, but that'll be a couple weeks to finish the merging.
I've spent some time in the debugger on this and haven't figured it out.  I just don't have a good enough understanding of what the data structures in memory should look like to spot the problem. I have a feeling it might be around the time ConvertDependentParams or ConvertIndependentParams is called but I'm just not sure,

I will update the bug if I'm able to figure out anything in the meantime.
Flags: needinfo?(steve)
I believe ConvertIndependentParams is where the relevant bits should be happening.
If that's the case, then, I'm looking at XPCConvert::JSData2Native(). I don't have a build up to test, but do the RootedValues appear to be nsXPTType::T_JSVAL? Because if so,

    case nsXPTType::T_JSVAL :
        *((jsval*)d) = s;
        break;

I wonder about swapping the two jsval words here unless this will mess up other things.
(i.e., turn a JSVAL of 0xffffff82NNNNNNNN into 0xNNNNNNNNffffff82 if big-endian)
The question is this.  "s" is already a Value.  *d will be treated as a Value, right?  Why is any swapping needed when copying a Value to another Value?

Or is there some sort of weirdness for jsval vs Value here?  How is jsval defined on the relevant platform?
Hmm.  I guess "s" is a HandleValue.  But that didn't change in bug 939294.

And jsval is a typedef for Value across the board, as far as I can tell.
Is it possible it just unmasked some previous endian issue that was not exposed before?

The problem appears to me -- and I may be utterly wrong -- that something downstream of the call thunked the offending 64-bit jsval into a 32-bit pointer. On a little-endian platform, the pointer is at byte offset zero, so it just works. On big-endian, it's at byte offset four, so you get the tag, which is bogus.

I just don't know how to detect the situation, since your comment alleges any jsval will get here, even doubles (which probably should NOT be byteswapped).
Any jsval will get here, yes.

Someone probably needs to step through what's going on and see where things go awry.  :(
P don't thikn the problem is with the conversion in XPCConvert::JSData2Native

s ptr asPtr is 0xffffff82 


Breakpoint 1, XPCConvert::JSData2Native (d=0xbfffc350, s=JSVAL_VOID, type=..., useAllocator=true, iid=0xbfffc478, pErr=0xbfffc488)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCConvert.cpp:441
441	        *((jsval*)d) = s;
(gdb) p s
$7 = JSVAL_VOID
(gdb) disable pretty-printer .* SpiderMonkey
15 printers disabled
0 of 15 printers enabled
(gdb) p s
$8 = {<js::HandleBase<JS::Value>> = {<js::ValueOperations<JS::Handle<JS::Value> >> = {<No data fields>}, <No data fields>}, 
  ptr = 0xbfffc470}
(gdb) s
JS::Handle<JS::Value>::operator JS::Value const& (this=0xbfffc4d8) at ../../../dist/include/js/RootingAPI.h:482
warning: Source file is more recent than executable.
482	    operator const T&() const { return get(); }
(gdb) s
JS::Handle<JS::Value>::get (this=0xbfffc4d8) at ../../../dist/include/js/RootingAPI.h:476
476	    const T& get() const { return *ptr; }
(gdb) p *ptr
$9 = {data = {asBits = 18446743532543672320, s = {tag = JSVAL_TAG_UNDEFINED, payload = {i32 = 0, u32 = 0, boo = 0, str = 0x0, obj = 0x0, 
        ptr = 0x0, why = JS_ELEMENTS_HOLE, word = 0, uintptr = 0}}, asDouble = -nan(0xfff8200000000), asPtr = 0xffffff82}}
(gdb) s
XPCConvert::JSData2Native (d=0xbfffc350, s=..., type=..., useAllocator=true, iid=0xbfffc478, pErr=0xbfffc488)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCConvert.cpp:442
442	        break;
(gdb) 
759	    return true;
(gdb) 
mozilla::AutoJSContext::~AutoJSContext (this=0xbfffc0b0, __in_chrg=<optimized out>) at ../../../dist/include/nsCxPusher.h:103
warning: Source file is more recent than executable.
103	private:
(gdb) s
mozilla::Maybe<mozilla::AutoCxPusher>::~Maybe (this=0xbfffc0b8, __in_chrg=<optimized out>) at ../../../dist/include/mozilla/Maybe.h:41
41	    ~Maybe() { if (constructed) asT().~T(); }
(gdb) s
XPCConvert::JSData2Native (d=0xbfffc350, s=<error reading variable: Cannot access memory at address 0x1>, type=..., useAllocator=true, 
    iid=0xbfffc478, pErr=0xbfffc488) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCConvert.cpp:760
760	}
(gdb) p s
Cannot access memory at address 0x1
(gdb) p d
$10 = (void *) 0xbfffc350
(gdb) s
ConvertIndependentParam (i=1 '\001', this=0xbfffc310)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:2172
2172	    return true;
(gdb) p dp
$11 = (nsXPTCVariant *) 0xbfffc350
(gdb) p *dp
$12 = {<nsXPTCMiniVariant> = {val = {i8 = -1 '\377', i16 = -1, i32 = -126, i64 = -541165879296, u8 = 255 '\377', u16 = 65535, 
      u32 = 4294967170, u64 = 18446743532543672320, f = -nan(0x7fff82), d = -nan(0xfff8200000000), b = 255, c = 255 '\377', 
      wc = -1 u'\xffff', p = 0xffffff82, j = {data = {asBits = 18446743532543672320, s = {tag = JSVAL_TAG_UNDEFINED, payload = {i32 = 0, 
              u32 = 0, boo = 0, str = 0x0, obj = 0x0, ptr = 0x0, why = JS_ELEMENTS_HOLE, word = 0, uintptr = 0}}, 
          asDouble = -nan(0xfff8200000000), asPtr = 0xffffff82}}}}, ptr = 0xbfffc350, type = {<XPTTypeDescriptorPrefix> = {
      flags = 26 '\032'}, <No data fields>}, flags = 3 '\003'}
(gdb) 
p mArgv[1]
$23 = {data = {asBits = 18446743555268535344, s = {tag = JSVAL_TAG_OBJECT, payload = {i32 = 1250026544, u32 = 1250026544, 
        boo = 1250026544, str = 0x4a81e430, obj = 0x4a81e430, ptr = 0x4a81e430, why = 1250026544, word = 1250026544, 
        uintptr = 1250026544}}, asDouble = -nan(0xfff874a81e430), asPtr = 0xffffff87}}
So if I'm reading your gdb session correctly, this implies that the pointer is already wrong, at least for the JSVAL_TAG_OBJECT (asPtr should be a real pointer).
Okay, I have TenFourFox 29 running (it's horrifically buggy right now but Australis does work in a basic sense on 10.4).

The good news: It doesn't look like an endian problem in handles.
The bad news: I don't know what I did to get it running, but I can't replicate it on OS X/ppc.

The only guidance I can give is that I think moz.build in xpcom/reflect/xptcall/src/md/unix is wrong -- it was compiling Darwin/x86, not /ppc, and I couldn't get it to link initially because it was missing all the PowerPC glue symbols. I eventually just hardcoded it in our local changesets because we only build PPC anyway. You might want to look and see what it's actually compiling on Linux/ppc32 -- I bet it's doing something different than ppc64.
(Blame log says that was bug 932178, btw, fwiw.)
Another user (from #debianppc) has confirmed the same stack trace on a debian ppc32 machine.
I've found that if I make this change

-- a/js/xpconnect/loader/mozJSComponentLoader.cpp
+++ b/js/xpconnect/loader/mozJSComponentLoader.cpp
@@ -1097,20 +1097,20 @@ mozJSComponentLoader::UnloadModules()
 NS_IMETHODIMP
 mozJSComponentLoader::Import(const nsACString& registryLocation,
                              HandleValue targetValArg,
                              JSContext *cx,
                              uint8_t optionalArgc,
                              MutableHandleValue retval)
 {
     MOZ_ASSERT(nsContentUtils::IsCallerChrome());
-    RootedValue targetVal(cx, targetValArg);
     RootedObject targetObject(cx, nullptr);
     if (optionalArgc) {
+        RootedValue targetVal(cx, targetValArg);

I get a bit further, it then calls ImportInto which will cause the Import function to be invoked again which will then segfault setting retval

#0  0x096cd604 in JS::Value::setObject (this=0xffffff82, obj=...) at ../../../dist/include/js/Value.h:982
#1  0x096cecb0 in js::MutableValueOperations<JS::MutableHandle<JS::Value> >::setObject (this=0xfffe9acc, obj=...)
    at ../../../dist/include/js/Value.h:1632
#2  0x0b3ae73c in mozJSComponentLoader::Import (this=0xf7df3950, registryLocation=..., targetValArg=..., cx=0xf7d81e40, 
    optionalArgc=0 '\000', retval=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/loader/mozJSComponentLoader.cpp:1148
#3  0x0b2ac9a0 in nsXPCComponents_Utils::Import (this=0xf5cb4500, registryLocation=..., targetObj=..., cx=0xf7d81e40, optionalArgc=
    0 '\000', retval=...) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCComponents.cpp:2733
#4  0x0958b1c4 in NS_InvokeByIndex ()
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_asm_ppc_linux.S:86
#5  0x0b355248 in Invoke (this=0xfffe9d70)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:2408
#6  Call (this=<optimized out>)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:1748
#7  XPCWrappedNative::CallMethod (ccx=..., mode=XPCWrappedNative::CALL_METHOD)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:1714
#8  0x0b35fe60 in XPC_WN_CallMethod (cx=0xf7d81e40, argc=1, vp=0xf42e80f8)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1282
#9  0x0dd986e8 in CallJSNative (args=..., native=0xb35fa7c <XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*)>, cx=0xf7d81e40)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/jscntxtinlines.h:239
#10 js::Invoke (cx=0xf7d81e40, args=..., construct=js::NO_CONSTRUCT)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:476
#11 0x0dda8588 in Interpret (cx=0xf7d81e40, state=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:2619
#12 0x0dd98358 in js::RunScript (cx=0xf7d81e40, state=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:423
#13 0x0dd995f0 in js::ExecuteKernel (cx=0xf7d81e40, script=..., scopeChainArg=..., thisv=..., type=js::EXECUTE_GLOBAL, evalInFrame=
    ..., result=0x0) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:631
#14 0x0dd99820 in js::Execute (cx=0xf7d81e40, script=..., scopeChainArg=..., rval=0x0)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:668
#15 0x0db94f94 in ExecuteScript (cx=0xf7d81e40, obj=..., scriptArg=..., rval=0x0)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/jsapi.cpp:4747
#16 0x0db951d4 in JS_ExecuteScriptVersion (cx=0xf7d81e40, obj=<error reading variable: value has been optimized out>, 
    script=<error reading variable: value has been optimized out>, version=JSVERSION_ECMA_5)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/jsapi.cpp:4772
#17 0x0b3addc4 in mozJSComponentLoader::ObjectForLocation (this=0xf7df3950, aComponentFile=0xf7c3b800, aURI=0xf7df42f0, aObject=
    0xf4265ca0, aTableScript=0xf4265ca4, aLocation=0xf4265ca8, aPropagateExceptions=true, aException=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/loader/mozJSComponentLoader.cpp:1026
#18 0x0b3af344 in mozJSComponentLoader::ImportInto (this=0xf7df3950, aLocation=..., targetObj=..., callercx=0xf7d81e40, vp=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/loader/mozJSComponentLoader.cpp:1244
#19 0x0b3ae698 in mozJSComponentLoader::Import (this=0xf7df3950, registryLocation=..., targetValArg=..., cx=0xf7d81e40, 
    optionalArgc=0 '\000', retval=...)


Also of note:
0xffffff82 is what you get on a 32 bit platform for  JSVAL_TAG_UNDEFINED
This also happens on SPARC 32bit/Solaris. (Firefox 29)
HandleValue is a structure, which contains a ptr.
On SPARC, structures are passed by making a copy of the structure and passing a pointer to the copy.
But the JS code just passes the ptr in HandleValue to Import().

I guess it might be the same problem on PowerPC.
Thanks steve for pointing me at that bug - i'm also hitting a segfault during make install on OpenBSD/powerpc with 29.0b8, during startup cache precompilation, which usually shows problems with the js engine, so that might be the same issue. Building the same version on OpenBSD/sparc64 now to double-check, and trying 29.0b2 on powerpc too.
I'm able to reproduce this issue using 29.0b2 on PowerPC 32 bits (tested running 32 bits userland on top of a 64 bits kernel).

29.0b2 completes build and install successfully on PPC64.
Just to add the contrary data point, I just rebuilt from 29 beta with the TenFourFox changesets, and it's still working fine on OS X/ppc. In fact, I'm maybe a week or two away from issuing a beta to our 32-bit PowerPC audience. So I'm really thinking there's a problem with xptcall -- that's the only code that should be different. We should be exercising exactly the same code in JS otherwise.
And fwiw, 29.0b2 also fails for me on OpenBSD/ppc, while 29.0b8 builds fine on OpenBSD/sparc64. So a 32-bits issue only ?
Since xptcall was suspected, i thought about https://bugzilla.mozilla.org/show_bug.cgi?id=958234 but that also landed in 28, so i doubt this is it.
Also, the js shell itself runs fine, i can do some really basic operations in it... if anyone has a more complicate usecase to trigger a segfault...
Actually (Boris, correct me if I'm wrong), that lends *more* credence to the xptcall theory, because the js shell doesn't use it.
That was was my thought on reading comment 34 too, yes.
This patch solved the "make package" issue on SPARC for me.

I've only tested with SPARC 32bit on Solaris.

I'm not sure
1) Is there any other platform needs this fix?
2) Does this fix introduce regressions? (if there's raw pointer or ref used for JSVAL)
3) Is there other struct passing in/out in xpt call interface? We may have calling conventions issues on SPARC, too.
I've seen one during "make check" several months ago, but not sure if it still exists.
This fix is probably required on the other 32-bit big-endian platforms (except Rhapsody PPC/OS X PPC, which is our port, and does work).

If it were going to break anything, I'd think this would pretty much break everything.
(In reply to Ginn Chen from comment #37)
> Created attachment 8407973 [details] [diff] [review]
> the patch that works for me on SPARC
> 
> This patch solved the "make package" issue on SPARC for me.
> 
> I've only tested with SPARC 32bit on Solaris.
> 
> I'm not sure
> 1) Is there any other platform needs this fix?

*sparc_openbsd* is used on sparc64/v9 (we dont have firefox on sparc32/v8) which afaict is unaffected.

I'll see if a similar fix can be used/applied on *ppc_openbsd*.
I think the issue is about calling conventions. So it's not something about 32bit vs 64bit, little endian vs big endian.

But, Landry is right, SPARC 64 (v9) should not be changed. Because the calling convention of SPARCv9 is "Small data structures and some floating point arguments are now passed directly in registers."
I thought i had it.. but alas, not yet. With that patch, make install is fine w/ 29.0b8 on OpenBSD/ppc (so startup cache precompilation is ok via xpcshell), firefox starts (i see the checking addons window) but as soon as the main win is painted, it crashes:

firefox(13862) in free(): error: bogus pointer (double free?) 0xffffff82
Abort trap (core dumped)

Note that i have no idea what i'm doing, just shuffling code around randomly trying to mimic sparc fix..
Remove the change for SPARC v9.
i.e. xptcinvoke_sparc64_openbsd.cpp and xptcinvoke_sparcv9_solaris.cpp should not be changed.
Attachment #8407973 - Attachment is obsolete: true
(In reply to Landry Breuil (:gaston) from comment #41)
> firefox(13862) in free(): error: bogus pointer (double free?) 0xffffff82
> Abort trap (core dumped)
> 

Can you attach the core stack?
This patch seems to let me start firefox on ppc32 on linux
Ignore my previous comment I was running the wrong binary.
I am actually seeing something similar to what Landry is:

Stack trace
#0  0x0fb3e06c in free () from /lib/libc.so.6
#1  0x0f661054 in moz_free (ptr=<optimized out>) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/memory/mozalloc/mozalloc.cpp:46
#2  0x0b65baac in NS_Free (ptr=<optimized out>) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/xpcom/base/nsMemoryImpl.cpp:211
#3  0x0c8abc3c in Free (ptr=<optimized out>) at ../../../dist/include/nsMemory.h:42
#4  CleanupParam (type=..., param=..., this=0xfffec5b8)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:2336
#5  CallMethodHelper::~CallMethodHelper (this=0xfffec5b8, __in_chrg=<optimized out>)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:1802
#6  0x0c8ac1a0 in XPCWrappedNative::CallMethod (ccx=..., mode=mode@entry=XPCWrappedNative::CALL_GETTER)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:1713
#7  0x0c8ad9d4 in GetAttribute (ccx=...) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/xpcprivate.h:2084
#8  XPC_WN_GetterSetter (cx=cx@entry=0x1025cff0, argc=0, vp=0xfffec8f0)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1321
#9  0x0e06b8ec in CallJSNative (args=..., native=0xc8ad790 <XPC_WN_GetterSetter(JSContext*, unsigned int, JS::Value*)>, cx=0x1025cff0)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/jscntxtinlines.h:239
#10 js::Invoke (cx=cx@entry=0x1025cff0, args=..., construct=construct@entry=js::NO_CONSTRUCT)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:476
#11 0x0e06c25c in js::Invoke (cx=cx@entry=0x1025cff0, thisv=..., fval=..., argc=argc@entry=0, argv=argv@entry=0x0, 
    rval=<error reading variable: value has been optimized out>)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:532
#12 0x0e06c430 in js::InvokeGetterOrSetter (cx=0x1025cff0, obj=0xe893c280, fval=..., argc=0, argv=0x0, rval=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Interpreter.cpp:604
#13 0x0dfb35c0 in js::Shape::get (this=<optimized out>, cx=0x1025cff0, receiver=..., obj=<optimized out>, pobj=<optimized out>, vp=...)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/src/vm/Shape-inl.h:46
(In reply to Ginn Chen from comment #43)
> (In reply to Landry Breuil (:gaston) from comment #41)
> > firefox(13862) in free(): error: bogus pointer (double free?) 0xffffff82
> > Abort trap (core dumped)
> > 
> 
> Can you attach the core stack?

Not really, gdb is unusable with libxul on OpenBSD, it segfaults.. but i suppose it's the same thing steve is seeing.
A few more details:

 0x0b35b6d4 in CleanupParam (type=..., param=..., this=0xfffeae30)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedNative.cpp:2336
2336                nsMemory::Free(param.val.p);
(gdb) p param
$3 = (nsXPTCMiniVariant &) @0xfffeae60: {val = {i8 = -1 '\377', i16 = -1, i32 = -126, i64 = -541165879296, u8 = 255 '\377', u16 = 65535, u32 = 
    4294967170, u64 = 18446743532543672320, f = -nan(0x7fff82), d = -nan(0xfff8200000000), b = 255, c = 255 '\377', wc = -1 u'\xffff', p = 0xffffff82, 
    j = {data = {asBits = 18446743532543672320, s = {tag = JSVAL_TAG_UNDEFINED, payload = {i32 = 0, u32 = 0, boo = 0, str = 0x0, obj = 0x0, ptr = 0x0, 
            why = JS_ELEMENTS_HOLE, word = 0, uintptr = 0}}, asDouble = -nan(0xfff8200000000), asPtr = 0xffffff82}}}}
(gdb) p param.val
$4 = {i8 = -1 '\377', i16 = -1, i32 = -126, i64 = -541165879296, u8 = 255 '\377', u16 = 65535, u32 = 4294967170, u64 = 18446743532543672320, f = 
    -nan(0x7fff82), d = -nan(0xfff8200000000), b = 255, c = 255 '\377', wc = -1 u'\xffff', p = 0xffffff82, j = {data = {asBits = 18446743532543672320, 
      s = {tag = JSVAL_TAG_UNDEFINED, payload = {i32 = 0, u32 = 0, boo = 0, str = 0x0, obj = 0x0, ptr = 0x0, why = JS_ELEMENTS_HOLE, word = 0, 
          uintptr = 0}}, asDouble = -nan(0xfff8200000000), asPtr = 0xffffff82}}}
(gdb) p param.val.p
$5 = (void *) 0xffffff82
(gdb) 

p type
$6 = (nsXPTType &) @0xfffeae6c: {<XPTTypeDescriptorPrefix> = {flags = 230 '\346'}, <No data fields>}
(gdb) 


What should val.p be for JSVAL_TAG_UNDEFINED ? nullptr or a pointer to real dynamically allocated memory?
I'm not 100% sure, but I don't think that if the tag is UNDEFINED that the payload *must* be nullptr.
So 29 will ship broken on ppc, still need to compile successfully aurora & central to crosscheck..
The PowerPC ABIs are similar *enough* between PowerOpen (AIX/OS X) and SysV (everything else) that you could maybe adopt with some changes what we're doing in OS X. In particular, if you look at xptcinvoke_ppc_rhapsody.cpp (and for that matter xptcinvoke_ppc_aix.cpp), we just dump everything on the stack; we don't have this "gprs" array business keeping track. It's a lot simpler.

You'd also need to change xptcinvoke_asm_ppc_openbsd.S so that it's using the same calling registers for invoke_copy_to_stack in xptcinvoke_asm_ppc_rhapsody.s, and you'd need to account for the changes in the stack frame (OS X and AIX use the arguments block provided for that purpose which is mandatory for PowerOpen, but optional on SysV). One conceivable way is to pull down a big frame in NS_InvokeByIndex and pass the extra space to invoke_copy_to_stack.
(In reply to Cameron Kaiser [:spectre] from comment #50)
> The PowerPC ABIs are similar *enough* between PowerOpen (AIX/OS X) and SysV
> (everything else) that you could maybe adopt with some changes what we're
> doing in OS X. In particular, if you look at xptcinvoke_ppc_rhapsody.cpp
> (and for that matter xptcinvoke_ppc_aix.cpp), we just dump everything on the
> stack; we don't have this "gprs" array business keeping track. It's a lot
> simpler.
> 
> You'd also need to change xptcinvoke_asm_ppc_openbsd.S so that it's using
> the same calling registers for invoke_copy_to_stack in
> xptcinvoke_asm_ppc_rhapsody.s, and you'd need to account for the changes in
> the stack frame (OS X and AIX use the arguments block provided for that
> purpose which is mandatory for PowerOpen, but optional on SysV). One
> conceivable way is to pull down a big frame in NS_InvokeByIndex and pass the
> extra space to invoke_copy_to_stack.

That would require understanding what all this means, which is unfortunately not the case :)
I can try to look at this myself, but I don't have *BSD or Linux running on my Power systems (they're either OS X or AIX which are not affected). Right now I'm trying to get TenFourFox 31 off the ground in my copious spare time between a day job and a Master's degree. If no one else is able to take this over, I'll try to get it building on one of the spare machines, but it's going to be a while before I can get to it (i.e., at least a couple months).

Gustavo, I don't know if anyone in your group has any spare cycles?

The good news is that because Linux and *BSD use the same PowerPC 32-bit ABI, once one is fixed, the fix can be adapted to the other, assuming that the problem is correctly understood.
(In reply to Cameron Kaiser [:spectre] from comment #52)
> Gustavo, I don't know if anyone in your group has any spare cycles?

I'm working on this. So far I got a smaller reproducer and some feeling of what is wrong. Not sure if I will have much time to work on this today though.

Here is the reproducer I'm working on:

echo -e 'Cu = Components.utils;\nCu.import("m");' > test.js
xpcshell -f test.js

Comparing ppc32 and ppc64, looks like Cu gets assigned JSVAL_TAG_NULL on ppc64, but JSVAL_TAG_OBJECT with a bogus pointer on ppc32. The interpreter segfaults later trying to dereference the bogus pointer.
I'm still trying to understand exactly where that happens and why.
Tried aurora branch, and it also segfaults on the same minimal testcase.
It took me a while to understand all the xpcom bits, but I think I got to the bottom of it.

Boris is right on comment 8 regarding Value& and HandleValue binary compatibility, however, there is a special C++ ABI rule that says that objects must be passed by reference if they have nontrivial constructors. See http://mentorembedded.github.io/cxx-abi/abi.html#value-parameter

HandleValue does have non-trivial constructors, so nsXPCComponents_Utils::Import() expects a reference to HandleValue, not a reference to Value, which is actually being passed by CallMethodHelper::ConvertIndependentParam().
If so, then I wonder how this worked at all on OS X/ppc. We should fall into the same trap, right? But our build of 29 works fine (on both G5 32-bit and 7450 G4).
(In reply to Cameron Kaiser [:spectre] from comment #57)
> If so, then I wonder how this worked at all on OS X/ppc. We should fall into
> the same trap, right? But our build of 29 works fine (on both G5 32-bit and
> 7450 G4).

I do not know why/if it works fine on OS X/ppc.
In the case of Linux/ppc64, it has the same issue. However, there seems to be a different issue that fails earlier and less catastrophically. The end result being it never gets to nsXPCComponents_Utils::Import().
> See http://mentorembedded.github.io/cxx-abi/abi.html#value-parameter

Isn't that specific to Itanium?  I guess it's possible Linux ppc32 has a similar ABI rule...
(In reply to Boris Zbarsky [:bz] from comment #59)
> > See http://mentorembedded.github.io/cxx-abi/abi.html#value-parameter
> 
> Isn't that specific to Itanium?  I guess it's possible Linux ppc32 has a
> similar ABI rule...

AFAIK, that rule applies to all arches (what makes me wonder how even x86 works).
(In reply to Boris Zbarsky [:bz] from comment #59)
> > See http://mentorembedded.github.io/cxx-abi/abi.html#value-parameter
> 
> Isn't that specific to Itanium?  I guess it's possible Linux ppc32 has a
> similar ABI rule...

Even though it's called the "Itanium" ABI, it applies to all instances of Linux/*BSD/etc.  OS X uses it too.
Hmm.  So looking at the rule some more, it talks about a nontrivial _copy_ constructor, not just any constructor.  Does Handle actually have that?  The closest thing I see to it is:

407     template <typename S>
408     Handle(Handle<S> handle,
409            typename mozilla::EnableIf<mozilla::IsConvertible<S, T>::value, int>::Type dummy = 0)
410     {
411         static_assert(sizeof(Handle<T>) == sizeof(T *),
412                       "Handle must be binary compatible with T*.");
413         ptr = reinterpret_cast<const T *>(handle.address());
414     }

Is that the one you're looking at?
(In reply to Gustavo Luiz Duarte from comment #60)
> (In reply to Boris Zbarsky [:bz] from comment #59)
> > > See http://mentorembedded.github.io/cxx-abi/abi.html#value-parameter
> > 
> > Isn't that specific to Itanium?  I guess it's possible Linux ppc32 has a
> > similar ABI rule...
> 
> AFAIK, that rule applies to all arches (what makes me wonder how even x86
> works).

Well, that makes sense in one respect; OS X/ppc's xptcall throws everything on the stack and there's always a parameter area. However, the parameter area is not obligatory in SysV ABI except when necessary to accommodate arguments. What does the stack look like during the Linux ppc32 call?

If something is spilling over, or invoke_copy_to_stack has a stroke, xptcinvoke_asm_ppc_*.S could be setting its argregs from the wrong location. It might be worth throwing a trap instruction in there and seeing what the stack looks like just before the call (and what address r30 has).
For the crash of Free(), the cause is:

We did
+              // On SPARC, we need to pass a pointer to HandleValue
+              *((void**)l_d) = &l_s->ptr;
to fix the Import() call.

But the stub call gets into
nsXPCWrappedJSClass::CallMethod()

At https://hg.mozilla.org/releases/mozilla-release/file/b598d062a9ed/js/xpconnect/src/XPCWrappedJSClass.cpp#l1381

We do
  1381         if (!XPCConvert::JSData2Native(&pv->val, val, type,
  1382                                        !param.IsDipper(), &param_iid, nullptr))
  1383             break;

The &pv->val is "&l_s->ptr" not "l_s->ptr".
So it overwrites the pointer.

However, I don't know if it is the only place that has the problem.
(In reply to Boris Zbarsky [:bz] from comment #62)
> Hmm.  So looking at the rule some more, it talks about a nontrivial _copy_
> constructor, not just any constructor.  Does Handle actually have that?

Yeah, this is tricky. So I looked again at the ppc64 asm and looks like it is not passing targetObj by reference as I suggested, though I can't be sure as that code is never executed on ppc64.

Today I poked a colleague and we wrote a reproducer and I compiled it (gcc-c++-4.8.2-7.fc20, optimizations disabled) on a few platforms for comparison:


template <typename T>
class HandleBase
{
};

template <typename T>
class Handle : public HandleBase<T>
{
public:
  const T *ptr;
};

class Value
{
public:
  int i;
};

typedef Handle<Value> HandleValue;

class Test
{
public:
  Test ();

  int method (HandleValue targetObj);
};

int Test::method (HandleValue targetObj)
{
  if (targetObj.ptr == 0)
    return 1;
  return 0;
}


Linux/PPC32 OUTPUT:
00000000 <_ZN4Test6methodE6HandleI5ValueE>:
   0:   94 21 ff e0     stwu    r1,-32(r1)
   4:   93 c1 00 18     stw     r30,24(r1)
   8:   93 e1 00 1c     stw     r31,28(r1)
   c:   7c 3f 0b 78     mr      r31,r1
  10:   90 7f 00 08     stw     r3,8(r31)
  14:   7c 9e 23 78     mr      r30,r4
  18:   81 3e 00 00     lwz     r9,0(r30)
  1c:   2f 89 00 00     cmpwi   cr7,r9,0
  20:   40 9e 00 0c     bne     cr7,2c <_ZN4Test6methodE6HandleI5ValueE+0x2c>
  24:   39 20 00 01     li      r9,1
  28:   48 00 00 08     b       30 <_ZN4Test6methodE6HandleI5ValueE+0x30>
  2c:   39 20 00 00     li      r9,0
  30:   7d 23 4b 78     mr      r3,r9
  34:   39 7f 00 20     addi    r11,r31,32
  38:   83 cb ff f8     lwz     r30,-8(r11)
  3c:   83 eb ff fc     lwz     r31,-4(r11)
  40:   7d 61 5b 78     mr      r1,r11
  44:   4e 80 00 20     blr


Linux/PPC64 OUTPUT:
0000000000000000 <._ZN4Test6methodE6HandleI5ValueE>:
   0:   fb e1 ff f8     std     r31,-8(r1)
   4:   f8 21 ff c1     stdu    r1,-64(r1)
   8:   7c 3f 0b 78     mr      r31,r1
   c:   f8 7f 00 70     std     r3,112(r31)
  10:   f8 9f 00 78     std     r4,120(r31)
  14:   e9 3f 00 78     ld      r9,120(r31)
  18:   2f a9 00 00     cmpdi   cr7,r9,0
  1c:   40 9e 00 0c     bne     cr7,28 <._ZN4Test6methodE6HandleI5ValueE+0x28>
  20:   39 20 00 01     li      r9,1
  24:   48 00 00 08     b       2c <._ZN4Test6methodE6HandleI5ValueE+0x2c>
  28:   39 20 00 00     li      r9,0
  2c:   7d 23 4b 78     mr      r3,r9
  30:   38 3f 00 40     addi    r1,r31,64
  34:   eb e1 ff f8     ld      r31,-8(r1)
  38:   4e 80 00 20     blr
  3c:   00 00 00 00     .long 0x0
  40:   00 09 00 00     .long 0x90000
  44:   80 01 00 01     lwz     r0,1(r1)


Linux/X86_64 OUTPUT:
0000000000000000 <_ZN4Test6methodE6HandleI5ValueE>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   8:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
   c:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  10:   48 85 c0                test   %rax,%rax
  13:   75 07                   jne    1c <_ZN4Test6methodE6HandleI5ValueE+0x1c>
  15:   b8 01 00 00 00          mov    $0x1,%eax
  1a:   eb 05                   jmp    21 <_ZN4Test6methodE6HandleI5ValueE+0x21>
  1c:   b8 00 00 00 00          mov    $0x0,%eax
  21:   5d                      pop    %rbp
  22:   c3                      retq


Linux/I386 OUTPUT:
00000000 <_ZN4Test6methodE6HandleI5ValueE>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   8b 45 0c                mov    0xc(%ebp),%eax
   6:   85 c0                   test   %eax,%eax
   8:   75 07                   jne    11 <_ZN4Test6methodE6HandleI5ValueE+0x11>
   a:   b8 01 00 00 00          mov    $0x1,%eax
   f:   eb 05                   jmp    16 <_ZN4Test6methodE6HandleI5ValueE+0x16>
  11:   b8 00 00 00 00          mov    $0x0,%eax
  16:   5d                      pop    %ebp
  17:   c3                      ret
Clearly, we can see that ppc32 has a different behavior than ppc64 and x86_64. The parameter is being passed by reference on ppc32 while it is being passed by value on ppc64 and x86_64. I'm not used to the i386 ABI, but to me that looks like a parameter passing by value though the stack, is that right?

I couldn't find any reference explaining why ppc32 works differently. Tomorrow I will poke the compiler folks, they may have some answers.
(In reply to Gustavo Luiz Duarte from comment #66)
> I couldn't find any reference explaining why ppc32 works differently.
> Tomorrow I will poke the compiler folks, they may have some answers.

The calling conventions is just defined like that for ppc32 and SPARC 32.
"structures are passed by making a copy of the structure and passing a pointer to the copy."
> The parameter is being passed by reference on ppc32

Interesting. On OS X/ppc, the code is closer to ppc64:

% gcc-mp-4.8 -c -o test test.cpp
% otool -tv test
test:
(__TEXT,__text) section
__ZN4Test6methodE6HandleI5ValueE:
00000000        stmw    r30,0xfff8(r1)
00000004        stwu    r1,0xffd0(r1)
00000008        or      r30,r1,r1
0000000c        stw     r3,0x48(r30)
00000010        stw     r4,0x4c(r30)
00000014        lwz     r2,0x4c(r30)
00000018        cmpwi   cr7,r2,0x0
0000001c        bne     cr7,0x28
00000020        li      r2,0x1
00000024        b       0x2c
00000028        li      r2,0x0
0000002c        or      r3,r2,r2
00000030        addi    r1,r30,0x30
00000034        lmw     r30,0xfff8(r1)
00000038        blr

Just to see if this was a compiler change, I checked with 4.6:

test:
(__TEXT,__text) section
__ZN4Test6methodE6HandleI5ValueE:
00000000        stw     r30,0xfff8(r1)
00000004        stw     r31,0xfffc(r1)
00000008        stwu    r1,0xffd0(r1)
0000000c        or      r30,r1,r1
00000010        stw     r3,0x48(r30)
00000014        stw     r4,0x4c(r30)
00000018        lwz     r0,0x4c(r30)
0000001c        cmpwi   cr7,r0,0x0
00000020        bne     cr7,0x2c
00000024        li      r0,0x1
00000028        b       0x30
0000002c        li      r0,0x0
00000030        or      r3,r0,r0
00000034        addi    r1,r30,0x30
00000038        lwz     r30,0xfff8(r1)
0000003c        lwz     r31,0xfffc(r1)
00000040        blr
It turns out the Linux ppc32 and ppc64 ABIs are much more different than I expected.

Indeed the ppc32 ABI states that aggregate types shall be treated as a pointer to the object. You can get a copy of the spec here: https://www.power.org/documentation/power-architecture-32-bit-abi-supplement-1-0-embeddedlinuxunified/

So, what would be the proper fix here? Something along the lines of Ginn's patch (special cased for Linux/ppc32)? Or go back to JS::Value* for jsobj?
(In reply to Cameron Kaiser [:spectre] from comment #68)
> Interesting. On OS X/ppc, the code is closer to ppc64:

Yeah, the OS X/ppc ABI does state that. Item 3 on "parameter passing":
"The caller places structures (struct elements) with only one noncomposite member in general-purpose or floating-point registers, depending on whether the member is an integer or a floating-point value. For example, the caller places a structure comprised of a float member in a floating-point register, not a general-purpose register. When registers of the required type are exhausted, the caller places structures in the parameter area."
See: https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/LowLevelABI/100-32-bit_PowerPC_Function_Calling_Conventions/32bitPowerPC.html


This is relevant in case we decide to go along the lines of Ginn's patch, we must be sure to special case only Linux/pp32 (apparently SPARC/ppc32 too), not all ppc32 platforms.
(In reply to Gustavo Luiz Duarte from comment #70)
> must be sure to special case only Linux/pp32 (apparently SPARC/ppc32 too),

Geez! I mean SPARC 32.
Did anyone actually tried Ginn's patch on their platform ? I did in att #8408418 for OpenBSD/ppc, but it failed when starting firefox... (see comment 41) - afaict, we're using the same ABI as linux there.

(In reply to Ginn Chen from comment #64)
> For the crash of Free(), the cause is:
> 
> We did
> +              // On SPARC, we need to pass a pointer to HandleValue
> +              *((void**)l_d) = &l_s->ptr;
> to fix the Import() call.
> 
> But the stub call gets into
> nsXPCWrappedJSClass::CallMethod()
> 
> At
> https://hg.mozilla.org/releases/mozilla-release/file/b598d062a9ed/js/
> xpconnect/src/XPCWrappedJSClass.cpp#l1381
> 
> We do
>   1381         if (!XPCConvert::JSData2Native(&pv->val, val, type,
>   1382                                        !param.IsDipper(), &param_iid,
> nullptr))
>   1383             break;
> 
> The &pv->val is "&l_s->ptr" not "l_s->ptr".
> So it overwrites the pointer.

Was it referring to the double-free crash i was seeing ?
If so, you did modify the code there too, or not ? One should use pv->val, not &pv->val ?
While i have no idea what i'm doing and just throwing ideas in the air, could we make &pv->val | pv->val conditional to type_tag == nsXPTType::T_JSVAL like its done with T_INTERFACE in https://hg.mozilla.org/releases/mozilla-release/file/b598d062a9ed/js/xpconnect/src/XPCWrappedJSClass.cpp#l1374 ?
Also, that chunk a bit upper:

  1355         if (param.IsDipper())
  1356             pv = (nsXPTCMiniVariant*) &nativeParams[i].val.p;
  1357         else
  1358             pv = (nsXPTCMiniVariant*) nativeParams[i].val.p;

makes me wonder here what IsDipper does, and if it could be related to (or used the same way as) T_JSVAL ?
http://mxr.mozilla.org/mozilla-central/ident?i=JSData2Native JSData2Native is not called in lots of places, so maybe it's worth trying to adapt all those calls to pass the unreferenced version of the first param in the T_JSVAL case?
(In reply to Landry Breuil (:gaston) from comment #72) 
> > The &pv->val is "&l_s->ptr" not "l_s->ptr".
> > So it overwrites the pointer.
> 
> Was it referring to the double-free crash i was seeing ?
> If so, you did modify the code there too, or not ? One should use pv->val,
> not &pv->val ?

It is not double-free.

XPCConvert::JSData2Native should write to "l_s->ptr" (JSVal).
But because we pass "&l_s->ptr" for SPARC 32/ppc32.
It overwrites "&l_s->ptr" with a JSVal. (0xffffff82 something)
When we want to clean up and free &l_s->ptr, it crashes since 0xffffff82 is not a valid pointer.

On SPARC 32, if I do,

if (type_tag == nsXPTType::T_JSVAL) {
    if (!XPCConvert::JSData2Native(*(void**)(&pv->val), val, type,
        !param.IsDipper(), &param_iid, nullptr))
        break;
} else {
    if (!XPCConvert::JSData2Native(&pv->val, val, type,
        !param.IsDipper(), &param_iid, nullptr))
        break;
}

Firefox starts fine.

But I don't think it's a proper fix.


Ideally, if we know whether the stub we're invoking is native code or wrappered js code, so we can pass &l_s->ptr or l_s->ptr conditionally. I think the problem is gone.
But I don't know how can we know if the stub is wrapped js code.


Go back to JS::Value* is safer, but the code has changed a lot since Bug 939294.
Just to complete everything, I ran the test on AIX. This has the same ABI as OS X/ppc, so it should compile the same way, and it does:

% gcc -c -o test test.cpp
% objdump -d test

test:     file format aixcoff-rs6000

Disassembly of section .text:

0000000000000000 <._ZN4Test6methodE6HandleI5ValueE>:
   0:   93 e1 ff fc     stw     r31,-4(r1)
   4:   94 21 ff d0     stwu    r1,-48(r1)
   8:   7c 3f 0b 78     mr      r31,r1
   c:   90 7f 00 48     stw     r3,72(r31)
  10:   90 9f 00 4c     stw     r4,76(r31)
  14:   80 1f 00 4c     lwz     r0,76(r31)
  18:   2f 80 00 00     cmpwi   cr7,r0,0
  1c:   40 9e 00 10     bne-    cr7,2c <._ZN4Test6methodE6HandleI5ValueE+0x2c>
  20:   38 00 00 01     li      r0,1
  24:   90 1f 00 18     stw     r0,24(r31)
  28:   48 00 00 0c     b       34 <._ZN4Test6methodE6HandleI5ValueE+0x34>
  2c:   38 00 00 00     li      r0,0
  30:   90 1f 00 18     stw     r0,24(r31)
  34:   80 1f 00 18     lwz     r0,24(r31)
  38:   7c 03 03 78     mr      r3,r0
  3c:   80 21 00 00     lwz     r1,0(r1)
  40:   83 e1 ff fc     lwz     r31,-4(r1)
  44:   4e 80 00 20     blr

So the problem, at least on PowerPC, is limited to SysV ABI (which is Linux/ and *BSD/ppc).

I don't think you can go back to JS::Value* just for one set of arches (easily), and while I'm not a JS peer I think I can say with high confidence such a patch would not be accepted.

However, Ginn, that approach seems like it would work in the general case. Why do you not consider it a proper fix?
Like I said, if I change code in xpcom/reflect/xptcall/src/md/unix/, pass &l_s->ptr or l_s->ptr conditionally on whether the stub we're invoking is native code or wrappered js code, I think it is a proper fix.

If I change code of js/xpconnect/src/XPCWrappedJSClass.cpp, around line 1381, I'm not quite comfortable, because
1) It's not the only place use pv->val.
2) It's not the only place that call XPCConvert::JSData2Native(&pv->val, ...)

So how can I know it's the only place that need to be handled specially for SPARC32 and ppc32?

At least I need someone who know the code to have a look.
Rejoice! With this patch, i was able to build and run m-i tip, i'm even posting this patch from nightly on ppc.

Mozilla/5.0 (X11; OpenBSD macppc; rv:32.0) Gecko/20100101 Firefox/32.0

Gustavo, Steve, if you could try it on linux .. (adapting the xpctinvoke bits to linux of course)

Now, remains to check whether both jsdata2native changes are needed or only one, and if it should be within #ifdef (ppc|sparc)&sysv_abi or if it doesnt break other archs/abi (i doubt). I'll also test it more, and will backport it to 29/30beta. But this time i hope we're on the right track.
Attachment #8423318 - Flags: feedback?(steve)
Attachment #8423318 - Flags: feedback?(gustavold)
The same patch allows 30.0b2 to build and run - after adding some debug printfs (the best debugging method ever...) i noticed that only the first JSData2Native codepath (ie XPCWrappedJSClass.cpp#1360) was called with T_JSVAL objects, and not the one on line 1432. So i suppose the second chunk isnt needed, or something special needs to be done to exercise this codepath.
Comment on attachment 8423318 [details] [diff] [review]
OpenBSD/ppc wip patch

attachment 8423318 [details] [diff] [review]  + attachment 8408720 [details] [diff] [review]
give me a working firefox (m-c) on my ppc32/linux machine.
Attachment #8423318 - Flags: feedback?(steve) → feedback+
At first glance, that second codepath is only reached for the T_INTERFACE_IS type.  In particular, that whole code chunk is only reached when IsDependent(), which is never true for T_JSVAL.
(In reply to Steve Singer (:stevensn) from comment #81)
> Comment on attachment 8423318 [details] [diff] [review]
> OpenBSD/ppc wip patch
> 
> attachment 8423318 [details] [diff] [review]  + attachment 8408720 [details] [diff] [review]
> [diff] [review]
> give me a working firefox (m-c) on my ppc32/linux machine.

However this combination if patches gives me a segfault on ppc64 linux:

#0  0x00003fffb4570c34 in XPCConvert::JSData2Native (d=0xfff9000000000000, s=..., type=..., useAllocator=<optimized out>, iid=
    0x3fffffff89f8, pErr=0x0) at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCConvert.cpp:588
#1  0x00003fffb459883c in nsXPCWrappedJSClass::CallMethod (this=0x3fff95e3d800, wrapper=<optimized out>, methodIndex=<optimized out>, info_=
    0x3fffa75fa980, nativeParams=0x3fffffff8bf0)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedJSClass.cpp:1363
#2  0x00003fffb4583fa8 in CallMethod (params=0x3fffffff8bf0, info=0x3fffa75fa980, methodIndex=3, this=0x3fff95a93700)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedJS.cpp:517
#3  nsXPCWrappedJS::CallMethod (this=0x3fff95a93700, methodIndex=<optimized out>, info=0x3fffa75fa980, params=0x3fffffff8bf0)
    at /media/nfs/usb_drive_src/firefox/mozilla-central-hg/src/js/xpconnect/src/XPCWrappedJS.cpp:507
I'm not really surprised... so it means that calls for a big horrible #ifdef archs_and_platforms_affected_by_this_because_of_special_passing_conventions ?
I don't think there's any way around that. I certainly wouldn't tamper with the existing code.
so something like:

#if (defined(XP_UNIX) && !defined(XP_MACOSX)) && ((defined(__sparc__) && !defined(__sparc64__)) || (defined(__powerpc__) && !defined (__powerpc64__)))

around the if (JS_VAL) would work for everyone ?

Afaic, OpenBSD/ppc has:
#define __powerpc__ 1
#define __powerpc 1

and OpenBSD/sparc64 has (it doesnt need the fix):
#define sparc 1
#define __sparc__ 1
#define __sparc 1
#define __sparc64__ 1
#define __sparcv9__ 1
#define __sparc_v9__ 1
martin, any luck you could try to backport/forwardport/test the xptcinvoke_ppc_openbsd.cpp chunk to xptcinvoke_ppc_netbsd.cpp or should i just assume netbsd needs the same treatment ?
If we want to fix this in this cycle and hopefully get it backported to beta before 30 is released, we'd better get this rollin'... so lets try this first. steve,gustavo,martin... i'd like your feedback on this too :)
ginn, if you want a chunk added for sparc/solaris, let me know which one.. and cameron, if you could check this doesnt break osx/ppc too, would be great!
Assignee: nobody → landry
Attachment #8426537 - Flags: review?(bzbarsky)
Comment on attachment 8426537 [details] [diff] [review]
put the XPCWrappedJSClass.cpp chunk within a huge #ifdef

r=me on the bits here, though don't sparc xptcall bits need changing too?

Also, is this not a problem in the opposite direction, calling from JS to C++?  If so, why not?
Attachment #8426537 - Flags: review?(bzbarsky) → review+
yeah sparc/solaris and netbsd/(ppc|sparc?) probably needs changing too at the same time. As for the opposite direction.. i have no idea.
Oh, I see.  You're handling those via the xptcinvoke changes for xpconnect.

What about for the JITs?  In particular, for ion, which will directly invoke methods that have a HandleValue in the signature.  Or do the relevant platforms not have ion?
Comment on attachment 8426537 [details] [diff] [review]
put the XPCWrappedJSClass.cpp chunk within a huge #ifdef

Review of attachment 8426537 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/reflect/xptcall/src/md/unix/xptcinvoke_ppc_linux.cpp
@@ +46,5 @@
>      
>      for(uint32_t i = 0; i < paramCount; i++, s++) {
> +        if(s->IsPtrData()) {
> +            if(s->type == nsXPTType::T_JSVAL)
> +                tempu32 = &s->ptr;

You need a cast, here, otherwise, it fails to build.

And you need the same thing in sparc files.
Attachment #8426537 - Flags: feedback-
Sorry, no chance for me to test NetBSD/ppc any time soon. Should I go find some other NetBSD person to test? I assume the same change is needed.
I am traveling this week(pgcon) and won't be able to test this on my ppc machines until Sunday but the patch looks fine.

None of the effected platforms currently work with ion.
(In reply to Boris Zbarsky [:bz] from comment #91)
> Oh, I see.  You're handling those via the xptcinvoke changes for xpconnect.
> 
> What about for the JITs?  In particular, for ion, which will directly invoke
> methods that have a HandleValue in the signature.  Or do the relevant
> platforms not have ion?

Only OS X/ppc has a JIT, and we're not affected (in fact, I just finished fixing my code generator to work with generational GC).
Comment on attachment 8426537 [details] [diff] [review]
put the XPCWrappedJSClass.cpp chunk within a huge #ifdef

Review of attachment 8426537 [details] [diff] [review]:
-----------------------------------------------------------------

Compiles fine on OS X with the expected code path being built, but I'll give it a more thorough test when my 31 build is complete.
Attachment #8426537 - Flags: feedback+
(In reply to Landry Breuil (:gaston) from comment #39)
> (In reply to Ginn Chen from comment #37)
> > Created attachment 8407973 [details] [diff] [review]
> > the patch that works for me on SPARC
> > 
> > This patch solved the "make package" issue on SPARC for me.
> > 
> > I've only tested with SPARC 32bit on Solaris.
> > 
> > I'm not sure
> > 1) Is there any other platform needs this fix?
> 
> *sparc_openbsd* is used on sparc64/v9 (we dont have firefox on sparc32/v8)
> which afaict is unaffected.

I was wrong here, xpctinvoke_sparc_openbsd.cpp would be used if we had ffx on sparc32/v8, on sparc64/v9 where it works we use xpctinvoke_sparc64_openbsd.cpp. Phew.
And here's a (hopefully) final patch adding the missing cast from comment #92, applying the same fix to xptcinvoke_ppc_netbsd.cpp, and merging the sparc32 bits from ginn in attachment #8408619 [details] [diff] [review].
Attachment #8427950 - Flags: review?(bzbarsky)
Probably should throw an && !defined(_AIX) in there too, since AIX is not susceptible to this even though I have no idea if the current code even builds on AIX anymore.
Comment on attachment 8427950 [details] [diff] [review]
Final patch, taking care of ppc/sparc

Review of attachment 8427950 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/reflect/xptcall/src/md/unix/xptcinvoke_sparcv9_solaris.cpp
@@ +42,5 @@
> +        *l_d = (uint64_t)&l_s->ptr;
> +      } else
> +      {
> +        *l_d = (uint64_t)l_s->ptr;
> +      }

*sparcv9_solaris.cpp are used for sparc64. Which is left out of the XPCWrappedJSClass.cpp change.
Comment on attachment 8427950 [details] [diff] [review]
Final patch, taking care of ppc/sparc

Review of attachment 8427950 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/XPCWrappedJSClass.cpp
@@ +1361,4 @@
>          if (!XPCConvert::JSData2Native(&pv->val, val, type,
>                                         !param.IsDipper(), &param_iid, nullptr))
>              break;
> +#endif

Why not a
#if
   if () {
     if ()
   } else
#endif
   {
     if ()
   }

form?

How about the other JSData2Native below? You used to patch it.
Comment on attachment 8427950 [details] [diff] [review]
Final patch, taking care of ppc/sparc

Going to defer to Mike on the xptcall bits....
Attachment #8427950 - Flags: review?(bzbarsky) → review?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #100)

> *sparcv9_solaris.cpp are used for sparc64. Which is left out of the
> XPCWrappedJSClass.cpp change.

Well, i think they're untested on solaris/sparc64 (Ginn ?) so i wonder if i should let it out of the patch. OpenBSD/sparc64 dont need it.

(In reply to Mike Hommey [:glandium] from comment #101)

> Why not a
> #if
>    if () {
>      if ()
>    } else
> #endif
>    {
>      if ()
>    }
> 
> form?

Yeah i considered it but was wondering if it'd be pleasant to the eye.. can do.

> How about the other JSData2Native below? You used to patch it.

As bz said in an earlier comment, the other JSData2Native call is not reached in the T_JSVAL case.
Add !defined(_AIX), rework the #ifdef layout, and take out the sparcv9_solaris.cpp bits until someone tests them for real.
Attachment #8427950 - Attachment is obsolete: true
Attachment #8427950 - Flags: review?(mh+mozilla)
Attachment #8428213 - Flags: review?(mh+mozilla)
Attachment #8428213 - Flags: review?(bzbarsky)
Just for info: Iceweasel 30 is available for Debian Sid. It works without any problems on my AmigaOne X1000 PowerPC system (Nemo board with PA6T cpu).

Screenshot: http://forum.hyperion-entertainment.biz/download/file.php?id=1178&mode=view

Thank you very much indeed for your hard work!
attachment 8428213 [details] [diff] [review] works for me on Linux both ppc32 and ppc64.
review ping, to get this tested soon and backported to beta before 30 is released ?
Attachment #8428213 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8428213 [details] [diff] [review]
Final patch, taking care of ppc/sparc

r=me
Attachment #8428213 - Flags: review?(bzbarsky) → review+
Sorry for the late response.

It will be great if you can use
+    ((defined(__sparc) && !defined(__sparcv9) && !defined(__sparcv9__)) || \
instead of
+    ((defined(__sparc__) && !defined(__sparc64__)) || \

Solaris Studio defines
__sparc
__sparcv8 (or __sparcv9 for 64bit)

gcc/Solaris defines
sparc
__sparc__
__sparc
and
__sparcv9 for 64bit
After countless push races lost, managed to land it, including ginn's last minute changes:

https://hg.mozilla.org/integration/mozilla-inbound/rev/d16fcad6aa60
Comment on attachment 8428213 [details] [diff] [review]
Final patch, taking care of ppc/sparc

technically, this should be a=NPOTB but i never know.. if uplifiting, take what was pushed in the end to inbound (see previous comment)
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 939294
User impact if declined: crash at runtime on linux/bsd/solaris/ppc32/sparc32, and this since ffx 29
Testing completed (on m-c, etc.): tested by numerous ppc/sparc users
Risk to taking this patch (and alternatives if risky): none for tier1 platforms
Attachment #8428213 - Flags: approval-mozilla-beta?
Attachment #8428213 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/d16fcad6aa60
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Attachment #8428213 - Flags: approval-mozilla-beta?
Attachment #8428213 - Flags: approval-mozilla-beta+
Attachment #8428213 - Flags: approval-mozilla-aurora?
Attachment #8428213 - Flags: approval-mozilla-aurora+
See Also: → 1015027
Whiteboard: [qa-]
Depends on: 1153707
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: