Closed Bug 608696 Opened 9 years ago Closed 9 years ago

Cross compiling libffi is completely broken on TM

Categories

(Firefox Build System :: General, defect)

x86
All
defect
Not set

Tracking

(blocking2.0 betaN+)

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: paul.biggar, Assigned: paul.biggar)

References

Details

(Keywords: regression, Whiteboard: [fixed-in-tracemonkey])

Attachments

(4 files)

The patch for bug 605133 had passed tryserver (as commit X) when I pushed it to the tracemonkey tree (as commit X), which it promptly broke. If it was going to break the tree, then it seems like a bug that it passed tryesrver.

From looking at the logs, tracemonkey failed when it couldn't build js-config. Tryserver didn't seem to have a problem with that, perhaps due to not-clobbering the previous build (I'm not sure).


Tryserver:
commit ID: 3cb9e5913d42
build log: http://ftp.mozilla.org/pub/mozilla.org/firefox/tryserver-builds/pbiggar@mozilla.com-3cb9e5913d42/tryserver-macosx64/tryserver-macosx64-build4096.txt.gz

Tracemonkey:
commit ID: 66f4a212edeb
build log: http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1288347743.1288347823.26124.gz&fulltext=1
tryserver does full clobbers for each build, whereas tracemonkey doesn't.  You may have run into a dependency issue.
Is there a way to prevent this in future? Perhaps doing a full clobber when configure.in changes, or similar?
(The other possible cause is if your TryServer build was based on a different older? changeset then what was on tip of Tracemonkey branch when you landed on Tracemonkey. If that was the case, you were not comparing like-for-like builds.)

(In reply to comment #1)
> tryserver does full clobbers for each build, whereas tracemonkey doesn't.  You
> may have run into a dependency issue.

(In reply to comment #2)
> Is there a way to prevent this in future? Perhaps doing a full clobber when
> configure.in changes, or similar?

As this looks like a bug in Makefile dependencies, pushing this over to BuildConfig.
Component: Release Engineering → Build Config
OS: Mac OS X → All
Product: mozilla.org → Core
QA Contact: release → build-config
Version: other → unspecified
We do generally attempt to do a full rebuild when configure.in changes. It's possible that something doesn't work right with js/src/configure.
(In reply to comment #4)
> We do generally attempt to do a full rebuild when configure.in changes. It's
> possible that something doesn't work right with js/src/configure.

Yes, the bug was in js/src/configure.in. So I guess we should do a full rebuild when js/src/configure.in changes too?
We have logic in client.mk that purports to handle this ...
(In reply to comment #6)
> We have logic in client.mk that purports to handle this ...

Which is this logic? I looked through and can't figure it out. Do you mean that js/src/configure is included in the CONFIGURES variable? Where does it go from there?
(In reply to comment #8)
> build (which is what tinderbox runs) depends on objdir/Makefile.in and

OK, it certainly looks like it should work, but I can't really see why it didn't.
I'm seeing this again after tracemonkey commit b1094f628602. OSX failed and needed a clobber, even though the changes were to js/src/configure.in.

Log is here: http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1288870394.1288870464.29482.gz&fulltext=1
(In reply to comment #8)
> build (which is what tinderbox runs) depends on objdir/Makefile.in and
> objdir/config.status
> 
> http://mxr.mozilla.org/mozilla-central/source/client.mk#344
> 
> which depend on CONFIG_STATUS_DEPS
> 
> http://mxr.mozilla.org/mozilla-central/source/client.mk#314
> 
> which include CONFIGURES
> 
> http://mxr.mozilla.org/mozilla-central/source/client.mk#282
> 
> which depends on the .in's
> 
> http://mxr.mozilla.org/mozilla-central/source/client.mk#278

ok, configure::configure.in succeeds

Generating /builds/slave/tracemonkey-macosx64/build/js/src/configure using autoconf
cd /builds/slave/tracemonkey-macosx64/build/js/src; /opt/local/bin/autoconf-2.13
cd obj-firefox/i386

then we run [root] configure:

/builds/slave/tracemonkey-macosx64/build/configure

Then we dump into js/src/configure (and apparantly load the root config.cache):

configuring in js/src
running /bin/sh /builds/slave/tracemonkey-macosx64/build/js/src/configure  --target=i386-apple-darwin10.2.0 --with-macos-sdk=/Developer/SDKs/MacOSX10.5.sdk --enable-application=browser --enable-update-channel=nightly-tracemonkey --enable-update-packaging --enable-tests --enable-codesighs --disable-install-strip --enable-debug-symbols=-gdwarf-2 --enable-threadsafe --enable-ctypes --disable-shared-js --with-nspr-cflags='-I/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/include/nspr' --with-nspr-libs='-L/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/lib -lplds4 -lplc4 -lnspr4' --with-dist-dir=../../dist --prefix=/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist --with-sync-build-files=/builds/slave/tracemonkey-macosx64/build --cache-file=../.././config.cache --srcdir=/builds/slave/tracemonkey-macosx64/build/js/src
loading cache ../.././config.cache

and then fail when we delve into the libffi configure....

creating ./config.status
creating config/autoconf.mk
creating js-config.h
js-config.h is unchanged
invoking make to create js-config script
rm -f js-config.tmp
sed < /builds/slave/tracemonkey-macosx64/build/js/src/js-config.in > js-config.tmp \
	-e 's|@prefix@|/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist|' \
	-e 's|@exec_prefix@|/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist|' \
	-e 's|@includedir@|/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/include|' \
	-e 's|@libdir@|/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/lib|' \
	-e 's|@MOZILLA_VERSION@||' \
	-e 's|@LIBRARY_NAME@|mozjs|' \
	-e 's|@NSPR_CFLAGS@|-I/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/include/nspr|' \
	-e 's|@JS_CONFIG_LIBS@|-L/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/lib -lplds4 -lplc4 -lnspr4  -lm |' \
	-e 's|@MOZ_JS_LIBS@|-L/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/dist/lib -lmozjs|' \
	&& mv js-config.tmp js-config && chmod +x js-config
configuring in ctypes/libffi
running /bin/sh /builds/slave/tracemonkey-macosx64/build/js/src/ctypes/libffi/configure  --disable-shared --enable-static --disable-raw-api --with-pic --cache-file=/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/js/src/ctypes/libffi/config.cache --srcdir=/builds/slave/tracemonkey-macosx64/build/js/src/ctypes/libffi
configure: loading cache /builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/js/src/ctypes/libffi/config.cache
configure: error: `build_alias' was set to `x86_64-apple-darwin10.2.0' in the previous run
configure: error: `host_alias' was set to `i386-apple-darwin10.2.0' in the previous run
configure: error: in `/builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/js/src/ctypes/libffi':
configure: error: changes in the environment can compromise the build
configure: error: run `make distclean' and/or `rm /builds/slave/tracemonkey-macosx64/build/obj-firefox/i386/js/src/ctypes/libffi/config.cache' and start over
configure: error: /builds/slave/tracemonkey-macosx64/build/js/src/ctypes/libffi/configure failed for ctypes/libffi
configure: error: /builds/slave/tracemonkey-macosx64/build/js/src/configure failed for js/src
*** Fix above errors and then restart with               "make -f client.mk build"
make[2]: *** [configure] Error 1
make[1]: *** [obj-firefox/i386/Makefile] Error 2
make: *** [build] Error 2


Hope that helps diagnosis
Nice catch Callek!

We probably want this to block final since it's going to do bizarre and horrible things randomly.  I'll have access to a Mac this weekend (thanks to releng) and will poke around.
blocking2.0: --- → ?
OK, the problem is that when js/src/ reconfigures it doesn't end up passing the cross compile flags to libffi.  This implies that when we get to the end of the configure script either CROSS_COMPILE isn't set or somehow the target has become something else (which seems unlikely.)

The cross compile logic in js/src/configure.in is substantially different from the logic in the root configure.in.  Is there a reason for that?
(In reply to comment #13)
> The cross compile logic in js/src/configure.in is substantially different from
> the logic in the root configure.in.  Is there a reason for that?

Where? I recently synced up the logic (bug 605133 - the prelude to this bug). I looked through anything that said CROSS_COMPILE, and didn't see anything different (on tracemonkey tree, it hasn't made it to m-c).
Oh, bah.  I was reading m-c, assuming that we'd merged recently.
Yeah, so that patch in Bug 605133 totally hosed the way we configure libffi.
Blocks: 605133
Keywords: regression
Summary: Push passed tryserver but broke the build → Cross compiling libffi is completely broken on TM
I'd say this fragment was the cause:

@@ -721,22 +909,16 @@ if test -n "$_WIN32_MSVC"; then
         # VS2008: http://msdn.microsoft.com/en-us/library/323b6b3k.aspx
         AC_DEFINE(JS_STDDEF_H_HAS_INTPTR_T)
         ;;
     esac
 fi
 
 fi # COMPILE_ENVIRONMENT
 
-if test "$cross_compiling"  = "yes"; then
-    CROSS_COMPILE=1
-else
-    CROSS_COMPILE=
-fi
-
 # Check to see if we are running in a broken QEMU scratchbox.
 # We know that anything below 1.0.16 is broken.
 AC_CHECK_PROGS(SBCONF, sb-conf ve, "")
 if test -n "$SBCONF"; then
     _sb_version=`$SBCONF ve`
     _sb_version_major=`echo $_sb_version | cut -f1 -d.`
     _sb_version_minor=`echo $_sb_version | cut -f2 -d.`
     _sb_version_point=`echo $_sb_version | cut -f3 -d.`



Or this one (but probably the last one):

-
-    dnl If we cross compile for ppc on Mac OS X x86, cross_compiling will
-    dnl have erroneously been set to "no", because the x86 build host is
-    dnl able to run ppc code in a translated environment, making a cross
-    dnl compiler appear native.  So we override that here.
-    cross_compiling=yes
You're right, but that hunk was merely wallpaper over the real issue, which is that we don't export CROSS_COMPILE from the root configure.in to js/src.  I'm testing a patch for that now.
Assignee: nobody → khuey
Status: NEW → ASSIGNED
Attachment #490749 - Flags: review?(ted.mielczarek)
(In reply to comment #19)
> Created attachment 490749 [details] [diff] [review]
> export CROSS_COMPILE to the subconfigure

I don't think this is right. js/src/configure.in needs to figure out whether it's cross-compiling for itself (which is why nothing else is passed via environmental variables.
(In reply to comment #20)
>(which is why nothing else is passed via
> environmental variables.

Oh, MOZ_MEMORY is done that way. Hmm, that'll have to be pulled out, that's pretty nasty (and explains some of the nasty code in js/src/configure.in).
Why does it need to determine that on its own?  The root configure.in doesn't do any such determination; it relies on the value passed in by the mozconfig.
(In reply to comment #22)
> Why does it need to determine that on its own?  The root configure.in doesn't
> do any such determination; it relies on the value passed in by the mozconfig.

Good point, but js/src/configure.in can be used without Mozilla, and therefore without mozconfig.
(In reply to comment #22)
> Why does it need to determine that on its own?  The root configure.in doesn't
> do any such determination; it relies on the value passed in by the mozconfig.

In fact, why do we rely on getting this from mozconfig? Are we able to get this on our own?
(In reply to comment #24)
> (In reply to comment #22)
> > Why does it need to determine that on its own?  The root configure.in doesn't
> > do any such determination; it relies on the value passed in by the mozconfig.
> 
> In fact, why do we rely on getting this from mozconfig? Are we able to get this
> on our own?

The way it works is mozconfig is a varient of doing

configure --foo

with the definitions in mozconfig.

And technically you don't need a mozconfig for m-c anything (its just much more convenient that way).

The idea that the js/src/configure will take values already utilized in {srcdir}/configure is fine and normal, and should continue to work when run by itself

See for example, both root and js/src/configure have the following comment at the moment:

|351 dnl Set CROSS_COMPILE in the environment when running configure
352 dnl to use the cross-compile setup for now|

and thats the comment right above where khuey is changing is js/src/...

AND m-c root configure has: |if test -n "$CROSS_COMPILE" -a "$target" != "$host"; then| for the same block of code this is trying to do here...

this is the right change.

(Not the most ideal, maybe. But a correct fix none the less)
Attachment #490749 - Flags: feedback+
(In reply to comment #25)
> > In fact, why do we rely on getting this from mozconfig? Are we able to get this
> > on our own?
> And technically you don't need a mozconfig for m-c anything (its just much more
> convenient that way).

Of course, but there is no need to require a mozconfig flag for something we are able to figure out automatically. Is it not a cross-compile if $host != $target?
 
> The idea that the js/src/configure will take values already utilized in
> {srcdir}/configure is fine and normal, and should continue to work when run by
> itself

Yes, but js/src/configure, as far as I can tell from the code, cross-compiled without CROSS_COMPILE until now. So we are now _adding_ it as a requirement. We could simply revert the changes I identified above instead, and keep things the ay they were.


> See for example, both root and js/src/configure have the following comment at
> the moment:
> 
> |351 dnl Set CROSS_COMPILE in the environment when running configure
> 352 dnl to use the cross-compile setup for now|
> 
> and thats the comment right above where khuey is changing is js/src/...
> 
> AND m-c root configure has: |if test -n "$CROSS_COMPILE" -a "$target" !=
> "$host"; then| for the same block of code this is trying to do here...
> 
> this is the right change.

No, the right change is the reversion I identified. This change will also work if the only goal is to make the whole build work, but it adds and extra requirement on SpiderMonkey-only builds. Look at the code history, and lets put it back to the way it worked.

Actually, the right change is to remove the environmental variable requirement from both j.s.c.in and c.in.

You make a good point about the comment, and back when we split j.s.c.in off c.in it was true and required. So we should also change the comment.
CROSS_COMPILE is sort of a hack. autoconf has built-in tests for cross-compiling, but it simply compiles a binary and tries to run it. As mentioned in those diff hunks in comment 17, that test didn't work for compiling PPC binaries on a Mac OS X host, since you could in fact run the binaries. It also won't work when compiling x86_64->i386, since you can run those binaries too, thus the CROSS_COMPILE hack.

Alternately, we could fix the libffi autoconf bit to just check that $host != $target.
(In reply to comment #27)
> Alternately, we could fix the libffi autoconf bit to just check that $host !=
> $target.

Can we set CROSS_COMPILE to $host != $target, and so not use mozconfig-/environmental-variable-supplied value?

(in practice we'd be removing CROSS_COMPILE, renaming it to something private like "cross_compile", which is the result of $host != $target.)
This reverts the CROSS_COMPILE part, which fixes spidermonkey config. This fixes bug 612809 (broke cross-compile) aswell, which the export problem doesn't.
blocking2.0: ? → betaN+
Attachment #491196 - Flags: review?(ted.mielczarek)
Attachment #491196 - Flags: review?(ted.mielczarek) → review+
Attachment #490749 - Flags: review?(ted.mielczarek) → review-
Assignee: khuey → pbiggar
http://hg.mozilla.org/tracemonkey/rev/301b97a20042
Whiteboard: [fixed-in-tracemonkey]
So, for those keeping score at home, what happened was:

- We broke the code that passed the host and target arch as arguments to libffi's configure.
- This means libffi assumes that the host arch is the target arch and sets the target arch to x86_64 on all builds.
-- If the config.cache in the objdir exists and is from before that patch landed, it says that the target arch is i386.  Libffi sees that the current target and the cached target aren't the same and errors out, which prompted this bug.
-- If the config.cache doesn't exist (because we're building on tryserver, or because philor set the builders to clobber), libffi proceeds without erroring out.
- If we've gotten this far, libffi pulls the values of CC and CXX from the environment, which the universal build glue has set to gcc-4.2 -arch i386.
- Thus, even though the target is x86_64, the build proceeds with the i386 compiler.  This is why this didn't fail to link or die on unit tests.

So, turns out this was mostly cosmetic about what target arch libffi thinks it's building for :-P
Don't ask me why, but this isn't fixed.  TM is still building libffi with an ostensible target of x64 and this bit us on the merge to m-c.
Whiteboard: [fixed-in-tracemonkey]
I'm getting breakage on clobber builds of the JS shell, 32-bit linux.  Attached is my config.log.

My configure line is this:

CC='gcc -m32' CXX='g++ -m32' AR=ar ../configure \
        --enable-debug --enable-valgrind --disable-optimize \
        --target=i686-pc-linux-gnu
32-bit JS shell builds were broken for me on Mac, too.  Adding --host=x86_64-darwin to my configure line fixed that.  But Linux builds are still broken, I can't find a --host option to fix them.  Unless I add --with-system-nspr!  Then I get lots of warnings about incompatible .so files with that, but it builds.

Here's the error, BTW:

  ../jscpucfg.cpp:48:21: error: prtypes.h: No such file or directory
  ../jscpucfg.cpp:81:2: error: #error "Endianess not defined."

Why do we #include prtypes.h?

pbiggar said this on IRC: "I don't think we know we require prtypes.h. That's new in a recent version of nspr, and we have a note in configure.in that we dont need it. so maybe we do now"
(In reply to comment #34)
> pbiggar said this on IRC: "I don't think we know we require prtypes.h. That's
> new in a recent version of nspr, and we have a note in configure.in that we
> dont need it. so maybe we do now"

I am significantly less confident in saying this since discovering that prtypes.h has been used since mercurial revision 1.
Right, the note was that PR_STATIC_ASSERT got moved from prlog.h to prtypes.h in a recent NSPR, and spidermonkey doesn't depend on that.
Attached patch patchSplinter Review
This finally fixes cross-compilation for all cross-compilation configure lines people have given.  I have no idea why this fixes it though.

It doesn't fix the very simple case, but that never worked:

  ./configure --target=i386-apple-darwin10.4.0
Attachment #491854 - Flags: review?(ted.mielczarek)
Attachment #491854 - Flags: review?(ted.mielczarek) → review+
http://hg.mozilla.org/tracemonkey/rev/1532a8fcc7d3
Whiteboard: [fixed-in-tracemonkey]
http://hg.mozilla.org/mozilla-central/rev/c2316a213ef0
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.