Closed Bug 715298 Opened 14 years ago Closed 14 years ago

Fennec Java builds succeed on "linux-ix-slave" builders, but fail on "try-linux-slave" builders

Categories

(Release Engineering :: General, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cpeterson, Assigned: coop)

References

Details

(Whiteboard: [mobile][try])

Attachments

(2 files)

I have a couple changesets (with Fennec Java changes) I have pushed to the try servers. When my Fennec builds are (randomly?) assigned to any "linux-ix-slave" builder, my builds succeed. When my builds are assigned to any "try-linux-slave" builder, my builds fail. The errors are that the ProGuard Java .class shrinker/optimizer can't find a couple methods in Android SDK libs, but AFAICT the builders have the same Android SDK and JDK versions and paths. > 1) specific changeset used in all cases * CHANGESETS PUSHED TO TRY SERVER: https://tbpl.mozilla.org/?tree=Try&rev=8ac71ab745e5 8ac71ab745e5 4e1d2cef3695 921bab94b9af 757fb03b09c3 0e422aa201ca c9c6f7ed46a5 64be73b3b014 > 2) you say "builds on try machines" - I assume are you doing push to try from a local repo that was the same changeset in all of these cases? Yes, I have local patches in my hg patch queue that I pushed to try. I tested ~2 different patches when trying to debugging my build problems, but those patches would be functionally equivalent and the patches landed on both "linux-ix-slave" and "try-linux-slave" builders. hg qnew --currentdate --currentuser --edit --message "try: -b do -e -p android -u none -t none" TRY hg push --force -rtip ssh://hg.mozilla.org/try * BUILDS SUCCEEDED ON THESE BUILDERS: linux-ix-slave08 linux-ix-slave10 linux-ix-slave10 linux-ix-slave11 linux-ix-slave11 mv-moz2-linux-ix-slave23 mv-moz2-linux-ix-slave23 * BUILDS FAILED ON THESE BUILDERS: try-linux-slave06 try-linux-slave06 try-linux-slave19 try-linux-slave20 try-linux-slave28 try-linux-slave28 try-linux-slave23 * BUILD ERROR: ProGuard, version 4.4 Reading input... Reading program directory [/builds/slave/try-andrd-dbg/build/obj-firefox/mobile/android/base/classes] Reading library directory [/tools/android-sdk-r13/platforms/android-13] Initializing... Warning: org.mozilla.gecko.GeckoApp$30: can't find referenced method 'void setZOrderOnTop(boolean)' in class android.view.SurfaceView Warning: org.mozilla.gecko.GeckoApp$30: can't find referenced method 'void setZOrderMediaOverlay(boolean)' in class android.view.SurfaceView
:cpeterson, could you attach your mozconfigs to the bug? (I assume, but asking explicitly, that the same mozconfig was being used in both the successful and failing cases?)
I see this in the try build for builder: try-android-xul slave: mv-moz2-linux-ix-slave22 starttime: 1325532788.85 results: success (0) buildid: 20120102113320 builduid: 25fff2ef53714e968cd0f21242be42a6 revision: 8ac71ab745e5 mozconfig dump: # Global options mk_add_options MOZ_MAKE_FLAGS=-j4 # Nightlies only since this has a cost in performance ac_add_options --enable-js-diagnostics # Build Fennec ac_add_options --enable-application=mobile ac_add_options --disable-elf-hack # Android ac_add_options --target=arm-linux-androideabi ac_add_options --with-endian=little ac_add_options --with-android-ndk="/tools/android-ndk-r5c" ac_add_options --with-android-sdk="/tools/android-sdk-r13/platforms/android-13" ac_add_options --with-android-tools="/tools/android-sdk-r13/tools" ac_add_options --with-android-toolchain=/tools/android-ndk-r5c/toolchains/arm-linux-androideabi-4.4.3/prebuilt/linux-x86 ac_add_options --with-android-platform=/tools/android-ndk-r5c/platforms/android-5/arch-arm ac_add_options --with-system-zlib ac_add_options --enable-update-channel=${MOZ_UPDATE_CHANNEL} export JAVA_HOME=/tools/jdk6 export MOZILLA_OFFICIAL=1 ac_add_options --with-branding=mobile/xul/branding/nightly I would like to compare that to a failed build on the non-try side to make sure that the ndk and sdk environment vars are matching
:bear, my change only affects the "try-android" builds, not "try-android-xul". The mozconfig you copied is for "builder: try-android-xul". :joduinn, do you want the mozconfig I use to build locally? I'm not sure which mozconfig the try servers are using. I didn't specify a particular mozconfig when pushing. The android nightly builds use this mozconfig: https://hg.mozilla.org/mozilla-central/file/tip/mobile/android/config/mozconfigs/android/nightly
I just want to compare the two mozconfigs. I took a look at two of the slaves you listed and saw that they had the proper android sdk's in place but one had some older info bacause it has been around longer than the other. My hunch is that you may not be using the mozconfig on try that you think/assume you are and I just want to rule that out.
Here is the mozconfig I use to compile locally on my Mac: ANDROID_MIN_VERSION=5 ANDROID_TARGET_VERSION=13 ANDROID_NDK_DIR="/Users/cpeterson/Code/google/android-ndk-r5c" ANDROID_SDK_DIR="/Users/cpeterson/Code/google/android-sdk-macosx" mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/OBJDIR mk_add_options MOZ_MAKE_FLAGS="-j9 -s" ac_add_options --disable-crashreporter ac_add_options --disable-pedantic ac_add_options --disable-tests ac_add_options --with-ccache # Android SDK ac_add_options --with-android-version=$ANDROID_MIN_VERSION ac_add_options --with-android-ndk="$ANDROID_NDK_DIR" ac_add_options --with-android-platform="$ANDROID_NDK_DIR/platforms/android-$ANDROID_MIN_VERSION/arch-arm" ac_add_options --with-android-toolchain="$ANDROID_NDK_DIR/toolchains/arm-linux-androideabi-4.4.3/prebuilt/darwin-x86" ac_add_options --with-android-sdk="$ANDROID_SDK_DIR/platforms/android-$ANDROID_TARGET_VERSION" ac_add_options --with-android-tools="$ANDROID_SDK_DIR/tools" # Android options ac_add_options --enable-application=mobile/android ac_add_options --target=arm-linux-androideabi ac_add_options --with-endian=little
> My hunch is that you may not be using the mozconfig on try that you think/assume you are > and I just want to rule that out. :bear, how can I check which mozconfig is being used on the try servers? The changeset I am pushing to try does not change mozconfig or rely on any new mozconfig flags. I wonder if the linux-ix-slave and try-linux-slave builders could be using different Java versions? The Java version on my dev machine is javac 1.6.0_29.
Blocks: Proguard
(In reply to Chris Peterson (:cpeterson) from comment #6) > :bear, how can I check which mozconfig is being used on the try servers? The > changeset I am pushing to try does not change mozconfig or rely on any new > mozconfig flags. Chris: if you check the build log for your try run, search for "mozconfig" and you should find two build steps early on in the log: one that fetches the mozconfig (got mozconfig), and then a second step that shows the contents of the .mozconfig that the build will use (cat .mozconfig). Here's as recent example try log that contains those steps: https://tbpl.mozilla.org/php/getParsedLog.php?id=8502559&tree=Try (In reply to Chris Peterson (:cpeterson) from comment #6) > I wonder if the linux-ix-slave and try-linux-slave builders could be using > different Java versions? The Java version on my dev machine is javac > 1.6.0_29. Doubtful since in theory we control those things, but I will double-check to be sure.
Assignee: joduinn → coop
OS: Mac OS X → Linux
Priority: -- → P3
I compared two of the builders mentioned in comment #0. It makes me sad that the output is not identical: [cltbld@linux-ix-slave08 ~]$ rpm -qa | grep jdk | sort jdk1.5-1.5.0_10-0moz1 jdk1.6-1.6.0_17-0moz1 [cltbld@linux-ix-slave08 ~]$ cd /tools [cltbld@linux-ix-slave08 tools]$ cd [cltbld@linux-ix-slave08 ~]$ ls -ld /tools/jdk* lrwxrwxrwx 1 root root 19 Jan 15 2010 /tools/jdk -> /tools/jdk-1.5.0_10 drwxr-xr-x 9 root root 4096 Nov 9 2006 /tools/jdk-1.5.0_10 drwxr-xr-x 10 cltbld cltbld 4096 Mar 19 2010 /tools/jdk-1.6.0_17 lrwxrwxrwx 1 root root 19 May 5 2010 /tools/jdk6 -> /tools/jdk-1.6.0_17 [cltbld@linux-ix-slave08 ~]$ rpm -qa | grep android | sort android-ndk5-r5c-0moz3 android-ndk-r4c-0moz3 android-sdk13-r13-0moz1 android-sdk-r8-0moz3 [cltbld@linux-ix-slave08 ~]$ ls -ld /tools/android* lrwxrwxrwx 1 root root 22 Jun 17 2010 /tools/android-ndk -> /tools/android-ndk-r4c drwxr-xr-x 6 root root 4096 Jul 1 2010 /tools/android-ndk-r4c drwxr-xr-x 9 root root 4096 Jul 20 13:58 /tools/android-ndk-r5c lrwxrwxrwx 1 root root 21 Jun 17 2010 /tools/android-sdk -> /tools/android-sdk-r8 drwxr-xr-x 8 root root 4096 Aug 9 09:29 /tools/android-sdk-r13 drwxr-xr-x 5 root root 4096 Jun 21 2010 /tools/android-sdk-r8 [cltbld@try-linux-slave06 ~]$ rpm -qa | grep jdk | sort jdk1.5-1.5.0_10-0moz1 jdk1.6-1.6.0_17-0moz1 [cltbld@try-linux-slave06 ~]$ ls -ld /tools/jdk* lrwxrwxrwx 1 root root 19 Jul 10 2009 /tools/jdk -> /tools/jdk-1.5.0_10 drwxr-xr-x 9 root root 4096 Nov 9 2006 /tools/jdk-1.5.0_10 drwxr-xr-x 10 cltbld cltbld 4096 Mar 19 2010 /tools/jdk-1.6.0_17 lrwxrwxrwx 1 root root 19 May 18 2010 /tools/jdk6 -> /tools/jdk-1.6.0_17 [cltbld@try-linux-slave06 ~]$ rpm -qa | grep android | sort android-ndk5-r5c-0moz3 android-ndk-r4c-0moz3 android-ndk-r5c-0moz1 android-sdk13-r13-0moz1 android-sdk-r8-0moz3 [cltbld@try-linux-slave06 ~]$ ls -ld /tools/android* lrwxrwxrwx 1 root root 22 Jun 17 2010 /tools/android-ndk -> /tools/android-ndk-r4c drwxr-xr-x 6 root root 4096 Jul 2 2010 /tools/android-ndk-r4c lrwxrwxrwx 1 root root 22 Jul 14 14:18 /tools/android-ndk-r5 -> /tools/android-ndk-r5c drwxr-xr-x 9 root root 4096 Jul 19 23:51 /tools/android-ndk-r5c lrwxrwxrwx 1 root root 21 Jun 17 2010 /tools/android-sdk -> /tools/android-sdk-r8 drwxr-xr-x 8 root root 4096 Aug 9 11:38 /tools/android-sdk-r13 drwxr-xr-x 5 root root 4096 Jun 21 2010 /tools/android-sdk-r8 Chris: would the extra android-ndk (r5) on the try-linux-* slaves be responsible for what you're seeing? I really need to look two build logs (one pass, one fail) to find out what's going on though. I can start mining the changesets you've provided to try to find matching build logs, but if you already have links to them, they would be appreciated.
Whiteboard: [mobile][try]
> Chris: would the extra android-ndk (r5) on the try-linux-* slaves be responsible for what you're seeing? I don't think the android-ndk-r5 -> android-ndk-r5c symlink cause any problems.
:coop, I've attached two build logs: a successful build (from a linux-ix-slave) and a failed build (from a try-linux-slave). The logs show that the same mozconfig is downloaded (mobile/android/config/mozconfigs/android/nightly) and the cat'd contents are the same.
(In reply to Chris Peterson (:cpeterson) from comment #12) > :coop, I've attached two build logs: a successful build (from a > linux-ix-slave) and a failed build (from a try-linux-slave). > > The logs show that the same mozconfig is downloaded > (mobile/android/config/mozconfigs/android/nightly) and the cat'd contents > are the same. OK, I've pulled try-linux-slave28 (the slave used in the BAD log) and am going to do some testing to find out what the java delta is between this machine and a GOOD machine.
Status: NEW → ASSIGNED
Priority: P3 → P2
(In reply to Chris Cooper [:coop] from comment #13) > OK, I've pulled try-linux-slave28 (the slave used in the BAD log) and am > going to do some testing to find out what the java delta is between this > machine and a GOOD machine. hg has been unavailable due to the downtime, but I did do a quick comparison of the jdk dirs between mv-moz2-linux-ix-slave23(GOOD) and try-linux-slave28(BAD): [cltbld@mv-moz2-linux-ix-slave23 jdk]$ rsync -e ssh -nav /tools/jdk-1.5.0_10 try-linux-slave28.build.mozilla.org:/tools cltbld@try-linux-slave28.build.mozilla.org's password: building file list ... done jdk-1.5.0_10/jre/lib/ jdk-1.5.0_10/jre/lib/charsets.jar jdk-1.5.0_10/jre/lib/deploy.jar jdk-1.5.0_10/jre/lib/javaws.jar jdk-1.5.0_10/jre/lib/jsse.jar jdk-1.5.0_10/jre/lib/plugin.jar jdk-1.5.0_10/jre/lib/rt.jar jdk-1.5.0_10/jre/lib/ext/ jdk-1.5.0_10/jre/lib/ext/localedata.jar jdk-1.5.0_10/jre/lib/i386/client/ jdk-1.5.0_10/jre/lib/i386/client/classes.jsa jdk-1.5.0_10/lib/ jdk-1.5.0_10/lib/tools.jar
I add "java -version" to the Fennec makefile to log the exact JDK version installed on the builders. I ran a few try builds until I hit a try-linux-slave and a linux-ix-slave. They report the same JDK versions installed, so my build errors (on try-linux-slave) must be a different problem. * slave: try-linux-slave13 (MY BUILD FAILED) java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing) * slave: linux-ix-slave07 (MY BUILD SUCCEEDED) java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
(In reply to Chris Peterson (:cpeterson) from comment #15) > I add "java -version" to the Fennec makefile to log the exact JDK version > installed on the builders. I ran a few try builds until I hit a > try-linux-slave and a linux-ix-slave. They report the same JDK versions > installed, so my build errors (on try-linux-slave) must be a different > problem. cpeterson: I don't know much java myself. How about I set aside one known GOOD slave and one known BAD slave and let you poke around at them directly?
Sounds good. btw, you mentioned earlier that the try-linux-* VMs would be phased out. What is that time frame of the phase out? That may make this investigation unnecessary. <:)
(In reply to Chris Peterson (:cpeterson) from comment #17) > btw, you mentioned earlier that the try-linux-* VMs would be phased out. > What is that time frame of the phase out? That may make this investigation > unnecessary. <:) Not soon enough to be relevant here, sadly.
Depends on: 719810
(In reply to Chris Cooper [:coop] from comment #16) > cpeterson: I don't know much java myself. How about I set aside one known > GOOD slave and one known BAD slave and let you poke around at them directly? I've set aside the following slaves for you: * try-linux-slave28 (BAD) * linux-ix-slave07 (GOOD) I already had them pulled for investigation, so that seemed easiest. I'll send you connection details out-of-band. Please re-assign the bug back to me once you've completed your investigation and I'll repatriate the slaves.
Assignee: coop → cpeterson
:coop, I am done with the try-linux-slave28 and linux-ix-slave07 builders. Please feel free to reimage them. The ProGuard errors "went away" when I changed its -libraryjars classpath from "$(ANDROID_SDK)" to "$(ANDROID_SDK)/android.jar". Both ProGuard's documentation says both directory and .jar -libraryjar classpaths should work, but for some reason specifying only the directory causes try-linux-slave builders to barf. I don't know why, but I have a reasonable workaround. After poking around the try-linux-slave28 and linux-ix-slave07 builders, the only difference I can see between their JDK environments is that linux-ix-slave07's /tools/jdk6/bin/java is a "Server VM", but its /tools/jdk6/bin/javac invokes a "Client VM" java. * try-linux-slave28 (BAD) Java versions: $ java -version java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing) $ javac -J-version java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing) * linux-ix-slave07 (GOOD) Java versions: $ java -version java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode) $ javac -J-version java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing) <-- javac invokes "Client VM" java when default java is "Server VM"??
Assignee: cpeterson → coop
Blocks: 721395
Blocks: 721396
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: