Closed Bug 971841 Opened 10 years ago Closed 10 years ago

Install ant on builders

Categories

(Release Engineering :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: blassey, Assigned: Callek)

References

Details

Attachments

(6 files, 5 obsolete files)

      No description provided.
Which builds need this? And is there a minimum version required?
Flags: needinfo?(blassey.bugs)
coming out of mobile automation meeting -- I'll provide more details soon.
Assignee: nobody → bugspam.Callek
Flags: needinfo?(blassey.bugs)
It's needed for bug 971101. Been speaking with Callek, I don't know what the minimum version needed is, but he tells me that 1.7.2 is available already so we're going to test that version and go from there.
Attachment #8375072 - Flags: review?(bhearsum)
Attachment #8375072 - Flags: review?(bhearsum) → review+
Live in production.
Turns out this was also needed (discovered during try push)
Attachment #8376007 - Flags: review?(bhearsum)
Attachment #8376007 - Flags: review?(bhearsum) → review+
Live in production.
we now have the answer to "what version of ant is required?" and its 1.8.0
Sooo, turns out the version of ant we have isn't new enough, failed blassey's try run.

Dustin recommends we take the srpm for what we have [1] and compare it against the fedora18 srpm [2] and then build a new rpm with all our deps/packages for our uses.

And for my notes (since might as well build related stuff):

[root@bld-centos6-hp-015.build.scl1.mozilla.com ~]# for i in `repoquery --whatrequires ant`; do echo "-$i" ; echo "`repoquery --requires $i | grep ant`" | sed -e "s/^/ \\\-\-/"; done
-ant-0:1.7.1-13.el6.x86_64
 \--
-ant-antlr-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
 \--antlr
-ant-antunit-0:1.1-4.el6.noarch
 \--ant
 \--config(ant-antunit) = 1.1-4.el6
-ant-apache-bcel-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-apache-bsf-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-apache-log4j-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-apache-oro-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-apache-regexp-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-apache-resolver-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-commons-logging-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-commons-net-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-contrib-0:1.0-0.10.b2.el6.noarch
 \--ant >= 1.6.2
-ant-javamail-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-jdepend-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-jmf-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-jsch-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-junit-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-ant-nodeps-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
-ant-scripts-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
-ant-swing-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
-ant-trax-0:1.7.1-13.el6.x86_64
 \--ant = 1.7.1-13.el6
 \--ant-nodeps = 1.7.1-13.el6
-jetty-eclipse-0:6.1.24-2.el6.noarch
 \--ant >= 1.6
-opengrok-0:0.9-1.el6.noarch
 \--ant

Which means we don't actually have *all that much* to update here
Assignee: bugspam.Callek → sbruno
I discussed this with Callek last week, and he provided me info about how to proceed.

I then compared the spec files relative to the 1.7 and 1.8 before rebuilding: I'd like to go through the differences with Callek before proceeding.

A diff report between the two can be found here: http://people.mozilla.org/~sbruno/ant_1.7-1.8.html
(In reply to Simone Bruno [:simone] from comment #11)
> I discussed this with Callek last week, and he provided me info about how to
> proceed.
> 
> I then compared the spec files relative to the 1.7 and 1.8 before
> rebuilding: I'd like to go through the differences with Callek before
> proceeding.

Yea we'll chat tomorrow AM (Eastern Time)

> A diff report between the two can be found here:
> http://people.mozilla.org/~sbruno/ant_1.7-1.8.html

Thats a rather larger diff than I anticipated.

Out of curiosity can you also run a diff from the ant_1.7 from fedora with the spec from ant at http://puppetagain.pub.build.mozilla.org/data/repos/yum/mirrors/centos/6/latest/os/x86_64/ant-1.7.1-13.el6.x86_64.rpm

(We're using the CentOS spec, not fedora, and might as well base our update off that, if it makes sense to -- though I do suspect a larger diff)
I retrieved the spec files for Fedora 1.7, Fedora 1.8 and CentOS 1.7, and I merged the changes Fedora{1.7 -> 1.8} into CentOS 1.7 in order to create a new spec file to be used for our CentOS 1.8 build.

Actually, since the only differences between Fedora 1.7 and CentOS 1.7 were some Groups definitions which  in Fedora 1.8 have been also updated so that they are identical to CentOS 1.7, the resulting file is identical to the Fedora 1.8 spec file.

According to what I see in (e.g) http://hg.mozilla.org/build/puppet/file/ed8f5fbccfa1/modules/packages/manifests/mozilla/ccache.spec, I changed the Release tag to moz1 in order to be able to "iterate on the rpm itself [...] and distinguish it properly from an upstream version that may come along" (in Callek's words).
Attached file ant_1.8.4_CentOS_moz.spec (obsolete) —
This is the spec file I am going to attempt to build to create the required ant package, as resulting from the merge described in the previous comment.
Attachment #8387550 - Flags: feedback+
Brad,

we're blocked here atm, with a "what to do" moment:

http://people.mozilla.org/~sbruno/1.7_CentOS-1.8_fedora.html line #278, it requires apache-commons-logging to build, I don't actually see it used in the source (https://github.com/apache/ant/tree/ANT_18_BRANCH/lib ) from a glance.

Our repos don't have apache-commons-logging right now, (also note apache-commons-net below it)

ftp://mirror.switch.ch/pool/4/mirror/fedora/linux/releases/18/Fedora/source/SRPMS/a/apache-commons-logging-1.1.1-20.fc18.src.rpm

It's looking like we'll need to pull in all sorts of rpms for this :/  
* Thu May  6 2010 Stanislav Ochotnicky <sochotnicky@redhat.com> - 1.1.1-1
- Rename and rebase from jakarta-commons-logging

(from the apache-commons-logging spec)

This realization significantly increases the time investment and eta, since any packages we do add we need to make sure won't break things, if we need to rebuild ant ourselves it adds extra pain, I'm tempted to ask simone to do the following:

locally try installing on a throw-away loan instance all the packages that this ant spec builds (e.g. anything with %package) -- any that are missing a dep, from our repo to get and install those as well.

We'll want to purge all newly installed packages and/or --downgrade each time we test one package to be extra careful (so we know what also gets upgraded/installed each time)

I expect a longer delay with this plan, but is certainly better/faster than manually taking all these deps and trying to build them from scratch.

Once we have a full array of new rpms to install, we'd add them to our yum repos, not a as a releng-made rpm (since these were upstream).

simone does that sound doable? blassey does that sound ok?
Flags: needinfo?(sbruno)
Flags: needinfo?(blassey.bugs)
Callek, what sort of ETA are we looking at for that? If it is quite long we may want to reevaluate. As Nick pointed out Android is moving to a new build system called Gradle anyway. So if it is a similar level of effort, we might be better off going in that direction.
Flags: needinfo?(blassey.bugs)
NI to Callek for the ETA
Flags: needinfo?(bugspam.Callek)
Let's plan for simone spending his monday and possibly tuesday with the above effort (ala: c#15) and seeing where that gets us in a solid ETA once scope is better figured out.

Me and you can try and chat briefly on wednesday before your plethora of meetings and see the next steps (without bogging down the Mobile Testing meeting), how's that sound?
Flags: needinfo?(bugspam.Callek)
I was wondering whether the packaged rpm is the only way to proceed here.

Isn't it possible to use a manually installed ant in the context of the build steps which require it? The steps should be relatively simple. Something like:

1. Downloading http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.4-bin.tar.gz
2. expanding it to a conveniently located folder
3. set up ANT_HOME and adjust the PATH variable to point to $ANT_HOME/bin (export $PATH=$PATH:$ANT_HOME/bin:$JAVA_HOME/bin)

Whenever I used ant in previous jobs I used it this way. Are there any technical reasons why this cannot be done in this case?
Flags: needinfo?(sbruno)
Hi, I was sick off for a few days and I am now working again on this.

I will follow the approach outlined in the previous comment, performing the following steps:

1 - upload to the current tooltool server the required version of apache ant
2 - update the releng tooltool manifest with it
3 - update the code so that the aforementioned manifest is actually used to retrieve the binary
4 - add the instructions to expand the binary in an appropriate folder
5 - change PATH variable so that the involved build steps use the binary ant installation in the previously mentioned folder, and update ANT_HOME env variable

My expected ETA is tomorrow, I will update the bug with progress or issues.
Attached patch bug_971841.diff (obsolete) — Splinter Review
I added two new files to the tooltool server:
- the ant binary tar.bz2 file
- a new version of the file setup.sh mentioned in the releng.manifest files (more on this below) for android.

The manifests have been modified to fetch the ant package and the updated version of setup.sh.

Finally, the mozconfig.linux file has been updated to setup ANT_HOME and update the path so that the desired version of ant is used.

In order to facilitate the review, I run some commands on relengwebadm.private.scl3 which allow to verify the changes introduced to setup.sh, and that the ant version uploaded to tooltool is correct.

The first command is a diff between the old and new versions of setup.sh:

[sbruno@relengwebadm.private.scl3 ~]$ diff /mnt/netapp/relengweb/tooltool/pvt/build/sha512/f630174deaf91d92317df52571fce29d7fd141da917f638423d5f5dd6f9f09a67d867d37b047687e18ef6d6dd1288f14f16c9ab31791d672d1ceb2400910181f /mnt/netapp/relengweb/tooltool/pvt/build/sha512/974584133bcee7648bfca4bf95145239926ece2ab788a9b9850d3ad77c390af6efcf4d888c5f9654201262d9c8b8ecf16022ada0182854dc8a89ecf14649193e
7d6
< # ant 1.8.0 in apache-ant-1.8.0, as per Bug 971841
12d10
< rm -rf apache-ant-1.8.0
17d14
< tar xf apache-ant-1.8.0-bin.tar.bz2

The following command checks that the md5 of the ant package retrieved via tooltool is correct, comparing it to http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.0-bin.tar.bz2.md5

[sbruno@relengwebadm.private.scl3 ~]$ md5sum /mnt/netapp/relengweb/tooltool/pvt/build/sha512/a0972898bd5ddf74d4f29cf7d43b6500f26f7bb30446498e93b5c90975a7615d45722d832ee7289b16aa3e8bf11495263f7aa48859742e360d69277bac37bace
f0cd9b3cfd1d969656971f055ea2e06f  /mnt/netapp/relengweb/tooltool/pvt/build/sha512/a0972898bd5ddf74d4f29cf7d43b6500f26f7bb30446498e93b5c90975a7615d45722d832ee7289b16aa3e8bf11495263f7aa48859742e360d69277bac37bace
[sbruno@relengwebadm.private.scl3 ~]$ wget http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.0-bin.tar.bz2.md5 && cat apache-ant-1.8.0-bin.tar.bz2.md5
--2014-03-18 09:55:19--  http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.0-bin.tar.bz2.md5
Resolving archive.apache.org... 192.87.106.229, 140.211.11.131, 2001:610:1:80bc:192:87:106:229
Connecting to archive.apache.org|192.87.106.229|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34 [text/plain]
Saving to: “apache-ant-1.8.0-bin.tar.bz2.md5”

100%[=================================================================================================================================================================>] 34          --.-K/s   in 0s

2014-03-18 09:55:20 (8.55 MB/s) - “apache-ant-1.8.0-bin.tar.bz2.md5” saved [34/34]

f0cd9b3cfd1d969656971f055ea2e06f
[sbruno@relengwebadm.private.scl3 ~]$

Please note that I assumed that a java environment is properly configured (java is present in the path, JAVA_HOME is correctly set, ...) in the context in which ant will run - if this assumption is not satisfied, the ant execution will obviously fail.

Finally, if we follow this approach, the changes introduced previously, which install and and ant-regexp as mock packages, will need to be reverted since no longer necessary.

Thanks Callek for all the info you provided so far, and (preemptive thanks) for the review!
Attachment #8392995 - Flags: review?(bugspam.Callek)
Not sure yet what is going on here, but once I took this patch and a patch blassey did + a slight forced update to use this ant version (since its not actually in path where brad needs it) I got:

Updated file /builds/slave/try-and-a6-0000000000000000000/build/obj-firefox/embedding/android/geckoview_example/proguard-project.txt
rm -f /builds/slave/try-and-a6-0000000000000000000/build/obj-firefox/embedding/android/geckoview_example/res/layout/main.xml
../../../dist/bin/nsinstall /builds/slave/try-and-a6-0000000000000000000/build/embedding/android/geckoview_example/main.xml res/layout/
rm -f /builds/slave/try-and-a6-0000000000000000000/build/obj-firefox/embedding/android/geckoview_example/AndroidManifest.xml
../../../dist/bin/nsinstall /builds/slave/try-and-a6-0000000000000000000/build/embedding/android/geckoview_example/AndroidManifest.xml /builds/slave/try-and-a6-0000000000000000000/build/obj-firefox/embedding/android/geckoview_example
echo jar.libs.dir=libs >> project.properties
ant debug
Error: JAVA_HOME is not defined correctly.
  We cannot execute /tools/jdk6/bin/java
make[4]: *** [bin/GeckoViewExample-debug.apk] Error 1

This means that something is wrong with the jdk/java binary [or path] we're using... not quite sure what yet.

This was https://tbpl.mozilla.org/?tree=Try&rev=536fd7a11d2f
To followup, it sounds like (from a cursory look) that /tools/jdk6 does not exist.

we install http://rpm.pbone.net/index.php3/stat/4/idpl/25506358/dir/centos_6/com/java-1.6.0-openjdk-devel-1.6.0.0-3.1.13.1.el6_5.x86_64.rpm.html (give or take a slight ver adjustment) into the mock environ, which is to /usr

Also of ref is /tools/jdk6 has been JAVA_HOME and in PATH for a LOOOOng while, is also in mozconfigs and was in the puppet-manifests repo (which user to serve the old non-mock based builders):

http://mxr.mozilla.org/build/source/puppet-manifests/modules/packages/manifests/devtools.pp#172

Sooooo I *think* we need to stop setting JAVA_HOME (in mozconfigs and buildbot) and this will work, alternatively we might need to set JAVA_HOME to the /usr location hard for me to know with how tired I am right now.

Either way this will need some more testing before we can call a deploy safe or ready
The assumption I was talking about in my last comment is not satisfied.

The key questions are:
- which java version do we want to use when we run ant?
- how is/should that be installed on our slaves?

Depending on the answers to those, we will need to set JAVA_HOME and setup PATH changes (to use the appropriate java binaries). Also, references to non existing folders (e.g.: /tools/jdk6) should be removed from the code to keep our repos clean.

Since java is installed in the mock environment to execute this task, it would be natural to use that java installation, but there's a possibility that I am missing something here so don't take this a suggestion :-)
So removing JAVA_HOME worked, I'll get a patch up for buildbot for that (in another clear bug, blocking this one)
Depends on: 985600
:simone

I have a few things to touch up here, but can you adjust your setup.sh to read from "apache-ant-bin.tar.bz2"  (e.g. make it version agnostic, for the simple reason of avoiding needing to modify setup.sh if we ever take an update)

We can do something like the clang_version in m-c's other .manifest files for tooltool to specify what version of ant we're using of course.
Flags: needinfo?(sbruno)
Attached patch bug_971841.diff (obsolete) — Splinter Review
Steps I performed:

- I repackaged apache-ant-1.8.0-bin.tar.bz2 so that it includes a folder named apache-ant (instead of apache-ant-1.8.0), and renamed it apache-ant-bin.tar.bz2.
  This has been uploaded to tooltool as e28b7a12fbbef02ad742958df8dd356ea2adb8ef79e95cd8eb8dbc953eb4cc11888969dac7d636187fd3ace9c63d9a6bc3d7795021c1d811a843e413fe5e52c9

- I modified setup.sh so that it cleans and unpacks the version-agnostic apache-ant folder; the new version of setup.sh has been uploaded to tooltool as f584b01f967b3f0b4877871c6078c339381344061d94914e5a720babf8d2564640ef051b221886204d6b207b2bdb5526b2b51ec9560dd8b12856d2cf0eb5ec37

- I modified all references to the two previously mentioned files in releng manifests

- I modified references from apache-ant-1.8.0 to apache-ant in mozconfig.linux as well
Attachment #8392995 - Attachment is obsolete: true
Attachment #8392995 - Flags: review?(bugspam.Callek)
Attachment #8394043 - Flags: review?(bugspam.Callek)
Flags: needinfo?(sbruno)
Comment on attachment 8394045 [details] [diff] [review]
bug_971841_II.diff - updating manifests with used version of ant

Callek, I uploaded this patch to follow the same pattern used for clang, as you suggested, but... After all, I hope you will not approve it.
I don't like to have this information in two places (in the manifests, and within the ant artifact itself) for the risk of future misalignment - the version is implied in the tooltool sha512 id, so I'd not add it explicitly here.
Comment on attachment 8394043 [details] [diff] [review]
bug_971841.diff

soooo.... we can't use this setup.sh anymore, it bitrotted against http://hg.mozilla.org/mozilla-central/diff/7bb92c73225e/mobile/android/config/tooltool-manifests/android/releng.manifest

Please remove the 2 you added to tooltool in this bug (so we don't pollute tooltool with unused things) and re-do it with a newly updated setup.sh...
Attachment #8394043 - Flags: review?(bugspam.Callek) → review-
Attached patch bug_971841.diff (obsolete) — Splinter Review
Patch updated as the underlying code-base changed.

I also removed from tooltool the following artifacts:
-f630174deaf91d92317df52571fce29d7fd141da917f638423d5f5dd6f9f09a67d867d37b047687e18ef6d6dd1288f14f16c9ab31791d672d1ceb2400910181f -f584b01f967b3f0b4877871c6078c339381344061d94914e5a720babf8d2564640ef051b221886204d6b207b2bdb5526b2b51ec9560dd8b12856d2cf0eb5ec37
Attachment #8394043 - Attachment is obsolete: true
Attachment #8394659 - Flags: review?(bugspam.Callek)
Comment on attachment 8394659 [details] [diff] [review]
bug_971841.diff

Review of attachment 8394659 [details] [diff] [review]:
-----------------------------------------------------------------

without having looked, and to be clear, this patch *does* use the setup.sh that updated the android sdk?
Short answer: yes.

Long answer:

the version with updated android sdk (the current one before my patch) is 5aa... (beginning of sha512 hash);
my version is 328...

I uploaded the two versions here for convenience.

Here is the diff between the two:

Simones-MacBook-Pro:bug sbruno$ diff 5aa_setup.sh 328_setup.sh
5a6
> # ant in $topsrcdir/apache-ant
8a10
> rm -rf apache-ant
11a14
> tar xf apache-ant-bin.tar.bz2
Attachment #8394846 - Attachment mime type: application/x-sh → text/plain
Attachment #8394845 - Attachment mime type: application/x-sh → text/plain
Currently failing because of a class not found exception (class org.apache.tools.ant.launch.Launcher).

Such class is included in ANT_HOME/lib/ant-launcher.jar, though.

ANT_HOME is correctly set in mozconfig.linux. Is it possible that this setting is not read in the context of the ant execution?

The classpath used by ant is determined as follows (so setting ANT_HOME should be enough for a successful execution):

 Additional directories to be searched may be added by using the -lib option. The -lib option specifies a search path. Any jars or classes in the directories of the path will be added to Ant's classloader. The order in which jars are added to the classpath is as follows:

    -lib jars in the order specified by the -lib elements on the command line
    jars from ${user.home}/.ant/lib (unless -nouserlib is set)
    jars from ANT_HOME/lib

(Source: http://ant.apache.org/manual/running.html)
Adding an "ant -diagnostics" command before the execution of "ant debug" would tell us whether ANT_HOME is correctly picked up.
So, I think we're close to done here, though there is some sort of build error I can't figure out.

I'm happy to dive back in and diagnose it, or give someone access to teh host, or whatever.... just tell-me.

The details are in pastebin ( https://callek.pastebin.mozilla.org/4677873 ) right now, saved for "1 month" I'm happy to get it dumped to this bug if someone has the inclination. The error seems to correspond to this part of build.mk in the ant dir of the android sdk we are using. Which means its some sub-project/sub-ant-invoke that is causing the problem afaict


                <!-- no need to build the deps as we have already
                     the full list of libraries -->
                <subant failonerror="true"
                        buildpathref="project.library.folder.path"
                        antfile="build.xml">
                    <target name="nodeps" />
                    <target name="${project.libraries.target}" />
                    <property name="emma.coverage.absolute.file" location="${out.absolute.dir}/coverage.em" />
                </subant>


Any ideas brad...
Flags: needinfo?(blassey.bugs)
I deleted the android-16 packages from my laptop and I'm now able to reproduce locally
Flags: needinfo?(blassey.bugs)
punting this patch to brad since this working blocks him and I'm told glandium/et-al are loaded up.
Attachment #8397501 - Flags: review?(blassey.bugs)
Simone thanks for *all* your work here, this is a corrected patch that makes it work, Since you authored the orig I'm asking for r? from armen as well.

I'll be backing out my buildbot-configs changes soon as well.
Assignee: sbruno → bugspam.Callek
Attachment #8387550 - Attachment is obsolete: true
Attachment #8394045 - Attachment is obsolete: true
Attachment #8394659 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8394045 - Flags: review?(bugspam.Callek)
Attachment #8394659 - Flags: review?(bugspam.Callek)
Attachment #8397503 - Flags: review?(sbruno)
Attachment #8397503 - Flags: review?(armenzg)
Attachment #8397503 - Flags: review?(sbruno) → review+
Attachment #8397501 - Flags: review?(blassey.bugs) → review+
Attachment #8397503 - Flags: review?(armenzg) → review+
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: