trying to upgrade the Android SDK to version 29 triggers weird aidl execution failures
Categories
(Firefox Build System :: Android Studio and Gradle Integration, defect)
Tracking
(firefox76 fixed)
Tracking | Status | |
---|---|---|
firefox76 | --- | fixed |
People
(Reporter: froydnj, Assigned: froydnj)
References
Details
Attachments
(1 file)
Explanation forthcoming in the commit message for the patch I am about to post.
Assignee | ||
Comment 1•4 years ago
|
||
It started, as most things do, with a software upgrade.
Upgrading the Android SDK version produced mysterious build errors:
java.lang.RuntimeException: ... Error while executing process /builds/worker/fetches/android-sdk-linux/build-tools/29.0.3/aidl ...
Looking a little further back in the log, one found the mysterious
error:
/builds/worker/fetches/android-sdk-linux/build-tools/29.0.3/aidl: error while loading shared libraries: /builds/worker/fetches/clang/lib/libc++.so: file too short
And when one investigated that file, one found that all libc++.so
contained
was:
INPUT(libc++.so.1 -lc++abi)
which was obviously not any kind of loadable shared library.
The libc++.so
file was meant as a redirect, an artifact placed so people
could say -lc++
on their compiler command lines and things would Just Work.
The intent was that programs would link to libc++.so.1
(or whatever other
soversion was in use) and that they wouldn't have to worry about -lc++abi
or
any other private implementation details of libc++. Why, then, was aidl
linking to libc++.so
?
Looking at ldd aidl
on a local machine says:
...
libc++.so => $HOME/.mozbuild/android-sdk-linux/build-tools/29.0.3/./lib64/libc++.so (0x00007f8b4b2d1000)
...
and poking at aidl
with readelf --dynamic -W
indicated a partial solution
to the problem:
...
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/../lib64:$ORIGIN/lib64]
...
aidl
contained a DT_RUNPATH
entry for libc++.so
that specified to the
linker "find my libc++.so
at a path relative to my current binary location."
Presumably this was done so that aidl
would link to a specific libc++.so
and not pick up a random libc++.so
from the system on which aidl
was
running.
...Except that setting LD_LIBRARY_PATH
, which we specify during our builds for
various reasons, takes precedence over DT_RUNPATH
. So invoking aidl
was
looking for libc++.so
along LD_LIBRARY_PATH
, finding this linker script
that was meant to be used only by the compiler, and attempting to use it like
a real shared library.
Great, we understand the problem now; what do we do about it?
Unwinding our use of LD_LIBRARY_PATH
is an involved task. One further
wrinkle that wasn't mentioned above is that not every Android build that we
run failed after the SDK update: our x86-64 fuzzing build was fine, as was the
ARM PGO instrumentation build. Presumably, those builds set LD_LIBRARY_PATH
in slightly different ways compared to our other builds, and those differences
in setting LD_LIBRARY_PATH
contribute to aidl
somehow not getting invoked
with LD_LIBRARY_PATH
set. It is not obvious to me how difficult getting a
consistent LD_LIBRARY_PATH
setting is.
This next part is conjecture -- informed conjecture, but conjecture
nonetheless. When we invoke gradle for the first time, a separate daemon
process is started, presumably to make various Java startup problems go away.
What we would like to have happen, and what appears to happen, is that
everything that gradle does is actually spawned from the daemon itself, not
from the invoked gradle process. So if we could get the daemon invoked with
the correct environment (i.e. no LD_LIBRARY_PATH
present), everything would
just work, because aidl
would be launched from an environment that will
respect its DT_RUNPATH
setting.
This change assumes that gradle does not invoke things that depend on
LD_LIBRARY_PATH
, which is not at all clear; I think there are cases where we
can go mach -> gradle -> mach -> ... -> clang
. But those cases don't seem
to come up in automation (perhaps due to every build being a clobber), and if
they come up on local developer machines, local developer machines seem
unlikely to have LD_LIBRARY_PATH
set (mandatory invocation of xkcd#1172).
All of that is to say that this patch takes a semi-hacky approach to the
problem: when we invoke gradle, we make sure that we're invoking it with an
environment that doesn't contain LD_LIBRARY_PATH
. And doing so avoids all
of the problems outlined above.
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7f9799e1d3d5 delete LD_LIBRARY_PATH before invoking gradle; r=firefox-build-system-reviewers,rstewart
Comment 3•4 years ago
|
||
bugherder |
Description
•