Produce a universal (fat) GV AAR with native libraries for multiple architectures in automation

ASSIGNED
Assigned to

Status

defect
P1
normal
ASSIGNED
6 months ago
a day ago

People

(Reporter: fluffyemily, Assigned: nalexander, NeedInfo)

Tracking

(Blocks 2 bugs)

unspecified
Unspecified
Android
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 wontfix, firefox65 wontfix, firefox66 wontfix, firefox67 affected)

Details

(Whiteboard: [geckoview:fenix:p2])

Attachments

(5 attachments, 2 obsolete attachments)

Reporter

Description

6 months ago
A big pain point for developers is with multiple artifacts per architecture, especially as GV apps add multiple dependencies with native code (e.g. Rust components). Providing a fat AAR alongside the architecture dependent ones would help with this.
The technical work is covered by Bug 1485045, which is about to land.  I'm mutating this to be about _publishing_.  One way to do this is to make a new automation job which depends on all of the underlying architecture builds (x86/arm/arm64) and aggregates a universal/fat AAR.  There are probably other ways to do this; for example, we could do this at a different automation (i.e., not in TC).  Pros and cons to that.
Depends on: 1485045
Summary: Create a fat GV AAR for working with multiple architectures → Publish a fat GV AAR for working with multiple architectures
Points: --- → 8
Whiteboard: [geckoview:fenix:p1]
James says this is a P1 and that Nick offered to investigate.
Assignee: nobody → nalexander
Priority: -- → P1

Updated

5 months ago
Product: Firefox for Android → GeckoView
Summary: Publish a fat GV AAR for working with multiple architectures → Publish a universal (fat) GV AAR with native libraries for multiple architectures
I've started thinking about this.  A few notes:

1) There's a question of where to put this.

a) We could make a TC job that does this work as part of an "N-way" artifact build.  Again, very Mozilla-specific -- and without precedent in our existing automation.  (The old macOS universal builds actually compiled the native code twice as part of one "build" job.)

b) We could make a TC job that just invokes Python to repack an AAR.  That's a very Mozilla-specific (and maybe even Mozilla build-system specific!) way to achieve this.

c) We could make a TC job that invokes Gradle to repack an AAR in some way.

I'm most enamored of c).  We have lots of precedent for Android Gradle jobs that look "build-like": i.e., do a DISABLE_COMPILE_ENVIRONMENT build to compile Java or whatever, and then do some Gradle stuff (including publishing the GeckoView AARs).  The people who are most knowledgeable about the Android packaging bits will probably be more comfortable in Gradle/Groovy/Kotlin than in Mozilla-specific build-system Python.  And there's the possibility of growing a Gradle plugin if this is helpful -- or of adapting one of the existing ones, like https://github.com/NicoToast/fat-aar (which doesn't look quite like what I want to do right now).

2) There's a question of what we're trying to repack and how much we want to assert about the consistency of the inputs.

The library packing part is straight-forward now that we aren't compressing Gecko libraries: Bug 1486524.  After Bug 1485045, there shouldn't be much (hopefully, any!) architecture-specific JVM code.  But I expect we'll find that the input omni.ja's encode "interesting" details of the target architecture that we'll have to smooth out.  I'll start work on this locally and capture the current state of the world.

3) There's a question of how beetmover tasks interact with new build-like jobs.

I'm concerned that beetmover will not be happy with a job that produces a GV AAR but *not* a Fennec APK.  That is, as far as I know, unprecedented.  However, looking at Bug 1470942, I see a special "beetmover_geckoview" TC kind that _only_ consumes target.maven.zip, AFAICT.  And it already heavily filters the builds it applies to, so it looks like it should be possible to introduce this new build task and make it work with beetmover_geckoview.

While I'm here, I see that we're producing a target.maven.zip in Python at https://searchfox.org/mozilla-central/source/mobile/android/mach_commands.py#467.  We might be able to make that happen in Gradle; there's probably no need to have that layer of Python massaging in there.  See https://github.com/mozilla/application-services/blob/bb0e1f576f2d0118c7e3c35612171ccebf9c7baf/publish.gradle#L243-L258.

jlorenzo: how do you feel about 3)?  Do you foresee any issues?
Depends on: 1470942, 1486524
Flags: needinfo?(jlorenzo)
Status: NEW → ASSIGNED
Depends on: 1517878
Depends on: 1517882
This doesn't work as written, for reasons that I don't understand.  Johan, can you help with this part?

Depends on D15773
Attachment #9034510 - Attachment description: Bug 1508976 - Produce a multi-architecture GeckoView "fat AAR". #firefox-build-system-reviewers,snorp,agi → Bug 1508976 - Produce a multi-architecture GeckoView "fat AAR".
Attachment #9034512 - Attachment description: Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR. #firefox-build-system-reviewers → Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR.
I see a (yaml) lint error -- quelle surprise -- but overall this is looking reasonable on try.  Let's see if

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4da101077fba0707d72c465edd51e0063e99ebe0

comes back green after my cleaning pass.  I wasn't really able to make progress on the Nightly + beetmover version; there's a lot of assumptions coded into the CI configuration that will require an actual releng person (hopefully, Johan) to help with.  But: progress!
https://treeherder.mozilla.org/#/jobs?repo=try&author=nalexander%40mozilla.com&selectedJob=220077559

is failing with an actual difference in modules/AppConstants.jsm; my initial guess is HAVE_USR_LIB64_DIR but it'll have to wait until next week.  In any case, we'll be able to work around this easily enough (or make it not be different, like I'm doing in Bug 1517878.

If anybody wants to see a well-formed (I hope!) target.maven.zip, an older try push like

https://treeherder.mozilla.org/#/jobs?repo=try&author=nalexander%40mozilla.com&selectedJob=219921167

sketches the idea.  (But I used the x86_64 job for the work before adding the new job, so it's not exactly the same.)
Attachment #9034510 - Attachment description: Bug 1508976 - Produce a multi-architecture GeckoView "fat AAR". → Bug 1508976 - Produce a multi-architecture GeckoView "fat AAR". #firefox-build-system-reviewers,snorp,agi
Attachment #9034512 - Attachment description: Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR. → Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR. #firefox-build-system-reviewers

(In reply to Nick Alexander :nalexander [he/him] from comment #9)

https://treeherder.mozilla.org/#/jobs?repo=try&author=nalexander%40mozilla.
com&selectedJob=220077559

is failing with an actual difference in modules/AppConstants.jsm; my initial
guess is HAVE_USR_LIB64_DIR but it'll have to wait until next week. In any
case, we'll be able to work around this easily enough (or make it not be
different, like I'm doing in Bug 1517878.

Sadly, my guess was not correct. The actual issue is that MOZ_GECKO_PROFILER is true on arm, aarch64, x86 but false on x86_64. Based on Bug 1360322 there's at least a little work to be done to get support for x86_64. (Build support, at least.)

mstange: can you tell me if x86_64 should be supported (so this is just misconfiguration) or if work is needed here. Can you link to tickets if they're filed? Thanks!

Flags: needinfo?(mstange)

I talked to jlorenzo on Slack about this. Just to capture that here:

I would like to split the ticket in half: the initial task (non-Nightly) and publishing (Nightly).

jlorenzo: can you skim the first bits (and the last bit, which doesn’t work) and green light landing the initial bits? If you see something that should block landing the first bit, let me know and we’ll do it all together. If you could do that today that would be helpful, and then we can follow-up with hammering out the beetmover details Friday/next week. (And I'll file the ticket to follow-up with publishing Nightly, of course).

Also, a green try is at

https://treeherder.mozilla.org/#/jobs?repo=try&revision=efd7de78f34839a8c70ed084b14d96edfc6ca6ba&selectedJob=220455409

The maven.target.zip is at

https://queue.taskcluster.net/v1/task/ecXXps7pRRmezcpP_MtiOw/runs/0/artifacts/public/build/target.maven.zip

Depends on: 1518557

(In reply to Nick Alexander :nalexander [he/him] from comment #10)

mstange: can you tell me if x86_64 should be supported (so this is just misconfiguration) or if work is needed here. Can you link to tickets if they're filed? Thanks!

Yes, just a misconfiguration.

Flags: needinfo?(mstange)

Replied over Slack per comment 11. Patch reviewed on Phabricator. Please let me know if I can help in any other way.

Flags: needinfo?(jlorenzo)

Hi folks, a status update on this ticket. I haven't landed it 'cuz I wanted to investigate two things:

  1. Is this even the right approach? This makes one AAR with lib/$ARCH/*.so for multiple architectures. In application-services we have a similar pattern but aggregating across $FEATURE instead of $ARCH. Their, we have one composite/fat ("megazord") AAR with multiple POM dependencies to capture the different features. We could do something similar -- have geckoview which internally depends on all the geckoview-$ARCH publications. I don't really have a definitive answer to this: what we have now works and is pretty simple; there's really no magic once you unpack our TC integration. There's more magic in the other approach.

  2. How does this interact with Gradle composite builds? The end game is to have a fat AAR (this ticket), an updated Android-Gradle plugin (Bug 1515248) and a stream-lined variant configuration (Bug 1509539) in order to make composite builds work for local development happiness. I tested my WIP on this (very large!) stack and ... it works! More or less as I expected it to.

So with all of that said, I will land this after polishing a few small tweaks, and then we can move on to getting it consumed in our downstream consumers:

  • [ ] Android Components
  • [ ] Reference Browser
  • [ ] Focus?

--artifact try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c949ff6adc1444b122056e2bce298d8e268d6e4
--no-artifact try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a6842500d1a4fc0a98e92affff5cd9583429adc4
Android jobs try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f7c1dc40e73f804a81a3e0e5dda4b429dbd2e8c3

Summary: Publish a universal (fat) GV AAR with native libraries for multiple architectures → Produce a universal (fat) GV AAR with native libraries for multiple architectures in automation
Blocks: 1522581
Attachment #9034513 - Attachment description: Bug 1508976 - Part 2: Add GeckoView multi-architecture fat AAR Nightly. r?jlorenzo → Bug 1522581 - Publish GeckoView multi-architecture fat AAR Nightly. r?jlorenzo

Comment on attachment 9034513 [details]
Bug 1522581 - Publish GeckoView multi-architecture fat AAR Nightly. r?jlorenzo

Revision D15774 was moved to bug 1522581. Setting attachment 9034513 [details] to obsolete.

Attachment #9034513 - Attachment is obsolete: true

Nick, what are the next steps to finish the fat AARs (and bug 1522581)?

How will the fat AARs impact Fenix APK size?

Flags: needinfo?(nalexander)
OS: Unspecified → Android

(In reply to Chris Peterson [:cpeterson] from comment #16)

Nick, what are the next steps to finish the fat AARs (and bug 1522581)?

I need some time to get back to this, or we need somebody in releng to push it across the line. I'd be thrilled if somebody else could push it across.

How will the fat AARs impact Fenix APK size?

They shouldn't impact single-arch Fenix APKs at all, 'cuz Gradle strips unused architectures. The underlying libraries are identical. If Fenix doesn't configure per-arch, it will multiply the APK size by the number of architectures, i.e., by 4: instead of shipping just {armeabi-v7a}, we'll ship {armeabi-v7a, aarch64, x86, x86_64}.

I'm quite confident Fenix will configure per-arch so this will have roughly no impact on APK size.

Flags: needinfo?(nalexander)
See Also: → 1530757

[geckoview:fenix:p2] because this is important for Fenix but technically not a release blocker.

Whiteboard: [geckoview:fenix:p1] → [geckoview:fenix:p2]
Attachment #9034512 - Attachment description: Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR. #firefox-build-system-reviewers → Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR.
Attachment #9034510 - Attachment description: Bug 1508976 - Produce a multi-architecture GeckoView "fat AAR". #firefox-build-system-reviewers,snorp,agi → Bug 1508976 - Produce a multi-architecture GeckoView "fat AAR".

Comment 19

3 months ago
Pushed by nalexander@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4e5d97c93515
Produce a multi-architecture GeckoView "fat AAR". r=snorp,agi,froydnj
https://hg.mozilla.org/integration/autoland/rev/da57df805c56
Post: Don't include architecture in multi-architecture GeckoView artifactId. r=snorp
https://hg.mozilla.org/integration/autoland/rev/91c31d2a7706
Post: Include all targets in about:buildconfig in GeckoView fat AAR. r=froydnj

This allows to use the existing artifacts VCS-based crawling to
download the "raw" target.maven.zip from Android jobs and not process
it further. It's just put in a specific directory, ready for use.
This isn't a big deal in automation, where all URLs are known, but
it's very useful when building locally and the VCS and the pushlog
must be consulted to determine task URLs.

Depends on D24984

This follows the model set down for EME artifacts:

  • a new tier is added that uses mach artifact install --job ... to
    fetch artifacts
  • in automation, MOZ_ARTIFACT_TASK* is used to ensure the artifacts
    come from the correct tasks
  • the fetched artifacts are unpacked and specific inputs moved into
    places expected by the build and packager

In this case, the artifact fetching is complicated enough that I did
it in a new Mach command, mach android fat-aar. That command also
verifies that the fetched artifacts are compatible and that we're not
assembling a fat AAR that is nonsensical. The specific inputs are not
used in the Fennec APK that is produced; they're only used in the
GeckoView AAR that is produced.

The artifact fetching itself required tweaking to fetch only
target.maven.zip artifacts and to not unpack them.

The specific inputs used are the native libraries (libs/$ARCH/*.so)
and the architecture-specific preference files ($ARCH/greprefs.js and
defaults/pref/$ARCH/geckoview-prefs.js). None of these inputs are
impacted by l10n.

Depends on D31571

This allows to use the existing artifacts VCS-based crawling to
download the "raw" target.maven.zip from Android jobs and not process
it further. It's just put in a specific directory, ready for use.
This isn't a big deal in automation, where all URLs are known, but
it's very useful when building locally and the VCS and the pushlog
must be consulted to determine task URLs.

Depends on D24984

Attachment #9034511 - Attachment description: Bug 1508976 - Post: Don't include architecture in multi-architecture GeckoView artifactId. r?snorp → Bug 1508976 - Post: Don't include architecture in multi-architecture GeckoView artifactId. r=snorp
Attachment #9065597 - Attachment is obsolete: true
Attachment #9034512 - Attachment description: Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR. → Bug 1508976 - Post: Include all targets in about:buildconfig in GeckoView fat AAR. r=froydnj
Attachment #9065597 - Attachment is obsolete: false
Attachment #9065597 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.