Open Bug 1882160 Opened 9 months ago Updated 8 months ago

Improve geckoview build times

Categories

(Firefox Build System :: Android Studio and Gradle Integration, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: gbrown, Unassigned)

References

(Blocks 1 open bug)

Details

When the Fenix/Focus/android-components tasks are migrated to mozilla-central, many more tasks will depend on Geckoview builds.

Currently, on github, firefox-android tasks pull geckoview nightly from maven -- there's practically no wait.

On mozilla-central, these tasks benefit from using the current in-tree code for geckoview, but pay the price of needing to wait for geckoview to build. In particular, they rely on build-fat-aar. build-fat-aar relies on the x86/x86-64/arm7/aarch64 geckoview builds -- more waiting!

The migration effort is curently staged on oak; here's an oak-based try push with --artifact:
https://treeherder.mozilla.org/jobs?repo=try&revision=fce4b4aabe23c4b75a2b6c11b75b9328ddb25d60

Decision task start: 11:27:54
Geckoview platform builds start: 11:33:22
Geckoview build-fat-aar start: 11:42:51
Fenix build start: 11:54:44

Roughly, for artifact builds, geckoview platform builds take about 10 minutes and the fat-aar build takes about 12 minutes.

Here's an oak-based try push with --no-artifact:
https://treeherder.mozilla.org/jobs?repo=try&revision=d5a9d9274a49c5e1e904526e25a5eeb5dee88dca

Decision task start: 12:19:47
Geckoview platform builds start: 12:24:45
Geckoview build-fat-aar start: 12:46:48
Fenix build start: 12:59:16

Roughly, for non-artifact builds, geckoview platform builds take about 22 minutes and the fat-aar build takes about 12 minutes.

Improvements to both artifact and non-artifact build times are desired.

Geckoview builds currently use b-linux-gcp workers. Here are try pushes using b-linux-xlarge-gcp workers:

Artifact: https://treeherder.mozilla.org/jobs?repo=try&revision=13e9727726ad4c8c45d1d1da199f0f2ead5c6394
Non-artifact: https://treeherder.mozilla.org/jobs?repo=try&revision=5e25f50778ee8ed0c1eb0d771e147d82e99d4a87

I don't see any improvement in per-platform build times; there is a possible 2 minute improvement for build-fat-aar.

I believe that the TaskCluster-y way of solving this problem is to use the TaskCluster index. We want a hash that captures the state of GV, just like we have hashes that capture the state of toolchains. The index then lets us find tasks that produced the expected index. There's lots of support for this but toolchains might be the only place that really exercise that support. That is, you want things that will produce the same GV to re-use an existing one, and -- I claim -- it's worth the engineering effort to try to solve that problem.

Component: General → Android Studio and Gradle Integration

Currently, on github, firefox-android tasks pull geckoview nightly from maven -- there's practically no wait.

The wait is there, it's just invisible to github actions because you get whatever geckoview is already available.

The question here would be what problem exactly you're trying to solve, because it's not clear from comment 0. Of course builds of firefox-android on m-c are going to have to wait for geckoview to be available. Is this about try as developers would use it? Ahal has recently added support for reusing tasks, and I was assuming this usecase was one of if not the main reason. So, ... is there something else that's not already covered?

Yes, we can already use --use-existing-tasks: https://treeherder.mozilla.org/jobs?repo=try&revision=d963461ac8110f76c3f5b819a3fd9b70acb29402 . That seems to work great, but requires the flag be specified explicitly (might be forgotten) and only helps on try.

I'm trying to capture a few performance concerns that have been raised recently:

  1. Fenix developers can currently create a pull request and see their Fenix builds running after just a few minutes; on try, there's a significant delay if they forget --use-existing-tasks.
  2. A local artifact build of geckoview often takes about 5 minutes, but we wait for 20 to 30 minutes for a geckoview artifact build on try (decision task, per-platform build task setup for 4 platforms, geckoview archive creation) -- could we do better?

For the taskcluster index approach, I think the hash would need to be based on everything in gecko and geckoview; have we attempted anything like that before?

IMO the lower hanging fruit here is to skip build-fat-aar in most cases, and only depend on that where we genuinely need/want the 4 archs.

Depends on: 1882407
You need to log in before you can comment on or make changes to this bug.