Include Android-on-Mac builders in build pool

RESOLVED WONTFIX

Status

Infrastructure & Operations
Buildduty
RESOLVED WONTFIX
5 years ago
16 days ago

People

(Reporter: rnewman, Unassigned)

Tracking

Details

(Reporter)

Description

5 years ago
This is our primary mobile development platform, but plenty of times over the past year we've had bustage that wasn't visible in TBPL. The latest is Bug 779291 (tracking Bug 850089 for the fix), which makes it impossible to build Firefox for Android on Mac.

Can we get automated builds on our primary development environment, please?
How often? We can't scale Mac with Amazon.
On which trees/repos?

On that note, we would not be able to have it showing on tbpl by default (without using "&noignore=1) since I don't think we should be running this on per-checkin basis.
Ideally this should be per-checkin and showing on TBPL, and available on Try.

The goal here is for people to notice & back out when they bust this platform -- if it's hidden and/or too infrequent, then that won't happen.

(Also: if it's sufficiently infrequent, then developers will discover the bustage locally sooner / around the same time as TBPL hits it, so we get less value from having the automated builders at that point.)
(Reporter)

Comment 3

5 years ago
Let's start with agreeing on the goal -- we want to spot bustage on Mac Android, just as we spot it on Linux builds -- and then we can figure out what that means.

I would expect m-c and m-i by default, but perhaps there are some build-related trees that would also benefit.
Android-on-Mac isn't listed on https://developer.mozilla.org/en-US/docs/Supported_build_configurations, but it sounds like it's a Tier 2 or Tier 3 configuration, which we don't generally do builds for. (Exceptions being configurations that we may end up shipping, eg Metro or Windows x86-64.)
I actually don't think there is a need to run these builds on every single landing. In fact, the last few issues I'm aware of that happened on mac-android builds are entirely related to host tools. Changes to these are pretty rare, and yes, when they do happen, there is chance for breakage. I don't think there are many breakages other than changes to host tools. So I think it boils down to:
- does this really happen so often? I know it happened twice in the past two weeks, but that's circumstantial at best.
- can we reliably-ish determine whether a changeset is affecting host tools?
(Reporter)

Comment 6

5 years ago
Ben: those are platforms for which we *target* builds for Firefox, not platforms on which we *develop* Firefox.

Mac + Android toolchain is the latter, not the former. It's the development environment on which most of a large team creates code for a tier 1 build configuration. We don't have lists of these, but they're pretty much the sane subset of {MacBook Pro, random Wintel} x {Mac, Windows, Linux} x {Clang, Android toolchain}.

The issue here is that it's a cross-compiled build configuration, and we're only cross-compiling from one of our two source environments: Linux -> Android, not Mac -> Android. When the latter breaks, individual developers find out when they pull. When the former breaks, the breaker finds out when they push, which is The Way Things Are Supposed To Work®.
(In reply to Richard Newman [:rnewman] from comment #6)
> The issue here is that it's a cross-compiled build configuration, and we're
> only cross-compiling from one of our two source environments: Linux ->
> Android, not Mac -> Android. When the latter breaks, individual developers
> find out when they pull. When the former breaks, the breaker finds out when
> they push, which is The Way Things Are Supposed To Work®.

The same is true for people building with clang or a gcc that is not 4.5 on linux, people building with a version of MSVC that is not the one on the build servers. In fact, it's more true for these because the possibility of a breakage happening is not limited to host tools, but apply to the entire code base. Some people don't like that our builds have -Werror by default in some directories because of that.
Yet, we're not multiplying the build setups.
(Reporter)

Comment 8

5 years ago
(In reply to Mike Hommey [:glandium] from comment #5)
> I actually don't think there is a need to run these builds on every single
> landing. In fact, the last few issues I'm aware of that happened on
> mac-android builds are entirely related to host tools. 

I wouldn't call Bug 850089 a host tool problem. Check the backout diff:

https://hg.mozilla.org/integration/mozilla-inbound/rev/e8938a43c31a

C++, one Makefile change.

> - does this really happen so often? I know it happened twice in the past two
> weeks, but that's circumstantial at best.

It's been a fairly regular occurrence over the past year. I don't have statistics to say how frequent. I'd estimate monthly, but it might be more or less because I don't pull and build every changeset.

I don't know if we need to build on every push, or every other push, or in parallel to every PGO build (to make inbound merges easier).

What I *don't* want to have happen is that the causal link between push and breakage is lost. If we can keep that link without running a build on every push, fine; that's up to the regular sheriffs.
(In reply to Richard Newman [:rnewman] from comment #6)
> Ben: those are platforms for which we *target* builds for Firefox, not
> platforms on which we *develop* Firefox.
> 
> Mac + Android toolchain is the latter, not the former.

Ah, okay. I did misunderstand that a bit. However...we still don't have any history of supporting local developer setups (and yes, even though it's used by a lot of people, it is a local setup) on our infrastructure. That's not to say that it couldn't happen, just that it's not a decision that we'd want to make lightly.

The alternative here is for folks that need to compile Android to use a VM or request a second machine to run Linux on.
(In reply to Richard Newman [:rnewman] from comment #8)
> (In reply to Mike Hommey [:glandium] from comment #5)
> > I actually don't think there is a need to run these builds on every single
> > landing. In fact, the last few issues I'm aware of that happened on
> > mac-android builds are entirely related to host tools. 
> 
> I wouldn't call Bug 850089 a host tool problem. Check the backout diff:
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/e8938a43c31a
> 
> C++, one Makefile change.

It actually is, even if it doesn't look like so.

> > - does this really happen so often? I know it happened twice in the past two
> > weeks, but that's circumstantial at best.
> 
> It's been a fairly regular occurrence over the past year. I don't have
> statistics to say how frequent. I'd estimate monthly, but it might be more
> or less because I don't pull and build every changeset.

That seems about the same as my experience building with clang on linux...
(Reporter)

Comment 11

5 years ago
(In reply to Mike Hommey [:glandium] from comment #10)

> It actually is, even if it doesn't look like so.

I'll take your word for it :D


(In reply to Ben Hearsum [:bhearsum] from comment #9)

> Ah, okay. I did misunderstand that a bit. However...we still don't have any
> history of supporting local developer setups (and yes, even though it's used
> by a lot of people, it is a local setup) on our infrastructure. That's not
> to say that it couldn't happen, just that it's not a decision that we'd want
> to make lightly.

Historically we haven't had such a focus on cross-compiling. Now we do. I think that changes the game.


> The alternative here is for folks that need to compile Android to use a VM
> or request a second machine to run Linux on.

I'd rather throw that money at a build configuration… those two options are unpleasant enough that people wouldn't use them, so we'd just end up with a more messy playing field and even less chance of detecting bustage.
How long (roughly) does it take to have a clobber build for it? and dependent build?
Would breaking android-on-mac require a backout or tree closure? Who would be on the hook for fixing it up?
(In reply to Armen Zambrano G. [:armenzg] from comment #12)
> How long (roughly) does it take to have a clobber build for it? and
> dependent build?

probably faster than a mac universal build, possibly slower than an android linux build on the same hardware and definitely slower than a mac non-universal opt build (although i don't think we're gdoing any nowadays).
(Reporter)

Comment 15

5 years ago
(In reply to Armen Zambrano G. [:armenzg] from comment #12)
> How long (roughly) does it take to have a clobber build for it? and
> dependent build?

Takes about 40 minutes for a clobber build on my Mac, which takes 16 minutes for a clobber desktop build.

> Would breaking android-on-mac require a backout or tree closure? Who would
> be on the hook for fixing it up?

Same as any build bustage: backout, and some combination of the author and whoever knows how to fix it.

This is exactly the same as the situation now (see last night's backout), only with a much quicker alerting mechanism than "confused people on IRC wondering if they missed a clobber or if someone broke the build". :D
(Assignee)

Updated

5 years ago
Product: mozilla.org → Release Engineering
We simply don't have the Mac hardware resources to tackle this.
 
In fact, we're desperately trying to get as far out of Mac hardware game for *building* as we can (bug 921040). If cross-compilation would be acceptable, we could consider that.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → WONTFIX
(In reply to Chris Cooper [:coop] from comment #16)
> If cross-compilation would be acceptable, we could consider that.

FWIW, sadly, the range of bugs that would be caught by android-on-mac builders *need* a mac environment. That being said, I don't think there's been a lot of bustage for android-on-mac builds recently, but i could be wrong.
(Reporter)

Comment 18

5 years ago
Things have been pretty stable recently, and I'm certainly aware of the resource limitations at play, so no significant complaint from me.
(Assignee)

Updated

16 days ago
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.