1617107 - Taskcluster task failing silently (bustage on base build prevents other pgo builds and tests from running)

I notice that Android 4.0 API16+ pgo / run tasks are marked busted and fixed by commit https://hg.mozilla.org/integration/autoland/rev/c7172b32a80d6ec2be386c60d48f42b4dbbcf5a6, in that range. I don't see any Android 5.0 AArch64 pgo builds in that range, so I wonder if Android 5.0 AArch64 pgo builds are dependent on Android 4.0 API16+ pgo builds. :mshal might be able to tell us...

Flags: needinfo?(gbrown) → needinfo?(mshal)

Michael Shal [:mshal]

Comment 3

•

5 years ago

Yeah, all Android PGO builds use the profile data from the android-api-16 build and run task. This comes from the 'use-pgo' attribute in the task definition:

https://searchfox.org/mozilla-central/rev/c1e3d3edd4a9b784971555dc74a5de23d768b2e1/taskcluster/ci/build/android.yml#232

So it will be difficult to get an Android 5.0 AArch64 PGO build in that range because the run task to generate the profile data had a high failure rate until the patch mentioned in #c2 was backed out.

Flags: needinfo?(mshal)

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

5 years ago

Product: Taskcluster → Firefox Build System

Chris Manchester (limited bugmail, email directly)

Updated

•

5 years ago

Priority: -- → P3

Michael Shal [:mshal]

Comment 4

•

5 years ago

Should we close this as WONTFIX? If it is just failing within the regression range, I don't really see what we can do here.

Alexandru Ionescu (needinfo me) [:alexandrui]

Reporter

Comment 5

•

5 years ago

Well, this is preventing the sheriffs to identify the culprits for some alerts (true that this doesn't happen often), but if there's nothing you can do here, you can close this.
Probably worth telling us (the sheriffs) how to identify this to avoid spending too much time on backfilling jobs like this. Is it happening on Android 5.0 AArch64 PGO only? I see 4 task definitions in android.yml containing use-pgo.

Flags: needinfo?(mshal)

Geoff Brown [:gbrown]

Updated

•

5 years ago

Summary: Taskcluster task failing silently → Taskcluster task failing silently (bustage on base build prevents other pgo builds and tests from running)

Michael Shal [:mshal]

Comment 6

•

5 years ago

I don't think I have the answer to that unfortunately. :tomprince, is there a better way to get feedback from taskcluster on what's happening here? If I understand correctly, I think the fundamental issue is that it can be hard to tell why retriggering a job isn't working (an android test in this case) if one of its dependencies earlier in the taskgraph is the one with the actual failure (android PGO profile generation here). Or is this something that would need to be solved in treeherder?

Flags: needinfo?(mshal) → needinfo?(mozilla)

Tom Prince [:tomprince]

Comment 7

•

5 years ago

We could perhaps error out if we find that any of task dependencies used in backfilling have already failed. I'm not sure if that would be better or worse than the current situation, in the case that some pushes have broken jobs and other don't. This could maybe be improved as part of Bug 1585757.

Component: General → Task Configuration

Flags: needinfo?(mozilla)

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Bugzilla

Taskcluster task failing silently (bustage on base build prevents other pgo builds and tests from running)

Categories

(Firefox Build System :: Task Configuration, defect, P3)

Tracking

(Not tracked)

People

(Reporter: alexandrui, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Updated

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Updated