Open Bug 1861689 Opened 2 years ago Updated 1 years ago

sometimes all chunks for a test suite with a specific config on a platform scheduled, should use fewer chunks instead (= less test suite overhead like task setup)

Categories

(Firefox Build System :: Task Configuration, task)

task

Tracking

(firefox121 affected)

REOPENED
Tracking Status
firefox121 --- affected

People

(Reporter: aryx, Unassigned)

Details

Almost every push on autoland gets all web platform tasks for Android 7.0 x86-64 scheduled.

An exception with only few tasks is this one.

We scheduled every task if a task is new (its name label got added/changed) but this does apply here.

The task count per push indicates this started on 2023-10-19 but not immediately - there are still a few pushes which got optimized - tasks deemed unnecessary got removed.

The log of a decision task for a push which scheduled all these tasks contains lines like this one:

optimize: test-android-em-7.0-x86_64-lite-qr/opt-geckoview-web-platform-tests-nofis-1 kept because of test (skip-unless-backstop-and-not-skip-unless-backstop-and-skip-unless-push-interval-10.0-and-skip-unless-schedules-or-bugbug-reduced-manifests-fallback-last-10-pushes-or-platform-disperse-or-skip-unless-backstop-and-skip-unless-push-interval-10.0-and-skip-unless-schedules-or-bugbug-reduced-manifests-fallback-low-or-platform-disperse)

Marco, can you investigate what is happening here? The tasks sometimes get backlogged because the pool is insufficient for such a load.

Flags: needinfo?(mcastelluccio)

I think we need a new deployment of bugbug. Unfortunately automatic deployments have not been working lately because the task that is supposed to perform them is too slow at pushing images to Heroku and fails.

I did a manual deployment, we should be good now. Let me know if this still happens.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(mcastelluccio)
Resolution: --- → FIXED

The issue persists, almost always all web-platform tasks get scheduled for Android.

Marco, you have another look?

Status: RESOLVED → REOPENED
Flags: needinfo?(mcastelluccio)
Resolution: FIXED → ---

I had a look at 6e5782df6da147ae5c916097b12efcfae7149b5c.

Looking at the bugbug-push-schedules.json artifact, I see bugbug did not select any Android web-platform-tests (none of them are in "reduced_tasks" nor "tasks").
I also noticed the Android web-platform-tasks are not listed in "known_tasks". This means the scheduler will schedule them as it thinks they are new tasks: https://searchfox.org/mozilla-central/rev/11d085b63cf74b35737d9c036be80434883dd3f6/taskcluster/gecko_taskgraph/optimize/bugbug.py#161.

Bugbug builds the "known_tasks" list by using the target-tasks.json file from the decision task. It looks like the target-tasks.json file does not contain the Android web-platform-tasks. Ahal, do you know why they are missing?

Flags: needinfo?(mcastelluccio) → needinfo?(ahal)

I see them in there. It's the nofis variant that's scheduled to run. E.g search for test-android-em-7.0-x86_64-qr/debug-geckoview-web-platform-tests-nofis in this recent target-tasks.json artifact.

If they weren't in target-tasks.json, then they wouldn't run in the first place (e.g, they wouldn't even make it to the optimization stage). Unless something depends on them...

Flags: needinfo?(ahal)

Also searching for test-android-em-7.0-x86_64-qr/debug-geckoview-web-platform-tests-nofis in the bugbug-push-schedules.json artifact from the same push, does show that it is in known_tasks. But not in selected tasks, so it must be getting picked for some other reason.

Right, CTRL+F in Firefox doesn't work when you view JSON in the pretty JSON viewer...
I see they are present in target-tasks.json and in bugbug-push-schedules even for the push I had looked at earlier.

Like you said, they are not selected by bugbug, so in theory they should not run.

Actually, I spent a lot of time looking into this, and I think because we're using the bugbug-reduced-manifest strategy, the tasks and reduced_tasks values are unused. Instead you have to look at the group value. I notice there's quite a lot of high confidence WPT groups in there.. could it be that every chunk just happens to have one?

But that begs the question.. Why is it only Android that is getting these manifests? IIRC, the platform-disperse strategy is supposed to spread the manifest out across platforms.. maybe that's where the real bug lies?

Marco, any thoughts?

Flags: needinfo?(mcastelluccio)

There are actually fewer manifests than for pushes which run everything, but the chunk count is the same - this explains ahal's assumptions.

The same behavior has been noticed for Linux debug browser-chrome: both M-swr and M-spi-nw have all chunks.

Summary: Android web-platform tests always scheduled on autoland, should have task optimization like other tasks → sometimes all chunks for a test suite with a specific config on a platform scheduled, should use fewer chunks instead (= less test suite overhead like task setup)

Sorry I was away last week. Ahal, with chunking in the taskgraph it shouldn't be a problem, right? What you are saying should only happen without chunking in the taskgraph, where the chunks were fixed and we were selecting a chunk if it had at least a selected manifest. Am I missing something?

Flags: needinfo?(mcastelluccio) → needinfo?(ahal)

Yeah, that's right. I checked and we should be setting chunks dynamically. So I'm confused about what's going on here :/

Flags: needinfo?(ahal)
You need to log in before you can comment on or make changes to this bug.