Closed Bug 1638485 Opened 5 years ago Closed 5 years ago

mach try auto runs xpcshell tests and android tests for browser-window-only changes (and is confused about artifact vs. non-artifact builds)

Categories

(Firefox Build System :: Task Configuration, defect)

defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: Gijs, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=19801dd00c8114ddc93f7ef9bf7422c222f2ef3b

I changed:

browser/base/content/browser-allTabsMenu.inc.xhtml
browser/base/content/browser-allTabsMenu.js
browser/base/content/browser.js
browser/components/originattributes/test/browser/browser_favicon_firstParty.js
browser/components/originattributes/test/browser/browser_favicon_userContextId.js

and it ran tasks as in the attachment.

The xpcshell runs were not necessary, and neither were the android runs. It could probably do without running arm or asan, and it seems confused about opt/shippable artifact builds, as it ran both, and in some cases seems to have run tests twice - though I'm pretty sure the "opt" artifact builds get shippable artifacts, and will thus be the same running code as the "full" pgo builds.

Thanks for the feedback!

Currently the ML algorithm doesn't select platforms at all (though this is planned for the future). We run a naive algorithm to prune platforms down after the fact.

We should put a SCHEDULES rule in moz.build saying that if a push only modified files under /browser, don't run Android platforms (and conversely for /mobile). However, before doing this we'll need to adjust the optimization such that tasks selected for SCHEDULES are still candidates for other optimizations afterwards.

Component: Try → Task Configuration

Note there will always be some false positives, the important thing is that they are not many and that most regressions are caught. Please keep notifying us anyway, as we can possibly optimize things.

There was only one Android task selected (test-android-em-7.0-x86_64/debug-geckoview-junit-e10s, which in turn made build-android-x86_64/debug be scheduled), the other Android ones were scheduled for other reasons independent of the ML (I wonder why we scheduled the Android TV tasks, :ahal do you know?). This one selection is a bit weird, I'll investigate.

The xpcshell tasks are a bit more plausible (e.g. there are probably many patches where browser/base/content/browser.js is modified with other files that could cause xpcshell failures, and I have no way to tell whether a given "historical" failure was due to a given file in a patch rather than another).

Regarding asan, it prefers non-asan over asan when possible.

Which ones are the arm tasks? The Windows Aarch ones? I'm not sure why these were scheduled, they shouldn't have (the ML didn't select any aarch task). :ahal can you check?

Regarding the opt/shippable artifact builds, currently we are selecting tasks from both; in the near future we will always choose "opt" for try and "shippable" for autoland.

(In reply to Andrew Halberstadt [:ahal] from comment #1)

Currently the ML algorithm doesn't select platforms at all (though this is planned for the future). We run a naive algorithm to prune platforms down after the fact.

The bugbug_reduced kind of does, although not in a super-smart way (I mean, it doesn't try to minimize the number of selected platforms, it just prefers some platforms over others).

We should put a SCHEDULES rule in moz.build saying that if a push only modified files under /browser, don't run Android platforms (and conversely for /mobile). However, before doing this we'll need to adjust the optimization such that tasks selected for SCHEDULES are still candidates for other optimizations afterwards.

This would be something like what we were discussing the other day: a rule which specifies tasks that should not run unless given paths are modified, but that can still be optimized away by the optimizer.

Flags: needinfo?(ahal)

(In reply to Marco Castelluccio [:marco] from comment #2)

This would be something like what we were discussing the other day: a rule which specifies tasks that should not run unless given paths are modified, but that can still be optimized away by the optimizer.

Come to think of it, that is the current behaviour already. Any(skip-unless-schedules, bugbug) means remove if either a SCHEDULES rule says to or bugbug says to.

I always get that negative mixed up and forget that Any means if any scheduler says not to schedule.

(In reply to Marco Castelluccio [:marco] from comment #2)

The xpcshell tasks are a bit more plausible (e.g. there are probably many patches where browser/base/content/browser.js is modified with other files that could cause xpcshell failures, and I have no way to tell whether a given "historical" failure was due to a given file in a patch rather than another).

No, browser.js (and, I think, any file in browser/base/content/) is not loaded in xpcshell tests, so it shouldn't be able to cause any xpcshell failures. You could probably push a removal of the entire directory to try and run xcpshell tests, and I doubt they'd so much as blink (assuming you didn't break packaging, which would break the builds, which we'd need for other tests, so we'd notice without the xpcshell runs).

There are browser/components/, browser/modules and browser/extensions subdirs that have xpcshell tests, but there aren't any for browser/base or browser/themes - they only concern the browser frontend.

(In reply to :Gijs (he/him) from comment #4)

(In reply to Marco Castelluccio [:marco] from comment #2)

The xpcshell tasks are a bit more plausible (e.g. there are probably many patches where browser/base/content/browser.js is modified with other files that could cause xpcshell failures, and I have no way to tell whether a given "historical" failure was due to a given file in a patch rather than another).

No, browser.js (and, I think, any file in browser/base/content/) is not loaded in xpcshell tests, so it shouldn't be able to cause any xpcshell failures. You could probably push a removal of the entire directory to try and run xcpshell tests, and I doubt they'd so much as blink (assuming you didn't break packaging, which would break the builds, which we'd need for other tests, so we'd notice without the xpcshell runs).

There are browser/components/, browser/modules and browser/extensions subdirs that have xpcshell tests, but there aren't any for browser/base or browser/themes - they only concern the browser frontend.

Sorry, I wasn't clear. I'm saying it's likely that there are previous commits where browser/base/content was modified together with other files that could cause xpcshell failures. We can't tell easily tell (for now) which files in those commits were responsible for the failures, so we can't rule out it was browser/base/content (we can with our human comprehension, but a machine can't). These kinds of false positives will be reduced when we'll start using code coverage data to filter out selected tests.

Which ones are the arm tasks? The Windows Aarch ones? I'm not sure why these were scheduled, they shouldn't have (the ML didn't select any aarch task). :ahal can you check?

Here are the dependent tasks:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dKSj47r1QGmIKZJJyFjEIQ/dependents

So looks like just the "build-signing" (which I thought wasn't supposed to cause a build to be scheduled) and "upload-generated-sources". I'm not sure when the latter runs or what even it really does. We should investigate.

Flags: needinfo?(ahal)

Ah, those aarch builds specify their own optimization:
https://searchfox.org/mozilla-central/source/taskcluster/ci/build/windows.yml#1265

So that means they wouldn't use our "always" strategy with |mach try auto|. Though, this try push isn't a multiple of 10, so they still should have been optimized away. In other words, I still don't know why they ran but found something else we'll need to fix. Filed bug 1639289 for it.

See Also: → 1639324

(In reply to Andrew Halberstadt [:ahal] from comment #3)

(In reply to Marco Castelluccio [:marco] from comment #2)

This would be something like what we were discussing the other day: a rule which specifies tasks that should not run unless given paths are modified, but that can still be optimized away by the optimizer.

Come to think of it, that is the current behaviour already. Any(skip-unless-schedules, bugbug) means remove if either a SCHEDULES rule says to or bugbug says to.

I always get that negative mixed up and forget that Any means if any scheduler says not to schedule.

Not for test-inclusive though, right? If we add a rule such as "run android only when /mobile" is modified, we'd run it whenever "/mobile" is modified, no matter what the smart scheduler says.
What if we drop "test-inclusive" altogether, and just use "test" everywhere? So those "inclusive" tests would still run only when specific paths are touched, but not every time those paths are touched.

There's still a use case for test-inclusive if the task is relatively short running and is impacted by relatively few files.. Python unittests come to mind, though those aren't in the test kind. I think it should be a case by case basis,though certainly there are tasks that can move over.

I filed bug 1643811 to add the SCHEDULES rule for /browser.

Since Andrew fixed most of the issues with builds in bug 1641131, and we have bugs on file for most of the other identified issues, let's close this.
If you keep using "mach try auto" (I hope you do!), please keep giving us feedback!

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
See Also: → 1650043
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: