Closed Bug 1195824 Opened 10 years ago Closed 9 years ago

Automatic backfilling should deal better with perma failures

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

(Blocks 1 open bug)

Details

RyanVM: you had some thoughts on this. Would you mind elaborating and helping us understand better?
ni? myself to keep this on my radar
Flags: needinfo?(ryanvm)
I'm not sure what more there is to add here beyond what's covered by the various deps of bug 1180732. I think the main point is we don't want unbounded backfilling on new jobs that fail (and won't have a previously-green run since they're new) and we don't want to bother with backfilling/retrying if a cause has already been identified and backed out. One other interesting observation I made, though, is that automatic backfilling/retriggering doesn't distinguish between visible and hidden jobs. So when a new permafailing job gets turned on (happened recently) or in general if you have a hidden permafailing job, you end up with a huge pile of backfilled jobs that are also failing in addition to auto-retries on them when they fail! I suspect that's a contributing factor to why we had some serious backlog issues last week.
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #2) > One other interesting observation I made, though, is that automatic > backfilling/retriggering doesn't distinguish between visible and hidden > jobs. So when a new permafailing job gets turned on (happened recently) or > in general if you have a hidden permafailing job, you end up with a huge > pile of backfilled jobs that are also failing in addition to auto-retries on > them when they fail! I suspect that's a contributing factor to why we had > some serious backlog issues last week. Looks like Armen just filed bug 1197223 for this.
That's right. Thanks for the info!
I'm going to start looking into this.
Assignee: nobody → armenzg
https://github.com/mozilla/mozilla_ci_tools/commit/aeabd6f50e3e0ca97ab7de1ec7d2a81beb700c95 from mozci.mozci import ( find_backfill_revlist ) revlist = find_backfill_revlist( repo_url='http://hg.mozilla.org/integration/mozilla-inbound', buildername='Ubuntu VM 12.04 mozilla-inbound opt test gtest', revision='0eee1ce8d43c', max_revisions=7, ) print revlist [] revlist = find_backfill_revlist( repo_url='http://hg.mozilla.org/integration/mozilla-inbound', buildername='Android 4.3 armv7 API 11+ mozilla-inbound opt test plain-reftest-5', revision='0eee1ce8d43c', max_revisions=7, ) print revlist [u'0eee1ce8d43c', u'aa291bcfb0e8', u'afd0786c65f5']
Deployed a bunch of changes. It could affect: * trigger missing jobs * trigger talos jobs * manual backfill sheriffs, jmaher: Please let me know if you notice anything wonky. Automatic backfilling is not yet running without dry run.
No issues so far. I've enabled automated backfilling. Here's the first backfill: > Oct 28 07:20:13 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:58d4fc52_Windows 7 32-bit mozilla-inbound opt test mochitest-2 will backfill [u'58d4fc528b3b', u'2730cc97c6ec', u'9a67e1d55e0d', u'b7dd8bf95c82', u'80f9778bb787']. Here's are few *not* backfilled jobs: > Oct 28 07:17:08 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:b5acf46a_Ubuntu VM 12.04 x64 mozilla-inbound debug test mochitest-jetpack will not backfill. > Oct 28 07:19:03 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:53952bbf_Ubuntu VM 12.04 mozilla-inbound debug test gtest will not backfill. > Oct 28 07:19:11 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:1e9c356a_b2g_emulator_vm mozilla-inbound opt test marionette-webapi will not backfill. > Oct 28 07:20:28 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:c537a7eb_Ubuntu VM 12.04 mozilla-inbound debug test gtest will not backfill. > Oct 28 07:20:33 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:1949b1c7_Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound opt test gtest will not backfill. > Oct 28 07:20:45 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:58d4fc52_Windows 7 32-bit mozilla-inbound debug test gtest will not backfill. > Oct 28 07:23:40 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:8d655f2a_Ubuntu VM 12.04 mozilla-inbound debug test mochitest-jetpack will not backfill. > Oct 28 07:24:30 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:2730cc97_Windows 7 32-bit mozilla-inbound debug test gtest will not backfill. > Oct 28 07:25:24 pulse-actions app/worker1.1: mozci INFO: BACKFILL-END:c537a7eb_Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound debug test gtest will not backfill. Both hidden jobs: * Ubuntu debug jetpack: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=Ubuntu%20debug%20jetpack&exclusion_profile=false&fromchange=39af5c53fad6 * gtest: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=gtest&fromchange=39af5c53fad6&exclusion_profile=false To my surprise jetpack is *sometimes* green.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.