Closed Bug 1320431 Opened 9 years ago Closed 9 years ago

Unhide Taskcluster Windows tests when they can go a full week without a widespread infra bustage

Categories

(Tree Management Graveyard :: Visibility Requests, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: philor)

Details

They certainly aren't yet a tier-2 job: yesterday was an all-day "claim, expire, retry, retry, retry, retry, retry, exception", before that was very frequent failure to even upload a log making it impossible to tell why they were failing, and now today it's "Could not install python package: Z:\task_1480140101\build\venv\Scripts\pip install --no-deps --timeout 120 -r Z:\task_1480140101\build\tests\config\mozbase_requirements.txt --no-index --find-links http://pypi.pub.build.mozilla.org/pub --trusted-host pypi.pub.build.mozilla.org failed after 5 tries!" Enough. You can have them visible again when they've made it a full seven days without mass bustage (a thing which sheriffs will not be looking for, once they are hidden we will not see them at all for any reason, so you'll have to both watch over them and also say when you've kept them running for seven days).
- failure to upload artifacts on task failure: was tracked and fixed in bug 1311966 - claim, expire, retry: is tracked and worked around in bug 1320313 until a patch becomes available for generic-worker - could not install python package: was a valid test failure. in-tree code must have changed long enough to cause the test failure and then be rolled back or patched as those tests are succeeding now without intervention from myself. it happened over the weekend, nothing was changed in either the generic worker or the ami creation process but the issue went away on it's own. buildbot windows slaves have a known issue (bug 1148087) whereby they resolve python dependencies from sources not in-tree. taskcluster win builders intentionally do not allow this. therefore python dependency errors in tc win builders (which are not mirrored by buildbot) is a good indication that a patch has introduced a new bug. i will not be requesting these builds to be unhidden. i have no desire to see bugs like this one raised again when tc win builds have failed for good or bad reasons. it is not difficult for me to continue my work of migrating win builds and tests to taskcluster with these builds hidden. it's a minor inconvenience to unhide the builds in the treeherder ui. it will make valid failures like the last one mentioned above less transparent to anyone who doesn't habitually unhide hidden builds but i can see that there has to be some balance between visibility of good failures and invisibility of bad failures and it's not my place to decide what that balance should be. i respect the sheriffs decision to bias towards invisibility of bad failures.
The pypi failures started late Friday night Pacific time on all of inbound/central/autoland/aurora, and ended late Saturday night Pacific time on all three. Could have been a bad AMI, could have been a busted host if pypi.pub.build.mozilla.org is implemented in a way that the TC Linux testers don't hit the same one as the TC Windows testers, could have been something I can't imagine, but it could not have been a bad patch that was backed out.
Unhidden since the cpp moved to tier-3 (the videopuppeteer job didn't, but we're pretty used to ignoring failures in it).
Assignee: nobody → philringnalda
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.