win7 test instances get killed (t-w732-ix)

RESOLVED FIXED

Status

defect
--
major
RESOLVED FIXED
3 years ago
Last year

People

(Reporter: aryx, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

[23:16:21]	arr: KWierso: something is killing the w7 machines
[23:16:33]	arr: KWierso: my speculation was that it was another bad graphics patch
[23:16:45]	arr: similar to the one philor found killing w7 machines back in feb
[23:17:04]	arr: but I don't have any data to back that up other than the pattern that the machines are going down in
[23:18:01]	nthomas: I just picked three at random (t-w732-ix-038, 39, 43) and they all end on a blue 'Windows 7 32-bit mozilla-beta debug test marionette' or '... marionette-e10s'
[23:18:59]	nthomas: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&filter-searchStr=marionette
[23:19:14]	nthomas: so 9df0ea5a8f8f or 85ccaed0061c ?
[23:19:33]	nthomas: AutomatedTester: ^^
[23:20:52]	arr: nthomas: those bug summaries look similar to the ones that brought down the infra in feb, yeah
[23:21:07]	arr: stuff about window sizes
[23:21:07]	nthomas: AutomatedTester: looks like one of the two marionette changes that landed on beta is faulty, and is taking out the pool of win7 h/w slaves
This looks similar to the issue we had in bug 1248347.
Looks like fallout from bug 1280101, something that only tickles a bug on beta as aurora/central/inbound seem fine.
Blocks: 1280101
Summary: windows test instances get killed → win7 test instances get killed (t-w732-ix)
rebooted:

t-w732-ix-001
t-w732-ix-006
t-w732-ix-007
t-w732-ix-009
t-w732-ix-018
t-w732-ix-019
t-w732-ix-021
t-w732-ix-023
t-w732-ix-036
t-w732-ix-038
t-w732-ix-039
t-w732-ix-047
t-w732-ix-052
t-w732-ix-054
t-w732-ix-060
t-w732-ix-064
t-w732-ix-068
t-w732-ix-076
t-w732-ix-079
t-w732-ix-100
t-w732-ix-102
t-w732-ix-110
t-w732-ix-114
t-w732-ix-120
t-w732-ix-121
t-w732-ix-122
t-w732-ix-126
t-w732-ix-130
t-w732-ix-134
t-w732-ix-137
t-w732-ix-143
t-w732-ix-152
t-w732-ix-160
t-w732-ix-161
t-w732-ix-162
t-w732-ix-167
t-w732-ix-171
t-w732-ix-172
t-w732-ix-175
t-w732-ix-180
t-w732-ix-191
t-w732-ix-192
t-w732-ix-195
t-w732-ix-198
t-w732-ix-200
t-w732-ix-205
t-w732-ix-206
t-w732-ix-213
t-w732-ix-218
t-w732-ix-221
t-w732-ix-225
t-w732-ix-229
t-w732-ix-231
t-w732-ix-234
t-w732-ix-245
t-w732-ix-262
t-w732-ix-266
t-w732-ix-268
t-w732-ix-273
t-w732-ix-280
At this point we're waiting for builds and tests to confirm a single backout is sufficient. Lowering severity.
Severity: blocker → major
I've gone ahead and reopened trees, as things seem to have improved greatly over the last hour or two.
The debug marionette tests are green on 2b22de79c849 so it seems likely opt will be OK too, lets resolve this FIXED.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.