Closed
Bug 680457
Opened 14 years ago
Closed 12 years ago
Network hiccups affected OPSI XP slaves
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P5)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: armenzg, Unassigned)
Details
(Whiteboard: [opsi])
Attachments
(1 file)
12.48 KB,
image/png
|
Details |
With all the issues between SCL and SJC we hit this problem.
The XP slaves started slowly hitting issues around between 1-2 days ago. The problem took all of these slaves out of the pool:
04,05,08,11,13,14,15,16,17,18,23,26,27,28,29,31,32,33,36,37,41,43,52,54,56,57,59,60,62
The OPSI prompt is the following:
zugriffsverletzung bei Adresse 005C435E in Modul 'winst32.exe'. Lesen von Adresse 00000000
which means:
Access violation at address in module 005C435E 'winst32.exe'. Reading address 00000000
If we have had the buildbot started check we would have caught this earlier.
Not sure if OPSI would have a way of rebooting the slave upon failure or not blocking it.
Probably having a second OPSI master in SCL would have helped.
Not sure how much we would like to get dragged into this since it is not normal conditions and we are going to attempt to get rid of OPSI ASAP.
Reporter | ||
Comment 1•14 years ago
|
||
Comment 2•14 years ago
|
||
talos-r3-xp-038 too. Clicking OK didn't unstick it, and using ssh to reboot only resulted in the ssh service stopping. I've put it on the reboots list in bug 678883.
talos-r3-xp-09 & talos-r3-xp-039 were running the screensaver, and continued as soon as I connected with VNC and waggled the mouse - this is another OPSI + network glitch type of issue.
talos-r3-xp-025 & talos-r3-xp-053 were sitting with runslave.py still running despite a couple of ^C's. Rebooted them.
talos-r3-xp-045 is a long term issue (bug 661377).
Otherwise http://build.mozilla.org/builds/last-job-per-slave.txt looks clean.
Reporter | ||
Comment 3•14 years ago
|
||
talos-r3-xp-022 hit this same issue after finishing a reftest job ending at 2012-01-12 16:58:41.
Clicking "OK" did not do anything. RDF fails to help. VNC fails to help. ssh fails to help.
Comment 4•13 years ago
|
||
I don't know how much we can do here. I imagine this will remain broken until we move to GPO or invest some effort into puppet on Windows.
Severity: normal → enhancement
Component: Release Engineering → Release Engineering: Platform Support
OS: All → Windows XP
Priority: P4 → P5
QA Contact: release → coop
Whiteboard: [opsi]
Comment 5•12 years ago
|
||
Kittenherder and GPO will save us here.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Assignee | ||
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
Assignee | ||
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•