Closed Bug 600523 Opened 15 years ago Closed 15 years ago

w32-ix-slave34,36 having connectivity issues

Categories

(mozilla.org Graveyard :: Server Operations, task)

All
Other
task
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: jabba)

References

Details

(Whiteboard: [machines recovered, needs to be racked at scl])

Attachments

(1 file)

We've had it disconnect over a dozen times since hooking it up yesterday. It usually gets disconnected while running hg. Current IP address: 10.12.48.60.
Blocks: 588950
http://buildbot-master1.build.mozilla.org:8010/buildslaves/w32-ix-slave34?numbuilds=100 Anything that says 'retry' or 'exception' is most likely a disconnect.
Assignee: server-ops → jlazaro
w32-ix-slave36 is also having problems.
Summary: w32-ix-slave34 having connectivity issues → w32-ix-slave34,36 having connectivity issues
Attached image screenshot
I managed to catch this when I logged into slave 36. Notice the balloon popup on the bottom right.
We're going to swap network cables and switch ports on our next trip to Internap in an attempt to isolate (what is likely) a hardware problem.
Changed the network cables and switch port, let's see if there's any improvement
w32-ix-slave34 just hit problems at Fri Oct 1 03:05:17 2010 did you reboot these machines? they shouldn't have reconnected to the master.
I did reboot them, my apologies. What kind of activity are you seeing now?
(In reply to comment #7) > I did reboot them, my apologies. What kind of activity are you seeing now? slave 34 hit a few disconnects last night, and I've shutdown buildbot for now. Are you seeing anything on the switch?
Will file a bug with IX since this might be hardware-related
(In reply to comment #9) > Will file a bug with IX since this might be hardware-related Any update here?
Contacted IX support through email to get this looked at, as well as the other machines experiencing various issues
w32-ix-slave34 w32-ix-slave36 These machines were taken by Chris Williams of IX Systems, to investigate the network connectivity issues. Will report back when I receive an update from IX
Assignee: jlazaro → server-ops
Assignee: server-ops → jlazaro
Updating the bug here from the email: Asset #4700 - Reported issue: NIC link resets/speed changes in Windows - Issue not observed on our test network. - Event log has record of link changes, indicating NIC is downgrading to 100Mbit due to link integrity issues. - Suggest installing current Intel NIC driver from ftp://ftp.supermicro.nl/driver/LAN/Intel/PRO_v15.5.zip - Admin credentials were not provided so we cannot test driver or run additional diagnostics.
Given that they haven't seen the issue on their network, are we sure it's not faulty hardware or cabling at SCL?
Comment #4 suggests that was tried.
Whiteboard: [machines recovered, needs to be racked at scl]
Assignee: jlazaro → jdow
Both machines are back online at Internap.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: