Closed
Bug 492033
Opened 16 years ago
Closed 15 years ago
Random disconnects when TCP SACK is enabled
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Assigned: dmoore)
Details
(Whiteboard: 07/22/2010)
If I'm connected to the MPT VPN, and I ssh to one of the machines in the build network, e.g. moz2-linux-slave02.build.mozilla.org, I'll occasionally get randomly disconnected from the machine.
Something is actually sending me TCP RST packets:
13:51:45.781316 IP moz2-linux-slave02.build.mozilla.org.ssh > 10.2.21.38.51851: Flags [R.], seq 2701, ack 2657, win 79, options [nop,nop,TS val 6684024 ecr 4245624108,nop,nop,sack 1 {2653630087:2653630183}], length 0
13:51:46.601533 IP moz2-linux-slave02.build.mozilla.org.ssh > 10.2.21.38.51851: Flags [R.], seq 2701:2749, ack 2657, win 79, options [nop,nop,TS val 6684241 ecr 4245624108], length 48
Disabling SACK on my machine (echo 0 > /proc/sys/net/ipv4/tcp_sack) seems to fix the issue, but I'd rather not have to do that.
Also, ssh'ing to mpt-vpn.mozilla.com, and then ssh'ing to the desired machine seems to fix the issue.
Updated•16 years ago
|
Assignee: server-ops → dmoore
Assignee | ||
Comment 1•16 years ago
|
||
This bug is due to the lack of TCP SACK support our Cisco firewall software.
When we first encountered it, the status would have been WONTFIX. Cisco had not released a workaround at that time. They have recently released a software upgrade, however, which disables TCP SACK negotiation during the TCP handshake. The end result is the same as disabling SACK locally.
IT will have a discussion later today to determine a schedule for upgrading our firewalls.
Assignee | ||
Comment 2•16 years ago
|
||
Tentatively scheduled for the evening of 05/12
Updated•16 years ago
|
Flags: needs-downtime+
Whiteboard: 05/12 @ 7pm
Updated•16 years ago
|
Group: infra
Assignee | ||
Comment 3•16 years ago
|
||
Running on the new software version now.
Secondary firewall upgrade scheduled for 05/14.
Whiteboard: 05/12 @ 7pm → 05/14 @ 7pm
Assignee | ||
Comment 4•16 years ago
|
||
Firewall upgrade with SACK workaround is complete
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 5•16 years ago
|
||
I've been hitting this again lately. I haven't tried tcpdump to determine if I'm getting RST packets, but disabling SACK locally does seem to fix the problem.
Reporter | ||
Comment 6•15 years ago
|
||
I'm still hitting this when SACK is enabled locally:
08:12:02.519529 IP production-master.build.mozilla.org.ssh > 10.2.21.86.53291: Flags [R.], seq 725:749, ack 825, win 57, options [nop,nop,TS val 4631086 ecr 2744937268], length 24
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 7•15 years ago
|
||
Fill me in here - we did a firewall upgrade to fix this but you're saying it's not fixed?
Assignee: dmoore → mrz
Reporter | ||
Comment 8•15 years ago
|
||
Yes, I'm still getting random disconnects from various machines. I just managed to capture this when trying to ssh to production-master:
15:05:48.768830 IP production-master.build.mozilla.org.ssh > 10.2.21.38.60501: Flags [R.], seq 1837:1901, ack 1793, win 79, options [nop,nop,TS val 55677417 ecr 548682849], length 64
Updated•15 years ago
|
Assignee: mrz → dmoore
Assignee | ||
Comment 9•15 years ago
|
||
We've confirmed that the patch for this problem actually regressed in a subsequent firewall firmware upgrade. We're investigating our current options.
Updated•15 years ago
|
Flags: needs-downtime+
Whiteboard: 05/14 @ 7pm → [blocked cisco]
Assignee | ||
Comment 10•15 years ago
|
||
A subsequent upgrade has been made available to us from Cisco. We'll schedule an appropriate downtime window shortly.
Updated•15 years ago
|
Flags: needs-downtime+
Whiteboard: [blocked cisco] → 05/18/2010 @ 7pm
Updated•15 years ago
|
Whiteboard: 05/18/2010 @ 7pm
Updated•15 years ago
|
Whiteboard: [needs to be scheduled]
Updated•15 years ago
|
Whiteboard: [needs to be scheduled] → 07/20/2010
Updated•15 years ago
|
Whiteboard: 07/20/2010 → 07/22/2010
Assignee | ||
Comment 11•15 years ago
|
||
FWSM upgrade was completed tonight, and (once again) we've enabled the TCP SACK workaround.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 15 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•