HTTP Connections drain socket input before close to avoid RST

RESOLVED FIXED in mozilla19

Status

()

Core
Networking: HTTP
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: mcmanus, Assigned: mcmanus)

Tracking

unspecified
mozilla19
x86_64
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

5 years ago
If you close a socket with data pending on it in the kernel (but unconsumed by necko), a rst is generated to the peer.

that's a problem when the peer gets that rst and still has unconsumed data in its kernel buffers - that data is typically dumped.

Consider a spdy session that wants to finish up because 

0] we read a GOAWAY from the peer.

1] we generates our own GOAWAY frame with information interesting to the other server (last good id and reason code)

2] it sends that

3] it closes the socket

But that [0] GOAWAY was probably followed by a SSL Application Alert about the SSL session being shutdown in one direction. This data remains on the socket buffer when [3] happens and a RST is immediately generated.

And the race is on. If the RST gets received by the server's kernel before the userspace application on the server consumes the GOAWAY then the GOAWAY information is lost :(

This problem isn't limited to SPDY (as that SSL Alert message occurs in HTTPS too), but the synchronous nature of HTTP/1 makes it less likely that the unorderly shutdowns would impact anything outside of debugging.

I'm going to propose to work around this at the HTTP layer, not the transport layer. My patch does this just by a non blocking short read of the socket we're trying to close to pump any data (within reason) up from the kernel. The HTTP fix is kind of just a 99% fix (leaving the possibility of a race condition still alive while the socket transport layer could handle this with a real lingering close), but I'm not convinced that this is the right behavior for any arbitrary protocol to silently absorb.
(Assignee)

Updated

5 years ago
Assignee: nobody → mcmanus
(Assignee)

Comment 1

5 years ago
Created attachment 679773 [details] [diff] [review]
patch 0
Attachment #679773 - Flags: review?(cbiesinger)
Proper use of shutdown() in Necko would be nicer IMO.
(Assignee)

Comment 3

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #2)
> Proper use of shutdown() in Necko would be nicer IMO.

of course, but I already addressed this obliquely. PSM doesn't even hook shutdown and http shouldn't be dealing with timers et al to be sure this is done, and its not clear this is a cross protocol requirement. It gets complicated quickly.

we need to take the easy win in stuff like this and move on without feeling guilty about it.
Comment on attachment 679773 [details] [diff] [review]
patch 0

Why the total < 8000 check?

I'd move at least the SetEventSink call before the read, because there is no point in generating the progress callbacks here.
Attachment #679773 - Flags: review?(cbiesinger) → review+
(Assignee)

Comment 5

5 years ago
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #4)
> Comment on attachment 679773 [details] [diff] [review]
> patch 0
> 
> Why the total < 8000 check?
> 

The kernel is executing in parallel, so its conceivable this would never complete if something could fill the buffer as fast as we can read from it.. if you assume the kernel will get a higher priority than userspace on a really stressed system it starts to sound plausible enough to at least put a limit of some kind in there.. I made it 64K in my update.
(Assignee)

Comment 6

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/f3fa8d88ab2d

Comment 7

5 years ago
https://hg.mozilla.org/mozilla-central/rev/f3fa8d88ab2d
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
You need to log in before you can comment on or make changes to this bug.