Closed Bug 87817 Opened 23 years ago Closed 23 years ago

violation of HTTP 1.* standard when sending POST requests

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9.4

People

(Reporter: mbykova, Assigned: darin.moz)

References

()

Details

(Whiteboard: [PDT+] [ETA ready now] [fixed-on-trunk since sept 12])

Attachments

(3 files)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux 2.4.5 i586; en-US; rv:0.9.1) Gecko/20010608
BuildID:    2001060810

I'm working on a web server and determined that Mozilla (as well as some other
browers) sends an extra CRLF (carriage return (13) and linefeed (10)) characters
after message-body in POST requests

The HTTP 1.1 RFC (#2068) states that nothing can follow the message-body of an
HTTP request (syntax of a generic message is on page 30 of the RFC).

RFC 2068, page 31:     
Note: certain buggy HTTP/1.0 client implementations generate an extra CRLF's
after a POST request. To restate what is explicitly forbidden by the BNF, an
HTTP/1.1 client must not preface or follow a request with an extra CRLF.

I could provide a byte-to-byte printout of a request that Mozilla sends showing
the mentioned characters, but I don't think it's necessary.

Reproducible: Always
Steps to Reproduce:
1.Go to http://x500.ohiou.edu:9999
2.Go to "Advanced search"
3.Enter "bykova" in the last name field
4.Submit the form
5.View the result 


Actual Results:  The web search engine reads only the number of characters
specified in the Content-Length field of a request and ignores all other
characters. Mozilla tries to send more characters and gets confused when it sees
the response. As a result, a part of the request and all of the response
(including the HTTP header) are dumped on the page. 

Expected Results:  The results of such POST requests should be the same as the
results of similar GET requests. Simple search on this web site uses the GET
method. Enter "bykova" in the text box on the main page
"http://x500.ohiou.edu:9999", submit, and see what the results should look like.

Altenatively, you can use Netscape and see how their results of advanced search
look. Netscape's resulting pages are correct.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Target Milestone: --- → mozilla0.9.3
-> pollmann
Assignee: neeti → pollmann
Component: Networking: HTTP → Form Submission
seeing similar effects on a number sites, not sure if they are caused by the
same bug.
I think I saw the code that did this with a comment that certain web 
servers/applications required that extra CRLF. But that is only a faint memory.
From the Apache docs,

  Trailing CRLF on POSTs

   This is a legacy issue. The CERN webserver required POST data to have
   an extra CRLF following it. Thus many clients send an extra CRLF that
   is not included in the Content-Length of the request. Apache works 
   around this problem by eating any empty lines which appear before a 
   request.


Is this what you were looking for?
Missed 0.9.3.
Target Milestone: mozilla0.9.3 → mozilla0.9.4
As a possible suggestion (if we need to leave it in there), could a
header be added to indicate that we are working around non-compliant 
servers?

Something akin to :

X-Post-CRLF: Adding CRLF to support CERN servers

It is a problem that has been bugging me for quite a while, but only now I have
clarity.  

The company I am at is developing some Proxy software and at this stage I am
intending to ensure that invalid HTTP requests in are made to be valid HTTP
requests out.  (Or at least that is the way I would like it to be).

-> critical.

The problem isn't that we generate extra CRLF, its that our content length is
wrong. This plays havoc when we do keepalive, which we now do more aggressively.
Apache eats the extra CRLF, and squid may do so as well. (the RFC says that
servers SHOULD do so).

I believe that if the content-length was right, this doesn't violate the specs,
because we're not sending an _extra_ CRLF in the request body, we're sending it
as _part of_ the request body. This then matches ns4's behaviour on unix, and
ie5.5 on windows, and ie5.0 on the mac.

I don't know why this didn't hit us before - we didn't do connections to
1.1-only servers until recently, and maybe 1.0 servers worked arround it, by
following the SHOULD? Anyway, this needs to be fixed ASAP. Lightly tested patch
coming. Nominating topembed since my http/1.1 persistent connection patch went
on that branch.

-> me, cause I have a patch.
Assignee: pollmann → bbaetz
Severity: normal → critical
Keywords: patch, review, topembed
this patch seems correct to me.  sr=darin, but pollmann should review this as well.
Logic seems fine, but let me play around with this for a little bit...
As no existing browser adds the trailing crlf to the content length, I fear that
this may cause regressions - namely, that by sending back a form post buffer
that has a trailing crlf in it, and added 2 to the content-length, we will see
form post parsing scripts try to parse the crlf, then choke.

How much testing have you done with these changes?

As a gut feeeling, I would prefer changing our behavior to be like that of IE
and just not add the trailing crlf at all, rather than this fix.  We might
reopen some 'FIXED' bugs, but at least some part of the set is known, which
allows us to check to see if the sites still assume that all Netscape browsers
send a crlf, and if so, we can start contacting them to get the assumption
fixed.  (And if it turns out to be a server problem, we'll know in advance that
the change is probably going to cause a much larger impact.)

For reference, the trailing crlf was added to nsFormFrame.cpp:
3.68 pollmann%netscape.com Oct 14 1999 Bugs 16450, 11979, 16576: Add back a CRLF
I took out of the form post headers; r=harishd
Bug 16450: Can't log in to http://my.netscape.com/
Bug 11979: Can't download files from http://www.microsoft.com/downloads/
Bug 16576: Can't view page at all: http://www.freshmeat.net/
No, they all add it to the content length. Thats the problem - we're different.
The problem isn't in adding the CRLF - we still do that with my patch. The
problem is that we then lie about the content length, because we don't take that
into account.

And that is what breaks these servers.
Status: NEW → ASSIGNED
The broken behaviour for HTTP posts is present in IE and Mozilla.  In the proxy
applicaton that I am working on, I look at the data and swallow a trailing CRLF 
pair if they come after a post.

(It was annoying to track down.  It was most prevelent when IE starts up and 
sends some data to codecs.microsoft.com - It 'POST's 63 bytes of data after
saying that it is going to send 61.)

Mozilla is consistent with other browsers, but violates the standard.  (The way
that we handle it in the proxy is that we swallow and make things standards
compliant.)



OK, so I did some more sniffing.

IE5.5SP1 sends the correct content-length if you're talking to a host, but the
incorrect content-length if you're talking to a proxy. this is independant of
the http version setting.
Let me look into this some more.
Keywords: patch, review
"The broken behaviour for HTTP posts is present in IE and Mozilla.  In the proxy
applicaton that I am working on, I look at the data and swallow a trailing CRLF 
pair if they come after a post."

RFC2616 says that you SHOULD do so.
-topembed for the moment
Keywords: topembed
I still think that this patch is the right thing to do, but I'll push this off
for the moment, unless someone else chimes in.
Target Milestone: mozilla0.9.4 → mozilla0.9.5
-> darin for resolution. We should test what other browsers do, but I think that
following the standard would be good
Assignee: bbaetz → darin
Status: ASSIGNED → NEW
tested the following browsers:

1) netscape 4.75 [linux] -> doesn't send \r\n, correct content-length
                         -> loads test page correctly

2) netscape 4.70 [winnt] -> sends trailing \r\n, incorrect content-length
                         -> loads test page incorrectly (results match mozilla)

3) IE 6.0        [winnt] -> sends trailing \r\n, incorrect content-length
                         -> loads test page correctly (headers all in 1 packet)
from these tests, it seems that the mismatch of trailing \r\n to content-length
isn't the entire problem b/c IE6.0, which also lies about the content-length,
somehow loads the page correctly.  there seems to be something else going on
here... i also noticed that IE sends all of the headers in the first data
packet.  perhaps this somehow explains why the server is not failing to load the
page when accessed via IE6, because otherwise, the POST request from mozilla
does not really differ from that sent by IE6.
i hacked mozilla to make it write out its POST request in one packet, and that
didn't seem to solve anything.
pollmann added the code in nsFormFrame.cpp which appends the CRLF to the POST
data stream.  in his checkin comments, he mentions that this fixes bug 16450,
bug 11979, and bug 16576.  i reviewed those bugs, and for the first and third
there is no indication from the bug reports that this patch fixed those
problems, and for the second, eric added a comment that says that it didn't
solve that problem.
later it was determined to be a separate bug.  so, it seems that we may be able
to simply stop sending the extra \r\n like NS4.75.
If IE sends a trailing CRLF are we really really sure that we don't want to?

What was the "separate bug" the problem was determined to be?
let me add that pollmann's patch had two parts:

1) added a CRLF before the Content-Length header (necessary given the way form
posting works in mozilla)

2) added a CRLF at the end of the POST data stream (IMO unnecessary, since
NS4.75 doesn't do this)
bbaetz: you'd have to read the 2nd bug report to see... it was something to do
with non-standard DOM IIRC, and no i'm not really really sure that removing the
trailing CRLF is correct, but i am leaning toward it.
Status: NEW → ASSIGNED
I would like someone to urgently review this bug. With HTTP/1.1 compliant
webservers such as iPlanet Webserver, this bug (sometimes) causes the server
to get seriously confused:
 "The browser sent a message the server could not understand".

Also note that this bug has a possibly very large impact:
 - users may not be able to submit posts. 
 - users may have to submit posts twice
 - you will be shipping a client which is propogating yet more
   standard-violating behaviour, which must be worked around by the servers

HTTP is one area we do NOT need any more gray-area. If you have ever written
HTTP client or server code, you will know that you're stepping into a quagmire,
especially when it comes down to Content Length and Keepalive behaviour.

I would like to get the PDT team to look at this (but I don't know how to
bring it to their attention).

This should be P1 IMO.
We have two mutually exclusive patches, and need to pick one.

darin's is arguably closer to the standard, but its not what IE does. r=bbaetz
on darin's patch, because the Standard is always right.

It should bake on the trunk for a bit, though.
I would like to reiterate what Steve Parkinson said. This bug violates the HTTP
RFC and causes major problems with Netscape & iPlanet web servers that comply
with the spec. This bug is a should not be a P3 and should be a stop ship.

BTW OS and Platform should be "all".

I think the right fix BTW is never to send an extra CRLF and send a
content-length that exactly matches the number of bytes sent in the body.
CRLF would screw things up. Remember that HTTP POST can be for binary data, not
just for form data. The servers have to be very strict for that reason, they
don't differentiate POST requests based on the MIME type submitted.

Keywords: nsbranch
Whiteboard: PDT
6.[01] shipped with this, and IE does it.

Again, assuming that it doesn't break anything, darin's patch is better than mine.
OS: Linux → All
Hardware: PC → All
Attachment #48223 - Flags: review+
With the test case cited at the top of this defect, at
http://x500.ohiou.edu:9999 , IE 5.5 works fine and so does Communicator 4.78. I
did not try NS 6.01, but NS 6.2 beta from 20010926 has the problem. So I would
not be so sure that IE has the same problem. Perhaps IE adds the CRLF - I did
not run a network trace - but it may include it in the content-length. That
would not be an HTTP violation, but rather something that the CGI application on
the server-side could deal with, without necessarily disturbing the HTTP
processing of ensuing requests. If the content-length is off though, it's a
violation that cannot be worked around.
What are the chances of getting this for the 0.9.4 branch?
Whiteboard: PDT → [PDT] [ETA ?]
Comment on attachment 48223 [details] [diff] [review]
patch to make mozilla stop sending the extra CRLF

sr=mscott
Attachment #48223 - Flags: superreview+
I have attached a network trace of the test case described above, made with
IPTRACE under OS/2 with Mozilla 0.9.4 . The dump shows clearly that the browser
is submitting 264 bytes of data in the POST request, with "0D 0A" bytes at the
end of the data on line 415 of the trace. However, the content-length is 262 .
This is clearly a violation of the spec.

On the other hand, the trace also shows that the server in this case is replying
immediately by echo'ing the headers of the request, so the server is also not
compliant here. It might be possible to find a better test case, but I don't
think it's necessary since the trace already shows the browser submitting an
invalid HTTP request.

jaime: we need _a lot_ of testing before considering this on the 0.9.4 branch.

i'm going to check it in today on the trunk, and then we'll just have to wait
and see what kind of regressions come out of it.  according to Eric Pollmann
there were bugs reported in the past that actually led to us adding the extra
\r\n at the end of the post data.  i'm anxious to see if any of these "bugs"
show up again.
Whiteboard: [PDT] [ETA ?] → [PDT] [ETA ?] ready-to-land
fixed-on-trunk (as of version 3.167 of nsFormFrame.cpp -- seems this patch
slipped in along with my checkin of bbaetz's SOCKS patch -- oops!)

so, i guess that means that we've actually been running with this patch since
september 12th.. and, i've not heard of a single regression since then.

i've also confirmed that this change did not make it on the 0.9.4 branch.
Whiteboard: [PDT] [ETA ?] ready-to-land → [PDT] [ETA ?] [fixed-on-trunk since sept 12]
so, i'm much more inclined to push for getting this patch on the 0.9.4 branch now.

gagan: your thoughts?
-> targeting mozilla 0.9.4 for possible checkin on the branch.
Target Milestone: mozilla0.9.5 → mozilla0.9.4
Marking as nsbranch+ to get on PDT radar.
Keywords: nsbranchnsbranch+
Whiteboard: [PDT] [ETA ?] [fixed-on-trunk since sept 12] → [PDT] [ETA ready now] [fixed-on-trunk since sept 12]
pls check it in tonight, if you can - PDT+
Whiteboard: [PDT] [ETA ready now] [fixed-on-trunk since sept 12] → [PDT+] [ETA ready now] [fixed-on-trunk since sept 12]
fixed-on-branch
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
VERIFIED: Win32, 0.9.4 - 2001100305.

Since darin fixed it, and its network behavior, Tom or I will retain QA
ownership of this bug.

Used example above and it worked.

+verifyme - need to load other platforms and check
Keywords: verifyme
VERIFIED MORE: (-verifyme)
Commercial 094 Linux 2001100404
Commercial 094 MacOS 2001100403
(the windows verficiation from before was commercial as well)
I haven't seen reports of this breaking on the trunk, so think we are done.
Status: RESOLVED → VERIFIED
Keywords: verifyme
*** Bug 103312 has been marked as a duplicate of this bug. ***
Component: HTML: Form Submission → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: