Closed
Bug 35407
Opened 24 years ago
Closed 24 years ago
Transfer-Encoding: chunked, extra CRLF after headers -> blank page
Categories
(Core :: Networking, defect, P5)
Core
Networking
Tracking
()
RESOLVED
FIXED
M16
People
(Reporter: pollmann, Assigned: ruslan)
References
()
Details
(Whiteboard: Proposed fix attached)
Attachments
(4 files)
The above site depends on HTTP Transfer-Encoding: chunked to do something reasonable. Currently just doesn't load the page at all. Test case up at http://blueviper/cgi-bin/chunked.cgi (on UNIX systems here it's at /u/pollmann/public/work/cgi-bin/chunked.cgi).
Reporter | ||
Comment 2•24 years ago
|
||
Of course it does work. We wouldn't be able to load 80% of the sites if it didn't. I've a test that proves it works just fine. Are you sure your test is correct and isn't missing <CR><LF> or anything?
Reporter | ||
Comment 5•24 years ago
|
||
Not sure, but it is displayed in Nav. The original test case also displays in Nav but not it Mozilla.
Looking at your test case I doubt it's correct if you run it from Unix-based web server. Cuz the format is: HEX<CRLF>BODY<CRLF> ... next chunk Looks like you're missing CRs
Looking at your test case I doubt it's correct if you run it from Unix-based web server. Cuz the format is: HEX<CRLF>BODY<CRLF> ... next chunk Looks like you're missing CRs
Reporter | ||
Comment 8•24 years ago
|
||
I attached an old version of the test case, in the one on blueviper (and at /u/pollmann/public/work/cgi-bin/chunked.cgi), the CRLF's are indeed there. The format in the example is Transfer-Encoding: chunked<CRLF> Content-Type: text/html<CRLF> <CRLF> <CRLF> 3c6<CRLF> ... Body of document ... 0<CRLF> <CRLF> This works fine in Nav but not in Mozilla. I will attach the version without the CR's stripped.
Reporter | ||
Comment 9•24 years ago
|
||
Assignee | ||
Comment 10•24 years ago
|
||
I'm still confused. 80% sites I see always send Transfer-Encoding: chunked when you advertise 1.1 (this includes yahoo, netcenter/etc.). Are you sure you're doing the right thing. Nav never advertises 1.1 and I'm not sure it supports transfer-encoding. I also have a test case with NES4.1 which clearly shows that Mozilla does the right thing. Which build are you using? How does it behave?
Reporter | ||
Comment 11•24 years ago
|
||
Hmm, my last attachment was also off a bit due to the proxy I was using changing a URL. (That made the Hex count off by 15). Actually, the format appears to be more like: Transfer-Encoding: chunked<CRLF> Content-Type: text/html<CRLF> <CRLF> <CRLF> 3c6<CRLF> ... Body of document (exactly 3c6=966 characters) ...<CRLF> 0<CRLF> <CRLF> Might the 0<CRLF><CRLF> at the end be the part that is confusing Necko? I'll attach yet another update test case.
Reporter | ||
Comment 12•24 years ago
|
||
Which build are you using? Pulled today, Windows NT How does it behave? The page displays as a big blank white box, Document Done displayed in status bar. In viewer, I dumped the content model and got nothing.
Reporter | ||
Comment 13•24 years ago
|
||
Reporter | ||
Comment 14•24 years ago
|
||
I'm not claiming that this site is doing the right thing - the 0<CRLF><CRLF> at the end seems spurious - just forwarding on a bug report. :)
Assignee | ||
Comment 15•24 years ago
|
||
I specifically tested on many sites today. I see it sending chunked responses (on Gifs/etc). It works beatifully.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
Reporter | ||
Comment 16•24 years ago
|
||
Yes, transfer-encoding: chunked works in general, the summary of this bug was too general. Let me rephrase it: Is there anyway to gracefully handle this malformed case? Both Nav and IE display the page with no problem, but we do not display anything. I realize it is malformed, but backwards compatability is nice where it is not too painful to achieve. This is not a common case, so lowering severity
Severity: normal → trivial
Status: RESOLVED → REOPENED
Keywords: 4xp
Priority: P3 → P5
Resolution: INVALID → ---
Summary: Transfer-Encoding: chunked doesn't work → Transfer-Encoding: chunked plus extra characters -> blank page
Assignee | ||
Comment 17•24 years ago
|
||
Well. I'm not sure. If we've garbage at the end and we got a pipelined request - we're screwed, becase the trailer would the only delimiter we could rely on. We probably could ignore it for http 1.0 response but not for http 1.1 response. If you know of a way to safely tolerate the garbage at the end - I'm open to suggestions.
Updated•24 years ago
|
Target Milestone: --- → M16
Comment 18•24 years ago
|
||
The more recent version of this chat accessed through http://tpm.amc.anl.gov/TPMForum/ does work correctly
Reporter | ||
Comment 19•24 years ago
|
||
Well, that *was* the only real-world page I've seen with this problem. jar@ornl.gov, I have a question - What was changed in the new version to make it work with Gecko? If there was some obvious problem with the previous version, it seems like it might be best to not devote time and energy to fixing this bug.
Comment 20•24 years ago
|
||
I've been asked to comment on what changes were made in the source code of the "chat room" to fix the problem alluded to above. First You should realize that this code is designed to be a WWW based "chat- type" room with the ability to also handle images and to allow a history/record to be kept to allow others to view recent commentary creating a short term persistence of the "conversation" . The application is designed for "microscopists" who deal alot with images an need a simple mechanism (other than Email) to briefly share images in near realtime without the necessity of creating individual WWW sites. A centralized set of Forum room where the recent results can be shared and dicussed is the model. Consider it something akin to a chatroom with a whiteboard. The difference is that it is platform independant and obviously free. Initially, a client pull model was tried but was immediately thrown since it automatically refreshes the browser every N seconds which makes detailed observation and discussion of fix images nearly impossible. The code was then designed to only refresh the page when the server push routine detects a modification in the file(s) being observed. New code was written and based upon a simple server push using "Content-type: multipart/x-mixed-replace" directly to a frame (one frame for text and one for images is the model). Preliminary testing on Mac's using Nav 3.x and 4.x showed all was generally well and fine, however, testing using Window on the latest version of the Browser failed to transfer any data to the frame resulting in blank text screens. Solution to the above problem was to change the serverpush Perl code. TPMForum6.pl basically used simple server push. A new version was written employing a nph (Non-Parsed-Header) version and the problem has appeared to be solved. The current working version of the "Forum" room which now includes the ability to exchanging images can be found at http://tpm.amc.anl.gov/TPMForum/ This URL defaults to the current production version, which employs nph mode. There are still some annoying bugs in the image update code but that I believe is not a browser issue. The problem version remains on this site and can be tested at the URL http://tpm.amc.anl.gov/TPMForum/TPMForum6.pl Nestor Zaluzec...Author of Forum Code...
Assignee | ||
Comment 21•24 years ago
|
||
Gagan, do we support multipart/x-replaced at all now?
Assignee | ||
Comment 22•24 years ago
|
||
I'm looking at your site in the debugger (http://blueviper/cgi-bin/chunked.cgi) and it appears still not to be sending '\r\n', just '\n'. There's also another problem, where the first chunk is preceeded by <LF>. 4.x doesn't like it either and displays chunk header/trailer (cuz it probably doesn't understand chunk-endcoding in the first place). IE5.01 doesn't work for me either. I can try relaxing the converter to just take '\n', but I don't know if it's going to break anything.
Keywords: beta2
Comment 23•24 years ago
|
||
Hmmm.... Interesting all the info I have said that in serverpush you are supposed to send \n... I've never seen an example or written description saying that you should send \r\n. But I've only read the various books on CGI/Perl not anything from Mozilla. Should it be on all \n's? But if that is true why does it work at all? I have added the \r to all \n's in the nph-serverpush.pl routine and tested in here. So far everthing that was working before for me is still working . So can I ask you to try it again.. Nestor
Comment 24•24 years ago
|
||
we do support multipart/x-replaced
Reporter | ||
Comment 25•24 years ago
|
||
ruslan: Hmm, I don't know why it's sending just \n - the source for the CGI has CR and LF in it, I'll look in to it. Strange! In the mean time, the old version of Nestor's CGI definitely put out \r\n, so you can use that to test. Also, I'm pretty sure that not all modern web servers will send out \r\n for chunked encoding. The apache 1.3.9 server on blueviper (my Linux box) just sends out \n. I don't know if it would break anything to accept either, but it might be interesting to test out at any rate. Nestor: I don't know what the standard says about \r\n vs \n , ruslan would be the expert here. I do know from experience that many web server I've seen send \r\n and some send just \n (including HTTP 1.1 servers, like Apache 1.3.9 which is running on the Linux machine in my cube.) Given that, however, it's interesting to note that Perl, even recent versions, has some platform quirks. On Windows, print "\n"; will actually print out "\r\n". It could be that the book you were reading or the author of that particular section at least, was writing how a CGI should be written if it was served by an NT web server, which could be different than how it should be written if served off of a UNIX or Mac web server (again, depending on what the standard says, which I don't know).
Reporter | ||
Comment 26•24 years ago
|
||
Ruslan: okay, now I'm pretty sure that Apache 1.3.9 doesn't *want* my cgi to be able to print out \r\n... The old version of the cgi I was using had embedded \r's in a hard-coded perl string. I wasn't 100% sure that this would actually print out \r's. It turns out it did. I am now explicitly telling perl to print out \r's like this: print "Content-Type: text/html\r\n". "\r\n". "\r\n". "3c6\r\n". ... (document) ... "\r\n". "0\r\n". "\r\n"; I just checked and indeed this does cause the CGI to print out \r's. (Log into blueviper, type "/home/httpd/cgi-bin/chunked.cgi > foo" then look at the file foo that is created, it has CR's in it! However, when I access this CGI via the apache web server: telnet blueviper 80 > bar GET /cgi-bin/chunked.cgi HTTP/1.1 Host: blueviper.mcom.com The file that is created, bar, has no CR's in it. Apache stripped them away when running the CGI, even though it is returning the result as chunked. It seems this is a fairly reasonable argument to accept \n as well as \r\n - assuming it doesn't cause regressions elsewhere. However, I don't have thoughts one way or the other on what should be done with a page that has a stray chunk at the end as this one does (\n0\n\n). It seems reasonable to do whatever IE does.
Assignee | ||
Comment 27•24 years ago
|
||
The spec says CRLF very unabiguously. Also Transfer-Encoding is usually something servers do themselves. Apache does it all the time under the cover. I think it's a bug to have cgi spin out chunk-encoded output in the first place.
Comment 28•24 years ago
|
||
Guys.... Let me happily plead ingnorance for a moment . Every message I've got so far has mentioned something called TranferEncodeing; Chunked characters...Sorry folks but it's not at all clear TO ME what this is nor where the problem. Can you point me somewhere? So I can look this up and see if it is actually something I am doing ?? The "books" I have been using are the O'Reilly series on Perl, CGI programming, and Web Graphics. I don't have them handy right now to give you the authors names or exact titles. They are most certainly written with a UNIX focus. Also.... I need to get on a small soap box for a moment. Since I frankly don't understand one of the comments I just saw. There is a fundamental point here. I am NOT trying to be compatible with IE nor should you!! The Mozilla based browsers have been a stalward for us! The entire Department of Energy Materials Microcharacterization Collaboratory Effort as well as alarge part of the Computational Science Initiative is focused on Server Push Technology for providing collaborative services including server text, images, and video. All this technology was developed to be compatible with Netscape based browsers, Apache Servers, PERL and Java which we have been using now for 4 years. IE is basically not used anywhere withing the collaboratory!!! To argue that "It seems reasonable to do whatever IE does." is ridiculous. You should make an effort to be compatible with original technology. If you can also add IE compatibility that is fine with me, but damn don't take anything away!! I hope to hell your not saying that IE is going to define how we do science because IE just doesn't cut it and neither do NT WWW servers. The do fine for business but not for science. If Apache and equivalent servers on Unix and Mac machines strip the \r then so be it, look into how fix the new browser to look out for that. The old ones obviously did not have a problem with it, so why should the new one? I'm honestly confused as to where the problem is and why suddenly mozilla based browsers are broken. The various bits of code that we have in DoE has been running now for anywhere weeks to years. So what have has changed that now causes things to break? That should be the question and the focus. If it is an obvious flaw that we have created that's fine, point me there and we will fix it. But.... Perl does what Perl does and it is a very robust bit of code and is designed to send the appropriate \r\n for the respective platforms. Using it for CGI programming is a large componenet of the community and I would suggest that if this bit of code causes problems there will likely be more out there that do too.... Now time to gets down off the soap box......and try to find out how to fix things... Nestor
Reporter | ||
Comment 29•24 years ago
|
||
Ruslan: Oops, I oversimplified in my test to see if \r\n was sent or \n (the unix command line sequence I provided above). Apache didn't send carriage returns because my apparently giving only a Host header isn't enough information. However with the full headers we send, there *are* carriage returns being sent to Necko when you visit this URL: http://blueviper/cgi-bin/chunked.cgi Can you please look at this again? I'm almost 100% sure that there are carriage returns sent. If you can't see this under the debugger, please give me a call. If you would like to trace through this with me, I can show you in my cube how I'm doing it. If you want to take a look yourself, I'm running the perl script /u/pollmann/bin/httproxy that I wrote to proxy and log http traffic. It is getting carriage returns logged to it in all of the right places. Also, according to the HTTP 1.1 spec, the response from this server is 100% legal: ftp://ftp.isi.edu/in-notes/rfc2616.txt > The chunked encoding is ended by any chunk whose size is > zero, followed by the trailer, which is terminated by an empty line. This is born out in the sample implementation in 19.4.6, which would handle the testcase correctly. Nestor: Sorry, to clarify: > To argue that "It seems reasonable to do whatever IE does." is ridiculous. > You should make an effort to be compatible with original technology. If you > can also add IE compatibility that is fine with me, but damn don't take I meant that comment to specifically apply to cases that are neither covered by the spec nor Nav 4x behaviours. At the time I thought you were putting out an invalid HTTP 1.1 response, and Nav certainly didn't handle HTTP 1.1 responses. My thinking on the matter of specs and backwards compatability is to follow these steps (I think quite a lot of people working on Mozilla try to do this whenever possible): 1) Well formed case? Yes) Follow the spec for all well-formed cases, done No) Go to step 2 2) Was Nav 4.x's behaviour sane for this case? Yes) Follow Nav 4.x's behaviour, done No) Go to step 3 3) Is IE's behaviour sane for this case? Yes) Follow IE's behaviour, done No) Go to step 4 4) Do something sane
Summary: Transfer-Encoding: chunked plus extra characters -> blank page → Transfer-Encoding: chunked, legal zero size trailing chunk -> blank page
Reporter | ||
Comment 30•24 years ago
|
||
Ah HA! Fairly humorous... The problem was an extra CRLF between the headers and the chunk size. I just stepped through the nsHTTPChunkConv code - because the first characters it encounters is CRLF instead of a hex number, it assumes the size of the first chunk is 0, then since the spec says zero size chunks should end the document, it closes out the document. This might be just a bit too brutal?! The code in question is here: http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/nsHTTPChun kConv.cpp#254 case CHUNK_STATE_LENGTH: if (mLenBufCnt >= sizeof (mLenBuf) - 1) return NS_ERROR_FAILURE; rv = iStr -> Read (&c, 1, &rl); if (NS_FAILED (rv)) return rv; streamLen--; if (isxdigit (c)) mLenBuf[mLenBufCnt++] = c; else if (c == '\r') { mLenBuf[mLenBufCnt] = 0; sscanf (mLenBuf, "%x", &mChunkBufferLength); mState = CHUNK_STATE_LF; } break; This problem was present in the original http://tpm.amc.anl.gov/TPMForum/TPMForum6.pl but is not there any more. You can still seen it in my test case. If I remove the extra CRLF, it the bug 'disappears'. Well, going by my above steps this falls into category 3) YES). IE does something very sane here and just ignores the extra CRLF. I think we should too. It would be easy to fix this problem - just loop before the first number consuming CRLF's or possibly even before every number? I can provide a patch if you would like.
Summary: Transfer-Encoding: chunked, legal zero size trailing chunk -> blank page → Transfer-Encoding: chunked, extra CRLF after headers -> blank page
Reporter | ||
Comment 31•24 years ago
|
||
Reporter | ||
Updated•24 years ago
|
Whiteboard: Proposed fix attached
Assignee | ||
Comment 32•24 years ago
|
||
Ok. I need to decide smth on this bug. Are there any other sites to show this bug? Strictly speaking we don't have a bug in the browser and the question is whether to relax the parser to handle this (malformed) case or not?
Reporter | ||
Comment 33•24 years ago
|
||
Here's one...
Reporter | ||
Comment 34•24 years ago
|
||
*** Bug 32877 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 35•24 years ago
|
||
Coping the url. I guess then we'll have to fix it. And why do people invent standards in the first place :-( http://www.ses-astra.com/index_poll.htm
Assignee | ||
Comment 36•24 years ago
|
||
Fixed
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•