Closed Bug 51363 Opened 24 years ago Closed 24 years ago

Won't complete connection for logged in user of my.lycos.com

Categories

(Core :: Networking: HTTP, defect, P3)

x86
Windows 2000
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: mwitbrock, Assigned: darin.moz)

References

()

Details

Attachments

(4 files)

Clear out lycos.com cookies (if you have any).
Create a my.lycos.com account (maybe use another browser for this).
Go to my.lycos.com site and try to log in. Get weird 302 page text that looks as
if it may be failing to set some cookies correctly.

If lycos cookies are already set, attempts to contact my.lycos.com result in an
indefinite network wait.

This bug appeared about 1 or 2 weeks ago.

If it can be confirmed as a problem with something that is being sent from the
server, I can have people at this end (i.e. at Lycos) look into fixing it, but I
need your help in diagnosing the source of the problem.
Using latest build 2000090508 under Windows 95 I get some kind of cookie error 
when logging into http://my.lycos.com/

After hitting the "login" button this text appears:

HTTP/1.1 302 Object moved
Server: Microsoft-IIS/5.0
Date: Wed, 06 Sep 2000 02:01:36 GMT
Set-Cookie: lycos_sso=guid%3Dterrigena%7Cln%26app_id%3Dly_ml%26pref_sso%
3Dall_allow%26mod_time%3D968205800%26ctime%3D3145658497%26version%3D1.8%26state%
3D0%26secret%3D1829638da0eff06233cc42f3e6dfad29; expires=Sat, 03-Mar-2001 
10:01:36 GMT; domain=.lycos.com; path=/
Location: /
Content-Length: 122
Content-Type: text/html
Expires: Wed, 06 Sep 2000 02:01:36 GMT
Set-Cookie: REGTARGET=+; expires=Mon, 31-Dec-1990 05:00:00 GMT; 
domain=.lycos.com; path=/
Set-Cookie: 
LOGIN=HUID=6e4058e2444f01901c7008eb50783baf&UID=terrigena&GUID=terrigena&HASH=; 
expires=Tue, 31-Dec-2002 05:00:00 GMT; domain=.lycos.com; path=/
Set-Cookie: LYSPORTS=; expires=Tue, 01-Jan-1980 05:00:00 GMT; 
domain=.lycos.com; path=/
Set-Cookie: USER=COUNTRY=US&NETMINDMAIL=&PLANETALL=&INFOE=Y2hhcmxlcw%3D%
3D&EMAILNOTIFY=1&GENDER=M&BIRTHDATE=12%2F09%
2F1984&CITY=Schenectady&EMAIL=neoplazma%40home%2Ecom&FNAME=Terrigena&INFO=app%
5Ffeature&ZIPCODE=12345&LNAME=EarthBorn; expires=Tue, 31-Dec-2002 05:00:00 GMT; 
domain=.lycos.com; path=/
Set-Cookie: LYCOSWEATHER=; expires=Mon, 31-Dec-1990 05:00:00 GMT; 
domain=.lycos.com; path=/
Set-Cookie: P=TRAFFIC%
2DPERZONES=&LYSHOP=&SPM=&LOTTOZIP=&LYMB=&PRIMARYMAIL=&HORO=&REMINDERS=&RANDOMREF
=&SAVEDSEARCH=&TOPNEWS=&STOREDFILES=0&LYCHAT=&COOKIEUPDATE=9%2F5%2F2000+10%3A01%
3A37+PM&LYTV=&LOCALDIR=&BOOKS=&METRICWEATHER=0&COMICS=&FRAC=; expires=Tue, 31-
Dec-2002 05:00:00 GMT; domain=.lycos.com; path=/
Set-Cookie: LYCOSSTOCKS=; expires=Tue, 01-Jan-1980 05:00:00 GMT; 
domain=.lycos.com; path=/
Set-Cookie: MergedPortfolio=; expires=Tue, 01-Jan-1980 05:00:00 GMT; 
domain=.lycos.com; path=/
Set-Cookie: SSO=1; expires=Tue, 31-Dec-2002 05:00:00 GMT; domain=.lycos.com; 
path=/
Cache-control: private

<head><title>Object moved</title></head> <body><h1>Object Moved</h1>This object 
may be found <a HREF="/">here</a>.</body>
Summary: Won't complete connection for logged in user of my.lycos.com → Won't complete connection for logged in user of my.lycos.com
over to cookies component.
Assignee: asa → morse
Status: UNCONFIRMED → NEW
Component: Browser-General → Cookies
Ever confirmed: true
QA Contact: doronr → tever
It's not cookies but rather networking.  However I just spent enough time on bug 
50471 to realize that this is a dup of that bug.  Fortunately, a fix was check 
in today.

*** This bug has been marked as a duplicate of 50471 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Component: Cookies → Networking
Which build should it show up as resolved in.
It's still not working in build 2000090508

Michael
The fix was checked in on 9-5 in the afternoon.  So I wouldn't expect it to be 
in a 9-5 build.  Try a build from 9-6 or later.
Still doesn't seem to work with Build 2000090708 on Windows 2000
(doesn't complete connection when logging in through
http://my.lycos.com/h/infosimple.asp?c=1)

Perhaps this isn't really resolved?
As I mentioned yesterday, the fix to 50417 doesn't seem to have fixed this.
I'm reopening... hope that's OK
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
.
Assignee: morse → gagan
Status: REOPENED → NEW
Target Milestone: --- → Future
Component: Networking → Cookies
Our MyLycos engineering team finally got a chance to look into this.
It turns out that the problem seems to be related to cookie length.
My Lycos user login results in an attempt to set a quite long (but < 4096 bytes)
cookie. This seems to be what causes moz and ns6pr3 to hang then time out.
Reducing the cookie length at our end solved the problem.

Debugging this identified another problem. Apparently Moz reacts badly to
attempts to delete non-existent cookies.

Lycos will implement a work-around to enable my lycos to work properly with
these bugs, but it would be good if they were fixed - presumably there are other
long cookies, and null deletes out there.

I'll take a look from this end and see if I can determine what is going on.  But 
one comment I didn't understand:

   > Apparently Moz reacts badly to attempts to delete non-existent cookies.

How are you attempting to delete them.  The only way that a site has for 
removing a cookie is to set another cookie having the same name and having an 
expiration date sometime in the past.  So if the cookie is non-existent, you are 
saying that moz behaves badly at attempts to set cookies with past-dated 
expirations.  Am I interpreting your comment correctly?
to "delete" the cookies we are setting the Expiration date to a date in the 
past (December 31, 1990), as well as setting the domain and path.  We are not 
setting the contents of the cookie, however.

If we try to do the above on a cookie that doesn't exist, we get into a 
situation where the browser hangs.

As for the cookie size issue, we start to see problems with cookies around the 
1,210 - 1,215 byte mark for the entire domain.  you can edit individual cookies 
to get them over that mark, but if you log out of My Lycos and then try to log 
back in, we see hanging.
I'm not seeing the behavior described here at all.  Instead here is what I am 
seeing:

1. First time -- no cookies.  After logging in I get a flock of cookies, some 
expiring at end of session and some lasting longer.

2. Close browser and reenter.  Do not visit lycos yet.  At this point I see 
three lycos cookies -- login, p, and user.

3. Visit lycos.  Am not asked to login, apparently because I already had a login 
cookie.  Instead I am automatically logged in.  At this point I still have the 
same three cookies -- login, p, and user.

4. Log out of lycos.  At this point I have the following cookies -- lycos_sso 
from lycosnetwork.com, lycos_sso from lycos.com, and CookieStatus from 
tripod.com.  First two will expire at end of session, third is more permanent.

5. Close browser and reenter.  Do not visit lycos yet.  At this point I still 
have the tripod cookie and I also notice a double-click cookie named id.  That's 
strange -- I didn't have it when I exited the browser and I haven't visited any 
sites since.  Hmmm!

6. Go to lycos site.  I am asked to login.  No further cookies are set yet.

7. I log in.  Get the same flock of cookies that I observed in step 1.

If you are observing something different, please give step-by-step details of 
what you are doing and what cookies are set (use the cookie manager) after each 
step.  Bottom line, however, is that this is working for me.
Because of the strange appearance of the double-click cookie in step 5, I 
decided to rerun the test.  This time I also recorded exactly what cookies are 
getting set:

1. First time -- no cookies.  After logging in I get the following cookies that 
expire at the end of the session:

   lycos.com/lycos_sso, lycos.com/lycos_test,
   lycosnetwork.com/lycos_sso, lycosnetwork.com/test
   
and the following that live longer.

   lycos.com/LOGIN, lycos.com/P, lycos.com/USER 

2. Close browser and reenter.  Do not visit lycos yet.  At this point I see 
three lycos cookies -- LOGIN, P, and USER.

3. Visit lycos.  Am not asked to login, apparently because I already had a login 
cookie.  Instead I am automatically logged in.  At this point I still have the 
same three cookies -- LOGIN, P, and USER.

4. Log out of lycos.  At this point I have the following cookies set to expire 
at end of session

   lycos.com/lycos_sso, lycosnetwork.com/lycos_sso 

and the following one which will live longer

   tripod.com/CookieStatus

5. Close browser and reenter.  Do not visit lycos yet.  At this point I still 
have the tripod cookie and I also notice a double-click cookie named id.  NOW 
HOW IS THAT POSSIBLE????  Both cookies are set to live beyond the end of the 
session, the doubleclick one going way out to the year 2030.

6. Go to lycos site.  I am asked to login.  No further cookies are set yet.

7. I log in.  Get the same flock of cookies that I observed in step 1.  Also the 
tripod and doubleclick cookies are there as well (they were not there in step 
1).

So I got the same results as in my previous test.  Can't explain where that 
doubleclick cookie is coming from since it wasn't present when I shut down the 
browser in step 5 but suddenly appeared when I reentered browser later in that 
same step.  That shouldn't be possible!
I just deleted all my terra lycos cookies
lycos.com
quote.com
terra.com (and verious national versions)
tripod.com

and tried to go to my.lycos.com
It bounces me to http://my.lycos.com/h/infosimple.asp?c=1

When I try to log in by extering my username and password, it hangs.

HOWEVER when I create a new user (bananafood1234 password banana)
it works fine.
This suggests that John Anguish is correct when he says that the problem is
cookie size.

My account, which is highly personalised, presumably has more cookie information
than the fresh bananafood1234 account.

Personalising Morse's account, perhaps using another browser, may get it to
exhibit the hang. Or perhaps John has a synthetic account that hangs that Morse
could use.

 
Here is an account on the verge of not working:

Username: johnny_net6
P@$$W0RD: holliston

You should currently be able to log in with this account.  Once you're logged 
in, Click the "Add" link in the My Page Setup box.  This will take you to a 
page that will allow you to add components to your page.  Add one or two 
components to your page by clicking the checkbox and then click the "I'm Done" 
button.  This will add the components to your page and take you back to My 
Lycos.

Now click the "Log Out" link next to where it says "Welcome, Johnny".

Now attempt to log in again.  It should hang.

Is this going to be a one-time test -- i.e., once I use the above account and 
get the hang, I can't use it again to re-reproduce the problem?  If so, then 
you'll need to create a bunch of these accounts for me to play with because I'm 
certainly going to have to make several iterations before seeing what the 
problem is.
Oops. forgot the undo process.

The easiest way to undo this is to log in to that account with a different 
browser (Netscape 4.7 or IR-egads!) and remove however many components you 
added.

Just to be sure, log in and out with the other browser every time you do this.
Well it's not a cookie problem.  I set a breakpoint in the cookie module and it 
is never getting called.

Furthermore, I put a print statement at the point that the response headers are 
getting set (in nsHTTPResponse.cpp).  The last response header is getting 
butchered.  In particular, here are the last few headers that are being reported 
(first entry is the header name and the second is the value:

  <set-cookie>
  <P=LYSHOP=6%2C8%2C12%2C15%2C16&FRAC=1&COOKIEUPDATE=11%2F20%2F2000+5%3A14%3A14
  +PM&STOCKSFIELDS=b%2Cc%2Cl%2Cd&C=EM%2CCN1%2CSRH%2CCO%2CSP%2CWG%2CAN%2CCH%2CMB
  %2CERIC%2CTR%2CLO%2CLD%2CQCIPO%2CWS%2CRM%2CSPNE%2CABBY%2CCC%2CAG%2CLT%2CAU%2C
  FL%2CSH%2C%3AWE%2CNE%2CHO%2CSE%2CGM%2CRR%2CTV%2CQCNC%2CPF%2CYP%2CCG%2CTM%2C%7
  C%3A; expires=Tue, 31-Dec-2002 05:00:00 GMT; domain=.lycos.com;path=/>

  <set-cookie>
  <SSO=1; expires=Tue, 31-Dec-2002 05:00:00 GMT; doHTTP/1.1 100 Continue>

  <http/1.1>
  <100 Continue>

  <http/1.1>
  <100 Continue>

It appears that some characters got lost in the final set-cookie header (right 
after the "do" of "domain") and after that the parser was unable to recover.  I 
looked up the stack to see how far back the error existed and it was in the 
ParseHTTPHeader method of nsHTTPResponseListener.cpp -- the value being set for 
mHeaderBuffer in that routine is already bad.

Furthermore, the example login of johnny_net6 fails for me, even if I remove 
several entries.  Yet that was working for the reporter.  So the point of 
failure is not dependent on an absolute cookie size (as reported in this bug 
report) but rather that critical cookie size will differ on different machines.  
This is further evidence that this is a transmission problem.
This bug might be a dup of bug 58261.
Here's what is going on.  The ReadSegment method in nsPipe2.cpp reads in a 
buffer full of traffic and then dispenses it one line at a time until there are 
no more lines left in the buffer.  Normally all the http response headers fits 
into the buffer, so all the headers (one header is one line) can be extracted 
without the need for refilling the buffer.

However if the response headers are too large to all fit into the buffer at the 
same time, the buffer will eventually need to be refilled.  The refill operation 
occurs when there is nothing left in the buffer.  However it should also occur 
if what is left in the buffer is only a partial line (i.e., does not contain a 
line-feed).

The attached patch rectifies this problem.  This might not be the best way to 
fix the problem so I welcome suggestions for a better patch.
Gagan, could you please code review the above patch.  Thanks.
You are adding special cases in the pipe code for a |writer|.  A great amount of 
work was done to abstrace this out.  There should be no check in ReadSegment for 
either the length of the closure or some "magic" char at some n location.  The 
length of the closure should be returned in the writeCount parameter from the 
writer().  wrt the check at length-1 of the closure. I guess with the context 
now about opqueing the data that is actually flowing, it is obvious your check 
is just evil.

Kind of a mute point now, but why would not call str->Length() twice?  Cache the 
return val. and save  yourself a few instructions.  

I could not reproduce this problem with a build from a few weeks ago.  If I 
could I would have dug deeper into what the cause could have been.  I was 
guessing that a problem could be in the nsHTTPServerListener.  Maybe it is not 
watching the LF stuff as well as it should (Just a guess)?  
Doug, have you tested using my.lycos.com and login of (johnny_net6, holliston)?  
This is definitely failing.

As I mentioned, my patch might not be the best way to go because I don't have as 
deep a knowledge of this section of code as you do.  But my analysis of the 
problem is accurate -- we are getting out of synch when the http response 
doesn't fit into the buffer such that the last line in the buffer is incomplete.
Rick Potts needs to look into this as he's closest to pipes now. The special
check in the pipe code doesn't seem right though.
Assignee: gagan → rpotts
Let me repeat -- I'm not saying that my patch is the right thing to do.  But I 
am saying that my analysis of why it is failing is correct and that the fact 
that my patch stops the failure proves it.

So if my patch is not the right way to fix it, then what is?
Steve, we need to figure out why you get into this condition first.  When we
know that, the fix should be obvious.  Did you take a look at the
nsHTTPServerListener?

If you want to stop by today and show me how to reproduce this, I can help you
investigate this further.
Yes, I did look at the nsHTTPServerListener.  See my comment of 11-20 in which I 
said:

     I looked up the stack to see how far back the error existed and it was
     in the ParseHTTPHeader method of nsHTTPResponseListener.cpp -- the
     value being set for mHeaderBuffer in that routine is already bad.

Then on 11-21 I figured out why that was happening and explained that in my next 
posting.  The thing that got me into this bad condition was an http response 
that was too large to fit in the buffer that had been allocated for it.

I'll be available to work with you on this anytime today.
Oops, I see that you were referring to nsHTTPServerListener and I was talking 
about nsHTTPResponseListener.  I got confused for a second.

No, I didn't look at nsHTTPServerListener.
This is a pretty serious bug so I'm surprised that it's marked "future".  
Affects all websites that return a large http response.  Rick Potts, have you 
had a chance to look at this yet?
Looks like gagan futured it on 9-12, no doubt because it wasn't going to make
Netscape 6.  Setting mozilla0.8 milestone keyword, clearing TM (since rpotts has
not had a chance to target it).

/be
Keywords: mozilla0.8
Target Milestone: Future → ---
reassigning to dougt at his request.
Assignee: rpotts → dougt
A while back stream (pipe) code was smacked pretty good.  Alot of the logic was 
moved from the pipe caller to the |writer| functions.  

The http server response listener use to just read all of the aval data on the 
stream.  Now it reads only complete lines ending with a LF.  There are times, as 
this bug points out, that the |writer| function can not complete a write without 
futher data.  (The response from the server so long (> 4k) that it the |writer| 
function runs out of date on the input stream.)  The trailing LF is in the next 
4k window.  To fix this problem without bastardizing the abstraction, the 
|writer| needs to return some error which tells the pipe implemention that more 
data is required on the input stream inorder to complete the write.   Attached 
will be a fix for this bug.  Morse et. al, please verify that it does indeed fix 
things.  Gagan, please review.
Where is NS_BASE_STREAM_INCOMPLETE_OPERATION defined, and what exactly does it
mean?  If it's new, its definition is not included in the patch.

But really, if it is new, can we avoid it altogether, by using
NS_BASE_STREAM_WOULD_BLOCK for the early return you added to nsWriteLineToString
and returning NS_OK; at the end?  From nsPipe2.cpp, I do not understand why any
writer implementation should return NS_BASE_STREAM_WOULD_BLOCK after consuming
the data handed to it in the readBuffer.

/be
By adding the following line to xpcom/base/nsError.h, I was able to get the 
patch to compile:

     #define NS_BASE_STREAM_INCOMPLETE_OPERATION
         NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_BASE, 8)

Having done so, I ran the test and was able to log on to my.lycos.com using the 
indicated username/password.  But after being at the site for about half a 
minute, the system suddenly crashed.  Below is the stack trace.  I don't know 
yet if this is related to the patch or not.

NTDLL! 77f76274()
il_delete_container(il_container_struct * 0x052df950) line 647 + 42 bytes
IL_NetRequestDone(il_container_struct * 0x052df950, ilIURL * 0x052d7030, int 0) 
line 1179 + 9 bytes
NetReaderImpl::NetRequestDone(NetReaderImpl * const 0x044df550, ilIURL * 
0x052d7030, int 0) line 139 + 20 bytes
ImageConsumer::OnStopRequest(ImageConsumer * const 0x04477480, nsIChannel * 
0x050500b0, nsISupports * 0x00000000, unsigned int 0, const unsigned short * 
0x100a9bc0 gCommonEmptyBuffer) line 551
nsDocumentOpenInfo::OnStopRequest(nsDocumentOpenInfo * const 0x05053cd0, 
nsIChannel * 0x050500b0, nsISupports * 0x00000000, unsigned int 0, const 
unsigned short * 0x100a9bc0 gCommonEmptyBuffer) line 277
nsHTTPFinalListener::OnStopRequest(nsHTTPFinalListener * const 0x05053c70, 
nsIChannel * 0x050500b0, nsISupports * 0x00000000, unsigned int 0, const 
unsigned short * 0x100a9bc0 gCommonEmptyBuffer) line 1166 + 42 bytes
InterceptStreamListener::OnStopRequest(InterceptStreamListener * const 
0x05770da0, nsIChannel * 0x050500b0, nsISupports * 0x00000000, unsigned int 0, 
const unsigned short * 0x100a9bc0 gCommonEmptyBuffer) line 1212
nsHTTPChannel::ResponseCompleted(nsIStreamListener * 0x05770da0, unsigned int 0, 
const unsigned short * 0x100a9bc0 gCommonEmptyBuffer) line 1923 + 42 bytes
nsHTTPServerListener::OnStopRequest(nsHTTPServerListener * const 0x0576d350, 
nsIChannel * 0x0576c614, nsISupports * 0x050500b0, unsigned int 0, const 
unsigned short * 0x100a9bc0 gCommonEmptyBuffer) line 730
nsOnStopRequestEvent::HandleEvent(nsOnStopRequestEvent * const 0x052db550) line 
300 + 46 bytes
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x052db560) line 92 + 12 bytes
PL_HandleEvent(PLEvent * 0x052db560) line 576 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00ada390) line 509 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x002c0880, unsigned int 49376, unsigned int 0, 
long 11379600) line 1054 + 9 bytes
USER32! 77e71268()
00ada390()
The crash occurred on the following PR_ASSERT statement at line 647 in 
ilclient.cpp:

   PR_ASSERT(ic->clients == NULL);

and indeed ic->clients is not null at this point.
Crash is reproducible.  Furthermore, the page never finishes loading (at least 
the throbber doesn't quiet down).
Looking one level up the stack (into IL_NetRequestDone) at the time of the 
crash, I see that ic->state is equal to IC_ABORT_PENDING.  That doesn't seem 
right.  Is it?
Morse, wierd! I am seeing that assert too.  Although I only hit my changes in my 
diffs once.  I will have to investigate it further tomorrow.  

Brendan, |writer|'s only return NS_OK if they consume all the data.  This writer 
does not and only does work line-by-line.  Since the writer now controls 
processing (eg. it does not just consume everything on the pipe) , it needs some 
way to indicate the condition when all data is not consumed because the writer 
was expecting more.

The two current states:

1. All data consumed. Status code: NS_OK
2. Not all data consumed. Status code: NS_BASE_STREAM_WOULD_BLOCK

My proposed state

3. Not all data consumed because more data is needed:  
NS_BASE_STREAM_INCOMPLETE_OPERATION.  

I am not sure if it is the right name for such a status code.  Basically what it 
allows you to do is shift the window in the pipe so that you can write data and 
immediately refill.  Is better? NS_BASE_STREAM_REFILL_NEEDED?

Sorry that I did not include the xpcom diff.  It is basically what steve 
suggested plus a comment.  I will attach it for completeness.
hmm, I'm confused here. why should HTTP's special (requiring LFs before
continuing) handling of data change the pipe impl? If it needs more data to
continue, more data should be on it's way (otherwise we should treat the
response as incomplete and fail the HTTP transaction).
Because there is a state that the pipe logic did not handle which required me to
add this new error/status code.  The problem is that there is no way that the
|writer| method can write data (which pushes the window in the pipe forward) and
force a (re-)FILL.

Why do we need to do this???  So, say that you want to read until you receive a
complete line (like the http stuff does).  Everything works until a line crosses
over the 4k pipe buffer.  Now, in order for your writer function to complete a
line, the pipe needs to be refilled (cause the LF is in the next 4k blob)

Ok, what can you do?  Returning zero byte written caused the pipe window not to
advance.  There are only two error codes that you can return: NS_OK which will
spit you out of ReadSegments, and NS_...BLOCK which will make you continue to
re-read the same section.  There is no combo of byte written and error code what
will allow you to write bytes and force a refill on the pipe.

If you look at the flow in the pipe code, you will see that if any sucessful
write happens, the control will return to the caller of ReadSegments().


I'm not buying this yet.  Please bear with me here:

First, the nsWriteLineToString function should return NS_OK if it consumed all
of the input readBuffer (fromRawSegment parameter).  The final return could look
like this:

  return (count < 0) ? NS_OK : NS_BASE_STREAM_WOULD_BLOCK;

I hope we can agree on this much.

But there's something wrong with nsPipe::nsPipeInputStream::ReadSegments, IMHO:
it blows past both a zero writeCount result (which should be asserted against,
because any writer that can't write any bytes should return ...WOULD_BLOCK); and
it expects out params to be valid after a failure return code, ...WOULD_BLOCK. 
This is unnecessary and violates the rules of XPCOM.

What should happen: a "short write" should not return ...WOULD_BLOCK, it should
return NS_OK and a writeCount < *readCount.  That condition, not ...WOULD_BLOCK,
should imply the current Fill logic.  Then, ...WOULD_BLOCK can be used as HTTP
needs to use it in this LF-out-of-window case, and we don't need a new error
code.  I'm attaching a patch.

/be
Whoops!  I didn't notice that nsWriteLineToString's count argument is PRUint32. 
The amended return statement I wrote assumed it was PRInt32, but that was a
straw man to show how NS_BASE_STREAM_WOULD_BLOCK is being abused as a success
code (implying valid out params).  Anyway, patch soon.

/be
I think that I understand what you are saying, but how does your solution handle
the case when all data was writen but more is needed before ReadSegments returns?  
I think I'm agreeing with Jud's last comment (2000-12-06 09:25).

In the case where HTTP is accumulating an nsCString till LF, but has not seen an
LF yet, why won't the pipe fill up eventually with the TCP segment that contains
that LF?  Is the problem that the "writer" function in ReadSegments doesn't free
pipe buffer space for a subsequent TCP segment to fill, by invoking another
ReadSegments via nsPipe::nsPipeOutputStream::WriteFrom => ...::WriteSegments =>
nsReadFromInputStream => Read => ReadSegments?

Can anyone debug the failing situation and say where the later TCP segment, the
one with the LF, is in memory, and why it didn't get passed into ReadSegments
again?

/be
HTTP headers can span multiple lines.  nsHTTPServerListener::ParseHTTPHeader
handles this case.  It should therefore be very close to being able to handle
the case when a header spans multiple pipe segments.  Because of this, I would
prefer to try to fix this in HTTP.  In fact, as I look at the code, it seems
like HTTP does try to handle the case of a header spanning multiple pipe
segments.  Perhaps, it did at one time work.

Also, ReadSegments is a low-level API.  It gives you access to the pipe's data
segments.  Currently, we require a consumer to empty a segment before moving on
to the next segment.  Why change this?

Doug suggested that we give the consumer some way of forcing a re-FILL of
the pipe segment, but this would require an extra memcpy.  That is, the
pipe implementation would have to copy (or move) the memory from the
existing segment to the start of the new segment.  HTTP is already doing
a memcpy (when it gets data out of the segment) just to be able to parse
the HTTP header, so why add the additional memcpy?

At first, I was not happy with nsWriteLineToString returning WOULD_BLOCK.
This seemed wrong to me because there is no i/o that would block.  But, I
understand that it is used to force the pipe to only call nsWriteLineToString
once.  Whereas, if it returned NS_OK, the pipe would repeatedly call it until
all data in the segment (as well as all data in the remaining segments) was
consumed.  Since HTTP does not want to buffer up all of the headers before
parsing them, it can't have nsWriteLineToString return NS_OK.

When I get a chance, I want to try to put together a patch to fix HTTP.  Like
I said above, the code is in place, but it is buggy... let's just fix it.
I do not think that we ever allowed a consumer to not empty a segment before
moving on to the next segment.  A consumer needs to finish all data that is
handed to him before a refill occurs.  This can be thought as an optimization to
avoid many memcpy's.

Well, if you think that the proper fix should be in http, than you will have to
mantain a buffer and state for incomplete reads.  I tend to think that this is
worse than an additional memcpy that you mentioned.


My point was that HTTP already maintains an additional buffer.

HTTP has to maintain a separate buffer in order to handle long
headers.  That is, it needs to have the entire header in a single
buffer.  Currently, HTTP gives up if that buffer grows above 60k
(assumes this to correspond to an evil server).  The problem with
using the pipe's segment as this buffer is that the pipe segment
is not large enough (4k right?), so I don't see how HTTP can avoid
maintaining a separate header buffer.  Keep in mind that in most
cases, the HTTP header buffer is going to be small... < 80 bytes.

My patch for this makes nsWriteLineToString return NS_OK if it has
not found a LF in the remaining portion of the segment.  I haven't
tested it yet, but I believe this should work.  Afterall, if it
returns NS_OK, then the pipe implementation will call me back with
the next segment.  If there is not a next segment, then ParseHTTPHeader
will have to wait until the next OnDataAvailable, which it already
knows how to do.
I'm with darin as far as http needing a buffer in which to accumulate very long
integral protocol strings goes, but nsPipe::nsPipeInputStream::ReadSegments is
still busted.  I'll cough up a patch as soon as I can.

/be
The patch I just attached seems to fix the HTTP headers problem.  The
assertion in imagelib is still showing up though.

Assertion failure: ic->clients == NULL, at ilclient.cpp:647

I think that this might be a different problem all together.  Pam may
know best what is going on here.

If it is a different problem then we should really open up a new bug.
I wouldn't conclude so quickly that it is a different problem.  Although my 
patch violates the abstraction that we want to preserve for pipe's, it does 
allow the site to come up without any assertion.  That suggests that there might 
be something in the way you are fixing the problem that is causing the 
assertion.
I would agree... except that the assertion was not reached until I closed
the window.  It had already finished loading by this time.  I will give
it another try though just to make sure that it never happens while the
page is still loading.  Then I think we could conclude that these are
separate problems.
But that can be explained by the fact that what was loaded was incorrect due to 
the way the patch is handling crossing over buffer boundaries.
I have to retract something that I said.  I just tested out my patch again and 
it turns out that it asserts and crashes same as the other patches presented 
here.  It takes a while for the page to finish loading and the crash to occur, 
so I guess I just didn't wait long enough when I tried it before.
I get the crash in NS6 as well, provided the cookies are already set using
a patched mozilla.  This is not conclusive, but it does suggest that this
crash might be another bug altogether.  So, I'd like to get my patch checked
in, and then open up a new bug for the crash.  Can I get a review on the patch?

I've noticed the crash in two different cases: 1) after resizing the window,
and 2) after closing the window.
I would now agree that the crash sounds like a different bug altogether.

As far as when you get it, I get it consistently when the page finishes loading 
(as indicated by the throbber), which takes a long time.  Coult it be that your 
page didn't finish loading at the time you tried your other activities (resizing 
window, closing window) in which case you simply aborted the loading causing the 
loading-done handlers to fire.
Sounds likely... I also noticed that all HTTP network activity had long since
stopped.  So whatever is keeping the throbber going it is most likely not
something having to do with HTTP (although HTTP could still be involved, since
data could very well be coming from the cache instead of the net).
r=dougt (according to him) on the 12/7 patch.
sr=mscott for Darin's 12/7 patch.
Landed 12/7 patch on the trunk.
marking this bug fixed... i opened up bug 62881 for the imagelib fatal
assertion.
Component: Cookies → Networking
I am not sure how this got assigned to me and reopened...
Assignee: dougt → darin
doug, I assigned it to you on 12-5 at your request.  Looks like you never 
reassigned it to darin when he took it over.  Furthermore, looks like darin 
never marked it fixed although he said he was doing so on 12-14.
my mistake... the changes did land on the trunk.  marking FIXED (for real this
time).
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
Keywords: mozilla0.8
QA Contact: tever → benc
Component: Networking → Networking: HTTP
QA Contact: benc → tever
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: