Closed Bug 242393 Opened 17 years ago Closed 17 years ago

M17 FF09x crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]

Categories

(Core :: Networking: HTTP, defect)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla1.8alpha5

People

(Reporter: spider, Assigned: Biesinger)

References

()

Details

(4 keywords)

Crash Data

Attachments

(6 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040502
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040502

After oppening page "http://themes.kde.org/index.php?page=1", browser show that
page (text, images, ...) and after it crash.

Reproducible: Always
Steps to Reproduce:
1.Open Mozilla
2.Try to open "http://themes.kde.org/index.php?page=1"


Actual Results:  
Mozilla crash.
can you post a Talkback ID for this crash ?
Keywords: crash, stackwanted
Reporter, as well as giving the talkback ID, can you also please try a newer
build.  Your build is a bit old.
I'm downloading this build of Mozilla today (03/05/2004), where I can find new
build ? 
I can't send Talkback ID becouse I behind proxy.
WFM Using build 040502...
(In reply to comment #3)
> I can't send Talkback ID becouse I behind proxy.
See instructions here to setup a proxy:
http://www.mozilla.org/quality/qfa.html
I'm afraid it cannot go through an authenticated proxy (bug 36545).

WFM build 20040502, also on Windows 2000
I'm using 
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040502
Same problem occur when I try to open 
http://www.sunfreeware.com/
did you succeed in setting up proxy as per comment 5 ?

Yes I sent feedback for "themes.kde.org"
can you now mention the Talkback ID associated to this crash by running
mozilla/components/talkback/talkback ? There should be a TB ID in the form
"TB12345H" for the report you submitted.
(In reply to comment #11)
TalcBack IDs
TB37715E
TB37713Q
TB36870M 
Assignee: general → darin
Component: Browser-General → Networking: HTTP
Keywords: stackwanted
QA Contact: general → core.networking.http
Summary: crash after dowloading page → crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]
Attached file TB36870M talkback data
Similar crashes have been topcrashers for the past few M17 beta release and
continue to show up in Mozilla 1.7 final.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: topcrash
Summary: crash after dowloading page [@ nsHttpChannel::OnDataAvailable ] → M17 crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]
I'll try to get to this for 1.8, but my plate is already overflowing :-(
Keywords: helpwanted
Target Milestone: --- → mozilla1.8alpha2
hmm, that stack trace indicates:
        nsresult rv =  mListener->OnDataAvailable(this,
                                                  mListenerContext,
                                                  input,
                                                  mLogicalOffset,
                                                  count);
*** Bug 245557 has been marked as a duplicate of this bug. ***
oh, and there's a nullcheck of mListener above that, so it's unlikely that it's
null...
*** Bug 249664 has been marked as a duplicate of this bug. ***
Target Milestone: mozilla1.8alpha2 → mozilla1.8beta
*** Bug 252625 has been marked as a duplicate of this bug. ***
*** Bug 253355 has been marked as a duplicate of this bug. ***
I cannot reproduce the problem on themes.kde.org, but this fixes paypal with
privoxy, the problem is on netwerk/protocol/http/src/nsHttpChannel.cpp line
3611,
mResponseHead->ContentLength(); sometimes mResponseHead is null so
de-referencing it cause a crash. My patch workaround the problem but it can be
in another place. I don't known too much of firefox code to find it, I'll add
some additional debug.
From the looks of the suggested patch, it would appear that this crash was
caused by the patch for bug 240053.
(In reply to comment #24)
> From the looks of the suggested patch, it would appear that this crash was
> caused by the patch for bug 240053.

Yes, and looking on the patch for bug 240053, my patch can be "usable" ;)
*** Bug 250884 has been marked as a duplicate of this bug. ***
Gianluigi: I'm having a hard time understanding how mResponseHead could be null
at this point in the code.  I'm not able to reproduce the crash either.  If you
can reproduce it, can you try capturing a HTTP log?  Instructions are here:

  http://www.mozilla.org/projects/netlib/http/http-debugging.html

Please attach the log file to this bug report or if it is too large, feel free
to send it to me directly.  Thanks!
Attached file Debug Report
I've isolated the crash to this link:
https://paypalssl.doubleclick.net/adj/paypal.us/

The crash can be reproduced using squid + privoxy, I can provide sample
configurations for this.
I'll try to get the code produced by squid + privoxy (the out is ssl encrypted
;() but I think the problem is mainly related to the connection
Ops missed the link in my previous post:

https://paypalssl.doubleclick.net/adj/paypal.us/
The link is in debug log, I dunno why this form strips links, anyway I've found
some intresting stuffs:
nsInputStreamPump::mStreamLength is an unsigned type initializated to
PR_UINT32_MAX, but

nsInputStreamPump::Init
has such assignment:

mStreamLength = (PRInt32) streamLen;

and since in this case streamLen is -1 a cast will transform it in max unsigned
value

and in:
nsInputStreamPump::OnStateTransfer()
 there is a check:
      if (avail + mStreamOffset > mStreamLength)
            avail = mStreamLength - mStreamOffset;

that is always false if mStreamLength is -1 or max unsigned since the type is
unsigned.

I've not yet discovered how this interact for the bug, but the first time I get
the link firefox says document contain no data, the second time, it follows the
way to start the transfer, I think it can be related the the cache.
My opinion is apply my first patch is always more secure and rewiew the use of
Unsigned/Signed ints and the incompatibility for checks
thank you so much for the log file!

from the log file it almost looks like the response to a SSL proxy CONNECT is
the 200 OK response from the origin server!  there should be an intermediate 200
OK response from the proxy server itself.  this might indicate a bug in squid +
privoxy.  clearly, mozilla is not handling the malformed response correctly.  it
looks like the error code, corresponding to reading premature EOF, is not making
its way back to the nsHttpChannel, which causes it to go down the path of
assuming it got a successful response from the server even though it did not.

i should be able to simulate this situation and trigger the crash on my end.
Status: NEW → ASSIGNED
ok, i can repro this crash by constructing a simple web server that just spits
out a response like this:

  "HTTP/1.0 200 OK\r\n"
  "Content-Type: text/plain\r\n"
  "Content-Length: 11\r\n"
  "\r\n"
  "hello world"

if i tell the browser to use this test web server as a SSL proxy, then mozilla
will crash when loading any HTTPS web site.
(In reply to comment #30)
> that is always false if mStreamLength is -1 or max unsigned since the type is
> unsigned.

This seems to be working as designed - if no explicit stream length was given
(i.e. -1 / PR_UINT32_MAX), we should read as much as possible, i.e. not limit avail.


> This seems to be working as designed - if no explicit stream length was given

indeed :-)
Adding FF09x to summary for tracking since this is also a topcrasher with
Firefox 0.9.x and to make sure to take care of this on the Aviary when we have a
fix.
OS: Windows 2000 → All
Summary: M17 crash after dowloading page [@ nsHttpChannel::OnDataAvailable ] → M17 FF09x crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]
*** Bug 257477 has been marked as a duplicate of this bug. ***
Flags: blocking1.8a4?
Flags: blocking1.8a4? → blocking1.8a4+
nsHttpResponseHead *
nsHttpTransaction::TakeResponseHead()
{
    if (!mHaveAllHeaders)
        return nsnull;

this condition is true...
which means that ProcessResponse is never called

since mResponseHead is null, a lot of stuff in CallOnStartRequest is skipped too
(this is probably ok, since onstartr is always called)

So anyway... This means that onstartrequest returns success, even though it has
no response head! obviously, this means that oda will be called.
Attached patch patch (obsolete) — Splinter Review
Assignee: darin → cbiesinger
Comment on attachment 159995 [details] [diff] [review]
patch

- I suspect ProxyStartSSL failure needs to be handled, not asserted

- I'm unsure whether the warning in TakeResponseHead should stay; I certainly
don't hit it in normal browsing
Attachment #159995 - Flags: review?(darin)
Comment on attachment 159995 [details] [diff] [review]
patch

>Index: src/nsHttpChannel.cpp

>         if (mResponseHead)
>             return ProcessResponse();
>+        else if (NS_SUCCEEDED(mStatus)) {
>+            LOG(("Cancelling channel due to lack of response head"));
>+            NS_WARNING("No response head in OnStartRequest");
>+            Cancel(NS_ERROR_UNEXPECTED);
>+        }

A couple nits here:

  1. no need for |else| after |return|.

  2. no need to check NS_SUCCEEDED(mStatus) since we wouldn't be here
     if that weren't the case.

Also, you mentioned over IRC that the transaction's status is a failure
code in this case.  So, wouldn't it make sense to "cancel" the channel
with that status code?	Or, notice at the top of OnStartRequest we get
the status of the request if the channel appears to have a successful
status code and has not been canceled.	How about we get the status of
the transaction if the request's status is likewise successful?

The real bug here is that we are not setting the status of the request
(the nsInputStreamPump) properly when we detect an error.  Somehow we
have already caused the nsInputStreamPump to think that it has data 
even though we don't have all of the response headers.

If we take a patch like this I still think we should look for a better
solution to the underlying problem.


>+    NS_ASSERTION(mResponseHead, "No response head in ODA!!");

perhaps this assertion should be moved down to where mResponseHead
is actually used.


>Index: src/nsHttpConnection.cpp

>+    if (!responseHead) { // XXX can this ever be null?
>         LOG(("nothing to do\n"));
>         return NS_OK;
>     }

It would appear that |responseHead| can never be null.	How about
changing this to an NS_ASSERTION instead.


>+            nsresult rv = ProxyStartSSL();
>+            NS_ASSERTION(NS_SUCCEEDED(rv), "ProxyStartSSL FAILED!");
>+            LOG(("ProxyStartSSL returned 0x%x\n", rv));

The fact that ProxyStartSSL failed is not a programming error, so the
assertion isn't really appropriate.  How about this instead:

  nsresult rv = ProxyStartSSL();
  if (NS_FAILED(rv))
    LOG(("ProxyStartSSL failed [rv=%x]\n", rv));



>Index: src/nsHttpTransaction.cpp

>+    if (!mHaveAllHeaders) {
>+        NS_WARNING("Trying to get response head, but lack headers");

	  NS_WARNING("response headers not available or incomplete");


So, I'm fine with the gist of this patch as I said on IRC.  Just revise
per my comments here and I think we should be good to go.  Thanks for
creating the patch!
Attachment #159995 - Flags: review?(darin) → review-
not going to hold alpha for this.
Flags: blocking1.8a4+ → blocking1.8a4-
(In reply to comment #41)
> The real bug here is that we are not setting the status of the request
> (the nsInputStreamPump) properly when we detect an error.  Somehow we
> have already caused the nsInputStreamPump to think that it has data 
> even though we don't have all of the response headers.

well, it does have data. using your testcase from above, it has the "hello
world" data. hm... I need to investigate why NSS did not read it. maybe it used
ReadSegments, discovered it was no SSL handshake, and indicated it read zero
bytes. so maybe the solution is to discard all data once we discover that nss
failed. I'll try that...

> >+    NS_ASSERTION(mResponseHead, "No response head in ODA!!");
> 
> perhaps this assertion should be moved down to where mResponseHead
> is actually used.

I left it here because I was thinking that it is in any case an error if we
don't have a responsehead by now.

> It would appear that |responseHead| can never be null.	How about
> changing this to an NS_ASSERTION instead.

yeah, I should have checked that. will make that change.

> The fact that ProxyStartSSL failed is not a programming error, so the
> assertion isn't really appropriate.  How about this instead:

Indeed, I did what the code a bit down did:
            // XXX what if this fails -- need to handle this error
            NS_ASSERTION(NS_SUCCEEDED(rv), "mSocketOut->AsyncWait failed");
but I suppose that's a bad excuse :)

darin: see last comment for my response to your review comments
>How about we get the status of
>the transaction if the request's status is likewise successful?

this does not actually work to fix the crash - oda just checks mCancelled to
return early; in other cases it continues and still references the responsehead.
Attached patch patch v2 (obsolete) — Splinter Review
Attachment #159995 - Attachment is obsolete: true
Comment on attachment 160016 [details] [diff] [review]
patch v2

I have a better patch...
Attachment #160016 - Attachment is obsolete: true
Attachment #160016 - Flags: review?(darin)
Attached patch patch v3Splinter Review
this fixes the real problem here. with this patch, we don't write any data into
the pipe if there is no content yet.
Comment on attachment 160255 [details] [diff] [review]
patch v3

this part is the real bugfix:

	 rv = HandleContentStart();
	 if (NS_FAILED(rv)) return rv;
+	 // Do not write content to the pipe if we don't have any content yet
+	 if (!mDidContentStart)
+	   return NS_OK;
Attachment #160255 - Flags: review?(darin)
Comment on attachment 160255 [details] [diff] [review]
patch v3

>Index: nsHttpTransaction.cpp

>+        // Do not write content to the pipe if we don't have any content yet
>+        if (!mDidContentStart)
>+          return NS_OK;

this comment sounds a little odd to me.  we have content, we just
want to discard it because we aren't ready to start streaming yet.

maybe this:

  // Do not write content to the pipe if we haven't started streaming yet


r=darin
Attachment #160255 - Flags: review?(darin) → review+
Attachment #160255 - Flags: superreview?(bzbarsky)
Target Milestone: mozilla1.8beta → mozilla1.8alpha5
Comment on attachment 160255 [details] [diff] [review]
patch v3

sr=bzbarsky
Attachment #160255 - Flags: superreview?(bzbarsky) → superreview+
Checking in netwerk/protocol/http/src/nsHttpChannel.cpp;
/cvsroot/mozilla/netwerk/protocol/http/src/nsHttpChannel.cpp,v  <-- 
nsHttpChannel.cpp
new revision: 1.221; previous revision: 1.220
done
Checking in netwerk/protocol/http/src/nsHttpConnection.cpp;
/cvsroot/mozilla/netwerk/protocol/http/src/nsHttpConnection.cpp,v  <-- 
nsHttpConnection.cpp
new revision: 1.53; previous revision: 1.52
done
Checking in netwerk/protocol/http/src/nsHttpTransaction.cpp;
/cvsroot/mozilla/netwerk/protocol/http/src/nsHttpTransaction.cpp,v  <-- 
nsHttpTransaction.cpp
new revision: 1.82; previous revision: 1.81
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Keywords: helpwanted
Resolution: --- → FIXED
The bug says "M17 FF09x" - does it mean it's branch safe? If yes, it would
certainly be nice to have it landed on branches.

I'm asking because I wouldn't like it to be forgotten, now that it's marked
fixed and noone mentioned the branches yet.
Attachment #160255 - Flags: approval1.7.x?
Attachment #160255 - Flags: approval-aviary?
Comment on attachment 160255 [details] [diff] [review]
patch v3

a=asa for branches checkins.
Attachment #160255 - Flags: approval1.7.x?
Attachment #160255 - Flags: approval1.7.x+
Attachment #160255 - Flags: approval-aviary?
Attachment #160255 - Flags: approval-aviary+
Attached patch branch patchSplinter Review
patch that applies cleanly to the branches. there was a small conflict there
fixed on aviary branch and 1.7 branch.
*** Bug 274437 has been marked as a duplicate of this bug. ***
*** Bug 260906 has been marked as a duplicate of this bug. ***
Crash Signature: [@ nsHttpChannel::OnDataAvailable ]
You need to log in before you can comment on or make changes to this bug.