Closed Bug 242393 Opened 21 years ago Closed 20 years ago

M17 FF09x crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]

Categories

(Core :: Networking: HTTP, defect)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla1.8alpha5

People

(Reporter: spider, Assigned: Biesinger)

References

()

Details

(4 keywords)

Crash Data

Attachments

(6 files, 2 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040502 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040502 After oppening page "http://themes.kde.org/index.php?page=1", browser show that page (text, images, ...) and after it crash. Reproducible: Always Steps to Reproduce: 1.Open Mozilla 2.Try to open "http://themes.kde.org/index.php?page=1" Actual Results: Mozilla crash.
can you post a Talkback ID for this crash ?
Keywords: crash, stackwanted
Reporter, as well as giving the talkback ID, can you also please try a newer build. Your build is a bit old.
I'm downloading this build of Mozilla today (03/05/2004), where I can find new build ? I can't send Talkback ID becouse I behind proxy.
WFM Using build 040502...
(In reply to comment #3) > I can't send Talkback ID becouse I behind proxy. See instructions here to setup a proxy: http://www.mozilla.org/quality/qfa.html I'm afraid it cannot go through an authenticated proxy (bug 36545).
WFM build 20040502, also on Windows 2000
I'm using Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040502
Same problem occur when I try to open http://www.sunfreeware.com/
did you succeed in setting up proxy as per comment 5 ?
Yes I sent feedback for "themes.kde.org"
can you now mention the Talkback ID associated to this crash by running mozilla/components/talkback/talkback ? There should be a TB ID in the form "TB12345H" for the report you submitted.
(In reply to comment #11) TalcBack IDs TB37715E TB37713Q TB36870M
Assignee: general → darin
Component: Browser-General → Networking: HTTP
Keywords: stackwanted
QA Contact: general → core.networking.http
Summary: crash after dowloading page → crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]
Attached file TB36870M talkback data
Similar crashes have been topcrashers for the past few M17 beta release and continue to show up in Mozilla 1.7 final.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: topcrash
Summary: crash after dowloading page [@ nsHttpChannel::OnDataAvailable ] → M17 crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]
I'll try to get to this for 1.8, but my plate is already overflowing :-(
Keywords: helpwanted
Target Milestone: --- → mozilla1.8alpha2
hmm, that stack trace indicates: nsresult rv = mListener->OnDataAvailable(this, mListenerContext, input, mLogicalOffset, count);
*** Bug 245557 has been marked as a duplicate of this bug. ***
oh, and there's a nullcheck of mListener above that, so it's unlikely that it's null...
*** Bug 249664 has been marked as a duplicate of this bug. ***
Target Milestone: mozilla1.8alpha2 → mozilla1.8beta
*** Bug 252625 has been marked as a duplicate of this bug. ***
*** Bug 253355 has been marked as a duplicate of this bug. ***
I cannot reproduce the problem on themes.kde.org, but this fixes paypal with privoxy, the problem is on netwerk/protocol/http/src/nsHttpChannel.cpp line 3611, mResponseHead->ContentLength(); sometimes mResponseHead is null so de-referencing it cause a crash. My patch workaround the problem but it can be in another place. I don't known too much of firefox code to find it, I'll add some additional debug.
From the looks of the suggested patch, it would appear that this crash was caused by the patch for bug 240053.
(In reply to comment #24) > From the looks of the suggested patch, it would appear that this crash was > caused by the patch for bug 240053. Yes, and looking on the patch for bug 240053, my patch can be "usable" ;)
*** Bug 250884 has been marked as a duplicate of this bug. ***
Gianluigi: I'm having a hard time understanding how mResponseHead could be null at this point in the code. I'm not able to reproduce the crash either. If you can reproduce it, can you try capturing a HTTP log? Instructions are here: http://www.mozilla.org/projects/netlib/http/http-debugging.html Please attach the log file to this bug report or if it is too large, feel free to send it to me directly. Thanks!
Attached file Debug Report
I've isolated the crash to this link: https://paypalssl.doubleclick.net/adj/paypal.us/ The crash can be reproduced using squid + privoxy, I can provide sample configurations for this. I'll try to get the code produced by squid + privoxy (the out is ssl encrypted ;() but I think the problem is mainly related to the connection
Ops missed the link in my previous post: https://paypalssl.doubleclick.net/adj/paypal.us/
The link is in debug log, I dunno why this form strips links, anyway I've found some intresting stuffs: nsInputStreamPump::mStreamLength is an unsigned type initializated to PR_UINT32_MAX, but nsInputStreamPump::Init has such assignment: mStreamLength = (PRInt32) streamLen; and since in this case streamLen is -1 a cast will transform it in max unsigned value and in: nsInputStreamPump::OnStateTransfer() there is a check: if (avail + mStreamOffset > mStreamLength) avail = mStreamLength - mStreamOffset; that is always false if mStreamLength is -1 or max unsigned since the type is unsigned. I've not yet discovered how this interact for the bug, but the first time I get the link firefox says document contain no data, the second time, it follows the way to start the transfer, I think it can be related the the cache. My opinion is apply my first patch is always more secure and rewiew the use of Unsigned/Signed ints and the incompatibility for checks
thank you so much for the log file! from the log file it almost looks like the response to a SSL proxy CONNECT is the 200 OK response from the origin server! there should be an intermediate 200 OK response from the proxy server itself. this might indicate a bug in squid + privoxy. clearly, mozilla is not handling the malformed response correctly. it looks like the error code, corresponding to reading premature EOF, is not making its way back to the nsHttpChannel, which causes it to go down the path of assuming it got a successful response from the server even though it did not. i should be able to simulate this situation and trigger the crash on my end.
Status: NEW → ASSIGNED
ok, i can repro this crash by constructing a simple web server that just spits out a response like this: "HTTP/1.0 200 OK\r\n" "Content-Type: text/plain\r\n" "Content-Length: 11\r\n" "\r\n" "hello world" if i tell the browser to use this test web server as a SSL proxy, then mozilla will crash when loading any HTTPS web site.
(In reply to comment #30) > that is always false if mStreamLength is -1 or max unsigned since the type is > unsigned. This seems to be working as designed - if no explicit stream length was given (i.e. -1 / PR_UINT32_MAX), we should read as much as possible, i.e. not limit avail.
> This seems to be working as designed - if no explicit stream length was given indeed :-)
Adding FF09x to summary for tracking since this is also a topcrasher with Firefox 0.9.x and to make sure to take care of this on the Aviary when we have a fix.
OS: Windows 2000 → All
Summary: M17 crash after dowloading page [@ nsHttpChannel::OnDataAvailable ] → M17 FF09x crash after dowloading page [@ nsHttpChannel::OnDataAvailable ]
*** Bug 257477 has been marked as a duplicate of this bug. ***
Flags: blocking1.8a4? → blocking1.8a4+
nsHttpResponseHead * nsHttpTransaction::TakeResponseHead() { if (!mHaveAllHeaders) return nsnull; this condition is true...
which means that ProcessResponse is never called since mResponseHead is null, a lot of stuff in CallOnStartRequest is skipped too (this is probably ok, since onstartr is always called) So anyway... This means that onstartrequest returns success, even though it has no response head! obviously, this means that oda will be called.
Attached patch patch (obsolete) — Splinter Review
Assignee: darin → cbiesinger
Comment on attachment 159995 [details] [diff] [review] patch - I suspect ProxyStartSSL failure needs to be handled, not asserted - I'm unsure whether the warning in TakeResponseHead should stay; I certainly don't hit it in normal browsing
Attachment #159995 - Flags: review?(darin)
Comment on attachment 159995 [details] [diff] [review] patch >Index: src/nsHttpChannel.cpp > if (mResponseHead) > return ProcessResponse(); >+ else if (NS_SUCCEEDED(mStatus)) { >+ LOG(("Cancelling channel due to lack of response head")); >+ NS_WARNING("No response head in OnStartRequest"); >+ Cancel(NS_ERROR_UNEXPECTED); >+ } A couple nits here: 1. no need for |else| after |return|. 2. no need to check NS_SUCCEEDED(mStatus) since we wouldn't be here if that weren't the case. Also, you mentioned over IRC that the transaction's status is a failure code in this case. So, wouldn't it make sense to "cancel" the channel with that status code? Or, notice at the top of OnStartRequest we get the status of the request if the channel appears to have a successful status code and has not been canceled. How about we get the status of the transaction if the request's status is likewise successful? The real bug here is that we are not setting the status of the request (the nsInputStreamPump) properly when we detect an error. Somehow we have already caused the nsInputStreamPump to think that it has data even though we don't have all of the response headers. If we take a patch like this I still think we should look for a better solution to the underlying problem. >+ NS_ASSERTION(mResponseHead, "No response head in ODA!!"); perhaps this assertion should be moved down to where mResponseHead is actually used. >Index: src/nsHttpConnection.cpp >+ if (!responseHead) { // XXX can this ever be null? > LOG(("nothing to do\n")); > return NS_OK; > } It would appear that |responseHead| can never be null. How about changing this to an NS_ASSERTION instead. >+ nsresult rv = ProxyStartSSL(); >+ NS_ASSERTION(NS_SUCCEEDED(rv), "ProxyStartSSL FAILED!"); >+ LOG(("ProxyStartSSL returned 0x%x\n", rv)); The fact that ProxyStartSSL failed is not a programming error, so the assertion isn't really appropriate. How about this instead: nsresult rv = ProxyStartSSL(); if (NS_FAILED(rv)) LOG(("ProxyStartSSL failed [rv=%x]\n", rv)); >Index: src/nsHttpTransaction.cpp >+ if (!mHaveAllHeaders) { >+ NS_WARNING("Trying to get response head, but lack headers"); NS_WARNING("response headers not available or incomplete"); So, I'm fine with the gist of this patch as I said on IRC. Just revise per my comments here and I think we should be good to go. Thanks for creating the patch!
Attachment #159995 - Flags: review?(darin) → review-
not going to hold alpha for this.
Flags: blocking1.8a4+ → blocking1.8a4-
(In reply to comment #41) > The real bug here is that we are not setting the status of the request > (the nsInputStreamPump) properly when we detect an error. Somehow we > have already caused the nsInputStreamPump to think that it has data > even though we don't have all of the response headers. well, it does have data. using your testcase from above, it has the "hello world" data. hm... I need to investigate why NSS did not read it. maybe it used ReadSegments, discovered it was no SSL handshake, and indicated it read zero bytes. so maybe the solution is to discard all data once we discover that nss failed. I'll try that... > >+ NS_ASSERTION(mResponseHead, "No response head in ODA!!"); > > perhaps this assertion should be moved down to where mResponseHead > is actually used. I left it here because I was thinking that it is in any case an error if we don't have a responsehead by now. > It would appear that |responseHead| can never be null. How about > changing this to an NS_ASSERTION instead. yeah, I should have checked that. will make that change. > The fact that ProxyStartSSL failed is not a programming error, so the > assertion isn't really appropriate. How about this instead: Indeed, I did what the code a bit down did: // XXX what if this fails -- need to handle this error NS_ASSERTION(NS_SUCCEEDED(rv), "mSocketOut->AsyncWait failed"); but I suppose that's a bad excuse :)
darin: see last comment for my response to your review comments
>How about we get the status of >the transaction if the request's status is likewise successful? this does not actually work to fix the crash - oda just checks mCancelled to return early; in other cases it continues and still references the responsehead.
Attached patch patch v2 (obsolete) — Splinter Review
Attachment #159995 - Attachment is obsolete: true
Comment on attachment 160016 [details] [diff] [review] patch v2 I have a better patch...
Attachment #160016 - Attachment is obsolete: true
Attachment #160016 - Flags: review?(darin)
Attached patch patch v3Splinter Review
this fixes the real problem here. with this patch, we don't write any data into the pipe if there is no content yet.
Comment on attachment 160255 [details] [diff] [review] patch v3 this part is the real bugfix: rv = HandleContentStart(); if (NS_FAILED(rv)) return rv; + // Do not write content to the pipe if we don't have any content yet + if (!mDidContentStart) + return NS_OK;
Attachment #160255 - Flags: review?(darin)
Comment on attachment 160255 [details] [diff] [review] patch v3 >Index: nsHttpTransaction.cpp >+ // Do not write content to the pipe if we don't have any content yet >+ if (!mDidContentStart) >+ return NS_OK; this comment sounds a little odd to me. we have content, we just want to discard it because we aren't ready to start streaming yet. maybe this: // Do not write content to the pipe if we haven't started streaming yet r=darin
Attachment #160255 - Flags: review?(darin) → review+
Attachment #160255 - Flags: superreview?(bzbarsky)
Target Milestone: mozilla1.8beta → mozilla1.8alpha5
Comment on attachment 160255 [details] [diff] [review] patch v3 sr=bzbarsky
Attachment #160255 - Flags: superreview?(bzbarsky) → superreview+
Checking in netwerk/protocol/http/src/nsHttpChannel.cpp; /cvsroot/mozilla/netwerk/protocol/http/src/nsHttpChannel.cpp,v <-- nsHttpChannel.cpp new revision: 1.221; previous revision: 1.220 done Checking in netwerk/protocol/http/src/nsHttpConnection.cpp; /cvsroot/mozilla/netwerk/protocol/http/src/nsHttpConnection.cpp,v <-- nsHttpConnection.cpp new revision: 1.53; previous revision: 1.52 done Checking in netwerk/protocol/http/src/nsHttpTransaction.cpp; /cvsroot/mozilla/netwerk/protocol/http/src/nsHttpTransaction.cpp,v <-- nsHttpTransaction.cpp new revision: 1.82; previous revision: 1.81 done
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Keywords: helpwanted
Resolution: --- → FIXED
The bug says "M17 FF09x" - does it mean it's branch safe? If yes, it would certainly be nice to have it landed on branches. I'm asking because I wouldn't like it to be forgotten, now that it's marked fixed and noone mentioned the branches yet.
Attachment #160255 - Flags: approval1.7.x?
Attachment #160255 - Flags: approval-aviary?
Comment on attachment 160255 [details] [diff] [review] patch v3 a=asa for branches checkins.
Attachment #160255 - Flags: approval1.7.x?
Attachment #160255 - Flags: approval1.7.x+
Attachment #160255 - Flags: approval-aviary?
Attachment #160255 - Flags: approval-aviary+
Attached patch branch patchSplinter Review
patch that applies cleanly to the branches. there was a small conflict there
fixed on aviary branch and 1.7 branch.
*** Bug 274437 has been marked as a duplicate of this bug. ***
*** Bug 260906 has been marked as a duplicate of this bug. ***
Crash Signature: [@ nsHttpChannel::OnDataAvailable ]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: