<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Reporter

Comment 2

•

16 years ago

Yes, I get the same behavior.

Matthias Versen [:Matti]

Comment 3

•

16 years ago

Can you please attach a networking log : http://www.mozilla.org/projects/netlib/http/http-debugging.html

Component: General → Networking

Product: Firefox → Core

QA Contact: general → networking

Version: 3.0 Branch → 1.9.0 Branch

Comment 4

•

16 years ago

(In reply to comment #0) > Gecko/2008120122 Firefox/3.0.5 When(from which Fx version) did your problem start to occur frequently? After upgrade from previous 3.0.x to current 3.0.5? Or after upgrade from Fx 2 to Fx 3? If latter, reduce following value to 2. Frequency of problem declines? > network.http.max-persistent-connections-per-server (default: Fx 2=2, Fx 3=6)

Reporter

Comment 5

•

16 years ago

Attached file HTTP Log results after reproducing error. — Details

Reporter

Comment 6

•

16 years ago

ps – Can anyone duplicate what my peers and I are experiencing here? Matti – I’ve attached a networking log. Note, I took the following steps to get the error. 1. Opened Firefox (my homepage is google.com). 2. Went to the secure URL noted in the bug info (https://signup.live.com/signup.aspx). The error did not occur. 3. Went back to my homepage. 4. Cleared my private data (including Cache, Cookies, Offline Website Data, and Authenticated Sessions). 5. (ERROR) Went back to the secure URL noted. The error occurred here (browser did not load CSS and images). 6. Exited browser. WADA – I found the error in 3.0.5 toward the end of January. I’m not exactly sure when this work machine was upgraded to 3.0.5 (I’ll check with our guys here to see if we have a log of that upgrade). I also have a laptop at home with Firefox 2 on it and I am able to reproduce the error in version 2 as well.

Reporter

Comment 7

•

16 years ago

WADA – I just realized what you were saying about reducing that value to 2 in the config. So I reduced that value to 2 in Fx 3.0.5 and now the error seems to be happening 100% of the time.

Comment 8

•

16 years ago

(In reply to comment #7) > So I reduced that value to 2 in Fx 3.0.5 and now the error seems to be happening 100% of the time. The IIS server looks tolerant with persistent connection=6. Please change back to default. (In reply to comment #7) >(snip) > 3. Went back to my homepage. > 4. Cleared my private data (including Cache, Cookies, Offline Website Data, and Authenticated Sessions). > 5. (ERROR) Went back to the secure URL noted. The error occurred here (browser did not load CSS and images). >(snip) > https://secure.wlxrs.com/yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css There is no log of "200 OK" for second HTTP GET of the css file(Step 5, Back just after clear cahce). What's wrong? (A) First HTTP GET in log. Step 2, before clear cahce. Log for "200 OK" exists. > Line-6262 : [328140]: GET /yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css HTTP/1.1 > Line-6263 : [328140]: Host: secure.wlxrs.com >(snip) > Line-7366 :HTTP/1.x 200 OK > Line-7367 :Content-Length: 25570 > Line-7368 :Content-Type: text/css >(snip) > Line-7371 :Server: Microsoft-IIS/6.0 >(snip) (B) Second HTTP GET in log. Step 5, Back just after clear cache. Log for "200 OK" doesn't exist. > https://secure.wlxrs.com/yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css > Line-18951 : GET /yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css HTTP/1.1 > Line-18952 : Host: secure.wlxrs.com >(snip) > No log of "200 OK" Phenomenon(CSS is not used) is reproduced by Fx 3.0.6 on MS Win, with single tab only, with next two pages. (1) Mozilla Firefox Central : http://en-us.www.mozilla.com/en-US/firefox/central/ (2) Sign Up page of Windows Live. Go http://home.live.com/ (Windows Live Home) Click "Sign Up" => Jumps to next page. I used this URL(with search string). > https://signup.live.com/signup.aspx?id=251248&ru=http%3a%2f%2fhome.live.com%2fdefault.aspx%3fnew%3dtrue&cp=2&mkt=en-US&rollrs=12&lic=1 The Sing Up page uses next css file(no other CSS is used) > https://secure.wlxrs.com/yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css And HTTP log for some images have following Referer(images are loaded by the CSS). > Referer: https://secure.wlxrs.com/yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css If cookie is cleared, client can't send cookie. Server(IIS)/Server Application(PHP?)/JScript behaviour seems; When cookie is not sent by client, (a) JScript doesn't generate <link rel="stylesheet"> in some situations, or (b) JScript fails to generate <link rel> in some situations, or (c) <link rel="stylesheet"> is generated and HTTP GET is issued by Fx, (c-1) but server application doesn't return ".css" file. (server application ignores HTTP GET for the .css file) (c-2) but delay occurs in returning ".css" file data(timeout). So, if cache is cleared and (c) occurs, CSS can't be used. In your log, it looks that (c) occurs, although Fx's caching of css related issue is still suspected. ("No log of 200 OK for .css" doesn't always mean "no response from server really".) By the way, an imapge/jpeg file is sent with "Content-Encoding: gzip" by server. This kind of header sometimes means "careless site(or cache server) configuration". > Line-14013 : 480[3283c0]: HTTP/1.1 200 OK > Line-14014 : 480[3283c0]: Cache-Control: no-cache > Line-14015 : 480[3283c0]: Connection: close > Line-14016 : 480[3283c0]: Date: Tue, 03 Mar 2009 05:04:19 GMT > Line-14017 : 480[3283c0]: Pragma: no-cache > Line-14018 : 480[3283c0]: Content-Type: image/jpeg > Line-14019 : 480[3283c0]: Expires: Tue, 03 Mar 2009 05:03:19 GMT > Line-14020 : 480[3283c0]: Server: Microsoft-IIS/6.0 > Line-14021 : 480[3283c0]: PPServer: PPV: 30 H: BAYIDSHIPV1C06 V: 0 > Line-14022 : 480[3283c0]: P3P: CP="DSP CUR OTPi IND OTRi ONL FIN" > Line-14023 : 480[3283c0]: Set-Cookie: HIPChallenge=44fXR7y0nCeTSLMMU62AWoJ!ybeLq0SSqNZU*PF8BpU8QYybSfIUaswdLeGHMdrLkn3dO7qD0ydf0hDvvR2xU8VDKTumqkYODkYoViyVwK4SIWEA7ckYkal84F5CvslylANTYBH0lo354uoCUelrPRuQ6HAaSOLzC7KyeuZWrO3GgOGNeJGShkyzRRa!LdldjFHK2jVr*BVnA7rgUwfESkzx3npwHBsb6USAmlG5Bg6mJe5Tlu96Tzg791FmmvmQ!*RtAsaUnE9zrkPb*zipR7H5fVmMGIJx54BngYaF2kY*23zo4sdYRjH0v6L0fYaIaxkuwnd8zDy8CDyu!ensWdj2p7rbZzSeOR; domain=.live.com;path=/;version=1 > Line-14024 : 480[3283c0]: Content-Encoding: gzip > Line-14025 : 480[3283c0]: Vary: Accept-Encoding > Line-14026 : 480[3283c0]: Transfer-Encoding: chunked

Comment 9

•

16 years ago

Correction. <link rel> was hard coded in HTML source. Sorry for wrong comments. > ...</script><link rel="StyleSheet" href="https://secure.wlxrs.com/yDLmcF2FzGTerKJw2Us!q0tI0qMxoipUtbqnHdX2IWd-W!MqS7tFT1jrjehtb5LNlvZlJHl-Mrc/Base/14.3.8033/hig+signup.css" type="text/css"/> > </head> (... Why no line break before "<link rel" ... ) Phenomenon was observed with Fx latest-trunk(2008/3/03 build) too. When problem(CSS not used) occurs after "Clear Recent History", no entry for the ".css" was found in about:cache(Memory Cache Device), even though "Done" is displayed in status bar. Entry for the ".css" was created after Reload, and CSS was applied to page. Loading was canceled/terminated at mid of document load process due to internal error? "Content-Encoding:gzip" of HTML is relevant? "Content-Type: text/html" for XHTML document is relevant? ( http://www.w3.org/TR/xhtml-media-types/#media-types ) > Line-212: 1480[3283c0]: HTTP/1.1 200 OK > Line-213: 1480[3283c0]: Cache-Control: private, max-age=0 >(snip) > Line-216: 1480[3283c0]: Content-Type: text/html; charset=UTF-8 > Line-217: 1480[3283c0]: Content-Encoding: gzip >(snip) > Note: This server uses "on-the-fly compression" for HTML, > or mis-uses "Content-Encoding:gzip". > (uses as if "Content-Transfer-Encoding" which is not defined by HTTP 1.1)

Comment 10

•

16 years ago

Attached file about:cache entries(Memory Cache Device) — Details

Comment 11

•

16 years ago

This site returns JavaScript file with "Content-Encoding: gzip", in addition to text/html. > Line-20169 : 0[328140]: uri=https://secure.shared.live.com/8Cvn5p7UVQ6YTKP9Tiy5Gw/js/omnitureH2.js >(snip) > Line-20184 : 0[328140]: GET /8Cvn5p7UVQ6YTKP9Tiy5Gw/js/omnitureH2.js HTTP/1.1 > Line-20184 : 0[328140]: Host: secure.shared.live.com >(snip) > Line-20995 : 1480[3283c0]: HTTP/1.1 200 OK > Line-20996 : 1480[3283c0]: Content-Type: application/x-javascript >(snip) > Line-20999 : 1480[3283c0]: Server: Microsoft-IIS/6.0 >(snip) > Line-21002 : 1480[3283c0]: Content-Encoding: gzip >(snip) > Line-21005 : 1480[3283c0]: Content-Length: 7196 >(snip) Bug 363109 is relevant? > Bug 363109 gzipped CSS and JS over SSL fails when reloading "Script loader"(if such function exists) version of Bug 314689? > Bug 314689 nsCSSLoader::Stop should cancel the necko channels

Reporter

Comment 12

•

16 years ago

WADA – I really appreciate your analysis! I must admit I’m a little out of my league with your findings. I’ll continue to follow along, as I am highly interested in the outcome/resolution on this, but I’m not sure what else I can do to help. Of course if there is something I can do, I will. Where do we go from here? Does anyone else have input? Is this 'Ben's Bug'?

Comment 13

•

16 years ago

"Content-Encoding: gzip" seems to be illerevant. Quick test result with Fx latest-trunk on MS Win-XP SP3. (0) Open Error Console. (1) Load this page (Bug 480619). text/html, No Content-Encoding: gzip. (1-a) Execute Reload, Shift+Reload several times. (1-b) Access pages of links described in step (4). (2) Clear Recent History, Remove: my entire history, Select all items. (3) (3-a) Shift+Reload (3-b) Move to URLBar, Enter (3-c) Move to URLBar, Insert blank and remove it, Click arrow icon (Go to the address in the location bar) => Nothing is done After several retries of (3-a)/(3-b)/(3-c), this page was reloaded, and following error was displayed in Error Console. > Warning: Unknown property '-moz-opacity'. Declaration dropped. > Source File: https://bugzilla.mozilla.org/skins/standard/yui/autocomplete.css Line: 7 >(Some warnings on CSS) > Error: keywordAutoComplete.textboxFocusEvent is null > Source File: https://bugzilla.mozilla.org/show_bug.cgi?id=480619 Line: 1503 (4) "Save Link As" or "Ctrl+Click" of https links. HTTPS links in this bug. (A) Two attachment links of this bug https: //bugzilla.mozilla.org/attachment.cgi?id=nnnnnn => 302 moved => https: //bug480619.bugzilla.mozilla.org/attachment.cgi?id=nnnnnn (A-1) First attachment, application/zip, No Content-Encoding: gzip (A-2) Second attachment, text/plain, Content-Encoding: gzip (B) Two links to MS's http site (B-1) Link to ".js" application/x-javascript, Content-Encoding: gzip (B-2) Link to ".css", text/css, No Content-Encoding: gzip (4-a) "Save Link As" => Following dialog was displayed. > Download Error > The download cannot be saved because an unknown error occurred. > Please Try again. (4-b) Ctrl+Click => Nothing is done(blank page is displayed) => After several retries of (4-a)/(4-b), download or display was done. Above phenomena in step (4) couldn't be observed for http links in this bug. Confirming, based on my test result in Comment #8.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Comment 14

•

16 years ago

s/(B) Two links to MS's http site/(B) Two links to MS's https site/

Comment 15

•

16 years ago

Is this flaw limited to 1.9.0? I don't see any comments saying anyone's tried it on 1.9.1

Flags: wanted1.9.0.x+

Flags: blocking1.9.1?

Comment 16

•

16 years ago

So for the second GET mentioned in comment 8, here's the issue: 1480[3283c0]: PR_Write returned [n=-1] 1480[3283c0]: ErrorAccordingToNSPR [in=-5929 out=80004005] 1480[3283c0]: nsSocketTransport::OnMsgOutputClosed [this=3dce8e0 reason=80004005] 1480[3283c0]: ReadSegments returned [rv=0 read=0 sock-cond=80004005] 1480[3283c0]: nsHttpConnection::CloseTransaction[this=2e38f40 trans=3a6b080 reason=80004005] 1480[3283c0]: nsHttpTransaction::Close [this=3a6b080 reason=80004005] 0x80004005 is NS_ERROR_FAILURE. Not helpful... But -5929 is PR_SOCKET_SHUTDOWN_ERROR. I thought we had an existing bug on this sort of thing, where already-closed sockets got reused.... Ehsan ran into it with private browsing, no?

Comment 17

•

16 years ago

(In reply to comment #15) > Is this flaw limited to 1.9.0? I don't see any comments saying anyone's tried it on 1.9.1 Sorry for my unclear comments on Fx version. "Fx latest-trunk" in my comment is nightly trunk build downloaded from; > http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/ Example : "Fx latest-trunk" 2009/3/19 build > Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090319 Minefield/3.6a1pre I've changed Version: to Trunk.

Version: 1.9.0 Branch → Trunk

Comment 18

•

16 years ago

Following is open bugs of "HTTP && Socket" in summary(changed within one year). Boris, Bug 470055? Bug 448575 is relevant too? > Bug 255141 PSM attempts to access socket notificationCallbacks on main thread ...(snip) > Bug 448575 nsHttpChannel silently eats socket errors that occur after the response body is being received > Bug 450078 HTTP1.1 Keep-Alive mechanism and having a small WEB-Server with only 2 open sockets. -> Not all elements are shown. > Bug 470055 pkix_HttpCertStore_FindSocketConnection reuses closed socket, OCSP fails > Bug 481301 ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread' [@ nsHttpConnection::ResumeRecv]

Comment 19

•

16 years ago

The bug I'm thinking of had a summary more like "SSL sites not loading"...

Comment 20

•

16 years ago

(In reply to comment #19) > The bug I'm thinking of had a summary more like "SSL sites not loading"... Yeah, it's bug 463256. See bug 463256 comment 21 for my analysis on the problem. This is caused by the call to nsISecretDecoderRing::LogoutAndTeardown which happens in the clear private data dialog as well. Bug 463256 comment 26 mentions the reason for this behavior. We worked around it in private browsing by using the offline mode to shut down Necko sockets, but I'm not sure what would be a proper fix in this case.

Comment 21

•

16 years ago

It's not a very common user action in Firefox, so I don't think this needs to block (especially if the fix is difficult) but would be willing to reconsider if someone comes up with a compelling use case. I'll let jst make a call about blocking. It's not great.

Comment 23

•

16 years ago

I'd really like to see this fixed for 1.9.1, but I can't say I'd hold the release for it. Bjarne, can you spend some time looking into this? If this is fixable in a safe way we'd consider taking the fix for 1.9.1 still. If you don't have time, please let me know.

Assignee: nobody → bjarne

Flags: wanted1.9.1+

Flags: blocking1.9.1?

Flags: blocking1.9.1-

Keywords: relnote

Assignee

Comment 24

•

16 years ago

Attached patch Initial, brute force approach (obsolete) — Details — Splinter Review

Applying the same trick as for bug #463256 seems to make this work. Mochitests also run fine (except for the usual ones on my test-box). Question is whether this approach is too heavy. It is probably possible to be more light-handed by e.g. re-initializing nsSocketTransportService in the "net:clear-active-logins" case in nsHttpHandler::Observe(). I haven't tried that yet but I'll try it out on Monday unless the current fix is acceptable.

Attachment #374473 - Flags: review?(jst)

Assignee

Comment 25

•

16 years ago

Oh - forgot to mention that the patch is for trunk...

Comment 26

•

16 years ago

(In reply to comment #24) > Created an attachment (id=374473) [details] > Initial, brute force approach > > Applying the same trick as for bug #463256 seems to make this work. Mochitests > also run fine (except for the usual ones on my test-box). This is less than ideal. I still feel bad about bug 463256... :-) For a sample of a problem caused by this approach, see bug 488772. We're really looking for a way around this, if possible. Come to think of it, it's actually worse than the private browsing case, because this would cause the download manager / add-on manager, etc. to prompt if there are any downloads in progress. > Question is whether this approach is too heavy. It is probably possible to be > more light-handed by e.g. re-initializing nsSocketTransportService in the > "net:clear-active-logins" case in nsHttpHandler::Observe(). I haven't tried > that yet but I'll try it out on Monday unless the current fix is acceptable. I'd very much like to see how this turns out. Please let us know if that solves the problem in a clean way, so that I can move the private browsing service to do a similar thing instead of the current hack. Thanks!

Comment 27

•

16 years ago

Comment on attachment 374473 [details] [diff] [review] Initial, brute force approach Clearing review request while waiting to see if the smaller hammer approach is acceptable here :)

Attachment #374473 - Flags: review?(jst)

Henrik Skupin [:whimboo][⌚️UTC+2] (away 07/21 - 08/08)

Assignee

Comment 28

•

16 years ago

Attached patch Lighter version (obsolete) — Details — Splinter Review

This version is "lighter", but it still shuts down and restarts the STS thread. (In reply to comment #26) > Come to think of it, it's actually worse than the private browsing case, > because this would cause the download manager / add-on manager, etc. to prompt > if there are any downloads in progress. User requested to clear private data, including "authenticated sessions", and should imo not be shell shocked if active connections are also terminated. :) Maybe we could even make the GUI indicate this by e.g. changing the text in the dialogue from "Authenticated sessions" to "Authenticated sessions and active connections" ? However, I agree with bug #488772 in that it would be more elegant to shut down SSL connections only. In order to implement this we need something like the two loops in nsSocketTransportService::Run() at lines 588++ in nsSocketTransportService2.cpp, but we also need to figure out from SocketContext whether a socket is ssl. Then we must stuff this into a method DetachSSLSockets() or similar and either expose it via a suitable API or implement an extra action in nsSocketTransportService::Observe() (I prefer the latter). No black magic, but it might be a little tricky to deduce ssl from the SocketContext-object. I'll give this a whack on Monday unless the current patch with above arguments about users being kept responsible for their actions are acceptable. :)

Attachment #374473 - Attachment is obsolete: true

Attachment #374616 - Flags: review?(jst)

Comment 29

•

16 years ago

See also bug 474364 where a similar thing happens on restart. I do not clear private data on shutdown, so those pages are still in the cache.

Status: NEW → ASSIGNED

Assignee

Comment 30

•

16 years ago

Attached patch Even lighter version... (obsolete) — Details — Splinter Review

I think this is about as light as we can get... Upon receiving the "net:clear-active-logins" msg, nsHttpHandler dispatches a new msg with topic "net:detach-nss-sockets". nsSocketTransportService reacts to this by dispatching an event which will detach sockets in the NSS layer (running on the socket-thread). We could make this simpler by letting nsSocketTransportService listen directly to "net:clear-active-logins", but if someone else wants to use this mechanism to close down NSS-sockets we need to keep the two topics separated in order to avoid undesired side-effects. A few things about this implementation should be mentioned (input is appreciated) - The string "net:detach-nss-sockets" is harcoded - is there a natural place for constants like this? It is shared between nsHttpHandler and nsSocketTransportService... - I think it's safe for nsDetachNssSocketsEvent to keep the reference to nsSocketTransportService as a raw pointer, but someone with stronger xpcom-fu may not share this opinion... :] - There should be a better way to filter the sockets than comparing layer-name with "NSS layer" - ideas are welcome here Any ideas about how to set up a testcase for this is also appreciated...

Attachment #374616 - Attachment is obsolete: true

Attachment #374817 - Flags: review?(jst)

Attachment #374616 - Flags: review?(jst)

Updated

•

16 years ago

Attachment #374817 - Flags: review?(jst) → review?(jduell.mcbugs)

Comment 32

•

16 years ago

Comment on attachment 374817 [details] [diff] [review] Even lighter version... I'm afraid I'm not a good reviewer for this change. Jason, wanna dig into this one?

Comment 34

•

16 years ago

Thanks for the patch, it looks really great! :-) (In reply to comment #30) > Any ideas about how to set up a testcase for this is also appreciated... Hmm, I would assume that you would open two channels, one to a http:// location and one to a https:// location, and send out "net:clear-active-logins" and verify that the http channel lives on but the https one aborts. Does that make sense?

Assignee

Comment 36

•

16 years ago

> Hmm, I would assume that you would open two channels, one to a http:// location > and one to a https:// location, and send out "net:clear-active-logins" and > verify that the http channel lives on but the https one aborts. Does that make > sense? Makes sense, but how do I check that a channel "lives"? Persistent connections?

Comment 37

•

16 years ago

(In reply to comment #36) > > Hmm, I would assume that you would open two channels, one to a http:// location > > and one to a https:// location, and send out "net:clear-active-logins" and > > verify that the http channel lives on but the https one aborts. Does that make > > sense? > > Makes sense, but how do I check that a channel "lives"? Persistent connections? I'm not really sure. A dumb trick would be to use the stuff added in bug 396226 and feed some different data through the connection after observing the notification, and on the HTTP client part expect (or don't expect) that data. Of course there may be a better way to do that using nsIChannel wizardry. :-)

Assignee

Comment 38

•

16 years ago

Not really clear to me how to test this... :) I'll think about it some more - maybe the server written for bug #486769 could be used, but I need to figure out how to support https on that one first. Any chance to get a review on the patch while we're thinking about testing?

Jason Duell

Comment 39

•

16 years ago

Comment on attachment 374817 [details] [diff] [review] Even lighter version... Looks good. A much lighter touch than the previous fixes :) Boris, this seems fairly straightforward, but if you're not comfortable reviewing this code, let me know and I can ping biesi.

Attachment #374817 - Flags: superreview?(bzbarsky)

Attachment #374817 - Flags: review?(jduell.mcbugs)

Attachment #374817 - Flags: review+

Comment 40

•

16 years ago

This seems ok, with the following nits: 1) Use nsRunnableMethod and just add a method that calls the detaching function. That will incidentally fix the fact that you're not holding a strong reference, which I do think is a bug. 2) Space after ',' when calling strcmp. 3) I assume that things work ok if the event fires after xpcom shutdown? If not, you presumably need to revoke it then.

Assignee

Comment 41

•

16 years ago

Attached patch Updated according to comments from reviewer — Details — Splinter Review

> 1) Use nsRunnableMethod and just add a method that calls the detaching > function. That will incidentally fix the fact that you're not holding a > strong reference, which I do think is a bug. Ok. Using nsRefPtr - is that ok in this situation? > 2) Space after ',' when calling strcmp. Right... > 3) I assume that things work ok if the event fires after xpcom shutdown? If > not, you presumably need to revoke it then. Uhh... what? Is that a potential scenario? Any hints or pointers to code/tests which checks for this? There are two things I would like to change in this patch, but I'm depending on input on how to achieve it : 1) I would like to identify sockets without using the name of the socket-layer (last issue in comment #30) 2) If it's possible to parametrize the call to a nsRunnableMethod, the function nsSocketTransportService::DetachNssSockets() could be dropped (making code-size smaller) And, of-course, ideas for how to write a test for this would be nice. Unfortunately, I don't think the approach from comment #37 will do the trick.

Attachment #374817 - Attachment is obsolete: true

Attachment #379101 - Flags: review?(bzbarsky)

Attachment #374817 - Flags: superreview?(bzbarsky)

Christian :Biesinger (don't email me, ping me on IRC)

Comment 42

•

16 years ago

> Ok. Using nsRefPtr - is that ok in this situation? I'd use nsCOMPtr, since the type is nsIRunnable. And probably NS_NEW_RUNNABLE_METHOD. > Uhh... what? Is that a potential scenario? Well.. what prevents it? You put an event into an event queue. What guarantees that it will fire before we fire the xpcom-shutdown notification, if there is already an event in the queue that will try to shutdown? I'm not saying it's definitely possible. I'm saying we either need to prove it's impossible or handle it. ccing Benjamin for his shutdown fu. I have no idea how to deal with your issue (1). There is no way to parametrize nsRunnableMethod, but I think the codesize is still smaller than having a separate class...

Benjamin Smedberg

Comment 43

•

16 years ago

https://wiki.mozilla.org/XPCOM_Shutdown runnables can fire after xpcom-shutdown and xpcom-shutdown-threads until the event queue is cleared. The component manager should be available during this time, but various modules will have begun to shut themself down.

Comment 44

•

16 years ago

Why do we need special code for this, especially hardcoding NSS in the STS? In particular: We handle servers closing the connection to an idle socket, and connection resets, etc. So the specific question is, I guess: Why do we attempt to reuse this socket? The HTTP code attempts to make sure that the socket is still alive before reusing it: PRBool nsHttpConnection::CanReuse() { return IsKeepAlive() && (NowInSeconds() - mLastReadTime < mIdleTimeout) && IsAlive(); } And IsAlive() basically calls nsSocketTransport::IsAlive, which basically calls PR_Recv. Shouldn't that be returning an error here, making us consider the socket to be dead? Similarly, shouldn't the STS loop polling the sockets already notice that the socket is in an error state?

Assignee

Comment 45

•

16 years ago

So, question may be why PR_Read() called from IsAlive() succeeds while PR_Write() fails? PR_Recv() from IsAlive() calls nsSSLThread::requestRecvMsgPeek() , which do not take conditions PK11LoggedOut / AlreadyShutDown into account (as opposed to nsSSLThread::requestWrite() and nsSSLThread::requestRead()). I'll try to follow this track for a little while and see if it resolves the original issue...

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 46

•

16 years ago

Attached patch Ultra-simple version derived from comments by biesi (obsolete) — Details — Splinter Review

I must admit that this approach feels more elegant... :) The observation is that the call to PR_Recv() from IsAlive() is made with parameter PR_MSG_PEEK, which translates into a call to nsSSLThread::requestRecvMsgPeek(). This patch adds logic to nsSSLThread::requestRecvMsgPeek() for certain states and sets an appropriate error-code on the socket. Works nicely with the test-URL for this defect but I haven't checked the duplicated cases. Probably kaie or MisterSSL are more appropriate reviewers, but I'll ask bz initially.

Attachment #379598 - Flags: review?(bzbarsky)

Comment 47

•

16 years ago

Does the poll code require fixing too? It seems like it should report an EXCEPT flag for these sockets, and if it does that then the socket transport service would already notice that the socket has an error, instead of waiting for HTTP to detect it. (and it would improve things for stuff like Chatzilla too)

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 48

•

16 years ago

Attached patch Added similar logic to poll-method (obsolete) — Details — Splinter Review

Assuming previous comment was referring to nsSSLThread::requestPoll(). This patch is a "little more aggressive" than previous patch since it now triggers and returns on bad conditions very early in the method-call. Not sure about this, but it's worth a try... Tested with test-URL for this defect and also by browsing some bank-sites which use https.

Attachment #379598 - Attachment is obsolete: true

Attachment #379680 - Flags: review?(cbiesinger)

Attachment #379598 - Flags: review?(bzbarsky)

Updated

•

16 years ago

Attachment #379680 - Flags: review?(cbiesinger)

Christian :Biesinger (don't email me, ping me on IRC)

Comment 49

•

16 years ago

Comment on attachment 379680 [details] [diff] [review] Added similar logic to poll-method I'm not qualified to review PSM code. But yeah, looks like I did refer to nsSSLThread::requestPoll (via nsNSSIOLayer.cpp:nsSSLIOLayerPoll) I'm not sure that keeping the LOG in nsSocketTransport is worth keeping...

Comment 50

•

16 years ago

Do you still want my review on the " Updated according to comments from reviewer" diff?

Assignee

Updated

•

16 years ago

Attachment #379101 - Flags: review?(bzbarsky)

Assignee

Comment 51

•

16 years ago

Comment on attachment 379101 [details] [diff] [review] Updated according to comments from reviewer Cleared request for review

Assignee

Comment 52

•

16 years ago

Comment on attachment 379680 [details] [diff] [review] Added similar logic to poll-method Trying new reviewer for PSM code...

Attachment #379680 - Flags: review?(kaie)

Jason Duell

Comment 53

•

16 years ago

From my glance over how this bug has evolved, it sounds like the latest method proposed here might also fix the necko problems in bug 463256. I.e., we might be able to avoid shutting down and restarting the whole necko layer. Ehsan, does that sounds like a possibility?

Comment 54

•

16 years ago

(In reply to comment #53) > From my glance over how this bug has evolved, it sounds like the latest method > proposed here might also fix the necko problems in bug 463256. I.e., we might > be able to avoid shutting down and restarting the whole necko layer. Ehsan, > does that sounds like a possibility? Yes, it almost certainly does.

Assignee

Comment 55

•

16 years ago

Looks like kaie is busy. Feel free to suggest reviewers for this patch...

Comment 56

•

16 years ago

Comment on attachment 379680 [details] [diff] [review] Added similar logic to poll-method Maybe dveditz could help out with the review here?

Attachment #379680 - Flags: review?(dveditz)

Comment 57

•

16 years ago

Have you e-mailed Kai?

Kai Engert [:KaiE:]

Comment 58

•

16 years ago

Comment on attachment 379680 [details] [diff] [review] Added similar logic to poll-method Thanks to everyone for your research, thoughts and this patch. It looks reasonable. r=kaie

Attachment #379680 - Flags: review?(kaie) → review+

Assignee

Comment 59

•

16 years ago

Attached patch Cleaned-up version of patch ready for check-in — Details — Splinter Review

Removed LOG as suggested by biesi, added meta-information

Attachment #379680 - Attachment is obsolete: true

Attachment #379680 - Flags: review?(dveditz)

Assignee

Comment 60

•

16 years ago

Requesting check-in to bake it on trunk. No test available - ideas for how to make one are appreciated.

Keywords: relnote → checkin-needed

Henrik Skupin [:whimboo][⌚️UTC+2] (away 07/21 - 08/08)

Comment 61

•

16 years ago

Comment on attachment 381395 [details] [diff] [review] Cleaned-up version of patch ready for check-in Need trunk approval....

Attachment #381395 - Flags: approval1.9.1?

Comment 62

•

16 years ago

Not fixed yet. Please leave relnote as it is.

Keywords: relnote

Christian :Biesinger (don't email me, ping me on IRC)

Updated

•

16 years ago

Blocks: 496335

Comment 63

•

16 years ago

(In reply to comment #60) > Requesting check-in to bake it on trunk. No test available - ideas for how to > make one are appreciated. Well, we support HTTPS now in the framework, right? So make an HTTPS request, do whatever clear private data does that breaks sockets, and make another request and see if that works. Could do this on the socket transport level too - open a socket transport stream to the SSL port, do whatever clear private data does, and see if the stream got notified (e.g. the asyncWait callback got called, etc)

Updated

•

16 years ago

Flags: wanted1.9.1.x?

Whiteboard: [3.5.1?]

Comment 64

•

16 years ago

(no longer need approval for trunk)

Henrik Skupin [:whimboo][⌚️UTC+2] (away 07/21 - 08/08)

Comment 65

•

16 years ago

Pushed http://hg.mozilla.org/mozilla-central/rev/98c455c298d5 Still need tests here.

Status: ASSIGNED → RESOLVED

Closed: 16 years ago

Flags: in-testsuite?

Resolution: --- → FIXED

Dão Gottwald [:dao]

Updated

•

16 years ago

Keywords: checkin-needed

Comment 66

•

16 years ago

Verified fixed on trunk with builds on all platforms like Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090613 Minefield/3.6a1pre (.NET CLR 3.5.30729) ID:20090613042114 Boris, can we completely automate those tests or do we also need a Litmus or Mozmill test?

Status: RESOLVED → VERIFIED

OS: Windows XP → Other

Hardware: x86 → All

Target Milestone: --- → mozilla1.9.2a1

Christian :Biesinger (don't email me, ping me on IRC)

Comment 67

•

16 years ago

The idea would be to have automated tests.

Comment 68

•

16 years ago

I described in comment 63 how an automated test could be written.

Updated

•

16 years ago

Attachment #381395 - Flags: approval1.9.1? → approval1.9.1-

Comment 69

•

16 years ago

Comment on attachment 381395 [details] [diff] [review] Cleaned-up version of patch ready for check-in (Not going to take this for 1.9.1, minusing the approval flag)

Updated

•

16 years ago

Flags: blocking1.9.1.1?

Whiteboard: [3.5.1?]

Henrik Skupin [:whimboo][⌚️UTC+2] (away 07/21 - 08/08)

Comment 70

•

16 years ago

(In reply to comment #63) > (In reply to comment #60) > > Requesting check-in to bake it on trunk. No test available - ideas for how to > > make one are appreciated. > > Well, we support HTTPS now in the framework, right? So make an HTTPS request, > do whatever clear private data does that breaks sockets, and make another > request and see if that works. > > Could do this on the socket transport level too - open a socket transport > stream to the SSL port, do whatever clear private data does, and see if the > stream got notified (e.g. the asyncWait callback got called, etc) Would a test like what I this help here? <http://mxr.mozilla.org/mozilla-central/source/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_sslsite_transition.js>

bernhard krumm

Comment 71

•

16 years ago

CSS is also missing after relaoding TABS when updating to 3.5 I opened the "Firefox 3.5 for developers" page ( https://developer.mozilla.org/en/Firefox_3.5_for_developers ) then I upgraded to 3.5 after/when upgrading the opened page was reloaded without CSS. a 2nd load in another TAB was correct with CSS a refresh in first TAB also helped to load CSS

Comment 72

•

16 years ago

Bernhard this is more bug 474364.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 07/21 - 08/08)

Updated

•

16 years ago

OS: Other → All

Updated

•

16 years ago

Attachment #381395 - Flags: approval1.9.1.1?

Christian :Biesinger (don't email me, ping me on IRC)

Comment 73

•

16 years ago

Comment on attachment 381395 [details] [diff] [review] Cleaned-up version of patch ready for check-in See bug 496335 comment 7.

Comment 74

•

16 years ago

(In reply to comment #70) > Would a test like what I this help here? > > <http://mxr.mozilla.org/mozilla-central/source/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_sslsite_transition.js> Not really. What was wrong with the approach I suggested?

Christian :Biesinger (don't email me, ping me on IRC)

Comment 75

•

16 years ago

(In reply to comment #74) > (In reply to comment #70) > > Would a test like what I this help here? > > > > <http://mxr.mozilla.org/mozilla-central/source/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_sslsite_transition.js> > > Not really. What was wrong with the approach I suggested? Hmm, I think it implements your first suggestion: > Well, we support HTTPS now in the framework, right? So make an HTTPS request, > do whatever clear private data does that breaks sockets, and make another > request and see if that works. In this case, toggling the private browsing mode is the action which breaks sockets.

Comment 76

•

16 years ago

That test is at a much higher level than a necko test should be. Hm... though I guess this shouldn't be a Necko test anyway, as it was a bug in PSM, so this should actually be in security/manager somewhere. My point is: this should be a test using channels/sockets and nsISecretDecoderRing, not a test that uses docshells and nsIPrivateBrowsingService. It also should probably be an xpcshell test. >> Well, we support HTTPS now in the framework, right? So make an HTTPS request, >> do whatever clear private data does that breaks sockets, and make another >> request and see if that works. >In this case, toggling the private browsing mode is the action which breaks >sockets. Kinda. What actually breaks sockets is logging out of/tearing down the secret decoder ring, i.e.: 292 let sdr = Cc["@mozilla.org/security/sdr;1"]. 293 getService(Ci.nsISecretDecoderRing); 294 sdr.logoutAndTeardown();

Comment 77

•

16 years ago

Attached patch On my way to a test (obsolete) — Details — Splinter Review

Here is what I have so far. Based on comment 63, I expected at least one of the reader or writer to get called, but none did. What am I doing wrong?

Christian :Biesinger (don't email me, ping me on IRC)

Updated

•

16 years ago

blocking1.9.1: --- → needed

status1.9.1: --- → wanted

Flags: wanted1.9.1.x?

Flags: wanted1.9.1.x+

Flags: blocking1.9.1.1?

Flags: blocking1.9.1.1-

Comment 78

•

16 years ago

hm... I think that code should work. if you remove the logoutAndTeardown call, does that make a difference?

Comment 79

•

16 years ago

Attached patch Test (obsolete) — Details — Splinter Review

OK, my mistake was to use example.org:443. That only becomes meaningful through the proxy set up by the test harness as a pac file. For this test we should use the SSL server directly, that is, localhost:4443. Asking review from biesi to confirm that this test is testing the correct thing, and from kaie as a security peer.

Attachment #388146 - Attachment is obsolete: true

Attachment #388452 - Flags: review?(cbiesinger)

Updated

•

16 years ago

Attachment #388452 - Flags: review?(kaie)

Comment 80

•

16 years ago

The test server isn't guaranteed to be running on port 4443, is it?

Comment 81

•

16 years ago

(In reply to comment #80) > The test server isn't guaranteed to be running on port 4443, is it? I'm not sure if this currently guaranteed or not, but it is hardcoded: <http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#321>

Comment 82

•

16 years ago

I thought we had support for our test stuff running on a different port if the one it wants is in use. Did that not land?

Jeff Walden [:Waldo]

Comment 83

•

16 years ago

That's reftest-only, also much easier there since the server is in-process. ssltunnel would need to communicate the port it chose back to the Python script to be splatched into a user.js, and that's not quite as easy. I'm confused as to why the raw port must be used, but do note that https://example.org:443 is fail because it's not in the set of mappings, so naturally it wouldn't work for that choice: http://mxr.mozilla.org/mozilla-central/source/build/pgo/server-locations.txt

Comment 84

•

16 years ago

Attached patch Test (v2) (obsolete) — Details — Splinter Review

(In reply to comment #83) > I'm confused as to why the raw port must be used, but do note that > https://example.org:443 is fail because it's not in the set of mappings, so > naturally it wouldn't work for that choice: Thanks for the note. I'm using the raw port because the socket transport layer does not use the proxy pac file. I tested and made sure that the example.com:443 combination can't be used in this test as well. This version of the patch is a bit better because it makes sure that the socket gets dropped after the teardown operation.

Attachment #388452 - Attachment is obsolete: true

Attachment #389185 - Flags: review?(kaie)

Attachment #389185 - Flags: review?(cbiesinger)

Attachment #388452 - Flags: review?(kaie)

Attachment #388452 - Flags: review?(cbiesinger)

Comment 85

•

16 years ago

Comment on attachment 381395 [details] [diff] [review] Cleaned-up version of patch ready for check-in Please renominate when you have the tests approved.

Attachment #381395 - Flags: approval1.9.1.1? → approval1.9.1.1-

Comment 86

•

16 years ago

Pinging biesi and kaie for the test review...

Kai Engert [:KaiE:]

Comment 87

•

16 years ago

Comment on attachment 389185 [details] [diff] [review] Test (v2) moa=kaie

Attachment #389185 - Flags: review?(kaie) → review+

Christian :Biesinger (don't email me, ping me on IRC)

Comment 88

•

16 years ago

biesi: ping?

Comment 90

•

16 years ago

Comment on attachment 389185 [details] [diff] [review] Test (v2) sorry for the delay + onInputStreamReady: function(stream) { + SimpleTest.ok(tearDown, "The stream should be closed after a teardown of secure decoder ring"); You should probably check that the stream has an error state. I.e. try reading from it and see if it throws the desired exception (or, just call available() and see if it throws the expected exception) +var writer = { + onOutputStreamReady: function(stream) { + } +}; you don't need this, do you? + getService(Ci.nsISecretDecoderRing). + logoutAndTeardown(); can you indent these two lines, so that it's obvious that they are continuation lines? +outStream.asyncWait(writer, 0, 1, currentThread); no need for this actually you don't even need to open the output stream

Attachment #389185 - Flags: review?(cbiesinger) → review+

Christian :Biesinger (don't email me, ping me on IRC)

Comment 91

•

16 years ago

Attached patch Test for check-in — Details — Splinter Review

Addressed all of the review comments on the test. I have submitted this to the try server, and after a green unit test run, I'll land it on m-c.

Attachment #389185 - Attachment is obsolete: true

Comment 92

•

16 years ago

+ try { + stream.available(); + } Shouldn't you add a: SimpleTest.ok(!tearDown, "Stream should be in an error state"); after this call?

Christian :Biesinger (don't email me, ping me on IRC)

Comment 93

•

16 years ago

(In reply to comment #92) > + try { > + stream.available(); > + } > > Shouldn't you add a: > > SimpleTest.ok(!tearDown, "Stream should be in an error state"); > > after this call? I make sure a NS_ERROR_FAILURE is thrown in the catch block, otherwise the test will time out. Isn't that enough?

Comment 94

•

16 years ago

I think it's better to be explicit about the fact that an unthrown exception is a failure.

Christian :Biesinger (don't email me, ping me on IRC)

Comment 95

•

16 years ago

The test is failing on Linux. Here is what happens: on Linux, the sink's onTransportStatus method is never called, so the secure decoder ring is never torn down. After an amount of time which is longer than Windows, reader's onInputStreamReady is called, and calling available on the stream passed to it throws a NS_ERROR_NET_RESET error. Interestingly, the connection on port 4443 is actually established during the test: ehsan@ehsan-laptop:~/moz/src$ sudo netstat -p | grep ^tcp tcp 0 0 ehsan-laptop:8888 ehsan-laptop:37426 TIME_WAIT - tcp 0 0 ehsan-laptop:8888 ehsan-laptop:37424 TIME_WAIT - tcp 0 0 ehsan-laptop:37423 ehsan-laptop:8888 TIME_WAIT - tcp 0 0 ehsan-laptop:4443 ehsan-laptop:48014 ESTABLISHED 25018/ssltunnel tcp 0 0 ehsan-laptop:48014 ehsan-laptop:4443 ESTABLISHED 25019/firefox-bin tcp 0 0 ehsan-laptop:8888 ehsan-laptop:37422 TIME_WAIT - tcp 0 0 ehsan-laptop:8888 ehsan-laptop:37427 TIME_WAIT - Do you have any idea why the socket transport service doesn't get notified about the connection being established on Linux? This happens both on the try server and locally.

Comment 96

•

16 years ago

Can you try setting the transport event sink before opening the input stream? That might possibly fix the problem (and is probably what should be done anyway)

Comment 97

•

16 years ago

Thanks, that did fix the problem! I'll submit another try server run to make sure that the test passes on all platforms before pushing.

Comment 98

•

16 years ago

Test landed as: <http://hg.mozilla.org/mozilla-central/rev/30ead1e19478>

Flags: in-testsuite? → in-testsuite+

Whiteboard: [test needs landing on 1.9.2]

Updated

•

16 years ago

Attachment #381395 - Flags: approval1.9.1.3?

Comment 99

•

16 years ago

Test landed on 1.9.2 as well: <http://hg.mozilla.org/releases/mozilla-1.9.2/rev/5b0d486603de>

Updated

•

16 years ago

Whiteboard: [test needs landing on 1.9.2]

Comment 100

•

16 years ago

Comment on attachment 381395 [details] [diff] [review] Cleaned-up version of patch ready for check-in Approved for 1.9.1.4, a=dveditz for release-drivers

Attachment #381395 - Flags: approval1.9.1.3? → approval1.9.1.4+

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/bcae819c3b43

Comment 101

•

16 years ago

status1.9.1: wanted → .4-fixed

Updated

•

16 years ago

Attachment #381395 - Flags: approval1.9.0.15?

Comment 102

•

16 years ago

Comment on attachment 381395 [details] [diff] [review] Cleaned-up version of patch ready for check-in Approved for 1.9.0.15, a=dveditz for release-drivers

Attachment #381395 - Flags: approval1.9.0.15? → approval1.9.0.15+

Samuel Sidler (old account; do not CC)

Comment 103

•

16 years ago

The SSL tunnel is not available on the 1.9.0.x branch, so it is not possible to test this on that branch. Is it OK to land it there without the test?

Comment 104

•

16 years ago

Yes, that's fine.

Al Billings [:abillings - ex-MoCo]

Comment 105

•

16 years ago

Attachment 381395 [details] [diff] landed on 1.9: Checking in security/manager/ssl/src/nsSSLThread.cpp; /cvsroot/mozilla/security/manager/ssl/src/nsSSLThread.cpp,v <-- nsSSLThread.cpp new revision: 1.17; previous revision: 1.16

Keywords: fixed1.9.0.15

Comment 106

•

16 years ago

Verified fixed for 1.9.1.4 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.4pre) Gecko/20090915 Shiretoko/3.5.4pre (.NET CLR 3.5.30729).

Keywords: verified1.9.1

Al Billings [:abillings - ex-MoCo]

Comment 107

•

16 years ago

Verified fixed for 1.9.0.15 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.15pre) Gecko/2009091606 GranParadiso/3.0.15pre (.NET CLR 3.5.30729).

Keywords: fixed1.9.0.15 → verified1.9.0.15

Honza Bambas (:mayhemer)

Comment 108

•

16 years ago

(In reply to comment #103) > The SSL tunnel is not available on the 1.9.0.x branch, so it is not possible to > test this on that branch. Is it OK to land it there without the test? Is it possible to have a test for this on the trunk? in security/manager/ssl/tests/ or in content/something/tests/...

Comment 109

•

16 years ago

(In reply to comment #108) > (In reply to comment #103) > > The SSL tunnel is not available on the 1.9.0.x branch, so it is not possible to > > test this on that branch. Is it OK to land it there without the test? > > Is it possible to have a test for this on the trunk? in > security/manager/ssl/tests/ or in content/something/tests/... I have already written and landed a test for it: <http://mxr.mozilla.org/mozilla-central/source/security/manager/ssl/tests/mochitest/bugs/test_bug480619.html>. It also exists in 1.9.1 and 1.9.2.

Updated

•

16 years ago

blocking1.9.1: needed → .4+