Closed Bug 936685 Opened 11 years ago Closed 11 years ago

nsHalfOpenSockets do not fully release nsSocketTransport objects when abandoned

Categories

(Core :: Networking: HTTP, defect)

25 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28
Tracking Status
firefox25 --- wontfix
firefox26 --- wontfix
firefox27 --- fixed
firefox28 --- fixed
firefox-esr17 --- wontfix
firefox-esr24 --- wontfix

People

(Reporter: sworkman, Assigned: sworkman)

References

Details

(Whiteboard: [qa-])

Attachments

(1 file)

Bug 853423 uncovered a leak of several objects surrouding nsHttpConnectionMgr::nsHalfOpenSocket (see bug 853423 comment 25). Local observations suggest that not releasing the nsSocketTransport object and associated input and output streams can keep the socket alive even after it has been abandoned.

This was able to be reproduced using netwerk/test/unit/test_seer.js (xpcshell) and toolkit/mozapps/extensions/test/browser/browser_list.js (mochitest-browser).

In test_seer.js, there was no localhost http server, so connections were refused. Even after the nsHalfOpenSocket was abandoned by nsHttpConnectionMgr, the nsSocketTransport tried to recover from the failed connection attempts by retrying with the next address in the DNS host record. In addition, the nsHalfOpenSocket was kept alive by the nsSocketTransport as its event sink and security callback object. Other objects were similarly kept from deleting. So, when test_seer.js finished up, the leak was detected.

In browser_list.js, the leak was observed because of the backup transport objects created in nsHalfOpenSocket.

So, the solution is to carry out some more cleanup in nsHttpConnectionMgr::nsHalfOpenSocket::Abandon; specifically, cleaning up the mutual references of nsHalfOpenSocket, nsSocketTransport and the input and output streams.
Added some cleanup to nsHalfOpenSocket::Abandon.

-- Remove mutual references for nsHalfOpenSocket and nsSocketTransport for main transport and backup.
-- Remove mutual references for nsHalfOpenSocket and its output stream and backup.
-- Null out references to input stream and backup (input stream should have no references to the half open socket - these are only set when the connection is established and socket and stream ownership moves to nsHttpConnection).

Also added a few comments for quick reading in the future.
Attachment #829569 - Flags: review?(mcmanus)
Comment on attachment 829569 [details] [diff] [review]
v1.0 Cleanup mutual references held between nsHalfOpenSocket and nsSocketTransport in nsHalfOpenSocket::Abandon

Review of attachment 829569 [details] [diff] [review]:
-----------------------------------------------------------------

steve this is really great. This is potentially a socket resource leak - not just a memory one. Theoretically it could lead to networking lockups, but it would seem pretty hard to get to that extreme. nonetheless, maybe after a week or so we should move this up at least to aurora it has potential for fixing a serious problem even if we can't show it.
Attachment #829569 - Flags: review?(mcmanus) → review+
Thanks Pat. Yeah, I was wondering about that too, especially after thinking about lame-network stuff recently.

On inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/78cd3e278127

I'll land bug 853423 later - I'd like the patches to be isolated in case one of them needs to be backed out.
https://hg.mozilla.org/mozilla-central/rev/78cd3e278127
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Comment on attachment 829569 [details] [diff] [review]
v1.0 Cleanup mutual references held between nsHalfOpenSocket and nsSocketTransport in nsHalfOpenSocket::Abandon

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
-- Server is unreachable (network connection down, bad DNS record etc.) and load is canceled (e.g. user cancels page load).
 
User impact if declined:
-- Potential memory leaks
-- File descriptors could be held unnecessarily which may lead to connection issues.

Testing completed (on m-c, etc.): 
-- On m-c for a while now.

Risk to taking this patch (and alternatives if risky):
-- No known risk.
-- No known alternative.

String or IDL/UUID changes made by this patch:
-- None.
Attachment #829569 - Flags: approval-mozilla-aurora?
Steve, given 853423  which is blocking this big is only on Firefox 28, unclear is this patch in needed on aurora. Can you help ? Anything I am missing here ?
Flags: needinfo?(sworkman)
:bajaj - this is a long standing resource leak (~20 releases) of a networking socket on an error condition. it blocked another feature which made the tests show it more clearly but it has always been there. I've updated the status fields to reflect that.

Leaks like this can lead to networking requiring a restart ("I gotta restart firefox to reach anything") which we don't hear about often but do occasionally hear about.. so imo this is worth backporting at least one release.
Bhavana: what Pat said :) This patch was blocking bug 853423, not the other way round.
Flags: needinfo?(sworkman)
Comment on attachment 829569 [details] [diff] [review]
v1.0 Cleanup mutual references held between nsHalfOpenSocket and nsSocketTransport in nsHalfOpenSocket::Abandon

low risk win, looks good to land.
Attachment #829569 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Patrick McManus [:mcmanus] from comment #8)
> :bajaj - this is a long standing resource leak (~20 releases) of a
> networking socket on an error condition. it blocked another feature which
> made the tests show it more clearly but it has always been there. I've
> updated the status fields to reflect that.
> 
> Leaks like this can lead to networking requiring a restart ("I gotta restart
> firefox to reach anything") which we don't hear about often but do
> occasionally hear about.. so imo this is worth backporting at least one
> release.

(In reply to Steve Workman [:sworkman] from comment #9)
> Bhavana: what Pat said :) This patch was blocking bug 853423, not the other
> way round.

Thanks for the clarification. Just approved :) Please add verifyme if their is any QA help needed here.
Thanks Bhavana! I think it should be fine without QA.
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: