Closed
Bug 97833
Opened 24 years ago
Closed 24 years ago
http should periodically prune it's idle connection list of dead connections
Categories
(Core :: Networking: HTTP, defect, P4)
Core
Networking: HTTP
Tracking
()
RESOLVED
FIXED
mozilla0.9.7
People
(Reporter: darin.moz, Assigned: darin.moz)
References
Details
(Whiteboard: [patch needs r=])
Attachments
(5 files)
4.99 KB,
patch
|
Details | Diff | Splinter Review | |
6.53 KB,
patch
|
Details | Diff | Splinter Review | |
6.56 KB,
patch
|
Details | Diff | Splinter Review | |
6.72 KB,
patch
|
Details | Diff | Splinter Review | |
6.43 KB,
patch
|
dougt
:
review+
darin.moz
:
superreview+
|
Details | Diff | Splinter Review |
http should periodically prune it's idle connection list of dead connections
Assignee | ||
Updated•24 years ago
|
Priority: -- → P4
Target Milestone: --- → mozilla0.9.5
Comment 2•24 years ago
|
||
*** Bug 104138 has been marked as a duplicate of this bug. ***
Assignee | ||
Updated•24 years ago
|
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Comment 3•24 years ago
|
||
Someone who has the ability: please add "mozilla1.0" to the Keywords section.
IMO, it's absolutely critical that this bug be fixed before mozilla1.0 is
released; see Bug 104138 (the complete comments) for an explanation as to why.
Assignee | ||
Comment 4•24 years ago
|
||
the fact that this bug is targeted for mozilla 0.9.7 means that it should be
fixed by mozilla 1.0, so no reason to nominate it.
Status: NEW → ASSIGNED
Comment 5•24 years ago
|
||
The reason why I wanted to nominate this bug for mozilla1.0 is because it has
already been "pushed back" multiple releases. I didn't want to see it pushed
back beyond the 1.0 release.
Regardless, as long as the bug is fixed before 1.0 is released, then I (and the
rest of the world, which has had to suffer years of Netscape's completely
brain-damaged handling of TCP sockets) will be happy.
Comment 6•24 years ago
|
||
Create a timer that will trigger once a minute and walk the available
connection list to get rid of dead connections.
Assignee | ||
Comment 7•24 years ago
|
||
Comment on attachment 59536 [details] [diff] [review]
Attaching a quick fix patch to this problem based on a timer.
thanks for putting together this patch... here's some comments:
>Index: nsHttpConnection.h
>@@ -82,6 +82,7 @@
> void DontReuse() { mKeepAliveMask = PR_FALSE;
> mKeepAlive = PR_FALSE;
> mIdleTimeout = 0; }
>+ PRBool IsAlive();
>
> nsAHttpTransaction *Transaction() { return mTransaction; }
> nsHttpConnectionInfo *ConnectionInfo() { return mConnectionInfo; }
>@@ -102,7 +103,6 @@
>
> nsresult SetupSSLProxyConnect();
>
>- PRBool IsAlive();
you want nsHttpConnection::CanReuse, not nsHttpConnection::IsAlive.
>Index: nsHttpHandler.cpp
>+ , terminating(PR_FALSE)
>+ , mTimer(nsnull)
terminating should be mTerminating
>@@ -256,6 +260,13 @@
> observerSvc->AddObserver(this, "profile-before-change", PR_TRUE);
> observerSvc->AddObserver(this, "session-logout", PR_TRUE);
> }
>+
>+ mTimer = do_CreateInstance("@mozilla.org/timer;1", &rv);
>+ if (NS_FAILED(rv))
>+ return rv;
failure to create this timer should not be a fatal error.
>+ mTimer->Init(DeadConnectionCleanupCB, this, 60*1000,
>+ NS_PRIORITY_NORMAL, NS_TYPE_REPEATING_SLACK);
>+
>+nsresult
>+nsHttpHandler::PurgeDeadConnections()
>+{
>+ if (terminating == PR_FALSE) {
>+ nsAutoLock lock(mConnectionLock);
>+ for (PRInt32 i = 0; i < mIdleConnections.Count(); i++) {
>+ nsHttpConnection* conn = (nsHttpConnection*) mIdleConnections[i];
>+ if (conn && (PR_FALSE == conn->IsAlive())) {
>+ // Dead and idle connection; purge it
>+ mIdleConnections.RemoveElement(conn);
>+ }
>+ }
>+ }
>+ return NS_OK;
>+}
how about:
if (conn && !conn->CanReuse()) {
you must also NS_RELEASE(conn) otherwise it will not actually be destroyed.
>+void
>+nsHttpHandler::DeadConnectionCleanupCB(nsITimer* aTimer, void* aClosure)
>+{
>+ nsHttpHandler* me = NS_STATIC_CAST(nsHttpHandler*, aClosure);
>+ me->PurgeDeadConnections();
>+}
>+
>+
looks like you have an extra newline here.
>Index: nsHttpHandler.h
>===================================================================
>RCS file: /cvsroot/mozilla/netwerk/protocol/http/src/nsHttpHandler.h,v
>retrieving revision 1.20
>diff -u -r1.20 nsHttpHandler.h
>--- nsHttpHandler.h 2001/11/04 05:46:06 1.20
>+++ nsHttpHandler.h 2001/11/28 16:30:57
>@@ -42,6 +42,7 @@
> #include "nsWeakReference.h"
> #include "nsVoidArray.h"
> #include "nsIIDNService.h"
>+#include "timer/nsITimer.h"
should be
#include "nsITimer.h"
add "timer" to the REQUIRES variable in Makefile.in and makefile.win
you may also need to edit the netwerk/macbuild/netwerk.mcp to get this to work
on the mac.
>@@ -76,6 +77,9 @@
> // Returns a pointer to the one and only HTTP handler
> static nsHttpHandler *get() { return mGlobalInstance; }
>
>+ // CB function for timer to clean up dead http connections
>+ static void nsHttpHandler::DeadConnectionCleanupCB(nsITimer* aTimer, void* aClosure);
>+
indentation problem here.
>@@ -123,6 +127,9 @@
> // longer than mMaxRequestDelay.
> nsresult ProcessTransactionQ();
>
>+ // Purge dead connections
>+ nsresult PurgeDeadConnections();
>+
indentation problem here.
>@@ -237,6 +245,7 @@
> nsCOMPtr<nsICacheSession> mCacheSession_MEM;
> PRUint32 mLastUniqueID;
> PRUint32 mSessionStartTime;
>+ PRBool terminating;
how about making this a PRPackedBool and placing it near the end with the other
PRPackedBool's.
on some platforms these are packed together in one word to reduce the size of
the data structure.
Attachment #59536 -
Flags: needs-work+
Comment 8•24 years ago
|
||
Darrin
Have incorporated ur comments. Hope I got them all.
I was also forced to make changes to mozilla/netwerk/build/Makefile.in and
mozilla/netwerk/build/Makefile.win once I moved from #include
"timer/nsITimer.h" to #include "nsITimer.h".
can u help out on the MAC part of it please.
Assignee | ||
Comment 9•24 years ago
|
||
Comment on attachment 59672 [details] [diff] [review]
comments incorporated into patch
> LOG(("Deleting nsHttpHandler [this=%x]\n", this));
>+ mterminating = PR_TRUE;
prefer |mTerminating| over |mterminating|
>
>+ if (mTimer != nsnull) {
>+ mTimer->Cancel();
>+ }
and just a minor nit: please do away with the extra braces if possible.
these two points are really just style nits, but as they say: "when
in rome, stay in rome." ;-)
>+ mTimer = do_CreateInstance("@mozilla.org/timer;1", &rv);
>+ // If we were unable to create mTimer, we will still work.
>+ // But connections are not aggressively reaped.
>+ if (mTimer) {
>+ // Reap once a minute
>+ mTimer->Init(DeadConnectionCleanupCB, this, 60*1000,
>+ NS_PRIORITY_NORMAL, NS_TYPE_REPEATING_SLACK);
>+ }
>+
no need to capture |rv| if you're not going to use it.
> // Returns a pointer to the one and only HTTP handler
> static nsHttpHandler *get() { return mGlobalInstance; }
>
>+ // CB function for timer to clean up dead http connections
>+ static void nsHttpHandler::DeadConnectionCleanupCB(nsITimer* aTimer, void* aClosure);
>+
this should be moved into the private section of the nsHttpHandler class.
>
>+ // Purge dead connections
>+ nsresult PurgeDeadConnections();
>+
this comment seems a bit redundant. how about leaving it out since it doesn't
add anything.
Attachment #59672 -
Flags: needs-work+
Assignee | ||
Comment 10•24 years ago
|
||
once we have a good patch, i'll give it a try on my mac, and see what project
changes might need to be made.
Comment 11•24 years ago
|
||
Darin
done.
also added a fix for a possible race condition.
Assignee | ||
Comment 12•24 years ago
|
||
Comment on attachment 59679 [details] [diff] [review]
darin's comments + fix for a race condition
> nsHttpHandler::~nsHttpHandler()
> {
> LOG(("Deleting nsHttpHandler [this=%x]\n", this));
>+ mTerminating = PR_TRUE;
>
>+ // scope for the nsAutoLock
>+ {
>+ nsAutoLock lock(mConnectionLock);
>+ if (mTimer != nsnull)
>+ mTimer->Cancel();
>+ }
>+
i don't understand why this lock is necessary. please explain.
>+nsresult
>+nsHttpHandler::PurgeDeadConnections()
>+{
>+ if (mTerminating == PR_FALSE) {
>+ nsAutoLock lock(mConnectionLock);
>+ if (mTerminating == PR_FALSE) {
>+ for (PRInt32 i = 0; i < mIdleConnections.Count(); i++) {
>+ nsHttpConnection* conn = (nsHttpConnection*) mIdleConnections[i];
>+ if (conn && !conn->CanReuse()) {
>+ // Dead and idle connection; purge it
>+ mIdleConnections.RemoveElement(conn);
>+ NS_RELEASE(conn);
>+ }
>+ }
>+ }
>+ }
>+ return NS_OK;
>+}
indentation seems all screwed up here.
Attachment #59679 -
Flags: needs-work+
Comment 13•24 years ago
|
||
If we have both the destructor and the nsHttpHandler::PurgeDeadConnections()
then there can exist a race. I'am assuming (given my limited knowledge) of this
code base, that the destructor for HttpHandler would get invoked only on
shutdown. So this race can exist only at shutdown.
Imagine the following scenario.
Timer has fired and nsHttpHandler::PurgeDeadConnections() is getting rid of the
dead connections in the list. In this time, the destructor has also fired as a
result of shutdown. Then the desturctor can also be touching the same data
structures.
Hence it is required to mark terminating in the destructor. Lock and cancel the
timer. The destructor will not get the lock if the timer has fired and
PurgeDeadConnections is executing. Also notice the usual double check pattern
for checks on mTerminating in PurgeDeadConnections.
With this we ensure correctness of nsHttpHandler irrespective of the underlying
architecture.
ex Was the underlying architecture designed to always cancel all timers on
shutdown so that such races do not exist etc etc.
Assignee | ||
Comment 14•24 years ago
|
||
we have bigger problems if the destructor is running while the timer function is
still live... afterall it has a weak reference to the handler.
we should instead make the http handler observe the xpcom-shutdown event. see
nsIObserverService in xpcom/ds. on xpcom-shutdown we can cancel the timer.
since the handler is a service (hence owned by the service manager) we can be
sure that the handler won't be destroyed prior to xpcom-shutdown.
Comment 15•24 years ago
|
||
Assignee | ||
Comment 16•24 years ago
|
||
Comment on attachment 59828 [details] [diff] [review]
with changes for xpcom-shutdown event handling
>Index: protocol/http/src/nsHttpHandler.cpp
>+ , mTerminating(PR_FALSE)
>+ , mTimer(nsnull)
> {
> NS_INIT_ISUPPORTS();
looks like we don't need mTerminating anymore.
> nsHttpHandler::~nsHttpHandler()
> {
>+ // We do not deal with the timer cancellation in the destructor since
>+ // it is taken care of in xpcom shutdown event in the Observe method.
>+
> LOG(("Deleting nsHttpHandler [this=%x]\n", this));
>+ mTerminating = PR_TRUE;
please fix indentation here.
>+ else if (!nsCRT::strcmp(topic, "xpcom-shutdown")) {
>+ if (mTimer != nsnull) {
>+ mTimer->Cancel();
>+ mTimer = nsnull;
>+ }
>+ }
should probably use NS_XPCOM_SHUTDOWN_OBSERVER_ID instead of "xpcom-shutdown"
just clean up these few things, and it looks like this patch'll do the trick.
Attachment #59828 -
Flags: needs-work+
Comment 17•24 years ago
|
||
removed mTerminating since it is not required.
indentation changes
Assignee | ||
Comment 18•24 years ago
|
||
Comment on attachment 60119 [details] [diff] [review]
incorporating darins latest comments
looks good, sr=darin
Attachment #60119 -
Flags: superreview+
Assignee | ||
Comment 19•24 years ago
|
||
actually, i thought a bit more about this patch, and i think that it might be
better to use a 15 second timeout instead of a 60 second timeout. this is
because many web servers frequently send keep-alive timeouts of 15 seconds.
moreover the cost of the timer callback is small enough that running it once
every 15 seconds should not in any degrade user perceived performance.
Assignee | ||
Updated•24 years ago
|
Whiteboard: [patch needs r=]
Assignee | ||
Comment 20•24 years ago
|
||
spoke with vinay about the timeout, and we decided to change it to 15 seconds...
will do that when i check this in.
Comment 21•24 years ago
|
||
Comment on attachment 60119 [details] [diff] [review]
incorporating darins latest comments
looks fine
Attachment #60119 -
Flags: review+
Assignee | ||
Comment 22•24 years ago
|
||
patch checked in... vinay: thx :)
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 23•24 years ago
|
||
Ideally, Mozilla should be checking for closed connections on the order of once
every 1 second, or maybe once every 5 seconds.
However, if the only way to discover closed connections is to walk a tree
structure and poll each connection individually, then the performance penalty
associated with those actions is probably too great to walk the tree every 1 second.
On unix systems, the poll() and/or select() system calls can be used to check an
arbitrary number of file descriptors at once. Do these system calls exist on
Windows and Macintosh? If so, it might be worth it to, for example, maintain
(in addition to the tree structure) an array of fds suitable for passing to
poll(). That way, instead of walking the tree and checking each fd
individually, you can simply call poll() with a zero timeout, see what fds have
data, and then walk the tree to find each fd in question and process it.
If, given a fd, there's an index of some type that allows you to jump directly
to that fd's spot in the tree, then using poll() as per the above example would
*really* be worth it, because you wouldn't have to walk the tree at all. (But I
haven't looked closely at the code; I don't know if the tree is indexed in that
way.)
At any rate, the current solution--walking the tree every 15 seconds and closing
any connection that was closed by the server--is probably good enough for the
time being. This will prevent servers around the world from having FIN_WAIT_2
connections stick around forever. Thanks, all.
You need to log in
before you can comment on or make changes to this bug.
Description
•