Closed Bug 591881 Opened 14 years ago Closed 14 years ago

Pruning of Idle connections needs to be implemented with one shot timer

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
flash10
Tracking Status
fennec 2.0+ ---

People

(Reporter: jon.hemming, Unassigned)

References

Details

Attachments

(2 files, 3 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.11) Gecko/2009060309 Ubuntu/8.04 (hardy) Firefox/3.0.11
Build Identifier: 

Pruning of idle connections is handled with 15s timer that keeps going on when we have connection. With mobile devices we need to get rid off all the timers when idle so we should suspend pruning of idle connections when display goes off. Other thing worth considering is if pruning of idle connections needs to be done in mobile devices between 15s or should we take a longer interval

Reproducible: Always
Attachment #470389 - Flags: feedback?(azakai)
OS: Linux → MeeGo
Comment on attachment 470389 [details] [diff] [review]
Suspends pruning of idle connection when display-off

I like the idea. You should get direct feedback from someone in the networking team to see what they think about such changes in their code though.

From my perspective, I think that we shouldn't have platform-specific checks in there. Also we shouldn't check for display-off. Instead, the connection manager might know when the last network activity (in the sense needed for pruning idle connections) was. When the timer fires, it could then stop itself from firing again if the network has been idle for more than 15 seconds. And when network activity resumes, the timer could be resumed.

That would make this work in a better way, since it could halt the timer even before the display goes off (if no networking is going on). And also without any potential problems with the display going off, but lots of network activity going on in the meanwhile.
Attachment #470389 - Flags: feedback?(azakai) → feedback+
Blocks: 583135
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #2)
> From my perspective, I think that we shouldn't have platform-specific checks in
> there. Also we shouldn't check for display-off. Instead, the connection manager
> might know when the last network activity (in the sense needed for pruning idle
> connections) was. When the timer fires, it could then stop itself from firing
> again if the network has been idle for more than 15 seconds. And when network
> activity resumes, the timer could be resumed.
> 

Yes, using display-off and platform specific checks here is a bit confusing, but since this is not optimal solution it was narrowed for Maemo 6 and at the moment display-off notification is only thing that is close to idle state for which this patch aims for. By the way how is the progress for system state notifications?

> That would make this work in a better way, since it could halt the timer even
> before the display goes off (if no networking is going on). And also without
> any potential problems with the display going off, but lots of network activity
> going on in the meanwhile.

Of course it is better to fix the real problem instead of making these kind of fixes, so better solution needs to be done, but this kind of solution can be used for idle state if we don't get better solution for it soon.
we probably want something for all platforms, not just MOZ_PLATFORM_MAEMO.
tracking-fennec: --- → ?
How about this kind of approach? Would it be enough to do pruning always when any of the idle connections will reach the maximum age. At that point we can check if any of the other idle connections in not reusable anymore. This way we would not be polling between 15s to check that. If we need react more quickly to idle connection being not usable anymore, I guess we can try to add more precise timeout to react to those. This patch also removes checking if we don't have any idle connections existing.
 We still would need to handle suspending of every timer when "browser goes idle", once this "browser goes idle" is defined somewhere.
Attachment #470389 - Attachment is obsolete: true
Attachment #473031 - Flags: feedback?(cbiesinger)
tracking-fennec: ? → 2.0+
Attachment #473031 - Flags: feedback?(cbiesinger) → feedback?(jduell.mcbugs)
Attachment #473031 - Flags: feedback?(dwitte)
Comment on attachment 473031 [details] [diff] [review]
xulrunner patch to remove polling from pruning of connections

>diff --git a/netwerk/protocol/http/nsHttpConnection.cpp b/netwerk/protocol/http/nsHttpConnection.cpp

>+nsresult nsHttpConnection::TimeToLive(PRUint32 *timeToLive)
>+{
>+    if(!timeToLive)
>+        return NS_ERROR_NULL_POINTER;

No need, since you control all the callers. Which means you can just return timeToLive directly.

>diff --git a/netwerk/protocol/http/nsHttpConnectionMgr.cpp b/netwerk/protocol/http/nsHttpConnectionMgr.cpp

>+void
>+nsHttpConnectionMgr::PruneDeadConnectionsAfter(PRUint32 time_in_seconds)

Prefer 'timeInSeconds'.

>+{
>+    LOG(("nsHttpConnectionMgr::StartPruneDeadConnectionsTimer\n"));

Wrong name.

>+
>+    if(!mTimer)
>+        mTimer = do_CreateInstance("@mozilla.org/timer;1");
>+    NS_ASSERTION(mTimer, "no timer");

This should be a warning if it's nonfatal.

>+void
>+nsHttpConnectionMgr::StopPruneDeadConnectionsTimer()
>+{
>+    LOG(("nsHttpConnectionMgr::StopPruneDeadConnectionsTimer\n"));
>+
>+    if (mTimer) {
>+        mTimer->Cancel();
>+        mTimer = 0;

'= NULL' please.

>+NS_IMETHODIMP
>+nsHttpConnectionMgr::Observe(nsISupports *subject,
>+                       const char *topic,
>+                       const PRUnichar *data)

Indentation is off.

Do we not need to check, when we add a new connection, whether the idle time for the new connection is less than the timeout for the existing idle timer, if one is set? For instance, we previously have an idle timer that fires in 300 seconds, but we add a new connection with a timeout of 10 seconds.

If we discover more places we need to prune connections, we can add them, but otherwise this patch looks fine in terms of where we're adding/removing timers (with the above exception). I don't think we currently prune a connection until its idle timeout has been satisfied right? In which case your patch does not regress anything.

r=dwitte with above comments addressed, and satisfactory answers to the above.
Attachment #473031 - Flags: feedback?(jduell.mcbugs)
Attachment #473031 - Flags: feedback?(dwitte)
Attachment #473031 - Flags: feedback+
Bug 508518 will remove the UITimer, and as a result, user-interaction-active/user-interaction-inactive notifications will only be sent every 5 seconds *IF* the device is not entirely idle.

So, perhaps we could prune idle connections using those notifications. Then we would automatically not do it when the device is completely idle. Upside is we suspend when inactive 'for free', without writing specific code.

Downside is that if we are not completely idle, we will still get those notifications. So even one other unneeded timer will keep this problem alive. But, once we fix all the other timers... (not that many left: https://wiki.mozilla.org/Mobile/Powersaving/Wakeups#Some_Wakeups_of_Note )
Attached patch Fixes to previous patch (obsolete) — Splinter Review
Requested changes added. Also fixed too hasty stopping for pruning of dead connections that was done in previous patch. Pruning was stopped on some places when connection entry had 0 idle connections instead of checking that count in connection manager for idle connections was 0. In addition added more comments.
Attachment #473031 - Attachment is obsolete: true
Attachment #481473 - Flags: review?(dwitte)
(In reply to comment #6)
> Do we not need to check, when we add a new connection, whether the idle time
> for the new connection is less than the timeout for the existing idle timer, if
> one is set? For instance, we previously have an idle timer that fires in 300
> seconds, but we add a new connection with a timeout of 10 seconds.
> 
This is now added.

> If we discover more places we need to prune connections, we can add them, but
> otherwise this patch looks fine in terms of where we're adding/removing timers
> (with the above exception). I don't think we currently prune a connection until
> its idle timeout has been satisfied right? In which case your patch does not
> regress anything.
> 
Closing of idle connection depends on result of CanReuse() which this case depends on expire time and isAlive(), which depends on isAlive() of socket transport. So connections goes to dead also when this isAlive() returns false. I guess this then means for this patch that if socket transport is not connected anymore and we have no activity with connections then we don't notice those dead connections before any of the connections will expire. I don't know does it matter.
(In reply to comment #7)
> So, perhaps we could prune idle connections using those notifications. Then we
> would automatically not do it when the device is completely idle. Upside is we
> suspend when inactive 'for free', without writing specific code.

Perhaps we can do both: prune connections iff we have both a prune timer and we're not idle. This would require canceling all the prune timers when we get user-interaction-inactive.

Not really sure if that's worth it. We can file a followup to try that if we determine it's worth it. Can we get data here once this patch lands?
Comment on attachment 481473 [details] [diff] [review]
Fixes to previous patch

Reviewing differences between previous and current patch.

>diff -urp mozilla-central/netwerk/protocol/http/nsHttpConnectionMgr.cpp mozilla-central-2/netwerk/protocol/http/nsHttpConnectionMgr.cpp

>@@ -86,6 +87,7 @@ nsHttpConnectionMgr::nsHttpConnectionMgr
>     , mMaxPersistConnsPerProxy(0)
>     , mNumActiveConns(0)
>     , mNumIdleConns(0)
>+    , mTimeOfNextWakeUp(PR_UINT32_MAX)

LL_MAXUINT.

> void
>-nsHttpConnectionMgr::PruneDeadConnectionsAfter(PRUint32 time_in_seconds)
>+nsHttpConnectionMgr::PruneDeadConnectionsAfter(PRUint32 timeInSeconds)
> {

>     // failure to create a timer is not a fatal error, but idle connections
>     // will not be cleaned up until we try to use them.

Is this comment still accurate?

>@@ -386,11 +392,11 @@ nsHttpConnectionMgr::PruneDeadConnection

>+    // Find out how long it will take for next idle connections to not be reusable
>+    // anymore.

'connection'.

>@@ -399,16 +405,22 @@ nsHttpConnectionMgr::PruneDeadConnection

>+    // If time to next expire found is shorter than time to next wake-up, we need to
>+    // change the time for next wake-up.
>+    if (0 < ent->mIdleConns.Length()) {
>+        PRUint32 now = NowInSeconds();
>+        PRUint64 timeOfNextExpire = now + timeToNextExpire;
>+        if (!self->mTimer || self->mTimeOfNextWakeUp < now || timeOfNextExpire < self->mTimeOfNextWakeUp) {
>+            self->PruneDeadConnectionsAfter(timeToNextExpire);

So, under what conditions would 'self->mTimeOfNextWakeUp < now' be true? Presumably only if the timer has already fired but we haven't got to processing it in the event loop yet, right? In which case we're going to be pruning dead connections as soon as that happens, so there's no need to explicitly do it here. I think you can drop that and the '!self->mTimer' check here.

>+        }
>+    } else if (0 == self->mNumIdleConns) {

Can you document the difference between mNumIdleConns and ent->mIdleConns.Length(), perhaps in the .h where mNumIdleConns is declared?

>     if (!conn) {
>+        // No reusable idle connection found for this entry. If there is no idle connections left at all,
>+        // we need to make sure that we are not pruning dead connections anymore.

'If there are', and keep comments to 80 columns per line?

>@@ -625,8 +642,11 @@ nsHttpConnectionMgr::GetConnection(nsCon

>+        // This only needs to be done if there is a idle connection.

'an'.

>+        if (0 < mNumIdleConns) {
>+            if (mNumIdleConns + mNumActiveConns + 1 > mMaxConns)

&& these together instead of nesting?

>@@ -938,18 +958,18 @@ nsHttpConnectionMgr::OnMsgReclaimConnect

>+            // If added connections was first idle connection or has shortest

'the added connection'

>+            // time to live among the idle connections, pruning dead
>+            // connections needs to be done when it can't be reused anymore.
>+            PRUint32 timeToLive = conn->TimeToLive();
>+            PRUint32 now = NowInSeconds();
>+            if(!mTimer || mTimeOfNextWakeUp < now || now + timeToLive < mTimeOfNextWakeUp)

Same comment about the first two tests.

diff -urp mozilla-central/netwerk/protocol/http/nsHttpConnectionMgr.h mozilla-central-2/netwerk/protocol/http/nsHttpConnectionMgr.h

>@@ -91,7 +91,11 @@ public:

>+    // Chedules next pruning of dead connection to happen after

'Schedules', 'connections'.

>+    // Stops timer cheduled for next pruning of dead connections.
>     void StopPruneDeadConnectionsTimer();

Same.

r=dwitte with fixes.
Attachment #481473 - Flags: review?(dwitte) → review+
Attached patch patch v3Splinter Review
Putting this still to review since all proposed fixed can't be done.
Attachment #481473 - Attachment is obsolete: true
Attachment #481779 - Flags: review?(dwitte)
(In reply to comment #11)
> > void
> >-nsHttpConnectionMgr::PruneDeadConnectionsAfter(PRUint32 time_in_seconds)
> >+nsHttpConnectionMgr::PruneDeadConnectionsAfter(PRUint32 timeInSeconds)
> > {
> 
> >     // failure to create a timer is not a fatal error, but idle connections
> >     // will not be cleaned up until we try to use them.
> 
> Is this comment still accurate?
Yes, I think it is. Connections are still not going to be cleaned if we
don't have the timer.

> >@@ -399,16 +405,22 @@ nsHttpConnectionMgr::PruneDeadConnection
> 
> >+    // If time to next expire found is shorter than time to next wake-up, we need to
> >+    // change the time for next wake-up.
> >+    if (0 < ent->mIdleConns.Length()) {
> >+        PRUint32 now = NowInSeconds();
> >+        PRUint64 timeOfNextExpire = now + timeToNextExpire;
> >+        if (!self->mTimer || self->mTimeOfNextWakeUp < now || timeOfNextExpire < self->mTimeOfNextWakeUp) {
> >+            self->PruneDeadConnectionsAfter(timeToNextExpire);
> 
> So, under what conditions would 'self->mTimeOfNextWakeUp < now' be true?
> Presumably only if the timer has already fired but we haven't got to processing
> it in the event loop yet, right? In which case we're going to be pruning dead
> connections as soon as that happens, so there's no need to explicitly do it
> here. I think you can drop that and the '!self->mTimer' check here.
Yes 'self->mTimeOfNextWakeUp < now' was just being over carefull, but '!self->mTimer' is needed for case when we have stopped pruning of dead connections timer and it will need to be started again.

> 
> >+        }
> >+    } else if (0 == self->mNumIdleConns) {
> 
> Can you document the difference between mNumIdleConns and
> ent->mIdleConns.Length(), perhaps in the .h where mNumIdleConns is declared?
mNumIdleConns now documented.

> >@@ -625,8 +642,11 @@ nsHttpConnectionMgr::GetConnection(nsCon
> 
> >+        // This only needs to be done if there is a idle connection.
> 
> 'an'.
> 
> >+        if (0 < mNumIdleConns) {
> >+            if (mNumIdleConns + mNumActiveConns + 1 > mMaxConns)
> 
> && these together instead of nesting?
done

> 
> >+            // time to live among the idle connections, pruning dead
> >+            // connections needs to be done when it can't be reused anymore.
> >+            PRUint32 timeToLive = conn->TimeToLive();
> >+            PRUint32 now = NowInSeconds();
> >+            if(!mTimer || mTimeOfNextWakeUp < now || now + timeToLive < mTimeOfNextWakeUp)
> 
> Same comment about the first two tests.
Same as for previous, !mTimer check still kept.
Comment on attachment 481779 [details] [diff] [review]
patch v3

Great! r=dwitte
Attachment #481779 - Flags: review?(dwitte) → review+
Keywords: checkin-needed
Sent to try. If all is good I'll push this later today or tomorrow.
Keywords: checkin-needed
I pushed to try earlier thismorning, and saw a bunch of oranges: http://tests.themasta.com/tinderboxpushlog/?tree=MozillaTry&pusher=dwitte@mozilla.com

They *are* listed as random, but there's a bunch of instances. So we'll see if you see the same thing...

(Also, lul, that's a lot of B)
Looks like bug 517584 has appeared in almost all the mochitest-other runs in my and dwitte's pushes. This needs to be fixed.

Example output:

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1286563025.1286563873.4859.gz

ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/security/ssl/bugs/test_bug480619.html | The stream should be inside an error state - got 2152398868, expected 2147500037
Target Milestone: --- → flash10
Note that this could be a bug in the test -- it was intermittent before; it may be sensitive to timing or some other non-guaranteed detail. So the fix here could either be in your code or the test itself.
Sadly, forcing the pruning of idle connections doesn't seem to fix that test failure on tryserver,

http://hg.mozilla.org/try/rev/ecfae0d3481b

but it works locally on my laptop :(
Patch to fix that intermittent test is in bug 517584.
Depends on: 517584
http://hg.mozilla.org/mozilla-central/rev/c66b0f9777bb
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Thanks guys for getting this pushed.
I like the change from a fixed poll to the delta list, but I wish it had a Network component on the bugzilla entry.. I got blindsided this morning on a whole stack of merge conflicts (primarily 592284). When I update that patch to resolve those conflicts you might want to take a look and ensure this is still how you want it.
Time needs to be reseted so that we can find a new value for it during the pruning of dead connections.
Attachment #485732 - Flags: feedback?(dwitte)
Component: General → Networking: HTTP
OS: MeeGo → All
Summary: Pruning of Idle connections needs to be suspended for Maemo 6 platform when display goes off → Pruning of Idle connections needs to be implemented with one shot timer
Comment on attachment 485732 [details] [diff] [review]
Resets the time of next wake-up after notification is observed from the timer.

Looks good, but I don't think this is sufficient. Let's do this over in bug 606719.
Attachment #485732 - Flags: feedback?(dwitte) → feedback-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: