Closed Bug 596373 Opened 14 years ago Closed 14 years ago

Timer thread sleeping for 0 time causes lots of wakeups

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0b2+ ---

People

(Reporter: azakai, Assigned: azakai)

Details

Attachments

(2 files, 2 obsolete files)

The TimerThread will sleep for 0 time if the next timeout is in less than 1 millisecond. This is actually the cause of a lot of CPU wakeups, and therefore of wasted power (even sleeping for 0 time requires the overhead of a few synchronization-related system calls, and this is noticeable in powertop, etc.). Instead, it can just continue directly to process the next timeout (see tiny patch).

Data: With the patch, while idle we have ~1.15 wakeups/second in Firefox. Without it (running normally), we have ~3.25 wakeups/second. (Fennec gives similar results.) So, this would remove about 2/3 of the wakeups we have while idle.

Running with the patch appears to work fine. In theory things could break, for example if every 1/2 millisecond we add a timeout in 1/2 of a millisecond, then we would never sleep at all in the timer thread. But it seems nothing else depends on the condition var we sleep or do not sleep on, so it doesn't matter. Anyhow note that this sort of issue is already possible (just less likely to occur), since the TimerThread does

  if (now >= timeout)
    goto next;

:next is where it runs the timer, so it bypasses the sleep to do that. (So a case of never sleeping could occur if in one timer we schedule a wakeup for 1/2 of a millisecond, then spend that amount of time working in the current timer.)

If we are concerned about such cases, a more sophisticated patch could take note of how much time the timer thread has run without sleeping, and sleep if too much time has passed. Seems to be needless complication though.
Attached patch Tiny Patch (obsolete) — Splinter Review
Attachment #475221 - Flags: review?(doug.turner)
tracking-fennec: --- → ?
Assignee: nobody → azakai
Comment on attachment 475221 [details] [diff] [review]
Tiny Patch

nice find.

you are going to want to add a comment as why this check is here.  when waitFor is 0, it really means PR_WaitCondVar(var, PR_INTERVAL_NO_WAIT) which "causes the thread to release the lock, possibly causing a rescheduling within the runtime, then immediately attempt to reacquire the lock and resume."

I really have no real ideas of what is going to happen if we flip this switch.  When you ran it through try, did you notice any perf deltas?

I am fine with this change given no perf regression and you add a comment.  You should also run this by shaver and/or bsmedberg as your next review.
Attachment #475221 - Flags: review?(doug.turner) → review-
Or brendan, who sorta owns this code....
Attached patch Tiny Commented Patch (obsolete) — Splinter Review
Added a comment to the patch.

The patch passes tryserver ok, and Talos shows no regressions. In fact I would expect a tiny perf *improvement*, since we save the locking code if the timeout is 0, and also there is a smaller chance that outside code will need to wake up the timer thread (which also costs some locking). But the improvement would likely be too small to be detectable. The main benefit here is in terms of removing unnecessary CPU wakeups and saving power, which as mentioned in a previous comment is measurable&significant (for Fennec, Firefox on laptops, etc.).
Attachment #475221 - Attachment is obsolete: true
Attachment #475645 - Flags: feedback?(brendan)
tracking-fennec: ? → 2.0+
Comment on attachment 475645 [details] [diff] [review]
Tiny Commented Patch

Can't see why it would hurt, modulo latent bugs (like those don't exist! :-P). Give it a try.

/be
Attachment #475645 - Flags: feedback?(brendan) → feedback+
Comment on attachment 475645 [details] [diff] [review]
Tiny Commented Patch

Requesting approval.

This patch would bring a substantial (~66%) improvement in the # of cpu wakeups, which is directly related to power usage, which would be useful for Firefox on laptops and Fennec on mobile devices.
Attachment #475645 - Flags: approval2.0?
I looked into this some more, since something didn't feel right. So, the issue is rounding: When we wait for a very small amount of time, it gets rounded down to 0 in PR_MillisecondsToInterval. We then wait for 0 time and perform the same check again - in effect doing a busy-wait for the small amount of time left :(

The previous patch did not fully fix the issue. It avoided the CPU sleep&wakeup during that busy-wait, but the busy-wait remained in effect... Hard to say if it helped or hurt (helped, since it removed the sleep&wakeup, so the wait is potentially shorter; hurt, since before we were at least spending some time sleeping during the busy-wait).

I believe this new patch fixes it properly. First we calculate the critical value for rounding - the (platform-dependent) value of microseconds at which PR_MicrosecondsToInterval rounds down to 0. We then round as correctly as we can: At 1/2 the critical value or less, we just go right into executing the next timer (round down to 0), at more, we wait at least 1 PRIntervalTime unit (round up to 1, if less than 1). So we try to keep the current timing as much as possible, while avoiding busy-waits entirely. (The patch works in microseconds and not milliseconds, to get the best possible resolution in these calculations.)

Patch passes try ok, and the busy-wait (and cpu sleeps&wakeups) are gone. Haven't seen any latent bugs in some manual testing either ;) so hoping all is well.
Attachment #475645 - Attachment is obsolete: true
Attachment #479494 - Flags: review?
Attachment #475645 - Flags: approval2.0?
Attachment #479494 - Flags: review? → review?(brendan)
Comment on attachment 479494 [details] [diff] [review]
Less Tiny But Still Small Patch

>+  // We need to know how many microseconds give a positive PRIntervalTime. This
>+  // is platform-dependent, we calculate it at runtime now.
>+  // First we find a value such that PR_MicrosecondsToInterval(high) = 1
>+  PRInt32 low = 0, high = 1, mid;
>+  while (PR_MicrosecondsToInterval(high) == 0)
>+    high <<= 1;
>+  // We now have
>+  //    PR_MicrosecondsToInterval(low)  = 0
>+  //    PR_MicrosecondsToInterval(high) = 1
>+  // and we can proceed to find the critical value using binary search
>+  while (high-low > 1) {
>+    mid = (high+low) >> 1;

Could declare mid here -- better C++ style perhaps (duplicates PRInt32 of course).

>+    if (PR_MicrosecondsToInterval(mid) == 0)
>+      low = mid;
>+    else
>+      high = mid;
>+  }

Nit: blank line here.

>+  // Half of the amount of microseconds needed to get positive PRIntervalTime.
>+  // We use this to decide how to round our wait times later
>+  PRInt32 halfMicrosecondsIntervalResolution = high >> 1;
>+
>   while (!mShutdown) {
>     // Have to use PRIntervalTime here, since PR_WaitCondVar takes it
>     PRIntervalTime waitFor;
> 
>     if (mSleeping) {
>       // Sleep for 0.1 seconds while not firing timers.
>       waitFor = PR_MillisecondsToInterval(100);
>     } else {
>@@ -319,19 +340,27 @@ NS_IMETHODIMP TimerThread::Run()
> 
>       if (!mTimers.IsEmpty()) {
>         timer = mTimers[0];
> 
>         TimeStamp timeout = timer->mTimeout + mTimeoutAdjustment;
> 
>         // Don't wait at all (even for PR_INTERVAL_NO_WAIT) if the next timer
>         // is due now or overdue.
>-        if (now >= timeout)
>-          goto next;
>-        waitFor = PR_MillisecondsToInterval((timeout - now).ToMilliseconds());
>+        //
>+        // Note that we can only sleep for integer values of a certain
>+        // resolution. We use halfMicrosecondsIntervalResolution, calculated
>+        // before, to do the optimal rounding (i.e., of how to decide what
>+        // interval is so small we should not wait at all).
>+        float microSeconds = (timeout - now).ToMilliseconds()*1000;

Nit: microseconds, whether capitalized or not, is spelled everywhere else with lowercase-s.

Non-nit: is float precise enough? I would use double and not worry.

Looks good otherwise. Have to try it out to be sure.

/be
Attachment #479494 - Flags: review?(brendan) → review+
tracking-fennec: 2.0+ → 2.0b2+
Patch for checkin, with the requested fixes.

Will run through try+talos once more before pushing.
http://hg.mozilla.org/mozilla-central/rev/13ac3bf13d90
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: