If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

correct condition variable and timeout logic in nsIOThreadPool::ThreadFunc

RESOLVED FIXED in mozilla1.6beta

Status

()

Core
Networking
--
minor
RESOLVED FIXED
14 years ago
14 years ago

People

(Reporter: Darin Fisher, Assigned: Darin Fisher)

Tracking

Trunk
mozilla1.6beta
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [ready-to-land])

Attachments

(1 attachment, 1 obsolete attachment)

3.23 KB, patch
Wan-Teh Chang
: review+
Details | Diff | Splinter Review
(Assignee)

Description

14 years ago
spurious wakups on nsIOThreadPool::mIdleThreadCV could hurt performance.  per
bug 222588 comment #14, we should be calling PR_WaitCondVar from within a while
loop.
(Assignee)

Updated

14 years ago
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.6beta

Comment 1

14 years ago
The while loop defends against not only spurious wakeups
but also the "another thread beating it to the lock" issue.
I explained this issue in bug 222588 comment 17.

In addition to calling PR_WaitCondVar from within a while
loop, you should also call PR_IntervalNow before and after
PR_WaitCondVar to properly determine whether the wait
timeout has expired, and reduce the timeout if you need
to wait again.  Although PR_WaitCondVar takes a timeout
argument, it does not tell you whether it returns because
the condition variable was notified or the timeout expired.
(This is different from pthread_cond_timedwait, which may
return ETIMEDOUT.) Offhand I can think of two reasons:
1. If the thread returns because the condition variable was
notified, by the time it actually reaquired the lock, the
elapsed time may have exceeded the timeout.  So it is hard
to report the timeout status reliably.
2. Our API hopefully forces the programmer to think about
how he should handle the possibility that the condition is
true but the timeout also expired when PR_WaitCondVar returns.
In some cases, you want to work on the shared data in spite
of the timeout.  In other cases, you want to bail out even
though the shared data are in the right condition.
(Assignee)

Comment 2

14 years ago
Created attachment 134074 [details] [diff] [review]
v1 patch
(Assignee)

Comment 3

14 years ago
Comment on attachment 134074 [details] [diff] [review]
v1 patch

wan-teh: can you please review this simple patch.  thx!
Attachment #134074 - Flags: review?(wchang0222)

Comment 4

14 years ago
You should reduce the timeout if you need to wait
again.

    PRIntervalTime ts = PR_IntervalNow();
    PRIntervalTime elapsed;
 
    while (PR_CLIST_IS_EMPTY(&pool->mEventQ) && !pool->mShutdown &&
            ((elapsed = PR_IntervalNow() - ts) < THREAD_IDLE_TIMEOUT)) {
        pool->mNumIdleThreads++;
        PR_WaitCondVar(pool->mIdleThreadCV, THREAD_IDLE_TIMEOUT - elapsed);
        pool->mNumIdleThreads--;
    }

A more complicated version is as follows.  It doesn't call
PR_IntervalNow() unless we need to wait.

    PRIntervalTime ts; /* init to 0 if gcc complains "might be used
uninitialized" */
    PRIntervalTime elapsed = 0;
 
    while (PR_CLIST_IS_EMPTY(&pool->mEventQ) && !pool->mShutdown) {
        if (elapsed == 0) {
            ts = PR_IntervalNow();
        } else {
            elapsed = PR_IntervalNow() - ts;
            if (elapsed >= THREAD_IDLE_TIMEOUT)
                break;
        }
        pool->mNumIdleThreads++;
        PR_WaitCondVar(pool->mIdleThreadCV, THREAD_IDLE_TIMEOUT - elapsed);
        pool->mNumIdleThreads--;
    }
(Assignee)

Comment 5

14 years ago
Created attachment 134092 [details] [diff] [review]
v1.1 patch

thanks for keeping me honest wan-teh... here's a better patch.
(Assignee)

Updated

14 years ago
Attachment #134074 - Attachment is obsolete: true
(Assignee)

Updated

14 years ago
Attachment #134074 - Flags: review?(wchang0222)
(Assignee)

Updated

14 years ago
Attachment #134092 - Flags: review?(wchang0222)

Comment 6

14 years ago
Comment on attachment 134092 [details] [diff] [review]
v1.1 patch

r=wtc.	The while loop is about more than being
tolerant of spurious wakeups.  Another thread may
get the lock before the thread returning from the
wait and empty the event queue.
Attachment #134092 - Flags: review?(wchang0222) → review+

Comment 7

14 years ago
Comment on attachment 134092 [details] [diff] [review]
v1.1 patch

OK, my point is that the following comment could be deleted.
One should always wait on a condition variable in a loop.
You don't need to explain that here, and the explanation is
not that clear (it doesn't make the connection to the while
loop) nor complete anyway.

>+            // however, we must be tolerant of spurious wakeups, which the
>+            // operating system may legitimately cause.
(Assignee)

Comment 8

14 years ago
Comment on attachment 134092 [details] [diff] [review]
v1.1 patch

ok, thanks wtc.. i'll remove that sentence from the comment before checking in
this patch.
Attachment #134092 - Flags: superreview?(bzbarsky)
Comment on attachment 134092 [details] [diff] [review]
v1.1 patch

sr=bzbarsky
Attachment #134092 - Flags: superreview?(bzbarsky) → superreview+
(Assignee)

Updated

14 years ago
Summary: spurious wakups on nsIOThreadPool::mIdleThreadCV could hurt performance → fix timeout calculation in nsIOThreadPool::ThreadFunc
Whiteboard: [ready-to-land]
(Assignee)

Updated

14 years ago
Summary: fix timeout calculation in nsIOThreadPool::ThreadFunc → correct condition variable and timeout logic in nsIOThreadPool::ThreadFunc
(Assignee)

Comment 10

14 years ago
fixed-on-trunk (for 1.6 beta)

i took the liberty of increasing IDLE_TIMEOUT from 10 seconds to 60 seconds.  i
think this makes much more sense.  i also noticed quite a bit of thread
creation/destruction churn from this threadpool in some test data running
through jrgm's page loader.
Status: ASSIGNED → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.