IonMonkey: WorkerThreadState::wait() is not threadsafe

RESOLVED FIXED

Status

()

Core
JavaScript Engine
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: sstangl, Assigned: bhackett)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
There are two bugs in WorkerThreadState::wait() that break thread safety.

1) wait() takes a parameter 'millis', which signals to PR_WaitCondVar() that locking should abort after a certain interval has expired. Aborting returns an error, not success, but the function erroneously presumes that the lock is always held after PR_WaitCondVar() returns. Fortunately it seems that "millis" is never a used argument, so we also have the option of removing it instead of fixing it.

The proposed solution from MDN is: "To detect the expiration of the specified interval, call PR_IntervalNow() before and after the call to PR_WaitCondVar and compare the elapsed time with the specified interval."

2) The return value of PR_WaitCondVar() is ignored. Threads can be forcibly jostled out of blocking on cvars without acquiring the lock via PR_Interrupt; it is unsafe to ignore the return value.
(Assignee)

Comment 1

5 years ago
(In reply to Sean Stangl from comment #0)
> 1) wait() takes a parameter 'millis', which signals to PR_WaitCondVar() that
> locking should abort after a certain interval has expired. Aborting returns
> an error, not success, but the function erroneously presumes that the lock
> is always held after PR_WaitCondVar() returns.

I don't think the 'attempt to reaqcuire the lock' part of the docs is very clear.  When the thread is rescheduled, either due to being notified or due to an interval timeout, it will reacquire the lock.  This is (well, had better be) infallible, as per PR_Lock; otherwise we'd need to test the return value on every PR_WaitCondVar, whether there is a timeout specified or not.

> 2) The return value of PR_WaitCondVar() is ignored. Threads can be forcibly
> jostled out of blocking on cvars without acquiring the lock via
> PR_Interrupt; it is unsafe to ignore the return value.

Eh, we don't use the PR_Interrupt API and have no plans to, but asserting on the PR_WaitCondVar's return value would be good, along with the places we use PR_WaitCondVar for other threads (e.g. the GC background thread).
(Reporter)

Comment 2

5 years ago
(In reply to Brian Hackett (:bhackett) from comment #1)
> (In reply to Sean Stangl from comment #0)
> > 1) wait() takes a parameter 'millis', which signals to PR_WaitCondVar() that
> > locking should abort after a certain interval has expired. Aborting returns
> > an error, not success, but the function erroneously presumes that the lock
> > is always held after PR_WaitCondVar() returns.
> 
> I don't think the 'attempt to reaqcuire the lock' part of the docs is very
> clear.  When the thread is rescheduled, either due to being notified or due
> to an interval timeout, it will reacquire the lock.  This is (well, had
> better be) infallible, as per PR_Lock; otherwise we'd need to test the
> return value on every PR_WaitCondVar, whether there is a timeout specified
> or not.

That makes sense; re-reading the documentation with the above text in mind, it's clear that interval expiry just causes the thread to unblock on the cvar and block on the external mutex.

Adding the assert sounds fine.
(Assignee)

Comment 3

5 years ago
Created attachment 654838 [details] [diff] [review]
patch

Add assert for PR_WaitCondVar.
Assignee: general → bhackett1024
Attachment #654838 - Flags: review?(sstangl)
(Reporter)

Updated

5 years ago
Attachment #654838 - Flags: review?(sstangl) → review+
(Assignee)

Comment 4

5 years ago
https://hg.mozilla.org/projects/ionmonkey/rev/e58be9409de7
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.