526805 - Users of multiple PRRWLock reader locks may deadlock

Reporter

Description

•

16 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20090909 Fedora/3.5.3-1.fc11 Firefox/3.5.3 Build Identifier: 4.8.1 Details from NSPR development mailing list: > Sorry to bring up such an old message thread, but an issue has come up > > related > > to re-entrant use of PR_RWLock_Rlock(). > > > > The way PR_RWLock works is that a waiting writer will block any threads > > attempting to get a new read lock. If you use read locks in a re-entrant > > manor, a request for a write lock between the two read lock calls will cause > > a > > deadlock (the writer is waiting for the reader to exit, and the reader can't > > get the re-entrant lock since the writer is waiting). I am running into > > this > > deadlock in my application. > > > > I'd like to propose that we modify PR_RWLock to behave differently when a > > re-entrant read lock is made. If a thread already holds a read lock and > > tries > > to get another readlock, this should be allowed, even if a writer is waiting > > on > > the write lock. Any other threads attempting to get a read lock will have > > to > > wait on the writer since it is given priority. This approach would prevent > > the > > writer from being starved due to many active readers, yet it would also > > allow > > for safe re-entrant use of read locks without chance of a deadlock. > > > > Does the above proposal sound feasible? Hi Nathan, We have three implementations of PRRWLock. Two of them (HAVE_UNIX98_RWLOCK and HAVE_UI_RWLOCK) are based on native thread libraries. Unless the behavior you proposed is documented in the Unix 98 reader-writer locks, if we make this change, we won't be able to use Unix 98 reader-writer locks. It seems that native thread libraries must solve this problem if they allow a thread to hold multiple concurrent read locks, and it seems that your proposal is the obvious solution. I'm worried that native thread libraries actually implement your solution but fail to document it. Wan-Teh Reproducible: Always

nkinder

Reporter

Comment 1

•

16 years ago

Attached file Test Program Code — Details

This is the code for a test program that demonstrates the deadlock. The program creates 2 threads (a reader and a writer). The PR_Sleep function is used to guarantee the following order of events: - Reader thread obtains a reader lock. - Writer threads attempts to get writer lock (and is blocked by the reader). - Reader thread attempts to get a second reader lock (and is blocked by the waiting writer).

nkinder

Reporter

Comment 2

•

16 years ago

I have run the test program on a HP-UX 11.23 ia64 system (where HAVE_UNIX98_RWLOCK is defined), and the deadlock still occurs. Does this mean that it is out of the question to modify the way that NSPR's portable reader-writer locks work?

Wan-Teh Chang

Comment 3

•

16 years ago

Nathan, thanks for the bug report. Could you also run your test program on Solaris? That's the other platform where NSPR uses the native reader writer locks. I thought about this issue a bit, and I believe that unless we remember all the threads that own a reader lock, we will need to give up giving preference to waiting writers (which means writers may be starved) to allow readers to lock recursively.

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

Nelson Bolyard (seldom reads bugmail)

Updated

•

16 years ago

Summary: Multiple reader locks are not safe when using PRRWLock → Users of multiple PRRWLock reader locks may deadlock

nkinder

Reporter

Comment 4

•

16 years ago

I have run my test program on a Solaris 9 sparc system and it does not deadlock. Here is the stdout output from my test program on this system: Reader running. Attempting to get first reader lock. Reader obtained first reader lock. Reader sleeping for 10 seconds. Writer running. Sleeping for 5 seconds. Writer attempting to get writer lock. Reader attempting to get second reader lock. Reader obtained second reader lock. Reader releasing second reader lock. Reader releasing first reader lock. Writer obtained writer lock. Writer releasing writer lock.

nkinder

Reporter

Comment 5

•

16 years ago

(In reply to comment #3) > I thought about this issue a bit, and I believe that unless we remember > all the threads that own a reader lock, we will need to give up giving > preference to waiting writers (which means writers may be starved) > to allow readers to lock recursively. I believe that you are correct that we need to keep track of all of the threads who own a reader lock as opposed to simply keeping a count of the number of readers. Is it legal for a thread that does not own a reader or writer lock to call PR_RWLock_Unlock()? It seems like nothing is preventing this currently. If we keep track of readers, it seems as if this should not be allowed since there would be no way of knowing which thread to remove from the owners list.

nkinder

Reporter

Comment 6

•

16 years ago

In light of my test findings, do my suggestions from comment#5 seem like a viable and acceptable approach for fixing this issue?

Wan-Teh Chang

Comment 7

•

16 years ago

Nathan, keeping track of all the threads who own a reader lock may slow down PRRWLock. It also means we won't be able to use pthread_rwlock_t on HP-UX. Since Solaris is open source now, we should study its implementation of reader-writer locks. I did a quick search for pthread_rwlock_rdlock and rw_rdlock at http://src.opensolaris.org/source/ and found: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/port/threads/rwlock.c Could you also study the code to find out what its algorithm is? Thanks.

nkinder

Reporter

Comment 8

•

16 years ago

(In reply to comment #7) > Nathan, keeping track of all the threads who own a reader lock > may slow down PRRWLock. It also means we won't be able to use > pthread_rwlock_t on HP-UX. Since Solaris is open source now, > we should study its implementation of reader-writer locks. I > did a quick search for pthread_rwlock_rdlock and rw_rdlock at > http://src.opensolaris.org/source/ and found: > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/port/threads/rwlock.c > > Could you also study the code to find out what its algorithm > is? Thanks. The Solaris implementation of reader-writer locks keeps a list of readers. It also keeps a reference count for each reader to keep track of how many locks it holds. When a thread that already holds a reader lock attempts to get another reader lock, the list of readers is checked to see if a reader lock is already held by the thread. If so, the reference count for that thread is simply increased. The list of reader threads is also used by the unlock function to ensure that the calling thread actually holds a reader lock.

Wan-Teh Chang

Comment 9

•

16 years ago

Nathan: thanks for studying the Solaris code. This is a tough call. On the one hand, it is reasonable to expect that a thread should be able to acquire a reader lock recursively because the thread should be able to share read-only access to the resource with itself. On the other hand, to support this the reader-writer lock implementation will become very heavyweight, making it even less appealing than regular locks. If you really can't restructure your code to avoid recursive reader locks, we'll have to implement the Solaris algorithm and also stop using the pthread reader-writer locks on HP-UX.

nkinder

Reporter

Comment 10

•

15 years ago

(In reply to comment #9) > If you really can't restructure your code to avoid recursive > reader locks, we'll have to implement the Solaris algorithm > and also stop using the pthread reader-writer locks on HP-UX. I believe that it is going to be very difficult to restructure our code to avoid re-entrant reader locks. We could switch to using PRMonitor in our code, but that will also be very intrusive and doesn't allow for multiple readers at the same time. Is there any way to get NSPR to implement re-entrant reader locks for PR_RWLock?

Wan-Teh Chang

Comment 11

•

15 years ago

Nathan: it's certainly possible. Someone will need to implement an algorithm similar to the one used in Solaris (see comment 7 and comment 8), and NSPR won't be able to use a native reader-writer lock implementation unless it is known to allow re-entrant reader locks.

BugBot [:suhaib / :marco/ :calixte]

Comment 12

•

3 years ago

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: wtc → nobody

Status: ASSIGNED → NEW

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Bugzilla

Users of multiple PRRWLock reader locks may deadlock

Categories

(NSPR :: NSPR, defect)

Tracking

(Not tracked)

People

(Reporter: nkinder, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Updated

Attachment

General

Description

File Name

Content Type