486606 - Rewrite: Convert thread synchronization to new API

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Description

•

16 years ago

Essentially, the rewrite needs to do the following:

  PRLock*     -> nsLock*
  PRMonitor*  -> nsMonitor*
  PRCMonitor* -> nsMonitor*
  PRCondVar*  -> nsCondVar*

  nsAutoLock::NewLock       -> nsLock::NewLock
  nsAutoMonitor::NewMonitor -> nsMonitor::NewMonitor

There is a one-to-one match between PR_XXX functions and nsXXX methods.  PRCMonitor is an oddity, but probably ought to be converted by hand.

If this proves to be easy, it would also be marvelous to rewrite the code in such a way that we can make these methods unnecessary:

  nsAutoLock::lock/unlock
  nsAutoMonitor::Enter/Exit

I think these methods are (1) an impediment to reasoning about code using AutoLock/Monitor; and (2) place an undue implementation burden on the AutoLock/Monitor implementation (it should be a dirt-simple wrapper, and not have to track any state.)

The former rewrite seems almost easy enough to do with sed.  The latter is tricker.  One way to do it is (i) find all decls that make calls to Auto::lock/unlock or Auto::Enter/Exit; (ii) change their type to nsLock*/nsMonitor*.  I'm open to other ideas, though.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Blocks: 456272

Brendan Eich [:brendan]

Comment 1

•

16 years ago

Chris, thanks again for taking this on. A few comments:

PRCMonitor: just say no. We should stop using it.

unlock/lock and Exit/Enter nesting avoidance shows overlong outer critical sections, agree these are impediments and burdens. If you are game to revisit the nesting badness and shorten/reorder/iterate-to-recheck or whatever the right fix is to relieve the ancient, exiguous need for these bad boys, great.

/be

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 2

•

16 years ago

Why do we need NewLock/NewMonitor? Can't we get non-heap-allocated locks here?

Removing Enter/Exit is a problem. How else do you write code like

  mon.Enter();
  while (...) {
    if (...) {
      mon.Exit();
      ...
      mon.Enter();
    }
    ...
  }
  mon.Exit();

in a sane way? Our video decoders have a lot of this. If you can demonstrate how to get rid of it in say nsOggDecoder.cpp, I'll be convinced.

Creating new subclasses, say nsDroppableLock and nsDroppableMonitor, and only using them where necessary, would be fine.

(dormant account)

Comment 3

•

16 years ago

(In reply to comment #2)
> Why do we need NewLock/NewMonitor? Can't we get non-heap-allocated locks here?
> 
> Removing Enter/Exit is a problem. How else do you write code like
> 
>   mon.Enter();
>   while (...) {
>     if (...) {
>       mon.Exit();
>       ...
>       mon.Enter();
>     }
>     ...
>   }
>   mon.Exit();
> 
> in a sane way? Our video decoders have a lot of this. If you can demonstrate
> how to get rid of it in say nsOggDecoder.cpp, I'll be convinced.

Sounds like need another RAII class that reverses the Enter/Exit pattern.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 4

•

16 years ago

We do have an nsAutoUnlock class for locks, just not for monitors.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 5

•

16 years ago

(In reply to comment #2)
> Why do we need NewLock/NewMonitor? Can't we get non-heap-allocated locks here?
> 

It's on my todo list.  I think the migration is easier if we switch to the new, heap-allocated API first.

> Removing Enter/Exit is a problem. How else do you write code like
> 
>   mon.Enter();
>   while (...) {
>     if (...) {
>       mon.Exit();
>       ...
>       mon.Enter();
>     }
>     ...
>   }
>   mon.Exit();
> 
> in a sane way? Our video decoders have a lot of this. If you can demonstrate
> how to get rid of it in say nsOggDecoder.cpp, I'll be convinced.
> 

The glib answer is s/nsAutoMonitor/nsMonitor/.  But creating nsAutoExitMonitor suits me as well, and appears to handle your case.

Brendan Eich [:brendan]

Comment 6

•

16 years ago

roc: why are the decoders using monitors over long critical sections? Don't make me blame Java now :-/.

Canonical synch structure for queueing uses a lock/condvar pair, not a monitor.

/be

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 7

•

16 years ago

(In reply to comment #6)
> roc: why are the decoders using monitors over long critical sections? Don't
> make me blame Java now :-/.
> 
> Canonical synch structure for queueing uses a lock/condvar pair, not a monitor.

There's state shared beween the main thread and the decoder thread. The decoder thread holds the monitor continuously except where we explicitly drop it (which is, roughly, whenever we might block or do something CPU-intensive, or in a Wait()). This is good because it reduces the number of interleavings.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 8

•

16 years ago

(In reply to comment #7)
> (In reply to comment #6)
> > roc: why are the decoders using monitors over long critical sections? Don't
> > make me blame Java now :-/.
> > 
> > Canonical synch structure for queueing uses a lock/condvar pair, not a monitor.
> 
> There's state shared beween the main thread and the decoder thread. The decoder
> thread holds the monitor continuously except where we explicitly drop it (which
> is, roughly, whenever we might block or do something CPU-intensive, or in a
> Wait()). This is good because it reduces the number of interleavings.

This sounds like a job for nsMonitor.  Presumably the points where the monitor is released don't correspond to any kind of lexical scope.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 9

•

16 years ago

Yeah.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 10

•

16 years ago

(In reply to comment #8)
> This sounds like a job for nsMonitor.  Presumably the points where the monitor
> is released don't correspond to any kind of lexical scope.

Maybe not, but using non-scope-based objects can lead to errors that are difficult to find when code gets modified later (by including early returns, for instance). Having the scope enforce the duration of the lock (or, unlock) provides nice protection.

Could (should?) we even go so far as to remove the Lock/Unlock and Enter/Exit functions from the nsLock/nsMonitor?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 11

•

16 years ago

Sure, there's always placement new and explicit destructor calls to fall back on! :-)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 12

•

16 years ago

Don't make me make that constructor private! ;)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 13

•

16 years ago

er, operator!

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 14

•

16 years ago

(In reply to comment #10)
> (In reply to comment #8)
> > This sounds like a job for nsMonitor.  Presumably the points where the monitor
> > is released don't correspond to any kind of lexical scope.
> 
> Maybe not, but using non-scope-based objects can lead to errors that are
> difficult to find when code gets modified later (by including early returns,
> for instance). Having the scope enforce the duration of the lock (or, unlock)
> provides nice protection.
> 

Except that the scope can't enforce that when AutoMonitor also has explicit Enter/Exit, which the decoder seems to require/want.  I think it's even more confusing to mix the two.  I want to keep "AutoMonitor == simple: you're in it when it's in scope", and "Monitor == not simple: you're on your own, be careful."

> Could (should?) we even go so far as to remove the Lock/Unlock and Enter/Exit
> functions from the nsLock/nsMonitor?

This would be equivalent to only providing AutoLock/AutoMonitor.  That'd be nice, but the amount of refactoring it would require is more than I want to tackle.  Plus there are probably cases where it's not possible, like code that needs to interrupt a task and drop back into an event loop, while still having the task hold a lock.

BTW, this entire discussion is probably better suited to bug 58904, where we're ostensibly deciding on the API :).  Rewriting the existing code to the new API (whatever it may be) is not nearly as interesting.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 15

•

16 years ago

(In reply to comment #14)
> Except that the scope can't enforce that when AutoMonitor also has explicit
> Enter/Exit, which the decoder seems to require/want.

Oh, sorry, I was assuming we had agreed to make a nsAutoExitMonitor or somesuch.

(dormant account)

Comment 16

•

16 years ago

(In reply to comment #14)
> 
> Except that the scope can't enforce that when AutoMonitor also has explicit
> Enter/Exit, which the decoder seems to require/want.  I think it's even more
> confusing to mix the two.  I want to keep "AutoMonitor == simple: you're in it
> when it's in scope", and "Monitor == not simple: you're on your own, be
> careful."

I'd replace all of the "ninja" constructs possible. If something can cause trouble, it will cause trouble.

I think it's most beneficial to discuss api in terms of rewriting, because as you rewrite you realize what is and isn't possible with existing code.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 17

•

16 years ago

(In reply to comment #15)
> (In reply to comment #14)
> > Except that the scope can't enforce that when AutoMonitor also has explicit
> > Enter/Exit, which the decoder seems to require/want.
> 
> Oh, sorry, I was assuming we had agreed to make a nsAutoExitMonitor or
> somesuch.

That could work, if the places it's released/reacquired correspond to some
lexical scope.  But see above.

Also, AutoExitMonitor is weirder than AutoUnlock, since Monitors are reentrant.
 Either AutoExitMonitor (i) Exits the most recent Enter; or (ii) exits all
previous Enters.  Option (i) makes sense (to me) by analogy to AutoUnlock, but
doesn't guarantee that succeeding code is out of the Monitor.  Option (ii)
makes sense when you want to use it before Block() or Compute(), but doesn't
provide the same analogy to AutoUnlock (which feels funny).  I don't like
either by itself.

So it seems that we can either: 
  (1) provide bare nsMonitor and let performance-critical code take care of its
own problems; or
  (2) provide AutoEnterMonitor, autoExitMonitor ((i) above), and
AutoReallyExitMonitor ((ii) above), and refactor existing code to only use
AutoXXXMonitor on lexical boundaries.
It's probably obvious that I lean towards (1) :).

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 18

•

16 years ago

(In reply to comment #16)
> (In reply to comment #14)
> > 
> > Except that the scope can't enforce that when AutoMonitor also has explicit
> > Enter/Exit, which the decoder seems to require/want.  I think it's even more
> > confusing to mix the two.  I want to keep "AutoMonitor == simple: you're in it
> > when it's in scope", and "Monitor == not simple: you're on your own, be
> > careful."
> 
> I'd replace all of the "ninja" constructs possible. If something can cause
> trouble, it will cause trouble.
> 
> I think it's most beneficial to discuss api in terms of rewriting, because as
> you rewrite you realize what is and isn't possible with existing code.

I wholly agree.  But beyond a fancy analysis/refactoring, the best I can do is provide the guidance comment "Use AutoMonitor if at all possible" (which is in the proposed patch).

But maybe the analysis/refactoring is tractable ...

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 19

•

16 years ago

(In reply to comment #17)
>   (1) provide bare nsMonitor and let performance-critical code take care of its
> own problems; or

I don't think it's about performance-critical code. I think it's good for safety too to avoid unnecessary potential interleavings.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 20

•

16 years ago

(In reply to comment #19)
> (In reply to comment #17)
> >   (1) provide bare nsMonitor and let performance-critical code take care of its
> > own problems; or
> 
> I don't think it's about performance-critical code. I think it's good for
> safety too to avoid unnecessary potential interleavings.

Hmm.  If you need to Exit the monitor at arbitrary Block() and Compute() points for better performance, then it seems like the main thread and decoding thread in your example are maybe sharing too much data.  A potentially simpler and better-performing way to reduce non-commutable interleavings would be with director classes and annotations that make data ownership (and its transfer) explicit.  But I'm really talking out of my arse here; I haven't looked at the code.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 21

•

16 years ago

(In reply to comment #20)
> But I'm really talking out of my arse here; I haven't looked at the
> code.

Skimmed the code.  Wow, it's very nice.  A couple of small, practical points:

(1) these can be uncommented:
  //  NS_ASSERTION(PR_InMonitor(mDecoder->GetMonitor()), ...);
and updated to the new NSPR macro.  (It's not quite what you want, but close enough.)
(2) you might want a Mutex+CondVar rather than a Monitor.  Code like:
  mon.Exit();
  DecodeFrame();
  mon.Enter();
will cause weird performance bugs if the monitor is reentered anywhere in the context of the call to that function (meaning that the Exit doesn't fully exit the monitor).  This impl is a great choice for safety, and I'm sure this really nice code maintains implicit invariants guarding against reentry, but I'd rather debug a Mutex deadlock.

The rest of this is BS-y philosophical stuff.  I don't mean to single out nsOggDecode, because given its model, it's amazing code: safe and moderately easy to understand.  But, I don't like the model.  Maybe it's time to start having these kinds of discussions.

nsOggDecode uses a share-everything, protect-against-race model.  But all it really needs to share is nsOggDecode<-->(command queue)<-->nsOggDecodeStateMachine, network-->(buffer)-->nsOggDecodeStateMachine, and nsOggDecode-->(buffer)-->HTML media element sink.  Here, (command queue), (buffer), and (buffer) are the only things accessed concurrently, and can be more easily protected.  These things can also be implemented very efficiently on a shared-memory machine, and more easily ported to non-shared machines (like the Cell, but more may be coming in the not-too-distant future).  The advantage is simplicity: everything in nsOggDecodeStateMachine can run without caring about concurrency.  And similarly for nsOggDecode, "network" and the "media element".  This also can yield higher performance, because each of these actors can run concurrently if need be (and compilers can generate better code).

Whereas with the current implementation, each pair above has access to each other's entire public API.  Certainly it can be argued that much of that concurrent public API is what one would get if one started with (command queue)s and (buffer)s, and optimized them by hand.  But I can't think of any case in this example where that should be necessary.

Maybe an analogy to pointer safety would illustrate what I mean.  It'd be weird to write code like:

  Foo(Ptr* const p) {
    if (p == 0)  error;
    else         p->Bar();
    if (p == 0)  error;
    else         p->Baz();
    ...
    if (p == 0)  error;
    else         p->Boo();
  }

when

  Foo(Ptr* const p) {
    assert(p);
    p->Bar();
    p->Baz();
    ...
    p->Boo();
  }

is the same thing, but simpler and more efficient.  The analogy is that the second version compartmentalizes the potential weird behavior (concurrency :: garbage pointer), instead of sprinkling the safety concerns throughout the implementation.

Anyway, pie in the sky.  Let the flames begin.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 22

•

16 years ago

New (soon to executable), rewrite specification:

rewrite SyncPrimitiveUpgrade {
  type PRLock => mozilla::Mutex
  call PR_NewLock => mozilla::Mutex::NewMutex
  call PR_DestroyLock => mozilla::Mutex::DestroyMutex
  call PR_Lock(lock) => lock->Lock()
  call PR_Unlock(lock) => lock->Unlock()

  type PRMonitor => mozilla::Monitor
  call PR_NewMonitor => mozilla::Monitor::NewMonitor
  call PR_DestroyLock => mozilla::Monitor::DestroyMonitor
  call PR_EnterMonitor(mon) => mon->Enter()
  call PR_ExitMonitor(mon) => mon->Exit()
  call PR_Wait(mon, ticks) => mon->Wait(ticks)
  call PR_Notify(mon) => mon->Notify()
  call PR_NotifyAll(mon) => mon->NotifyAll()

  type PRCondVar => mozilla::CondVar
  call PR_NewCondVar => mozilla::CondVar::NewCondVar
  call PR_DestroyCondVar => mozilla::CondVar::DestroyCondVar
  call PR_WaitCondVar(cvar, ticks) => cvar->Wait(ticks)
  call PR_NotifyCondVar(cvar) => cvar->Notify()
  call PR_NotifyAllCondVar(cvar) => cvar->NotifyAll()
}

Brendan Eich [:brendan]

Comment 23

•

16 years ago

I completely agree with everything in comment #21 -- I said something in another bug about bogo-Java-monitors being replaced by mutex+condvar. So who disagrees?

/be

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 24

•

16 years ago

Replacing the monitor with a mutex and conditional variable is completely reasonable.

Changing the decoder model to be purely message-passing, I'm not sure. We started it that way but it didn't work out very well. One problem is that for A/V sync you really do want the main thread peeking into shared state to see what the "current" video frame is. It would be interesting to have a broader discussion about that in some better forum.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 25

•

16 years ago

(In reply to comment #24)
> One problem is that for
> A/V sync you really do want the main thread peeking into shared state to see
> what the "current" video frame is.

Do you mean the actual frame contents, or some small frame identifier (e.g., sequence number/address)?  If the latter, I had imagined the decoder pushing this information to the main thread's control queue.  For an integer or similar, that should be cheap.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 26

•

16 years ago

For that to work, you'd have to flush the control queue every time the main thread paints the video, and in general the decoder will be sending events that require the main thread to run script, which we don't want to do during painting.

Maybe you could do it with multiple control channels or something. There might be some advantage over the current design, but we'd have to think about it.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 487918

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

No longer blocks: 456272

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 493080

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 493112

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 493116

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 493132

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 493138

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

16 years ago

Depends on: 493146

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 27

•

16 years ago

Round 1 of this upgrade has been completed.  For the record, all the mozilla code could have been rewritten in one shot, this afternoon and evening; however, I stopped short of that for reasons related to the tryserver and reviewer latency.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Comment 28

•

16 years ago

(In reply to comment #24)
> Replacing the monitor with a mutex and conditional variable is completely
> reasonable.
> 
> Changing the decoder model to be purely message-passing, I'm not sure. We
> started it that way but it didn't work out very well. One problem is that for
> A/V sync you really do want the main thread peeking into shared state to see
> what the "current" video frame is. It would be interesting to have a broader
> discussion about that in some better forum.

Update: we (the Electrolysis team + media team) will attempt to use our IPC language (https://wiki.mozilla.org/IPC_Protocols) to make this happen.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Assignee

Updated

•

14 years ago

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → DUPLICATE

BMO Automation

Updated

•

7 years ago

Product: Core → Firefox Build System

BMO Automation

Updated

•

2 years ago

Product: Firefox Build System → Developer Infrastructure