Closed Bug 1347963 Opened 7 years ago Closed 7 years ago

add a recursive mutex type and replace uses of ReentrantMonitor where possible

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox55 --- wontfix
firefox56 --- fixed

People

(Reporter: froydnj, Assigned: froydnj)

References

Details

Attachments

(8 files, 3 obsolete files)

12.47 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
22.98 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
14.98 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
2.69 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
5.88 KB, patch
mozbugz
: review+
Details | Diff | Splinter Review
16.54 KB, patch
kats
: review+
Details | Diff | Splinter Review
2.31 KB, patch
erahm
: review+
Details | Diff | Splinter Review
16.11 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
ReentrantMonitor gets (ab)used in our codebase for cases where recursive mutexes are needed.  But ReentrantMonitor is more expensive than simple recursive mutexes because of its need to support notifications.  Furthermore, all the platforms we care about support "native" recursive mutexes:

* Unices: PTHREAD_MUTEX_RECURSIVE mutex types;
* Windows: CRITICAL_SECTION is acquirable recursively.

This bug would have several steps:

* Create a RecursiveMutex class that simply wraps ReentrantMonitor, but exposes a mutex-like interface;
* Replace all ReentrantMonitor instances that don't use Wait/Notify/etc. with RecursiveMutex.  This is a bit tedious due to the need to modify the RAII lockers as well.
* Add recursive mutex variants to mozglue.  We *could* put these locks solely in XPCOM, since I doubt JS will want recursive locks anytime soon (?), but all the infrastructure to support these lives in mozglue.
* Modify mozilla::RecursiveMutex to use the mozglue locks.
* Profit.

Tiny benchmarks on my Linux machine suggest that a lock/unlock pair for true recursive mutexes is 3-5x faster than enter/exit for ReentrantMonitor.  I haven't checked other platforms, but it wouldn't surprise me if the performance numbers were roughly similar there.  (If you just read the NSPR implementation for PRMonitor, each enter and exit requires a separate lock/unlock for the underlying mutex in PRMonitor, so it stands to reason that a "native" mutex would be at least 2x faster...)
Do you have a list of these things that currently need recursive {mutex,monitor}? I have found very few situations where this isn't a design flaw, and it might be worth just getting rid of them entirely if there are only a few.
Flags: needinfo?(nfroyd)
I don't have anything more sophisticated than:

https://dxr.mozilla.org/mozilla-central/search?q=type-ref%3AReentrantMonitor&redirect=true

Scanning through those, it looks like the media stack uses them fairly heavily, there's some GFX layers bits, image encoders, various networking things...

I'm not a huge fan of them either, but last time I tried chasing a couple down in the media stack, it looked like we really could enter them recursively for valid reasons...I might not have really understood the code, though.  When I was poking around yesterday, the dom/flyweb/ one was definitely unnecessary AFAICT, but I haven't tried understanding the other ones.
Flags: needinfo?(nfroyd)
I have recently done -- and am in the middle of doing some more -- refactoring in the profiler to avoid deadlocks where recursive mutexes would have made things a lot simpler. The profiler has a mutex for a bunch of global state, and most profiler operations lock the mutex, but if any of these operations call back outside the profiler there's a chance that the outside code will then call back into the profiler. I don't know if this counts as a design flaw.
Having a proper recursively-acquirable mutex type makes intent clearer,
and RecursiveMutex also happens to be somewhat faster than
ReentrantMonitor.
Attachment #8883308 - Flags: review?(erahm)
Making nsHttpRequestHead slightly faster with RecursiveMutex is a good thing.
Attachment #8883309 - Flags: review?(mcmanus)
Making nsHttpResponseHead slightly faster with RecursiveMutex is a good thing.
Attachment #8883310 - Flags: review?(mcmanus)
Making nsInputStreamPump slightly faster with RecursiveMutex is a good thing.
Attachment #8883311 - Flags: review?(mcmanus)
Assignee: nobody → nfroyd
Making ChannelEventQueue slightly faster with RecursiveMutex is a good thing.
Attachment #8883368 - Flags: review?(mcmanus)
Making MediaQueue slightly faster with RecursiveMutex is a good thing.
Attachment #8883369 - Flags: review?(gsquelart)
Making ImageContainer slightly faster with RecursiveMutex is a good thing.
We need to fix up some cargo-culting of includes along the way, though.
Attachment #8883370 - Flags: review?(bugmail)
Remember to add <pthread.h> so we're not cargo-culting it.
Attachment #8883371 - Flags: review?(erahm)
Attachment #8883308 - Attachment is obsolete: true
Attachment #8883308 - Flags: review?(erahm)
Attachment #8883370 - Flags: review?(bugmail) → review+
Comment on attachment 8883369 [details] [diff] [review]
part 6 - make MediaQueue use RecursiveMutex

Review of attachment 8883369 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent, thanks for that.

A future exercise would be to find places where a recursive/reentrant mutex/monitor is used, when it could fairly easily replaced with a non-recursive one. (E.g., MediaCache, which I may fix one day...)
Attachment #8883369 - Flags: review?(gsquelart) → review+
(In reply to Gerald Squelart [:gerald] from comment #12)
> A future exercise would be to find places where a recursive/reentrant
> mutex/monitor is used, when it could fairly easily replaced with a
> non-recursive one. (E.g., MediaCache, which I may fix one day...)

Oooh, MediaCache can be non-recursive?  I was going to convert it to RecursiveMutex tomorrow, but if it could be plain Mutex, so much the better!
Flags: needinfo?(gsquelart)
(In reply to Nathan Froyd [:froydnj] from comment #13)
> Oooh, MediaCache can be non-recursive?  I was going to convert it to
> RecursiveMutex tomorrow, but if it could be plain Mutex, so much the better!

MediaCache uses Wait and NotifyAll (so that readers can block, waiting for data from the network), so I doubt you can use a Mutex there.

But I think at least a non-reentrant Monitor should be possible, though my previous attempt at precursory refactoring ended in tears (see bug 1374173). I'll get back to it, some day...
Flags: needinfo?(gsquelart)
Comment on attachment 8883371 [details] [diff] [review]
part 1 - introduce mozilla::RecursiveMutex

Review of attachment 8883371 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just a few questions so f+ for now.

::: xpcom/tests/gtest/TestRecursiveMutex.cpp
@@ +13,5 @@
> +
> +// Basic test to make sure the underlying implementation of RecursiveMutex is,
> +// well, actually recursively acquirable.
> +
> +TEST(RecursiveMutex, SmokeTest)

Thanks for adding this! Do you think we should add a DeadlockDetector test as well?

@@ +19,5 @@
> +  RecursiveMutex mutex("testing mutex");
> +
> +  RecursiveMutexAutoLock lock1(mutex);
> +  RecursiveMutexAutoLock lock2(mutex);
> +  

nit: trailing space

::: xpcom/threads/BlockingResourceBase.cpp
@@ +527,5 @@
>  }
>  
>  
>  //
> +// Debug implementation of RecursiveMutex

It's kind of a shame we're copy and pasting this, I wonder if we could make ReentrantMonitor share with RecursiveMutex?

::: xpcom/threads/RecursiveMutex.cpp
@@ +25,5 @@
> +  // This number was adapted from NSPR.
> +  static const DWORD sLockSpinCount = 100;
> +  BOOL r = InitializeCriticalSectionEx(NativeHandle(mMutex),
> +                                       sLockSpinCount,
> +                                       CRITICAL_SECTION_NO_DEBUG_INFO);

Won't this make aklotz sad?

::: xpcom/threads/RecursiveMutex.h
@@ +62,5 @@
> +  void* mMutex[6];
> +#endif
> +};
> +
> +class MOZ_STACK_CLASS RecursiveMutexAutoLock

Do we want MOZ_RAII instead?

@@ +65,5 @@
> +
> +class MOZ_STACK_CLASS RecursiveMutexAutoLock
> +{
> +public:
> +  explicit RecursiveMutexAutoLock(RecursiveMutex& aRecursiveMutex)

Shouldn't we do the whole GAURD_OBJECT_NOTIFIER dance here? Or is that obsolete now?

@@ +86,5 @@
> +
> +  mozilla::RecursiveMutex* mRecursiveMutex;
> +};
> +
> +class MOZ_STACK_CLASS RecursiveMutexAutoUnlock

Same comments.
Attachment #8883371 - Flags: review?(erahm) → feedback+
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #15)
> ::: xpcom/threads/BlockingResourceBase.cpp
> @@ +527,5 @@
> >  }
> >  
> >  
> >  //
> > +// Debug implementation of RecursiveMutex
> 
> It's kind of a shame we're copy and pasting this, I wonder if we could make
> ReentrantMonitor share with RecursiveMutex?

We could do it with lambda and macros (so strings for error messages had the correct names), I guess, but I'm not super motivated to do this.
Addressed review comments.
Attachment #8884010 - Flags: review?(erahm)
Attachment #8883371 - Attachment is obsolete: true
Comment on attachment 8884010 [details] [diff] [review]
part 1 - introduce mozilla::RecursiveMutex

Review of attachment 8884010 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for adding the test!
Attachment #8884010 - Flags: review?(erahm) → review+
Attachment #8883310 - Flags: review?(mcmanus) → review+
Attachment #8883311 - Flags: review?(mcmanus) → review+
Attachment #8883368 - Flags: review?(mcmanus) → review+
Attachment #8883309 - Flags: review?(mcmanus) → review+
These macros can be used in cases where the `mozilla` namespace might
not refer to the toplevel `mozilla` namespace that was intended.  To
ensure that the macros always refer to the `mozilla` namespace in the
global namespace, use the appropriate qualification.
Attachment #8889613 - Flags: review?(erahm)
Minor modifications for tests, since apparently I wasn't compiling with
a debug build locally.
Attachment #8889614 - Flags: review+
Attachment #8884010 - Attachment is obsolete: true
Attachment #8889613 - Flags: review?(erahm) → review+
Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cec8d814aa83
part 0 - use global qualification for GuardObjects.h macros; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/6c1c1d869619
part 1 - introduce mozilla::RecursiveMutex; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/7232578cc2b1
part 2 - make nsHttpRequestHead use RecursiveMutex; r=mcmanus
https://hg.mozilla.org/integration/mozilla-inbound/rev/a9be4c2f9c2a
part 3 - make nsHttpResponseHead use RecursiveMutex; r=mcmanus
https://hg.mozilla.org/integration/mozilla-inbound/rev/b9bd9b93f786
part 4 - make nsInputStreamPump use RecursiveMutex; r=mcmanus
https://hg.mozilla.org/integration/mozilla-inbound/rev/069e7b402e3d
part 5 - make ChannelEventQueue use RecursiveMutex; r=mcmanus
https://hg.mozilla.org/integration/mozilla-inbound/rev/6c1336c071cc
part 6 - make MediaQueue use RecursiveMutex; r=gerald
https://hg.mozilla.org/integration/mozilla-inbound/rev/12e33b9d6f91
part 7 - make ImageContainer use RecursiveMutex; r=kats
Too late for 55. Mark 55 won't fix.
Blocks: 1385335
Depends on: 1396307
Depends on: 1397037
You need to log in before you can comment on or make changes to this bug.