Closed Bug 913653 Opened 6 years ago Closed 6 years ago

IOInterposer should not call observers while holding mutex

Categories

(Core :: General, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla31

People

(Reporter: jonasfj, Assigned: aklotz)

References

Details

Attachments

(1 file, 4 obsolete files)

Following p1-io-interposer-refactoring.patch (pending review) attached to bug 902587,
we'll be locking the list of IOInterposeObservers when we:
  A) Report an IO observation,
  B) Register an observer, and
  C) Unregister an observer.

This will prevent IO operations from being reported in parallel.

We should investigate the performance impact, especially, since later patches in
bug 902587 are looking to make an IOInterposerObserver that queries the OS for
filenames on each IO operation.
Though, that particular case, may be mitigated by a filename cache.

Nevertheless, as aklotz pointed out, future IOInterposeObservers are bound do even
more expensive things.

**To measure performance impact** I suggest that we count the accumulated time
threads wait to acquire the lock in question. But we'll probably need to measure
this on multiples types of hardware.

I have following work in progress ideas for concurrence control:
  1. Single mutex locking the entire list in (A), (B) and (C) [Current Approach]
  2. Shared mutex that allows for multiple readers and exclusive writes,
     (See: http://stackoverflow.com/a/244376/68333, this is not without overhead)
  3. List locked at entry level or similar structure.
     (Only effective if we have multiple observers).
  4. ThreadLocal observer lists, that can be locked and updated by functions that
     modify the observer lists.

In any event, we need to consider the case where an IOInterposerObserver decides
to lock in order to update a hash table with filenames.
(I mention this case as Part 3 of bug 902587, will do exactly this).

So at the end of the day we might also consider a dedicated thread for IO reporting,
then using a concurrent producer-consumer pattern where IO reporters produce
observations and append them to a queue, for a consumer thread to invoke observers
with. Though, this would conflict with lazy filename querying, but then again that
could be mitigated by filehandle/filename caching.

Thoughts? - suggestions are most welcome :)
So just to get some numbers I ran crashtest and crashtest-ipc with a hack that counts the number
of seconds we wait to acquire the lock in IOInterposer::Report().

Suite:          Accumulated Wait:     Number of IO Reports:
crashtest       0.289 s               247863
crashtest-ipc   0.194 s               248656

I think the crashtests does a lot of I/O, though perhaps not so much concurrent I/O.
Anyways, please suggest more appropriate test suites or a better approach to testing this.
(Maybe I should also record longest wait, and accumulated I/O time). 

Note. These tests were run with patches Part 1, 2A, 2B, 2C and 3 from bug 902587.
So the TelemetryIOInterposeObserver does query the OS for a filename on each IO observation.

Again, this is just some preliminary numbers. But we have to keep in mind that compared to
reading/writing from/to disk waiting for access to a data structure might be a non-issue.
Attached patch Experiment (obsolete) — Splinter Review
Assignee: nobody → aklotz
Attachment #8341409 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Component: Gecko Profiler → General
Summary: Mutex vs shared lock performance for the IOInterposers observer lists → IOInterposer should not call observers while holding mutex
The idea with this patch is that there is a generation count associated with the observer list.

Every time an observer is registered or unregistered, the lock is taken and the generation count is incremented. Furthermore, during (un)registration the existing master list isn't modified; instead a new copy is created and the modifications are made to this new copy prior to anointing it as the new master list. Finally the lock is released.

Threads that are monitored by IOInterposer must register and unregister themselves. Registration allocates some thread-local data that includes a private copy of the generation count and a RefPtr to the observer lists.

If a thread's generation count equals the master generation count, it uses its local reference to the lists. This is the common case.

If the two counts differ, the thread takes the lock, modifies its RefPtr to point to the current master list, and releases the lock. This path is triggered rarely, since we only (un)register observers a handful of times over the lifetime of the process.

Note that I declared the Atomics as having ReleaseAcquire semantics. I don't think that those variables need sequential consistency with respect to other variables.
Attachment #8398078 - Attachment is obsolete: true
Attachment #8400850 - Flags: review?(nfroyd)
Whoops, I had to fix a mistake. Please see comment 4 for patch details.
Attachment #8400850 - Attachment is obsolete: true
Attachment #8400850 - Flags: review?(nfroyd)
Attachment #8400908 - Flags: review?(nfroyd)
Comment on attachment 8400908 [details] [diff] [review]
Eliminate need to take lock when calling IOInterposer observers

Review of attachment 8400908 [details] [diff] [review]:
-----------------------------------------------------------------

Apologies for the review latency.  I think I buy all this, r=me with the comments below addressed.

::: ipc/chromium/src/base/thread.cc
@@ +139,5 @@
>  
>  void Thread::ThreadMain() {
>    char aLocal;
>    profiler_register_thread(name_.c_str(), &aLocal);
> +  mozilla::IOInterposer::RegisterCurrentThread();

This sort of profiler + IOInterposer registration makes me think we should have some sort of "hey, there's a new system thread" observer thingy.  Can you file a new bug for that and CC BenWa to see what he thinks about that?

@@ +168,5 @@
>  
>    // Assert that MessageLoop::Quit was called by ThreadQuitTask.
>    DCHECK(GetThreadWasQuitProperly());
>  
> +  mozilla::IOInterposer::UnregisterCurrentThread();

Filing a new bug on AutoThreadRegistrar or something like that would be good too.

::: xpcom/build/IOInterposer.cpp
@@ +14,5 @@
> +// does not clean up its data before shutdown.
> +#undef MOZILLA_INTERNAL_API
> +#include "mozilla/RefPtr.h"
> +#define MOZILLA_INTERNAL_API
> +#endif // defined(MOZILLA_INTERNAL_API)

Do you want to:

#else
#include "mozilla/RefPtr.h"
#endif

?  Or maybe:

#else
#error blah blah blah
#endif

there?

@@ +40,5 @@
> +void VectorRemove(std::vector<T>& vector, const T& element)
> +{
> +  typename std::vector<T>::iterator newEnd = std::remove(vector.begin(),
> +                                                         vector.end(), element);
> +  vector.erase(newEnd, vector.end());

Erasing from after the removed element to the end of the vector seems wrong.

@@ +146,5 @@
> +        {
> +          // Invalid IO operation, see documentation comment for
> +          // IOInterposer::Report()
> +          MOZ_ASSERT(false);
> +          // Just ignore is in non-debug builds.

Nit: "ignore it"

@@ +191,5 @@
> +class MasterList
> +{
> +public:
> +  MasterList()
> +    : mLock(PR_NewLock())

Is there a reason that you are using PR_Locks directly rather than going through mozilla::Mutex?  I guess we were already using raw PR_Locks, so a followup to use mozilla::Mutex would be good...unless there's some rationale here that I don't understand and/or have forgotten.

@@ +213,5 @@
> +  void
> +  Register(IOInterposeObserver::Operation aOp, IOInterposeObserver* aObserver)
> +  {
> +    AutoPRLock lock(mLock);
> +    

Nit: trailing whitespace.

@@ +311,5 @@
> +    }
> +    mObserverLists = newLists;
> +    mCurrentGeneration++;
> +  }
> +  

Nit: trailing whitespace.

@@ +348,5 @@
> +  IOInterposeObserver::Operation    mObservedOperations;
> +  // Used for quickly disabling everything by IOInterposer::Disable()
> +  Atomic<bool, ReleaseAcquire>      mIsEnabled;
> +  // Used to inform threads that the master observer list has changed
> +  Atomic<uint32_t, ReleaseAcquire>  mCurrentGeneration;

I don't think you can justify the ReleaseAcquire ordering of these members.  Please use the default or provide a detailed defense of ReleaseAcquire ordering.

@@ +509,4 @@
>      return;
>    }
> +  MOZ_ASSERT(!sThreadLocalData.get());
> +  PerThreadData* curThreadData = new PerThreadData(aIsMainThread);

Since you are removing the Metro-specific stuff here, can we do something like:

static bool everSawMainThread = false;
if (aIsMainThread) {
  MOZ_ASSERT(!everSawMainThread);
  everSawMainThread = true;
}

?

@@ +518,5 @@
> +{
> +  if (!sThreadLocalData.initialized()) {
> +    return;
> +  }
> +  PerThreadData* curThreadData = sThreadLocalData.get();

Probably want to MOZ_ASSERT this so calling UnregisterCurrentThread twice gets appropriately flagged.
Attachment #8400908 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd (:froydnj) from comment #6)
> Comment on attachment 8400908 [details] [diff] [review]
> Eliminate need to take lock when calling IOInterposer observers
> 
> Review of attachment 8400908 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Apologies for the review latency.  I think I buy all this, r=me with the
> comments below addressed.
> 
> ::: ipc/chromium/src/base/thread.cc
> @@ +139,5 @@
> >  
> >  void Thread::ThreadMain() {
> >    char aLocal;
> >    profiler_register_thread(name_.c_str(), &aLocal);
> > +  mozilla::IOInterposer::RegisterCurrentThread();
> 
> This sort of profiler + IOInterposer registration makes me think we should
> have some sort of "hey, there's a new system thread" observer thingy.  Can
> you file a new bug for that and CC BenWa to see what he thinks about that?

Filed bug 993291.

> 
> @@ +168,5 @@
> >  
> >    // Assert that MessageLoop::Quit was called by ThreadQuitTask.
> >    DCHECK(GetThreadWasQuitProperly());
> >  
> > +  mozilla::IOInterposer::UnregisterCurrentThread();
> 
> Filing a new bug on AutoThreadRegistrar or something like that would be good
> too.

I commented on that in the new bug.

> 
> ::: xpcom/build/IOInterposer.cpp
> @@ +14,5 @@
> > +// does not clean up its data before shutdown.
> > +#undef MOZILLA_INTERNAL_API
> > +#include "mozilla/RefPtr.h"
> > +#define MOZILLA_INTERNAL_API
> > +#endif // defined(MOZILLA_INTERNAL_API)
> 
> Do you want to:
> 
> #else
> #include "mozilla/RefPtr.h"
> #endif
> 
> ?  Or maybe:
> 
> #else
> #error blah blah blah
> #endif
> 
> there?

Good catch. Fixed.

> 
> @@ +40,5 @@
> > +void VectorRemove(std::vector<T>& vector, const T& element)
> > +{
> > +  typename std::vector<T>::iterator newEnd = std::remove(vector.begin(),
> > +                                                         vector.end(), element);
> > +  vector.erase(newEnd, vector.end());
> 
> Erasing from after the removed element to the end of the vector seems wrong.

Actually it's correct - std::remove just shifts the remaining elements to the front of the vector, so newEnd points to the first "removed" element at the back -- the range form of erase still needs to be called to clear out the extra elements.

> 
> @@ +146,5 @@
> > +        {
> > +          // Invalid IO operation, see documentation comment for
> > +          // IOInterposer::Report()
> > +          MOZ_ASSERT(false);
> > +          // Just ignore is in non-debug builds.
> 
> Nit: "ignore it"

Fixed.

> 
> @@ +191,5 @@
> > +class MasterList
> > +{
> > +public:
> > +  MasterList()
> > +    : mLock(PR_NewLock())
> 
> Is there a reason that you are using PR_Locks directly rather than going
> through mozilla::Mutex?  I guess we were already using raw PR_Locks, so a
> followup to use mozilla::Mutex would be good...unless there's some rationale
> here that I don't understand and/or have forgotten.

The concern here was that IOInterposer is intentionally leaked until the process terminates. Jonas considered using OffTheBooksMutex here but even with that we could crash in debug builds due to BlockingResourceBase::Shutdown clearing its static members.

> 
> @@ +213,5 @@
> > +  void
> > +  Register(IOInterposeObserver::Operation aOp, IOInterposeObserver* aObserver)
> > +  {
> > +    AutoPRLock lock(mLock);
> > +    
> 
> Nit: trailing whitespace.

Fixed.

> 
> @@ +311,5 @@
> > +    }
> > +    mObserverLists = newLists;
> > +    mCurrentGeneration++;
> > +  }
> > +  
> 
> Nit: trailing whitespace.

Fixed.

> 
> @@ +348,5 @@
> > +  IOInterposeObserver::Operation    mObservedOperations;
> > +  // Used for quickly disabling everything by IOInterposer::Disable()
> > +  Atomic<bool, ReleaseAcquire>      mIsEnabled;
> > +  // Used to inform threads that the master observer list has changed
> > +  Atomic<uint32_t, ReleaseAcquire>  mCurrentGeneration;
> 
> I don't think you can justify the ReleaseAcquire ordering of these members. 
> Please use the default or provide a detailed defense of ReleaseAcquire
> ordering.

Switched to default.

> 
> @@ +509,4 @@
> >      return;
> >    }
> > +  MOZ_ASSERT(!sThreadLocalData.get());
> > +  PerThreadData* curThreadData = new PerThreadData(aIsMainThread);
> 
> Since you are removing the Metro-specific stuff here, can we do something
> like:
> 
> static bool everSawMainThread = false;
> if (aIsMainThread) {
>   MOZ_ASSERT(!everSawMainThread);
>   everSawMainThread = true;
> }
> 
> ?

There's still some Metro-specific code that calls into this function. Until all of that is purged I don't think that it is safe to make the "first caller is always the main thread" assumption.

> 
> @@ +518,5 @@
> > +{
> > +  if (!sThreadLocalData.initialized()) {
> > +    return;
> > +  }
> > +  PerThreadData* curThreadData = sThreadLocalData.get();
> 
> Probably want to MOZ_ASSERT this so calling UnregisterCurrentThread twice
> gets appropriately flagged.

Done.
Updated patch. Carrying forward r+.
Attachment #8400908 - Attachment is obsolete: true
Attachment #8403537 - Flags: review+
(In reply to Aaron Klotz [:aklotz] from comment #7)
> > @@ +40,5 @@
> > > +void VectorRemove(std::vector<T>& vector, const T& element)
> > > +{
> > > +  typename std::vector<T>::iterator newEnd = std::remove(vector.begin(),
> > > +                                                         vector.end(), element);
> > > +  vector.erase(newEnd, vector.end());
> > 
> > Erasing from after the removed element to the end of the vector seems wrong.
> 
> Actually it's correct - std::remove just shifts the remaining elements to
> the front of the vector, so newEnd points to the first "removed" element at
> the back -- the range form of erase still needs to be called to clear out
> the extra elements.

I read the docs more carefully and you are indeed correct.  Hooray STL.

> > @@ +191,5 @@
> > > +class MasterList
> > > +{
> > > +public:
> > > +  MasterList()
> > > +    : mLock(PR_NewLock())
> > 
> > Is there a reason that you are using PR_Locks directly rather than going
> > through mozilla::Mutex?  I guess we were already using raw PR_Locks, so a
> > followup to use mozilla::Mutex would be good...unless there's some rationale
> > here that I don't understand and/or have forgotten.
> 
> The concern here was that IOInterposer is intentionally leaked until the
> process terminates. Jonas considered using OffTheBooksMutex here but even
> with that we could crash in debug builds due to
> BlockingResourceBase::Shutdown clearing its static members.

OK.  Can you provide a comment about the intentional leaking here?

> > @@ +509,4 @@
> > >      return;
> > >    }
> > > +  MOZ_ASSERT(!sThreadLocalData.get());
> > > +  PerThreadData* curThreadData = new PerThreadData(aIsMainThread);
> > 
> > Since you are removing the Metro-specific stuff here, can we do something
> > like:
> > 
> > static bool everSawMainThread = false;
> > if (aIsMainThread) {
> >   MOZ_ASSERT(!everSawMainThread);
> >   everSawMainThread = true;
> > }
> > 
> > ?
> 
> There's still some Metro-specific code that calls into this function. Until
> all of that is purged I don't think that it is safe to make the "first
> caller is always the main thread" assumption.

Hm.  Then is it safe to remove the XP_WIN code?  (Writing this in haste, or I'd have a longer look at it.)
(In reply to Nathan Froyd (:froydnj) from comment #9)
> > > @@ +191,5 @@
> > > > +class MasterList
> > > > +{
> > > > +public:
> > > > +  MasterList()
> > > > +    : mLock(PR_NewLock())
> > > 
> > > Is there a reason that you are using PR_Locks directly rather than going
> > > through mozilla::Mutex?  I guess we were already using raw PR_Locks, so a
> > > followup to use mozilla::Mutex would be good...unless there's some rationale
> > > here that I don't understand and/or have forgotten.
> > 
> > The concern here was that IOInterposer is intentionally leaked until the
> > process terminates. Jonas considered using OffTheBooksMutex here but even
> > with that we could crash in debug builds due to
> > BlockingResourceBase::Shutdown clearing its static members.
> 
> OK.  Can you provide a comment about the intentional leaking here?

We already have one at IOInterposer.cpp:343.

> 
> > > @@ +509,4 @@
> > > >      return;
> > > >    }
> > > > +  MOZ_ASSERT(!sThreadLocalData.get());
> > > > +  PerThreadData* curThreadData = new PerThreadData(aIsMainThread);
> > > 
> > > Since you are removing the Metro-specific stuff here, can we do something
> > > like:
> > > 
> > > static bool everSawMainThread = false;
> > > if (aIsMainThread) {
> > >   MOZ_ASSERT(!everSawMainThread);
> > >   everSawMainThread = true;
> > > }
> > > 
> > > ?
> > 
> > There's still some Metro-specific code that calls into this function. Until
> > all of that is purged I don't think that it is safe to make the "first
> > caller is always the main thread" assumption.
> 
> Hm.  Then is it safe to remove the XP_WIN code?  (Writing this in haste, or
> I'd have a longer look at it.)

It is. There is XP_WIN stuff in IOInterposer::Init that does what the removed XP_WIN stuff did.
https://hg.mozilla.org/mozilla-central/rev/179b22b687c1
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Depends on: 999888
You need to log in before you can comment on or make changes to this bug.