Don't allow a leaking memory reporter to hang workers

RESOLVED FIXED in Firefox 20

Status

()

defect
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: bent.mozilla, Assigned: bent.mozilla)

Tracking

unspecified
mozilla21
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox18 wontfix, firefox19 affected, firefox20 fixed, firefox21 fixed, firefox-esr17 wontfix, b2g18 fixed, b2g18-v1.0.1 fixed)

Details

(Whiteboard: [MemShrink])

Attachments

(1 attachment, 1 obsolete attachment)

Posted patch Patch, v1 (obsolete) — Splinter Review
Worked this out with jlebar today, our memory reporter tests actually cause workers to hang by re-registering memory reporters after workers are trying to shut down. Since this is possible we should prevent it from hanging the worker.
Attachment #711382 - Flags: review?(justin.lebar+bug)
Whiteboard: [MemShrink]
> Worked this out with jlebar today, our memory reporter tests actually cause
> workers to hang by re-registering memory reporters after workers are trying
> to shut down.

Which test does this?
All the ones that unregister and then re-register reporters... http://mxr.mozilla.org/mozilla-central/search?string=realReporters seems to capture the current list.
Most of our about:memory tests do this.

The bad sequence is:

 * Test grabs the list of memory reporters and unregisters them all.
 * Worker finishes.  It tries to unregister its reporter, but that fails, because it was already unregistered.
 * Test finishes and re-registers all the reporters.
 * Worker waits forever, because memory reporter is never freed (until XPCOM shutdown, which is too late)
r=me with some nits

>diff --git a/dom/workers/WorkerPrivate.h b/dom/workers/WorkerPrivate.h
>--- a/dom/workers/WorkerPrivate.h
>+++ b/dom/workers/WorkerPrivate.h

>+class SharedMutex

Would you mind adding a brief comment explaining what SharedMutex is?

>+{
>+  typedef mozilla::Mutex Mutex;
>+
>+  class RefCountedMutex : public Mutex
>+  {
>+  public:
>+    RefCountedMutex(const char* aName)
>+    : Mutex(aName)
>+    { }
>+
>+    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(RefCountedMutex)
>+
>+  private:
>+    ~RefCountedMutex()
>+    { }
>+  };
>+
>+  nsRefPtr<RefCountedMutex> mMutex;
>+
>+public:
>+  SharedMutex(const char* aName)
>+  : mMutex(new RefCountedMutex(aName))
>+  { }
>+
>+  SharedMutex(SharedMutex& aOther)
>+  : mMutex(aOther.mMutex)
>+  { }
>+
>+  operator Mutex&()
>+  {
>+    MOZ_ASSERT(mMutex);
>+    return *mMutex;
>+  }
>+
>+  operator const Mutex&() const
>+  {
>+    MOZ_ASSERT(mMutex);
>+    return *mMutex;
>+  }
>+
>+  void
>+  AssertCurrentThreadOwns() const
>+  {
>+    MOZ_ASSERT(mMutex);
>+    mMutex->AssertCurrentThreadOwns();
>+  }

I don't think these MOZ_ASSERTS add anything; we'll crash immediately after
each one of them if mMutex is null.  But you don't have to remove them if you
have grown attached.  :)

>diff --git a/dom/workers/WorkerPrivate.cpp b/dom/workers/WorkerPrivate.cpp
>--- a/dom/workers/WorkerPrivate.cpp
>+++ b/dom/workers/WorkerPrivate.cpp

>@@ -1430,26 +1430,27 @@ public:
>     // This should never be used when reporting with workers (hence the "?!").
>     static const char bogusMemoryReporterPath[] = "explicit/workers/?!/";
>     aCompartmentStats->extra2 = const_cast<char*>(bogusMemoryReporterPath);
>   }
> };
> 
> class MemoryReporter MOZ_FINAL : public nsIMemoryMultiReporter
> {
>-  friend class WorkerPrivate;
>-
>+  friend class mozilla::dom::workers::WorkerPrivate;
>+
>+  SharedMutex mMutex;
>   WorkerPrivate* mWorkerPrivate;
>   nsCString mRtPath;
> 
> public:
>   NS_DECL_ISUPPORTS
> 
>-  MemoryReporter(WorkerPrivate* aWorkerPrivate)
>-  : mWorkerPrivate(aWorkerPrivate)
>+  MemoryReporter(SharedMutex& aMutex, WorkerPrivate* aWorkerPrivate)
>+  : mMutex(aMutex), mWorkerPrivate(aWorkerPrivate)

If aMutex is supposed to be aWorkerPrivate's mutex, perhaps you don't need the
aMutex argument?  Otherwise, please add a comment indicating where aMutex is
supposed to come from.

>@@ -2830,161 +2849,136 @@ WorkerPrivate::ScheduleDeletion(bool aWa
> void
> WorkerPrivate::DisableMemoryReporter()
> {
> [...]
>+    // Next disable the memory reporter so that the main thread stops trying to
>+    // signal us.
>+    static_cast<MemoryReporter*>(memoryReporter.get())->Disable();

Could you just make mMemoryReporter an nsRefPtr?
Attachment #711382 - Flags: review?(justin.lebar+bug) → review+
(In reply to Justin Lebar [:jlebar] from comment #4)
> Would you mind adding a brief comment explaining what SharedMutex is?

Done.

> I don't think these MOZ_ASSERTS add anything; we'll crash immediately after
> each one of them if mMutex is null.  But you don't have to remove them if you
> have grown attached.

Yeah, I left them.

> If aMutex is supposed to be aWorkerPrivate's mutex, perhaps you don't need
> the
> aMutex argument?  Otherwise, please add a comment indicating where aMutex is
> supposed to come from.

I made it an internal class so that I don't have to expose some Mutex() getter.

> Could you just make mMemoryReporter an nsRefPtr?

Yeah, since I made it internal it's easy now.
Posted patch Patch, v1.1Splinter Review
[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 813867
User impact if declined: It's not clear... In our default configuration it's unlikely that users would ever see this shutdown hang. We see it on tinderbox because of some tricks that our about:memory tests use, but users can't trigger the same problems by actually using about:memory. The only way that users might see this bug is with some addon that messes with memory reporting somehow.
Testing completed (on m-c, etc.): Fixes bug 835106 locally, patch is on m-i now.
Risk to taking this patch (and alternatives if risky): It's touching some very racy code so there's always some risk it could cause crashes or deadlocks but I expect tinderbox will tell us quickly.
String or UUID changes made by this patch: None
Attachment #711382 - Attachment is obsolete: true
Attachment #711769 - Flags: review+
Attachment #711769 - Flags: approval-mozilla-release?
Attachment #711769 - Flags: approval-mozilla-esr17?
Attachment #711769 - Flags: approval-mozilla-beta?
Attachment #711769 - Flags: approval-mozilla-b2g18?
Attachment #711769 - Flags: approval-mozilla-aurora?
Oh, this really needs to be approved for b2g18 since I somehow can't get that tinderbox to be non-orange without this (all our other tinderboxes are mostly ok).
Comment on attachment 711769 [details] [diff] [review]
Patch, v1.1

This is blocking B2G memory work so we should take it on the b2g18 branch.
Attachment #711769 - Flags: approval-mozilla-b2g18? → approval-mozilla-b2g18+
https://hg.mozilla.org/mozilla-central/rev/845e50f1ca76
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Comment on attachment 711769 [details] [diff] [review]
Patch, v1.1

Bug 813867 only landed on b2g18, and FF20 and up. Why would we want this on Beta 19, Release 18, or ESR 17?
Attachment #711769 - Flags: approval-mozilla-release?
Attachment #711769 - Flags: approval-mozilla-release-
Attachment #711769 - Flags: approval-mozilla-esr17?
Attachment #711769 - Flags: approval-mozilla-esr17-
Attachment #711769 - Flags: approval-mozilla-aurora?
Attachment #711769 - Flags: approval-mozilla-aurora+
Attachment #711769 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Batch edit: Bugs fixed on b2g18 after 1/25 merge to v1.0 branch are fixed on v1.0.1 branch.
You need to log in before you can comment on or make changes to this bug.