Closed Bug 1410209 Opened 2 years ago Closed 2 years ago

Consider including runnable name in leak logs

Categories

(Core :: XPCOM, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: Nika, Assigned: mccr8)

References

(Blocks 2 open bugs)

Details

Attachments

(4 files, 5 obsolete files)

I've had a few cases where leaks which I notice in Gecko are caused by a runnable which is never destroyed holding the universe alive. It could be really useful if we could use the new runnable name system to print out the name of the runnable which leaked when a runnable leak is detected.
This is non-trivial, since we only have a pointer and a class name for leak logging when we print all that output.  We could try introducing class definitions (violating all sorts of layering assumptions) in there to print out more useful information.  I could see this being sort of useful in other cases, e.g. printing out URLs for nsGlobalWindow instance or somesuch.
Priority: -- → P3
Theoretically we could implement this for a small set of specific types, e.g. nsGlobalWindow and Runnable. We'd do a string comparison against the name provided & then perform a pointer cast to the concrete type we're interested in.

That may be way too fragile though. Not sure.
(from bug 1410768)

(In reply to Nathan Froyd [:froydnj] from comment #4)
> Perhaps we should have something like:
> 
> NS_IMPL_ADDREF_WITH_DESCRIPTION(klassname, lambda)

The thing I don't like about this is that we currently don't track the set of live objects in leak checking, under the default settings. I worry about what sort of slowdown we'd create if all of that machinery has to be enabled.

Another approach would be to templatize the implementation used for DEBUG_DUMP_URLS_AT_SHUTDOWN. That's already very well tested, and would not add much overhead.
(In reply to Andrew McCreight [:mccr8] from comment #3)
> (from bug 1410768)
> 
> (In reply to Nathan Froyd [:froydnj] from comment #4)
> > Perhaps we should have something like:
> > 
> > NS_IMPL_ADDREF_WITH_DESCRIPTION(klassname, lambda)
> 
> The thing I don't like about this is that we currently don't track the set
> of live objects in leak checking, under the default settings. I worry about
> what sort of slowdown we'd create if all of that machinery has to be enabled.

Mmm, that's a good point, I forgot about that.

> Another approach would be to templatize the implementation used for
> DEBUG_DUMP_URLS_AT_SHUTDOWN. That's already very well tested, and would not
> add much overhead.

I'd be a little worried about:

a) lock contention for tracking something like Runnables if we made it thread-safe; or
b) missing things created off main thread if we eschewed thread-safetyness.

But for something like BackstagePass (bug 1410768) or nsGlobalWindow (this bug), neither of those concerns would apply.  I have an idea, I'll try that.
(In reply to Nathan Froyd [:froydnj] from comment #4)
> b) missing things created off main thread if we eschewed thread-safetyness.
Ah, right, I forgot that runnables aren't main thread only. For URLs, they simply don't track them off the main thread, and I guess it works well enough. (We used to get some weird crashes due to this before it was fixed...)
If we only care about the `Runnable` type (which are the ones which won't have good names, as they don't implement ADDREF or RELEASE themselves), the name of the runnable is stored in the mName field as a const char* which IIRC is always a string literal (we don't attempt to copy it / allocate or free memory for it). Could we just pass that information into the first nsLogAddRef call and print it out alongside the classname?
nsTraceRefcnt doesn't track individual objects, when we normally run it. It just tracks the number of objects that have been created or destroyed with a particular name, so mName would need to be passed in to every call to nsLogAddRef and nsLogRelease. I think that could be implemented by adding a second level of hash table for mName into nsTraceRefcnt.

I think the "more correct" fix to the particular issue of Runnables would be to make all of the subclasses of Runnable do NS_IMPL_ADDREF_INHERITED and NS_IMPL_RELEASE_INHERITED (a macro to package those up would be nice, if one does not exist already). That's what it is for. It looks like they have to overload Run() anyways. Presumably this could be enforced through some static analysis wizardry. The drawback here is that it will likely require a gigantic patch. I can try looking into that.
I guess what we'd really want for this is some new macros that override AddRef/Release but not QI. Maybe only in debug builds? I kind of fear how much bloat this may cause (though PGO is presumably smart enough to deal with it).
Something like that might do the trick. The drawback is that it requires auditing all 851 subclasses of Runnable. Note in the patch that I had to declare a private dtor. This is needed because the static asserts in refcounting check that the object does not have a public destructor, so the current code is technically not ok, but this will make any patch to fix all of the runnables even longer.
The other option which I thought of was passing mName instead of "Runnable" as aClassName for Runnable, which would (I think) actually work fine, except that runnable names often conflict with actual class names.

We could prepend "Runnable:" to the beginning of mName, but if we do that then these strings are no longer statically allocated, so we run into memory management problems, as aClassName is I'm pretty sure expected to be statically allocated.

Theoretically we could find the places where names for runnables clash with class names & change those, or we could perhaps do some template/macro trickery to add the "Runnable:" prefix to a statically allocated string?
(In reply to Nika Layzell [:mystor] (At RBR Oct.26-28) from comment #11)
> The other option which I thought of was passing mName instead of "Runnable"
> as aClassName for Runnable, which would (I think) actually work fine, except
> that runnable names often conflict with actual class names.

Is that really all that common? I spot checked a few places and they seem to simply pass in the class name. Like you suggested, maybe we should just fix up the places that don't do this. I don't know what the proper style is for the string that is passed in. So maybe that's the most practical option?

> We could prepend "Runnable:" to the beginning of mName, but if we do that
> then these strings are no longer statically allocated, so we run into memory
> management problems, as aClassName is I'm pretty sure expected to be
> statically allocated.

nsTraceRefCnt.cpp does a PL_strdup() on the strings that are passed in, so that shouldn't matter. The main problem is that we have to create the string every time we call addref and release, which is going to add a lot of overhead to debug builds, I'm guessing.
(In reply to Nika Layzell [:mystor] (At RBR Oct.26-28) from comment #11)
> The other option which I thought of was passing mName instead of "Runnable"
> as aClassName for Runnable, which would (I think) actually work fine, except
> that runnable names often conflict with actual class names.

I spent some time manually converting a dozen or so classes using my overload macro approach, and I've come to think that your approach here is a much better idea. ;) Sure, there are some places where people don't pass in the class name, but it would be easier to just fix those.
It turns out the problem with that approach is that it causes assertions in nsTraceRefCnt for classes that already overload AddRef/Release, because they are telling it that the subclass has two sizes: its own size, plus the size of Runnable.
Blocks: 1411644
ErrorEvent is also the name of a DOM event class, which causes an
assertion in the XPCOM leak checking system with the second patch.
Attachment #8921554 - Attachment is obsolete: true
Most subclasses of Runnable don't bother to override AddRef and Release, so XPCOM leak checking ends up reporting Runnable, which makes it impossible to know what is actually leaking.

Each subclass of Runnable is already required to pass in the name of the class, which is stored in the field mName. This patch changes Runnable to use mName as the class name for XPCOM leak checking, thus giving each subclass a specific name without needing to change the hundreds of existing subclasses of Runnable.

The limitation of this approach is the classes that DO use NS_IMPL_ADDREF/RELEASE_INHERITED end up using the same class name that is used by the superclass AddRef/Release, but with a different size, which causes assertions in the leak checker. To work around this, I change NS_IMPL_ADDREF/RELEASE_INHERITED to not call into NS_LOG_ADDREF/RELEASE for classes that are a subclass of Runnable.

The problem with that is that IsBaseOf<C,D> only works when C and D are defined, but classes that include nsISupportsImpl.h do not necessarily include nsThreadUtils.h. To fix this, I take the ugly approach of doing #undef of the AddRef and Release macros in nsThreadUtils.h. Any subclass of Runnable must include that header file, and thus will get the updated macros.

I added an explicit include of nsISupportsImpl.h in nsThreadUtils.h to avoid the possibility of it not including it, but then some other file includes nsThreadUtils.h before it includes nsISupportsImpl.h, causing a redefinition.
Attachment #8922500 - Attachment is obsolete: true
Attachment #8922501 - Attachment is obsolete: true
Let me know if either of you have any feedback on this patch. I know the implementation is a little ugly.
Flags: needinfo?(nika)
Flags: needinfo?(nfroyd)
Comment on attachment 8922960 [details]
Bug 1410209, part 3 - Use Runnable::mName for the class name with XPCOM leak checking.

https://reviewboard.mozilla.org/r/194134/#review199260

::: commit-message-13f51:20
(Diff revision 1)
> +The problem with that is that IsBaseOf<C,D> only works when C and D
> +are defined, but classes that include nsISupportsImpl.h do not
> +necessarily include nsThreadUtils.h. To fix this, I take the ugly
> +approach of doing #undef of the AddRef and Release macros in
> +nsThreadUtils.h. Any subclass of Runnable must include that header
> +file, and thus will get the updated macros.

Can you use `IsConvertible<_class*, mozilla::Runnable*>` along with a forward declaration? I think that would let you get around the IsBaseOf defined requirement, and make this patch somewhat less gross.

You'd still want a bunch of comments explaining this decision.

::: xpcom/threads/nsThreadUtils.h:1795
(Diff revision 1)
>  GetMainThreadSerialEventTarget();
>  
>  } // namespace mozilla
>  
> +
> +// Oh no....

This definitely needs a better comment than this. You should at the very least include a reference to this bug number and comment, and should also include an inline explanation of what's going on here, along with why we can't do it in the main refcount file.

::: xpcom/threads/nsThreadUtils.cpp:48
(Diff revision 1)
>  {
>    *aIdleDeadline = TimeStamp();
>    return NS_OK;
>  }
>  
> -NS_IMPL_ISUPPORTS(Runnable, nsIRunnable, nsINamed)
> +#define NS_IMPL_ADDREFX(_class)                                                \

I think that having these macros and then invoking obscures intent here - I think we should have a comment noting that this is a custom implementation of AddRef and Release, and then just write the implementation literally.

You can also probably remove the mRefCnt.isThreadSafe check, as we know that Runnable has a threadsafe refcount.

Again, like above, this should have a comment explaining the difference between these implementations and the macro ones, as well as a reference back to this bug.
Flags: needinfo?(nika)
Oh, sorry, I should have emphasized was just looking for some high-level feedback. I know a lot of the low level stuff is terrible, and I agree with all of what you said on comments etc. Thanks for the suggestion on IsConvertible<>. I'll check that out. It would be great if I could get rid of the macro undef stuff. The INHERITED macros aren't too complicated, but they are complicated enough.
New piece of feedback: I was using this patch to try to diagnose a bug, and I ran into the issue that it's hard to tell which "objects" in the leak log are actually runnables, and which ones are normal objects.

I don't think I know of any way to mark them more clearly as runnables without dynamic allocations though :-/.
Attached patch Fix a windows compiler error (obsolete) — Splinter Review
I tried to run this locally and ran into a win64 compiler error. This patch just adds the string to the call to NewRunnableFunction to make it build.
Assignee: nobody → nika
Ah, good catch. I used SearchFox to find all of the call sites, and it misses this Windows-only code. I'll have to do a second pass of raw text search.
Flags: needinfo?(nfroyd)
Blocks: 1375369
Blocks: 1294873
Assignee: nika → continuation
IsConvertible worked just fine. (In case it isn't clear, this whole patch is an implementation of Nika's idea from comment 11. Fortunately runnable names don't seem to conflict with class names too often.)
Attachment #8922958 - Attachment is obsolete: true
Attachment #8923576 - Attachment is obsolete: true
Suggestions are welcome on any of these runnable names. I was not incredibly inspired when I came up with them.
Comment on attachment 8922959 [details]
Bug 1410209, part 2 - Disambiguate some service worker runnable names.

https://reviewboard.mozilla.org/r/194132/#review200668
Attachment #8922959 - Flags: review?(bkelly) → review+
Comment on attachment 8922961 [details]
Bug 1410209, part 4 - Add names to some IPC runnables.

https://reviewboard.mozilla.org/r/194136/#review200862

How did you generate all these names? o_O

I found two names that don't have the Runnable suffix, maybe you want to add it to them?

::: gfx/layers/ipc/CompositorBridgeChild.cpp:499
(Diff revision 2)
>    nsIWidget::CaptureRegisteredPlugins(aParentWidget);
>  
>    // Bounce the call to SendAllPluginsCaptured off the ImageBridgeChild loop,
>    // to make sure that the image updates on that thread have been processed.
>    ImageBridgeChild::GetSingleton()->GetMessageLoop()->PostTask(
> -    NewRunnableFunction(&ScheduleSendAllPluginsCaptured, this,
> +    NewRunnableFunction("ScheduleSendAllPluginsCaptured",

ScheduleSendAllPluginsCapturedRunnable?

::: gfx/vr/ipc/VRManagerChild.cpp:164
(Diff revision 2)
>    RefPtr<VRManagerChild> selfRef = this;
>  
>    // The DeferredDestroyVRManager task takes ownership of
>    // the VRManagerChild and will release it when it runs.
>    MessageLoop::current()->PostTask(
> -             NewRunnableFunction(DeferredDestroy, selfRef));
> +             NewRunnableFunction("VRManagerChildDestroy",

VRManagerChildDestroyRunnable?
Attachment #8922961 - Flags: review?(kchen) → review+
Comment on attachment 8924204 [details]
Bug 1410209, part 1 - Rename OSFile's ErrorEvent to avoid name conflicts.

https://reviewboard.mozilla.org/r/195422/#review201086

This is fine.
Attachment #8924204 - Flags: review?(nfroyd) → review+
Comment on attachment 8922960 [details]
Bug 1410209, part 3 - Use Runnable::mName for the class name with XPCOM leak checking.

https://reviewboard.mozilla.org/r/194134/#review201090

The hacks in `nsISupportsImpl.h` are a little less gross than I had imagined.  I think I'm OK with this, but I would like to have a discussion about whether some slightly more general facility is worth it before we commit to this.

::: xpcom/base/nsISupportsImpl.h:656
(Diff revision 2)
> + * @param _name The class name to be passed to XPCOM leak checking
>   */
> -#define NS_IMPL_ADDREF(_class)                                                \
> +#define NS_IMPL_ADDREF_NAME(_class, _name)                                    \
>  NS_IMETHODIMP_(MozExternalRefCountType) _class::AddRef(void)                  \
>  {                                                                             \
>    MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING(_class)                                  \

Maybe we should `MOZ_ASSERT` that `_name` is non-null here?  I could easily see some other class trying to use this facility and injecting nullptr into the `nsTraceRefcnt` code, which is probably not prepared for such a thing...

::: xpcom/base/nsISupportsImpl.h:710
(Diff revision 2)
>   * of object allocated with placement new).
>   */
> -#define NS_IMPL_RELEASE_WITH_DESTROY(_class, _destroy)                        \
> +#define NS_IMPL_RELEASE_NAME_WITH_DESTROY(_class, _name, _destroy)            \
>  NS_IMETHODIMP_(MozExternalRefCountType) _class::Release(void)                 \
>  {                                                                             \
>    MOZ_ASSERT(int32_t(mRefCnt) > 0, "dup release");                            \

Same thing here for asserting non-nullptrness of `_name`.

::: xpcom/base/nsISupportsImpl.h:1119
(Diff revision 2)
> -  NS_LOG_ADDREF(this, r, #Class, sizeof(*this));                              \
> +  if (!mozilla::IsConvertible<Class*, mozilla::Runnable*>::value) {             \
> +    NS_LOG_ADDREF(this, r, #Class, sizeof(*this));                            \
> +  }                                                                           \

WDYT about making this more general right off the bat?  Say if you're going to have a class that handles its own logging, require it to `typedef TrueType RefCntLoggingHandled;` or something like that, and then check that property here.  I'm not super-excited about potentially adding a bunch of class checks here, but maybe the above suggestion is too heavyweight?  (A solution that involved less modification of `nsISupportsImpl.h` would also do wonders for build times...)

::: xpcom/base/nsISupportsImpl.h:1129
(Diff revision 2)
> -  NS_LOG_RELEASE(this, r, #Class);                                            \
> +  if (!mozilla::IsConvertible<Class*, mozilla::Runnable*>::value) {             \
> +    NS_LOG_RELEASE(this, r, #Class);                                          \
> +  }                                                                           \

For avoidance of doubt, we'd check the same property here, if we did something more general.
Attachment #8922960 - Flags: review?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #37)
> WDYT about making this more general right off the bat?  Say if you're going
> to have a class that handles its own logging, require it to `typedef
> TrueType RefCntLoggingHandled;` or something like that, and then check that
> property here.  I'm not super-excited about potentially adding a bunch of
> class checks here, but maybe the above suggestion is too heavyweight? 

That sounds better than my current approach. How do I check for the property?

> (A solution that involved less modification of `nsISupportsImpl.h` would also
> do wonders for build times...)

What do you mean by this? That my current approach would require modifying the ISupports header for any other classes that might need it?
ni? for comment 38.
Flags: needinfo?(nfroyd)
(In reply to Andrew McCreight [:mccr8] from comment #38)
> (In reply to Nathan Froyd [:froydnj] from comment #37)
> > WDYT about making this more general right off the bat?  Say if you're going
> > to have a class that handles its own logging, require it to `typedef
> > TrueType RefCntLoggingHandled;` or something like that, and then check that
> > property here.  I'm not super-excited about potentially adding a bunch of
> > class checks here, but maybe the above suggestion is too heavyweight? 
> 
> That sounds better than my current approach. How do I check for the property?

Something like:

static_assert(Class::RefCntLoggingHandled::value, "...");

and I think you'd have to add a RefCntLoggingHandled to nsISupports as well?  Ugh, if people use NS_IMPL_ADDREF_INHERITED for non-nsISupports classes, you have to get into template SFINAE hacks...

Maybe this is not so simple?

> > (A solution that involved less modification of `nsISupportsImpl.h` would also
> > do wonders for build times...)
> 
> What do you mean by this? That my current approach would require modifying
> the ISupports header for any other classes that might need it?

Right.
Flags: needinfo?(nfroyd)
Blocks: 1416927
It looks like Nathan is off until next year (and you are off until next week...). I was going to ask him this, but since he's not around, what do you think about landing this without the generalized form Nathan suggested? I think there's a strong argument for not prematurely generalizing something, and it feels like it isn't entirely clear how this would be generalized, and I'd like to get this landed at some point to help with intermittent leaks.
Flags: needinfo?(erahm)
froydnj, over IRC said:
mccr8: if the more general solution is really a hassle, I guess we can just use the current patch and refine things later
Flags: needinfo?(erahm)
Looks like nothing broke, in terms of runnable name collisions: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cad0cd22e97848f9d02a955b573fd2c6dc0526d2
Comment on attachment 8922960 [details]
Bug 1410209, part 3 - Use Runnable::mName for the class name with XPCOM leak checking.

https://reviewboard.mozilla.org/r/194134/#review214652

::: xpcom/base/nsISupportsImpl.h:653
(Diff revision 2)
>  /**
>   * Use this macro to implement the AddRef method for a given <i>_class</i>
>   * @param _class The name of the class implementing the method
> + * @param _name The class name to be passed to XPCOM leak checking
>   */
> -#define NS_IMPL_ADDREF(_class)                                                \
> +#define NS_IMPL_ADDREF_NAME(_class, _name)                                    \

shouldn't this be called something like
NS_IMPL_NAMED_ADDREF or NS_IMPL_ADDREF_WITH_NAME.
That way the macro name still somehow hints that it is implementing *addref* + something else.
Perhaps _NAMED_ would work then better with
_WITH_DESTROY in release case.

::: xpcom/base/nsISupportsImpl.h:1119
(Diff revision 2)
>  #define NS_IMPL_ADDREF_INHERITED(Class, Super)                                \
>  NS_IMETHODIMP_(MozExternalRefCountType) Class::AddRef(void)                   \
>  {                                                                             \
>    MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING(Class)                                   \
>    nsrefcnt r = Super::AddRef();                                               \
> -  NS_LOG_ADDREF(this, r, #Class, sizeof(*this));                              \
> +  if (!mozilla::IsConvertible<Class*, mozilla::Runnable*>::value) {             \

extra space before \ ?

::: xpcom/base/nsISupportsImpl.h:1119
(Diff revision 2)
>  #define NS_IMPL_ADDREF_INHERITED(Class, Super)                                \
>  NS_IMETHODIMP_(MozExternalRefCountType) Class::AddRef(void)                   \
>  {                                                                             \
>    MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING(Class)                                   \
>    nsrefcnt r = Super::AddRef();                                               \
> -  NS_LOG_ADDREF(this, r, #Class, sizeof(*this));                              \
> +  if (!mozilla::IsConvertible<Class*, mozilla::Runnable*>::value) {             \

If we want to do what froydnj suggests, that can be done in a followup.
Attachment #8922960 - Flags: review?(bugs) → review+
Blocks: 1426696
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7b1a39e7a036
part 1 - Rename OSFile's ErrorEvent to avoid name conflicts. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/4b7fb8a3e422
part 2 - Disambiguate some service worker runnable names. r=bkelly
https://hg.mozilla.org/integration/autoland/rev/cc72b46a44f1
part 3 - Use Runnable::mName for the class name with XPCOM leak checking. r=smaug
https://hg.mozilla.org/integration/autoland/rev/649873f16998
part 4 - Add names to some IPC runnables. r=kanru
Depends on: 1426779
You need to log in before you can comment on or make changes to this bug.