Closed Bug 335120 Opened 18 years ago Closed 18 years ago

shutdown crash

Categories

(Toolkit Graveyard :: Data Collection/Metrics, defect)

defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bryner, Assigned: bryner)

Details

Attachments

(2 files)

I sometimes get the following crash on shutdown, which I believe is due to extensions/metrics (the nsObserverList subject is "webnavigation-create").  The nsCOMPtr seems to point to freed memory:

>	xpcom_core.dll!nsCOMPtr_base::~nsCOMPtr_base()  Line 81 + 0xd	C++
 	xpcom_core.dll!nsCOMPtr<nsISupports>::~nsCOMPtr<nsISupports>()  + 0xf	C++
 	xpcom_core.dll!ObserverRef::~ObserverRef()  + 0x12	C++
 	xpcom_core.dll!ObserverRef::`scalar deleting destructor'()  + 0xf	C++
 	xpcom_core.dll!nsTArrayElementTraits<ObserverRef>::Destruct(ObserverRef * e=0x02271fa0)  Line 146	C++
 	xpcom_core.dll!nsTArray<ObserverRef>::DestructRange(unsigned int start=0, unsigned int count=1)  Line 574 + 0x9	C++
 	xpcom_core.dll!nsTArray<ObserverRef>::RemoveElementsAt(unsigned int start=0, unsigned int count=1)  Line 468	C++
 	xpcom_core.dll!nsTArray<ObserverRef>::Clear()  Line 479	C++
 	xpcom_core.dll!nsTArray<ObserverRef>::~nsTArray<ObserverRef>()  Line 215 + 0xf	C++
 	xpcom_core.dll!nsObserverList::~nsObserverList()  Line 80 + 0x29	C++
 	xpcom_core.dll!nsObserverList::`scalar deleting destructor'()  + 0xf	C++
 	xpcom_core.dll!nsTHashtable<nsObserverList>::s_ClearEntry(PLDHashTable * table=0x010d6d58, PLDHashEntryHdr * entry=0x02221398)  Line 404	C++
 	xpcom_core.dll!PL_DHashTableRawRemove(PLDHashTable * table=0x010d6d58, PLDHashEntryHdr * entry=0x02221398)  Line 653 + 0x10	C
 	xpcom_core.dll!PL_DHashTableEnumerate(PLDHashTable * table=0x010d6d58, PLDHashOperator (PLDHashTable *, PLDHashEntryHdr *, unsigned int, void *)* etor=0x00271190, void * arg=0x00000000)  Line 687 + 0xd	C
 	xpcom_core.dll!nsTHashtable<nsObserverList>::Clear()  Line 248 + 0x10	C++
 	xpcom_core.dll!nsObserverService::Shutdown()  Line 93	C++
 	xpcom_core.dll!NS_ShutdownXPCOM_P(nsIServiceManager * servMgr=0x00acb5dc)  Line 733	C++
 	xul.dll!ScopedXPCOMStartup::~ScopedXPCOMStartup()  Line 556 + 0xc	C++
 	xul.dll!XRE_main(int argc=3, char * * argv=0x00ac7ea8, const nsXREAppData * aAppData=0x00403090)  Line 2403	C++
 	firefox.exe!main(int argc=3, char * * argv=0x00ac7ea8)  Line 61 + 0x13	C++
 	firefox.exe!mainCRTStartup()  Line 398 + 0x11	C
 	kernel32.dll!7c816d4f() 	
 	kernel32.dll!7c8399f3()
I think the problem is that nsWindowCollector::SetEnabled(PR_FALSE) allows the window collector to be double-released.  Take the following series of events:

1. The window collector is enabled
2. A config is received that disables the window collector (this could be an error response, as well).
3. nsMetricsService calls nsWindowCollector::SetEnabled(false)
4. nsWindowCollector::SetEnabled releases sInstance but doesn't null the sInstance pointer, because of the commented-on dependency on logging window destruction during shutdown.  The ObserverService's references keep nsWindowCollector around.
5. On shutdown, nsMetricsCollector calls SetEnabled(false) again, releasing the window collector again and freeing it
6. An observer notification comes in, and the observer service still has a pointer to the window collector.  --> crash

We should only allow the reference that's managed by SetEnabled() to be released if it has an outstanding reference.  I also think it would be more straightforward if these references were managed by the MetricsService, like:

class nsMetricsService
{
...
  nsCOMArray<nsISupports> mCollectors;
};

We could then merge SetEnabled() and GetInstance() into a single method:

static already_AddRefed<nsFooCollector> GetInstance(PRBool create);

Enabling a collector would use GetInstance(PR_TRUE), which would create a new instance if one doesn't exist.  It would be held in nsMetricsService::mCollectors.

The notifications that nsMetricsService forwards to certain collectors would use GetInstance(PR_FALSE), which would return null if the collector doesn't exist.

There's one loose end here: we need a way to tear down collectors that add themselves to the observer service.  Releasing the MetricsService's reference isn't good enough; the ObserverService is still holding at least one reference to the collector.  We have a couple of options:

- Use weak references with the observer service, but this will cause us pain in logging window closes during exit because they happen after quit-application.  We could fix that by not explicitly releasing the references on exit (instead relying on the MetricsService teardown to free all of the collectors).

- Add an out-of-band mechanism (Detach()) which is called on the collectors when we want to turn them off.  The collectors use this to unhook themselves from the observer service.  We don't explicitly Detach() on quit-application, again, to allow logging of events during shutdown.

I think I prefer the Detach() approach because, for example, the UIElement collector will hook up as an event handler which isn't something that supports weak references.
Attached patch patchSplinter Review
This is really a change I've wanted to make for awhile -- dealing with collectors generically in the metrics service, by instantiating them by contract id.  To make this work, I introduced a new interface, nsIMetricsCollector, and some new notifications.  That allows for the following changes:

- The collectors are managed as singleton objects through the service manager, instead of with SetEnabled().  That gives us more stable refcounting to fix this crash.

- There are no more hardcoded ordering dependencies in the metrics service.  There are some new notifications that collectors must/should use to ensure that everything plays well together.

- The process for disabling collectors has changed.  First, I added a notification, detach(), which is called on each collector if it's disabled after startup.  This allows the collector to unhook itself as an observer or event listener, so that when the metrics service releases it, it can actually go away (presuming no one else leaked it).  Secondly, collectors aren't explicitly disabled at application shutdown.  Instead, they'll continue to log until the MetricsService is destroyed (and won't leak, because the MetricsService will detach them at that point).

- It's now actually possible to drop in new collectors, implemented in JS or C++.

I'm mostly just looking for one review here, so whichever of you has time.
Attachment #219546 - Flags: second-review?(darin)
Attachment #219546 - Flags: first-review?(marria)
Comment on attachment 219546 [details] [diff] [review]
patch

>Index: extensions/metrics/public/nsIMetricsCollector.idl

>+ */
>+
>+[scriptable, uuid(b6cba4ad-363a-4780-90fa-9d7cc8115854)]

Please kill the newline.


>+interface nsIMetricsCollector : nsISupports
>+{
>+  /**
>+   * Notification that this collector is no longer enabled.  The collector
>+   * should unregister itself from observer and event notifications so that
>+   * the object can be freed.
>+   */
>+  void detach();

I recommend calling this "onDetach" since it's a callback.  Should there
be an onAttach event to compliment this?  I'm not sure what a collector
would use it for, but it might be convenient since object construction
time might occur before the object is fully hooked up to the metrics
service.


>Index: extensions/metrics/public/nsIMetricsService.idl

>+ * metrics-webnavigation-destroy
>+ * metrics-chrome-webnavigation-destroy
>+ *   These work like NS[_CHROME]_WEBNAVIGATION_DESTROY, except that
>+ *   the MetricsService is guaranteed to still know about the window
>+ *   being destoryed (via getWindowID).  Collectors should use these
>+ *   notifications instead of the generic ones.

It's common practice to declare these as #defines in a %{C++ block after
the interface definition.  See nsIIOService, nsINetworkLinkService, and
nsIHttpProtocolHandler for samples.


>Index: extensions/metrics/src/nsLoadCollector.cpp

>+nsLoadCollector::nsLoadCollector()
>+{
>+}
> 
>+nsLoadCollector::~nsLoadCollector()
>+{
>+}

I'm a fan of putting these in the header file so that they can be
inlined at the callsites.  There are very few ctor/dtor callsites
for XPCOM objects typically, so there isn't much concern about
codesize bloat.


>Index: extensions/metrics/src/nsMetricsConfig.h

>   /**
>+   * Call this method to get a list of all events that are enabled.
>+   * The event names are prefixed with the namespace, separated by a colon.
>+   */
>+  void GetEvents(nsTArray<nsString> *events);

Pass the nsTArray by reference?


>Index: extensions/metrics/src/nsProfileCollector.cpp

>+nsProfileCollector::~nsProfileCollector()
>+{
>+}

Move to header?


r=darin
Attachment #219546 - Flags: second-review?(darin) → second-review+
> I recommend calling this "onDetach" since it's a callback.  Should there
> be an onAttach event to compliment this?  I'm not sure what a collector
> would use it for, but it might be convenient since object construction
> time might occur before the object is fully hooked up to the metrics
> service.

There's not really anything in the way of hookup that needs to happen... the insertion into the collector hashtable doesn't really affect anything from the collector's point of view.  I think I'll leave this off for now.
> I'm a fan of putting these in the header file so that they can be
> inlined at the callsites.  There are very few ctor/dtor callsites
> for XPCOM objects typically, so there isn't much concern about
> codesize bloat.

Actually I've found a reason not to inline the ctor/dtor when the functon call overhead of constructing the object isn't important.  Inlining the ctor/dtor requires the definition for any nsCOMPtr-member interfaces to be visible.  So I've gotten in the habit of not inlining the ctor/dtor unless it's a trivial class or struct... these don't really fall into that category in my opinion.

I'll go ahead and check in with them non-inlined, we can discuss further if you think they really should be.
checked in
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Attachment #219546 - Flags: first-review?(marria)
Attached patch fix MSVC6 buildSplinter Review
MSVC 6 complains with this error:

c:/builds/mozilla/ff2/obj/fx-vc6-opt/extensions/metrics/src/../../../../../mozilla/extensions/metrics/src/nsWindowCollector.cpp(129) : error C2248: 'LogWindowOpen' : cannot access private member declared in class 'nsWindowCollector'
        c:/builds/mozilla/ff2/obj/fx-vc6-opt/extensions/metrics/src/../../../../../mozilla/extensions/metrics/src/nsWindowCollector.h(94) : see declaration of 'LogWindowOpen'

I think the best we can do is probably to just make it public.
Attachment #219843 - Flags: first-review?(marria)
Attachment #219843 - Flags: first-review?(marria) → first-review+
Comment on attachment 219843 [details] [diff] [review]
fix MSVC6 build

checked in
Flags: second-review+
Flags: first-review+
Product: Toolkit → Toolkit Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: