crash in nsPerformanceGroup::Dispose

RESOLVED FIXED in Firefox 48

Status

()

--
critical
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: philipp, Assigned: Yoric)

Tracking

({crash, regression})

45 Branch
mozilla50
x86
Windows NT
crash, regression
Points:
---

Firefox Tracking Flags

(firefox45 wontfix, firefox46- wontfix, firefox47- wontfix, firefox48+ fixed, firefox49 fixed, firefox50 fixed)

Details

(crash signature)

Attachments

(2 attachments)

(Reporter)

Description

3 years ago
This bug was filed from the Socorro interface and is 
report bp-f8551137-082c-435a-8d29-a13952151107.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	nsPerformanceGroup::Dispose() 	toolkit/components/perfmonitoring/nsPerformanceStats.cpp
1 	xul.dll 	nsPerformanceStatsService::Dispose() 	toolkit/components/perfmonitoring/nsPerformanceStats.cpp
2 	xul.dll 	nsPerformanceStatsService::Observe(nsISupports*, char const*, wchar_t const*) 	toolkit/components/perfmonitoring/nsPerformanceStats.cpp
3 	xul.dll 	nsObserverList::NotifyObservers(nsISupports*, char const*, wchar_t const*) 	xpcom/ds/nsObserverList.cpp
4 	xul.dll 	nsObserverService::NotifyObservers(nsISupports*, char const*, wchar_t const*) 	xpcom/ds/nsObserverService.cpp
5 	xul.dll 	nsXREDirProvider::DoShutdown() 	toolkit/xre/nsXREDirProvider.cpp
6 	xul.dll 	ScopedXPCOMStartup::~ScopedXPCOMStartup() 	toolkit/xre/nsAppRunner.cpp
7 	xul.dll 	mozilla::UniquePtr<ScopedXPCOMStartup, mozilla::DefaultDelete<ScopedXPCOMStartup> >::reset(ScopedXPCOMStartup*) 	mfbt/UniquePtr.h
8 	xul.dll 	XREMain::XRE_main(int, char** const, nsXREAppData const*) 	toolkit/xre/nsAppRunner.cpp
9 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp
10 	firefox.exe 	do_main 	browser/app/nsBrowserApp.cpp
11 	firefox.exe 	NS_internal_main(int, char**) 	browser/app/nsBrowserApp.cpp
12 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp
13 	firefox.exe 	__tmainCRTStartup 	f:/dd/vctools/crt/crtw32/startup/crt0.c:255
14 	kernel32.dll 	BaseThreadInitThunk 	
15 	ntdll.dll 	__RtlUserThreadStart 	
16 	ntdll.dll 	_RtlUserThreadStart

this crash signature is occurring in pre-release builds since 45 and may be caused by the work in bug 1208747.
(Reporter)

Comment 1

3 years ago
sorry, that first flag was set by accident...
status-firefox45: unaffected → affected
Quick remarks:
- the crash takes place during shutdown phase "profile-before-change", so XPCOM should still be available;
- there are two paths that call nsPerformanceGroup::Dispose() from nsPerformanceStatsService():
  - mTopGroup->Dispose() at https://hg.mozilla.org/mozilla-central/annotate/cc48981c026c/toolkit/components/perfmonitoring/nsPerformanceStats.cpp#l451;
  - group->Dispose() at https://hg.mozilla.org/mozilla-central/annotate/cc48981c026c/toolkit/components/perfmonitoring/nsPerformanceStats.cpp#l462.

I just noticed that we do not check whether mTopGroup is null before calling `mTopGroup->Dispose(); mTopGroup = nullptr`. This should not be able to cause the issue, though, as we shouldn't be able to call this method more than once.

Other dereferences in or around Dispose():
- https://hg.mozilla.org/mozilla-central/annotate/cc48981c026c/toolkit/components/perfmonitoring/nsPerformanceStats.cpp#l1015, we have a `service.mGroups`, however, this shouldn't be the problem, as mGroups is deleted during destruction of `service`, never set to null;
- a few lines below, `service->mAddonIdToGroup`, but `mAddonIdToGroup` is deleted during destruction of `service`, never set to null;
- a few lines below, `service->mWindowIdToGroup`, but `mWindowIdToGroup` is deleted during destruction, of `service`, never set to null;
- at https://hg.mozilla.org/mozilla-central/annotate/cc48981c026c/toolkit/components/perfmonitoring/nsPerformanceStats.cpp#l458, we take a raw pointer and dereference it. However, these raw pointers are always added to `this` and always removed from the hashtable by the destructor of the corresponding class, so that should also not cause the problem.

None of these lines seem to be actually able to cause a segfault. If I'm right, the segfault comes from some memory corruption, somewhere else, which won't make it easy to find.

As a side-note, it would seem useful for peace of mind to rewrite `mGroups` to take as key a WeakPtr instead of a raw pointer.
tracking-firefox46: --- → -
tracking-firefox47: --- → -
tracking-firefox48: --- → -

Comment 3

2 years ago
Hello David, this was mentioned as a concern for Fx47 release. This is consistently showing up on all 47 Beta builds. Can you investigate the latest crash reports? Or help find an alternate owner. Thanks!
Flags: needinfo?(dteller)
Created attachment 8757847 [details]
Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

Although I haven't been able to pinpoint why, it looks like
nsPerformanceStatsService::Dispose() may be called twice, which in
turn causes crashes. This patch makes sure that calling the method
twice is idempotent.

Also, just in case this was due to a typo in
AddObserver/RemoveObserver, this patch replaces the literal strings
used in both with constants.

Review commit: https://reviewboard.mozilla.org/r/56230/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/56230/
Attachment #8757847 - Flags: review?(nfroyd)
Flags: needinfo?(dteller)
Comment on attachment 8757847 [details]
Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

https://reviewboard.mozilla.org/r/56230/#review53178

The only way I can think of this happening is that profile-before-change somehow triggers (the profile manager?) early on in the program lifecycle, and then when it's called again, bad things happen.  Or maybe this is some weirdness with multiple content processes?  Regardless, r=me
Attachment #8757847 - Flags: review?(nfroyd) → review+
status-firefox45: affected → wontfix
status-firefox46: affected → wontfix
status-firefox48: --- → affected
status-firefox49: --- → affected
Assignee: nobody → dteller
Comment on attachment 8757847 [details]
Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56230/diff/1-2/
Attachment #8757847 - Attachment description: MozReview Request: Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,r=froydnj → Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

Comment 7

2 years ago
Pushed by dteller@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/66e0240f8c06
Make nsPerformanceStatsService::Dispose() idempotent,r=froydnj
Ah, conflict with bug 1270628. This should fix it.
Flags: needinfo?(dteller)
Comment on attachment 8757847 [details]
Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56230/diff/2-3/

Comment 11

2 years ago
Pushed by dteller@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e4c8b7d0abed
Make nsPerformanceStatsService::Dispose() idempotent,r=froydnj

Comment 12

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/e4c8b7d0abed
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox50: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Comment on attachment 8757847 [details]
Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

Approval Request Comment
[Feature/regressing bug #]: Unleashed by bug 1243706. Actual cause is still unclear.
[User impact if declined]: A number of shutdown crashes.
[Describe test coverage new/current, TreeHerder]: Currently on m-c.
[Risks and why]: I don't think that there is any risk. Worst case scenario, we'll exchange one shutdown crash for another one.
[String/UUID change made/needed]:
Attachment #8757847 - Flags: approval-mozilla-beta?
Attachment #8757847 - Flags: approval-mozilla-aurora?
status-firefox47: affected → wontfix
Comment on attachment 8757847 [details]
Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,

Fix a shutdown issue, taking it.
Should be in 48 beta 2
Attachment #8757847 - Flags: approval-mozilla-beta?
Attachment #8757847 - Flags: approval-mozilla-beta+
Attachment #8757847 - Flags: approval-mozilla-aurora?
Attachment #8757847 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/2e5e0318f6e5f7db9f026dc4c9b656c64722ced6

but has problems to apply to beta:

grafting 350360:2e5e0318f6e5 "Bug 1261702 - Make nsPerformanceStatsService::Dispose() idempotent,r=froydnj, a=sylvestre"
merging toolkit/components/perfmonitoring/nsPerformanceStats.cpp
warning: conflicts while merging toolkit/components/perfmonitoring/nsPerformanceStats.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
status-firefox49: affected → fixed
Flags: needinfo?(dteller)
Created attachment 8763958 [details] [diff] [review]
Backport to beta
Flags: needinfo?(dteller)
Attachment #8763958 - Flags: review+
Restoring the tracking flag to + since it was set to "-"
tracking-firefox48: - → +
(Reporter)

Updated

2 years ago
Flags: needinfo?(cbook)
Keywords: checkin-needed
thanks!

https://hg.mozilla.org/releases/mozilla-beta/rev/982871ad599a
status-firefox48: affected → fixed
Flags: needinfo?(cbook)
Keywords: checkin-needed
I'm here testing a new script which could potentially update bugs automatically when we uplift a crash fix, but the crash signature still is being reported on that channel. There are still a couple of crash reports with this signature, coming in on aurora 49, since the uplift. Do you want to open a followup bug?
Flags: needinfo?(dteller)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #19)
> I'm here testing a new script which could potentially update bugs
> automatically when we uplift a crash fix, but the crash signature still is
> being reported on that channel. There are still a couple of crash reports
> with this signature, coming in on aurora 49, since the uplift. Do you want
> to open a followup bug?

Yes, please, that would be great.
Flags: needinfo?(dteller) → needinfo?(lhenry)
Hmmm, actually just one from builds this week. https://crash-stats.mozilla.com/report/index/2c7e8b17-c9cd-4912-bb37-3849c2160711
Flags: needinfo?(lhenry)
You need to log in before you can comment on or make changes to this bug.