Closed
Bug 1223445
Opened 9 years ago
Closed 9 years ago
KeyframeEffectReadOnly objects end up keeping lots of other objects alive too long
Categories
(Core :: DOM: Animation, defect)
Core
DOM: Animation
Tracking
()
RESOLVED
FIXED
mozilla45
People
(Reporter: wlach, Assigned: smaug)
References
Details
(Keywords: memory-leak, Whiteboard: [MemShrink])
Attachments
(2 files)
12.15 KB,
patch
|
birtles
:
review+
|
Details | Diff | Splinter Review |
12.28 KB,
patch
|
ritu
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
So I'd noticed that over the past while, Nightly would be pretty janky after a day of usage. I called :mconley over (:vladan also joined) and we pinned it down to the fact that I had multiple treeherder tabs opened consuming over 3G of memory, thus causing the cycle collector to aggressively collect memory and cause periodic pauses to the browser. Here's a profile:
http://people.mozilla.org/~bgirard/cleopatra/#report=ddbcb94237c91cc0ff36766775b6d15f299a9ee6
I know there's a bug in treeherder somewhere (which I'll try and investigate seperately) and there's only so much we can do about badly performing websites, but perhaps we should at least warn about this sort of case somehow, similar to how we warn about other types of things that make the browser perform slowly?
CC'ing various people I'm told would have insight into this (and/or the ability to fix it)
Comment 1•9 years ago
|
||
There's a few things here:
- the treeherder web page likely keeps references to DOM elements longer than it should
- could CC somehow do its walks in smaller increments?
- about:performance should be extended to cover GC & CC (but there's no time for this in Perf-team in Q4 or Q1)
- once that is done, the Slow Addon Watcher should be extended to cover web pages (Page Watcher), but again, no time in Q4/Q1
- I don't think BHR captured all the janks
- it would be neat to have about:telemetry smart enough to point out performance problems using the collected histogram data (e.g. in this case CYCLE_COLLECTOR_MAX_PAUSE)
Assignee | ||
Comment 2•9 years ago
|
||
So the STR for this is "keep treeherder open in several tabs"?
Assignee | ||
Comment 3•9 years ago
|
||
CC is incremental (default slice is 5ms) and we do try to optimize out certainly-alive objects from the graph.
Looking the profile now...
Assignee | ||
Comment 4•9 years ago
|
||
Hmm, looks like the profile is missing most of the symbols.
Comment 5•9 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #1)
> - could CC somehow do its walks in smaller increments?
The cycle collector is already mostly incremental. I'll have to try to reproduce this to see what exactly is taking a long time. Unfortunately unlike the GC we don't have detailed profiling for the CC. I guess the cleopatra profile might indicate that, but I find the UI incomprehensible.
The big CC optimization we do is marking objects as being definitely alive. This may be failing here in a way that could be improved. To figure that out, I'd need a CC log (about:memory -> save concise) to see what all is in the log.
Assignee | ||
Updated•9 years ago
|
Component: JavaScript: GC → XPCOM
Assignee | ||
Comment 6•9 years ago
|
||
Oh, maybe JS: GC is right after all, since the long pause times are GC, not CC.
Component: XPCOM → JavaScript: GC
Assignee | ||
Comment 7•9 years ago
|
||
(maybe. That profile is hard to interpret)
Assignee | ||
Comment 8•9 years ago
|
||
Something is keeping lots of
"root FragmentOrElement (xhtml) button class='btn group-btn btn-xs job-group-count btn-dkgray-count filter-shown' (orphan)" elements alive. They are not in any document (they are orphan), and most
of them are the root objects of a cycle of 384 objects.
Assignee | ||
Comment 9•9 years ago
|
||
Ok, this was a bit painful to debug since treeherder is so heavy page that using debug build was hopeless.
anyhow, those html:button elements have 5 references of which cycle collector normally knows about 4:
- JS wrapper owns the native object
- element's parent owns its child node
- elements child owns its parent node
- nsDOMTokenList (classList) owns its element (and element owns its classList)
The missing 5th reference is coming from KeyframeEffectReadOnly.
Either the page is keeping KeyframeEffectReadOnly objects alive too long, or we have a runtime leak.
Since the page doesn't seem to create any JS wrappers for KeyframeEffectReadOnly objects, I'd say this is a
leak somewhere in AnimationManager.
Assignee | ||
Updated•9 years ago
|
Component: JavaScript: GC → DOM: Animation
Assignee | ||
Comment 10•9 years ago
|
||
AnimationCollection looks very suspicious here.
Assignee | ||
Updated•9 years ago
|
Summary: Badly performing web pages can cause cycle collector pauses that are hard to diagnose → KeyframeEffectReadOnly objects end up keeping lots of other objects alive too long
Comment 11•9 years ago
|
||
Possibly introduced by bug 1208385? Specifically: https://hg.mozilla.org/mozilla-central/rev/8770acc7fd9c
Assignee | ||
Comment 12•9 years ago
|
||
So, if I'm reading the code right, elements have a "property" (SetProperty) which points to AnimationCollection, that collection has array of
RefPtr<dom::Animation> objects which then have strong refs to KeyframeEffectReadOnly objects which have
nsCOMPtr<Element> mTarget. So, nice little cycle there.
It is broken manually when CommonAnimationManager::RemoveAllElementCollections() is called, since
that goes through all the collections and calls Destroy on them.
That happens when null PresShell is passed to nsPresContext::SetShell.
Assignee | ||
Comment 13•9 years ago
|
||
I guess we could just traverse/unlink those properties. a patch coming.
Assignee | ||
Comment 14•9 years ago
|
||
Hmm, no, that isn't enough.
I noticed that at some point also animation timeline seemed to keep objects alive too long.
Assignee | ||
Comment 15•9 years ago
|
||
Looks like DocumentTimeline removes animations from its hashtable only when refresh driver is about to tick. And on background tabs that may not ever happen.
Comment 16•9 years ago
|
||
Thanks for looking into this.
With regards to the timeline, current AnimationTimeline never removes animations from its list. The DocumentTimeline subclass, however, will check on each tick which animations it needs to keep and drop the rest. So perhaps if we're not getting ticks I guess that will never happen. (Perhaps NotifyRefreshDriverDestroying could take care of that.)
Assignee | ||
Comment 17•9 years ago
|
||
[Tracking Requested - why for this release]:
Rather bad runtime leak causing a lot higher cc and gc times than normally - in other words causing jank.
(Not yet sure if this is happening also in 43.)
status-firefox44:
--- → affected
status-firefox45:
--- → affected
tracking-firefox44:
--- → ?
tracking-firefox45:
--- → ?
Comment 18•9 years ago
|
||
(In reply to Brian Birtles (:birtles, busy 6-17 Nov) from comment #16)
> With regards to the timeline, current AnimationTimeline never removes
> animations from its list. The DocumentTimeline subclass, however, will check
> on each tick which animations it needs to keep and drop the rest. So perhaps
> if we're not getting ticks I guess that will never happen. (Perhaps
> NotifyRefreshDriverDestroying could take care of that.)
It looks like NotifyRefreshDriverDestroying is only called when a presshell is destroyed. But this seems to be about background tabs that have a presshell (and a refresh driver), but aren't getting ticks. Or am I missing something?
Comment 19•9 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #18)
> (In reply to Brian Birtles (:birtles, busy 6-17 Nov) from comment #16)
> > With regards to the timeline, current AnimationTimeline never removes
> > animations from its list. The DocumentTimeline subclass, however, will check
> > on each tick which animations it needs to keep and drop the rest. So perhaps
> > if we're not getting ticks I guess that will never happen. (Perhaps
> > NotifyRefreshDriverDestroying could take care of that.)
>
> It looks like NotifyRefreshDriverDestroying is only called when a presshell
> is destroyed. But this seems to be about background tabs that have a
> presshell (and a refresh driver), but aren't getting ticks. Or am I missing
> something?
Sorry, I posted that comment at the same time as Olli so I didn't realise this was about background tabs.
We discussed this on IRC and I think we need to eagerly remove Animations from their AnimationTimeline in Animation::Cancel()--and that probably requires making AnimationTimelines store a linked-list of Animations instead of an array.
Assignee | ||
Comment 20•9 years ago
|
||
That ::DoCancel change still isn't enough, since there is after all the strong ref from timeline to animation, so nothing ends up calling cancel (even though I would assume styling code to call CancelFromStyle).
Comment 21•9 years ago
|
||
We should be calling CancelFromStyle when we run UnbindFromTree here:
https://dxr.mozilla.org/mozilla-central/rev/84a7cf29f4f14c9b359db2f7f19c0abd6a8e178e/dom/base/Element.cpp#1801
Since calling DeleteProperty should trigger AnimationCommon::PropertyDtor.
Assignee | ||
Comment 22•9 years ago
|
||
So there is yet another leak. We never remove Animations in background tabs
from PendingAnimationTracker (unless the page is closed or becomes foreground).
The reason is that when we try to remove animation from tracker,
GetComposedDoc already returns null. So Element::UnbindFromTree needs to remove properties earlier.
The other leak fix is the mTimeline->RemoveAnimation(this); call in Animation::DoCancel()
LinkedList handling is a bit ugly, but I blame our LinkedList API.
In order to still keep a strong reference to the relevant Animation objects, AnimationSet is now
nsTHashtable<nsRefPtrHashKey<dom::Animation>>
https://treeherder.mozilla.org/#/jobs?repo=try&revision=453433462a2b
I realized we don't need to traverse the properties atm after all, since the remove them anyway
when closing the relevant page or unbinding element.
But it is not clear to me why Element::UnbindFromTree handling is actually right, and why we shouldn't just
rely on cycle collector. But that is about some other bug.
Assignee: nobody → bugs
Attachment #8686839 -
Flags: review?(bbirtles)
Comment 23•9 years ago
|
||
Comment on attachment 8686839 [details] [diff] [review]
v2
Sorry for the delay on this one (we had a big Firefox DevCon here over the weekend), and thanks for doing this.
>diff --git a/dom/animation/AnimationTimeline.h b/dom/animation/AnimationTimeline.h
>--- a/dom/animation/AnimationTimeline.h
>+++ b/dom/animation/AnimationTimeline.h
>@@ -86,29 +89,30 @@ public:
> /**
> * Inform this timeline that |aAnimation| which is or was observing the
> * timeline, has been updated. This serves as both the means to associate
> * AND disassociate animations with a timeline. The timeline itself will
> * determine if it needs to begin, continue or stop tracking this animation.
> */
> virtual void NotifyAnimationUpdated(Animation& aAnimation);
>
>+ void RemoveAnimation(Animation* aAnimation);
>+
Perhaps we should add a comment explaining why this isn't called more often. e.g. "Typically animations that don't currently require time updates (e.g. because they have finished or are paused) are automatically removed from the timeline on each tick. However, we eagerly remove animations that have been cancelled since the next tick might never come if we are in a background tab, but that shouldn't prevent us from freeing up references to these animations."
Actually, I've now noticed we use this method in a few other places so perhaps this comment isn't necessary. I'll leave it up to you.
> // Animations observing this timeline
> //
> // We store them in (a) a hashset for quick lookup, and (b) an array
> // to maintain a fixed sampling order.
> //
> // The array keeps a strong reference to each animation in order
> // to save some addref/release traffic and because we never dereference
> // the pointers in the hashset.
>- typedef nsTHashtable<nsPtrHashKey<dom::Animation>> AnimationSet;
>- typedef nsTArray<RefPtr<dom::Animation>> AnimationArray;
>- AnimationSet mAnimations;
>- AnimationArray mAnimationOrder;
>+ typedef nsTHashtable<nsRefPtrHashKey<dom::Animation>> AnimationSet;
>+ AnimationSet mAnimations;
>+ LinkedList<dom::Animation> mAnimationOrder;
This comment needs to be updated to say that it's the hashset that keeps the strong ref. (The original comment seems to be missing some words in the middle saying that we store raw pointers in the array "... to save some addref/release traffic ..." so feel free to completely rewrite that comment!)
Thanks again.
Attachment #8686839 -
Flags: review?(bbirtles) → review+
Assignee | ||
Comment 24•9 years ago
|
||
Comment 25•9 years ago
|
||
Comment 26•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Assignee | ||
Comment 27•9 years ago
|
||
Comment on attachment 8688008 [details] [diff] [review]
comment tweaked
Approval Request Comment
[Feature/regressing bug #]: I don't know yet. The Element::UnbindFromTree part is really old
[User impact if declined]: runtime leak until background tabs are closed
[Describe test coverage new/current, TreeHerder]: NA, this is quite hard to test
and needs one to create/analyze cycle collector logs
[Risks and why]: low risk, releasing to-be-deleted objects earlier
[String/UUID change made/needed]: NA
I need to check if we have the bug also in beta.
Attachment #8688008 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 28•9 years ago
|
||
(this could be a regression from bug 1208385, which is nightly/aurora only)
Assignee | ||
Comment 29•9 years ago
|
||
Haven't managed to reproduce on beta.
Comment on attachment 8688008 [details] [diff] [review]
comment tweaked
This code has been in Nightly for a few days so should we be safe to uplift to Aurora44.
Attachment #8688008 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Memory leaks are bad, tracked for FF44.
Comment 32•9 years ago
|
||
bugherder uplift |
Comment 33•9 years ago
|
||
bugherder uplift |
status-b2g-v2.5:
--- → fixed
Updated•9 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•