Closed
Bug 795186
Opened 12 years ago
Closed 12 years ago
Declare refptrs between WebGL objects to the CC even if there is no cycle at C++ level, because JS could create cycles
Categories
(Core :: Graphics: CanvasWebGL, defect)
Core
Graphics: CanvasWebGL
Tracking
()
RESOLVED
FIXED
mozilla19
People
(Reporter: bjacob, Assigned: bjacob)
References
Details
(Whiteboard: webgl-next [MemShrink:P1])
Attachments
(3 files)
9.50 KB,
patch
|
smaug
:
review+
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
2.05 KB,
patch
|
smaug
:
review-
|
Details | Diff | Splinter Review |
6.70 KB,
patch
|
smaug
:
review+
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
For example, WebGLUniformLocation has a nsRefPtr to WebGLProgram. While there is no cycle at the c++ level, js code could do
program.foo = uniformloc;
thus creating a cycle.
Thus we need to declare all such refptrs to the cc.
Cycles that are entirely created by JS code do not need to be declared to the CC as the JS GC takes care of them, but if part of the cycle is a reference created by c++ code, then the CC needs to know about it.
Assignee | ||
Comment 1•12 years ago
|
||
Smaug tells me that we have a leak in http://maps.nokia.com/webgl/ that looks like it might be caused by this bug. Prioritizing.
Updated•12 years ago
|
Whiteboard: webgl-next → webgl-next [MemShrink]
Assignee | ||
Comment 2•12 years ago
|
||
To be clear, the only nontrivial thing in this bug is that WebGL inter-object strong references use a custom refptr class, WebGLRefPtr, so it's not clear beforehand that it's just a matter of using the usual CC macros -- have to look into these macros to see what they are doing.
WebGLRefPtr is here:
http://hg.mozilla.org/mozilla-central/file/39d70a2e0dea/content/canvas/src/WebGLContext.h#l300
Comment 3•12 years ago
|
||
The macros are fairly simple: grab out the underlying pointer, do a callback. Assuming your underlying classes are the usual nsISupports CCed classes, and the WebGLRefPtr supports .get(), then it looks like the usual NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR should just work.
Assignee | ||
Comment 4•12 years ago
|
||
Attachment #672984 -
Flags: review?(bugs)
Assignee | ||
Comment 5•12 years ago
|
||
(Hit enter before I could write the explanatory comment)
Unfortunately, while this does all I can see needs doing here, the issue with Nokia Maps seems to persists: if I view that page and then go to about:memory in the same tab, then no matter how long I wait and click GC/CC buttons, the webgl metrics stay. Note: in about:memory, do NOT click 'minimize memory usage' as this would cause the WebGL context to destroy all its resources, artificially making the problem disappear.
Explanation on this patch:
The header file where all those classes are declared is WebGLContext.h.
Aside from usual refptr classes, the main custom refptr class here is WebGLRefPtr. There is a comment there explaining why it exists. Also, WebGLVertexAttribData stores a WebGLRefPtr member, so we had to add a .get() method to it to make CC macros work with it.
A nontrivial change in this patch is that it adds CC to WebGLUniformLocation, which wasn't CC'd so far. The reason why it wasn't CC'd was that contrary to most other classes there, WebGLUniformLocation objects can only ever be gotten by scripts by a [Creator] method, webgl.getUniformLocation, so there is no need for wrappercaching. That's why WebGLUniformLocation still doesn't inherit nsWrapperCache.
Comment 6•12 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #5)
> Note: in about:memory, do NOT click
> 'minimize memory usage' as this would cause the WebGL context to destroy all
> its resources, artificially making the problem disappear.
Ah, this is interesting. Add a breakpoint to the context release and check what is releasing it when
minimizing memory.
Comment 7•12 years ago
|
||
Comment on attachment 672984 [details] [diff] [review]
fix cycle collection in the WebGL implementation
The LinkedList member variables don't need traversing/unlinking, right?
They are just list of objects which have raw refs to each other?
Can there be lots of WebGL related objects alive?
If so, we should probably make them skippable. Something like
http://mxr.mozilla.org/mozilla-central/source/content/events/src/nsDOMEventTargetHelper.h?rev=d8798b6331be&mark=31-31#24
and
http://mxr.mozilla.org/mozilla-central/source/content/events/src/nsDOMEventTargetHelper.cpp?rev=5a6a36d109d7&mark=47-62#42
There are 3 different skip-phases, documented here
http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsCycleCollectionParticipant.h?rev=1e5bf398fc88&mark=256-260,265-268,273-276#244
So, file a followup to add support for skippability.
Attachment #672984 -
Flags: review?(bugs) → review+
Updated•12 years ago
|
Assignee: nobody → bjacob
Assignee | ||
Comment 9•12 years ago
|
||
At a point, I thought that this fixed all the bugs (nokia maps and ro.me); but now I can't reproduce anymore. There are poltergeists around here. Anyway, this timer _is_ causing a call to WebGLContext::AddRef() so this seems like another cycle that we failed to declare.
Attachment #673510 -
Flags: review?(bugs)
Assignee | ||
Comment 10•12 years ago
|
||
My reference analysis tool (bug 704240) finds only one persistent (not broken by the CC) cycle involving WebGL, and that appears to be the cycle between the WebGLContext and the HTMLCanvasElement:
Reference cycle 39:
Contains 2 blocks, listed below:
block 592393
has references to these blocks: 415899, 592399
allocation call stack:
at 0x402B1FE: malloc (vg_replace_malloc.c:271)
by 0x403BEE4: moz_xmalloc (mozalloc.cpp:57)
by 0x709865F: NS_NewHTMLCanvasElement(already_AddRefed<nsINodeInfo>, mozilla::dom::FromParser) (mozalloc.h:200)
by 0x70F2476: CreateHTMLElement(unsigned int, already_AddRefed<nsINodeInfo>, mozilla::dom::FromParser) (nsHTMLContentSink.cpp:497)
by 0x70F3DCA: NS_NewHTMLElement(nsIContent**, already_AddRefed<nsINodeInfo>, mozilla::dom::FromParser) (nsHTMLContentSink.cpp:480)
by 0x6F94058: NS_NewElement(nsIContent**, already_AddRefed<nsINodeInfo>, mozilla::dom::FromParser) (nsNameSpaceManager.cpp:201)
by 0x6F54C75: nsDocument::CreateElem(nsAString_internal const&, nsIAtom*, int, nsIContent**) (nsDocument.cpp:6943)
by 0x6F478EC: nsDocument::CreateElement(nsAString_internal const&, nsIContent**) (nsDocument.cpp:4561)
by 0x762AEF0: nsIDOMDocument_CreateElement(JSContext*, unsigned int, JS::Value*) (dom_quickstubs.cpp:1273)
by 0x821A89B: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) (jscntxtinlines.h:364)
by 0x822638E: js::InvokeKernel(JSContext*, JS::CallArgs, js::MaybeConstruct) (jsinterp.cpp:367)
by 0x82213D7: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2370)
block 592399
has references to these blocks: 592393, 592400, 592871, 593172
allocation call stack:
at 0x402B1FE: malloc (vg_replace_malloc.c:271)
by 0x403BEE4: moz_xmalloc (mozalloc.cpp:57)
by 0x7011F1E: NS_NewCanvasRenderingContextWebGL(nsIDOMWebGLRenderingContext**) (mozalloc.h:200)
by 0x6C7BFFC: CreateCanvasRenderingContextWebGL(nsISupports*, nsID const&, void**) (in /hack/mozilla-central/obj-firefox-valgrind/toolkit/library/libxul.so)
by 0x7BFC628: nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) (nsComponentManager.cpp:1006)
by 0x7BB5E8C: nsCreateInstanceByContractID::operator()(nsID const&, void**) const (nsComponentManagerUtils.cpp:178)
by 0x7098830: nsCOMPtr<nsICanvasRenderingContextInternal>::assign_from_helper(nsCOMPtr_helper const&, nsID const&) (nsCOMPtr.h:1286)
by 0x7098D46: nsCOMPtr<nsICanvasRenderingContextInternal>::nsCOMPtr(nsCOMPtr_helper const&) (nsCOMPtr.h:685)
by 0x7098EBB: nsHTMLCanvasElement::GetContextHelper(nsAString_internal const&, nsICanvasRenderingContextInternal**) (nsHTMLCanvasElement.cpp:628)
by 0x70991A3: nsHTMLCanvasElement::GetContext(nsAString_internal const&, JS::Value const&, nsISupports**) (nsHTMLCanvasElement.cpp:652)
by 0x76210D0: nsIDOMHTMLCanvasElement_GetContext(JSContext*, unsigned int, JS::Value*) (dom_quickstubs.cpp:16548)
by 0x821A89B: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) (jscntxtinlines.h:364)
Comment 11•12 years ago
|
||
Comment on attachment 673510 [details] [diff] [review]
cycle collect the mContextRestorer pointer
Hmm, this shouldn't help.
nsTimerImpl isn't cycle collected.
Need to change timer handling, so that it doesn't keep context alive.
Use InitWithFuncCallback and make sure to cancel the timer when
context is deleted.
Attachment #673510 -
Flags: review?(bugs) → review-
Assignee | ||
Comment 13•12 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #8)
> Can you write a test for this?
How would you write a test, checking that things are correctly cycle collected?
A compiled-code test maybe?
Comment 14•12 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #13)
> How would you write a test, checking that things are correctly cycle
> collected?
Well, these leaks are apparently bad enough to cause shutdown leaks, so you can just set up one of these leaky cycles, and the shutdown leak detection will catch it if it breaks.
Assignee | ||
Comment 15•12 years ago
|
||
Whiteboard: webgl-next [MemShrink] → webgl-next [MemShrink] [leave open]
Target Milestone: --- → mozilla19
Assignee | ||
Comment 16•12 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #14)
> (In reply to Benoit Jacob [:bjacob] from comment #13)
> > How would you write a test, checking that things are correctly cycle
> > collected?
>
> Well, these leaks are apparently bad enough to cause shutdown leaks, so you
> can just set up one of these leaky cycles, and the shutdown leak detection
> will catch it if it breaks.
OK.
It seems like the two patches there together fix the leaks, and I was just using the wrong way to check if the leak was fixed -- I was navigating to about:memory in the SAME tab as the testcase and so the testcase was purposefully kept in memory.
I'll try to figured _what_ exactly caused the leak and then maybe I'll be able to make a test.
Comment 17•12 years ago
|
||
Comment 18•12 years ago
|
||
Comment on attachment 674011 [details] [diff] [review]
fix the robustness timer, don't inherit the timer callback base class
>+ void RobustnessTimerCallback(nsITimer* timer);
>+
>+ static void RobustnessTimerCallbackStatic(nsITimer* timer, void *thisPointer) {
>+ static_cast<WebGLContext*>(thisPointer)->RobustnessTimerCallback(timer);
>+ }
Looks like the file doesn't use Gecko coding style :(
But ok, let's not change that now.
(Parameters should be aParameterName, and indentation should be 2 spaces)
Attachment #674011 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 19•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/15b7770dcfcc
Regarding the testcases, after some testing, it seems that Nokia Maps is fixed by the first patch (48502b61a63e) and RO.ME is fixed by the second patch (15b7770dcfcc).
I will try to think of a test for this, but as it's not clear yet to me why these particular pages trigger these bugs while others don't, it would take some more time to isolate a test case, and I'm sorry, I have too many things to do this week.
Can close the bug now.
Assignee | ||
Updated•12 years ago
|
Whiteboard: webgl-next [MemShrink] [leave open] → webgl-next [MemShrink]
Comment 20•12 years ago
|
||
What's the effect on RO.ME? Will bug 657115 be affected by this?
Assignee | ||
Comment 21•12 years ago
|
||
The effect on RO.ME was that the WebGL context and its resources were staying around forever after closing the RO.ME tab ; now they are properly going away several seconds after the tab is closed. I haven't tested if it has any effect on bug 657115 which seems to be more about memory usage _during_ playing RO.ME.
Comment 22•12 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #19)
> I will try to think of a test for this, but as it's not clear yet to me why
> these particular pages trigger these bugs while others don't, it would take
> some more time to isolate a test case, and I'm sorry, I have too many things
> to do this week.
No problem, I just was wondering if it might be easy to throw together a little test, but I guess not!
Nick: To reiterate what Benoit said, for RO.ME, there was a recent regression (bug 800488) from the new DOM bindings for WebGL stuff that caused a shutdown leak, which is what this fixes. It is possible that the old bindings for WebGL had some kind of temporary leak issue that will be fixed by the new DOM bindings, but I'd be surprised if that was the case.
Comment 23•12 years ago
|
||
This needs to land on Aurora, too, right? But not really earlier?
status-firefox18:
--- → affected
status-firefox19:
--- → affected
tracking-firefox18:
--- → ?
tracking-firefox19:
--- → ?
Whiteboard: webgl-next [MemShrink] → webgl-next [MemShrink:P1]
Assignee | ||
Comment 24•12 years ago
|
||
Yes, let's backport to Aurora and not earlier. Backporting to Beta would be more intrusive as these classes were not CC'd before the WebIDL port (Aurora).
Assignee | ||
Comment 25•12 years ago
|
||
Comment on attachment 672984 [details] [diff] [review]
fix cycle collection in the WebGL implementation
[Approval Request Comment]
Bug caused by (feature/regressing bug #): don't know. The RO.ME leak is claimed to be from the WebIDL port (bug 779611) but I don't see how this can have not leaked before. The cycles existed before.
User impact if declined: shutdown memory leaks on certain heavy WebGL applications
Testing completed (on m-c, etc.): m-c since yesterday
Risk to taking this patch (and alternatives if risky): seems low risk: rather standard CC stuff
String or UUID changes made by this patch: none
Attachment #672984 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 26•12 years ago
|
||
Comment on attachment 674011 [details] [diff] [review]
fix the robustness timer, don't inherit the timer callback base class
[Approval Request Comment]
See previous request comment.
Attachment #674011 -
Flags: approval-mozilla-aurora?
Comment 27•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Updated•12 years ago
|
Attachment #672984 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•12 years ago
|
Attachment #674011 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•12 years ago
|
Assignee | ||
Comment 28•12 years ago
|
||
Updated•12 years ago
|
Flags: in-testsuite? → in-testsuite-
You need to log in
before you can comment on or make changes to this bug.
Description
•