Closed Bug 64819 Opened 24 years ago Closed 2 years ago

nsDOMEvent objects held too long in mozilla, not in viewer, causes bloat

Categories

(Core :: DOM: UI Events & Focus Handling, defect, P3)

x86
Windows NT
defect

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: buster, Unassigned)

References

()

Details

(Keywords: memory-leak, perf)

1. use your favorite technique for detecting leaks. In playing with the idea of an event recycler, I put printf's in the constructor and destructor of nsDOMEvent 2. load the page in viewer 3. use mouse-moves to generate events. you should see a few hundred within a very short time of waving the mouse around in the page. none of these leak. 4. load the page in mozilla (if it matters, I'm using classic skin.) 5. put your mouse in the air, and wave it around like you just don't care 6. every event leaks. this includes mouse moves over the html content area, or just over the chrome (menus, buttons, etc.) 7. load a different page 8. you'll notice all the events get free'd at the page transition 64 bytes per event (more really....) times hundreds or thousands of events for a realistic interaction with a page....that's a bunch of memory!
although the memory is eventually reclaimed, I think we have to consider this a leak because I can't imagine we actually need those events to persist for the life of a document. They're just bloat.
Keywords: mlk, perf
Priority: -- → P2
This bug blocks bug 64696. A recycler is useless if the objects it's trying to recycle are leaked.
Blocks: 64696
Wow, that's a doozy.
> and wave it around like you just don't care come on, we could release not this... "all mouse movements should be made with thought and care" ;-) joki, any ideas on how we could fix this?
In mozilla, I think we examine events in javascript for tooltips (right?) -- this would create JS objects for the events that wouldn't be collected until the next GC. What happens if you trigger a JS garbage collection? Do they go away?
buster: have you verified that these really are leaks? i.e., set XPCOM_MEM_REFCOUNT_LOG=c:\tmp\refcnt.log run waggle your mouse shutdown see ns*Event objects in refcnt.log
they aren't "leaks" in the sense that eventually they are reclaimed, but they are pure bloat. the lifetime of an event is very short, and JS is holding onto these events for a relatively long time.
Summary: nsDOMEvent objects leaked in mozilla, not in viewer → nsDOMEvent objects held too long in mozilla, not in viewer, causes bloat
GC is bloaty compared to ref-counting -- no news there. This bug is really asking for better GC scheduling, if necessary (see bug 64696). Hard to say what necessary is without looking at all intra-GC bloat. Here's an idea: - Extend the JS API with a JS_AdviseGC call that allows one to make claims to the JS GC such as: - JS_ADVISE_GC_LIKELY_GARBAGE: An object is likely to be garbage now. - JS_ADVISE_GC_LIKELY_THRESHOLD: Run a GC after N such likely garbage pieces of advice, or perhaps N bytes containing such likely garbage objects. - Additional per-object size information (private data, entrained out of line allocations owned by the private data) for JS_ADVISE_GC_LIKELY_GARBAGE. - Additional per-object size threshold at which to run the GC. This could be combined with JS_ADVISE_GC_LIKELY_THRESHOLD so that the single threshold is a byte count, and the engine sums intrinsic (JS engine data structure) object size with additional (private data) size, and compares to the threshold. - Call JS_AdviseGC(cx, JS_ADVISE_GC_LIKELY_THRESHOLD, 10*1024) or whatever bloat limit is justified by measurement of all intra-GC bloat, when creating a DOM script context. - Call JS_AdviseGC(cx, JS_ADVISE_GC_LIKELY_GARBAGE, sizeof(nsDOMEvent)) after each event handler returns. Having sketched the above, I'm now loath to add anything to the JS API. All of the above could be implemented in dom or layout code, using the existing JS API. The byte counter and byte threshold could be stored in the nsJSContext. The intrinsic size of a JS object can be estimated or hardwired, or we could add a JS API entry point JS_SizeofObject. Before any of this happens, though, it seems necessary to get a total measure of intra-GC bloat, and to compare it to the worst case from mousing around. The current code will run a GC after enough mouse motion and 50% growth in nominal live object count. /be
Cc'ing jst for comments. /be
s/intra-GC/inter-GC/ -- I'm tired. Another good to optimize is garbage collected / time spent in GC. That will take some inside-the-JS-engine work, and I'll look into that as part of bug 66381 (just filed). The goal is to make JS_MaybeGC fast enough to call whenever a JS embedding believes it has created garbage (no matter how little). I am not sure how this will pan out, and without a JS_AdviseGC hook there is no way for the engine to size private data subtrees hanging off of JS objects. So don't wait for me: if the inter-GC bloat due to nsDOMEvents is too great, force a GC from dom or layout code every N event handler dispatches. /be
I'd like to see some numbers here too before spending any time on this, currently the DOM code calls JS_MaybeGC() after every 20:th script evaluation (i.e. after every 20:th or so event handler is run) so we should be calling JS_MaybeGC() often enough IMO but it seems like the events and event handlers don't generate enough garbage for JS_MaybeGC() to actually run the GC.
be: I appreciate your efforts here, but I really hope we don't need to pull out such a big jackhammer to solve this iddy biddy problem.
Call me a fat lover, but I don't think two new member variables per DOM context, and some logic to add to one and check against the other before calling JS_GC, is bigger than "iddy biddy". OTOH, I'm not sure that you have swallowed the whole garbage collection pill, practice as well as theory. GC means more bloat, until the next GC runs (incremental of not), than an ideal system, or than a ref-counted system. That bloat need not be large compared to the working set, and well-tuned GC scheduling will keep it down. But the problem is not as small as its smallest ref-counted analogue. Ref-counting will always be less bloaty than GC, but it has its own downside (leaks, cycles, code bloat, dangling dumb pointers), with which we are well acquainted. /be
well, I'm going to move forward with the patch in 64696 as is, because it does help quite a bit in many situations (embedded gecko, applications other than browser), and has almost no additional cost in the case of the browser chrome. We can continue this thread here, or move it to a newsgroup. My main point remains: in the case where we know the lifetime of an object, there is a significant cost to having that object migrate into javascript, where the GC memory model renders recyclers much less effective.
What would we say in the newsgroup? Argue about who should measure the bloat and compare it to other concurrent bloat in order to back up the claims about this bug's severity? I'll save the newsgroups the thread and try some trace-malloc fu, as soon as I upgrade to a version of GCC that can handle the latest xpcom/base/nsTypeInfo.cpp. Clearly, recyclers are a means to an end. The one you added saves cycles in viewer, but cycles are not bloat, and don't even correlate to bloat in many cases. You have never quantified the transient bloat problem that this bug reports, and compared it to other inter-GC or intra-page-load bloats, so I think your use of "significant" is unjustified. Again, if the problem of Mozilla chrome using JS to capture events, leading to event bloat between GC runs, *is* significant (and I agree, obviously it could be, with enough mousing around between GCs -- but how bad is it in the real world?), then we have a GC scheduling bug, and not an excuse or pretext to badmouth JS, or to rewrite the UI in C++. The former is just wrong, the latter would add a ton of C++ to the system (source brainprint and compiled footprint), raising the bar for programmers, introducing leak and bad pointer bugs, etc. etc. More repeating of what I wrote elsewhere: say we do a hybrid C++/JS scheme that rewrites event listeners in JS to take only the public members of the nsDOMEvent instance, including methods, but never to get a JS object wrapping the C++ instance. This approach still adds otherwise unnecessary C++ to the system (a fixed code footprint cost). It also obfuscates the JS code, so much so that it will probably lead to a JS object being created to hold all the related members and methods. Then for each nsDOMEvent, which on the upside would be recycled after the handler returns, you'd still get a JS object that would not be recycled until the next GC. But here I'm guessing at exactly what you propose as a practical improvement over the current state of affairs where JS can implement the same listener interfaces that C++ viewer, layout, and DOM code implement and call through. Did you have something other than the hybrid approach, or rewrite-it-all-in-C++, in mind? JS and languages like it have win over C++ at the higher layers, and we'll see more of them, including more garbage-collected languages, in Mozilla. Their embeddings will have bugs that need to be fixed, such as GC scheduling (if it's not handled magically by the language's engine). By all means, let's call a spade a spade, but please don't try to puff up the significance of GC's downside without any apples-to-apples quantification. It's not as if refcounting has no downside, in different (oranges to apples, and usually worse, judging from Mozilla experience) ways than transient bloat. Can we let this bug stand as the GC scheduling one? Should joki own it? /be
Reassigning QA Contact for all open and unverified bugs previously under Lorca's care to Gerardo as per phone conversation this morning.
QA Contact: lorca → gerardok
Setting milestone
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9
I think the title of this bug is misleading, I get permanent threadsafety assertions for the nsDOMEvent when leaving the viewer. As this is under Win98 it usually indicates that an object is held past xpcom shutdown. It makes table regression tests a really funny exercise.
I'll continue to hold onto this bug but I don't know that we've decided on a course of action to do anything about it. For the moment I'm moving it to 0.9.1 since nothing is going to be done on it before tomorrow but we should agree what needs to be done.
Target Milestone: mozilla0.9 → mozilla0.9.1
Moving this to 1.0 as I don't think we have a plan of action to do anything about it for 0.9.1
Target Milestone: mozilla0.9.1 → mozilla1.0
Shaver may prefer to dup this against his incremental/generational GC bug. /be
QA contact updated
QA Contact: gerardok → madhur
Blocks: 92580
No longer blocks: 92580
Bugs targeted at mozilla1.0 without the mozilla1.0 keyword moved to mozilla1.0.1 (you can query for this string to delete spam or retrieve the list of bugs I've moved)
Target Milestone: mozilla1.0 → mozilla1.0.1
QA Contact: madhur → rakeshmishra
QA Contact: rakeshmishra → trix
.
Assignee: joki → saari
Status: ASSIGNED → NEW
QA Contact: trix → ian
retargeting
Target Milestone: mozilla1.0.1 → Future
Assignee: saari → nobody
QA Contact: ian → events
Moving to p3 because no activity for at least 1 year(s). See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P2 → P3
Component: Event Handling → User events and focus handling

Hi Brendan, do you believe this issue is still relevant today or it can be closed?

Flags: needinfo?(brendan)

Redirect a needinfo that is pending on an inactive user to the triage owner.
:peterv, since the bug has high severity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(brendan) → needinfo?(peterv)

This is some ancient bug from pre-cycle-collector era.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(peterv)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.