Closed Bug 64696 Opened 24 years ago Closed 24 years ago

DOM UI events should be allocated from an recycler

Categories

(Core :: DOM: UI Events & Focus Handling, defect, P2)

x86
Windows NT
defect

Tracking

()

VERIFIED FIXED
mozilla0.9

People

(Reporter: buster, Assigned: buster)

References

Details

(Keywords: perf)

Attachments

(1 file)

Is there any reason why nsDOMEvent's are not allocated from an arena?  We
allocate thousands of nsDOMEvent's just by interacting with simple documents
(mouse moves, selection operations, form elements, etc.)  As far as I can tell,
we usually only have one in memory at a time, and the peak I saw with simple
testing in Viewer on Windows was 4.  Seems like this is begging for a tiny arena.
Wow, this is cool. Could you measure the saving time if/when you fix this?
buster: do you mean recycler rather than arena?  Arenas are for LIFO allocation,
but it isn't clear to me that the last-allocated event gets deleted before the
next-to-last, etc. -- and any instance's lifetime could be extended arbitrarily
via refcounting, no?

OTOH, recyclers tend to fragment the heap and create private hoards.  Is malloc
really not a good enough recycler for this case, if we usually have only one
live nsDOMEvent at a time?  Is there quantify data showing too much time in
malloc and/or free from nsDOMEvent ctor/dtor code?

/be
Malloc shows up prominently in usual cases (I have not tested this case,
though). Based on buster's comment of thousands of events being allocated I
assumed a simple mouse move would result in this many events. If that were true,
I am pretty sure malloc would show up. But if that were not true (i.e. you'd
have to interact for a while to get thousands of events) then it probably would
not matter. But I think this is worth checking out.
My first attempt at this was a simple minded use of the existing PresShell
arena, used to allocate frame objects.  This arena has a layer of recycling on
top of the arena.
As expected, it dramatically cut down allocations of events, and slightly
improved mouse event handling performance (as measured by quantify.)  Since
we're measuring interactive events, reliable comparisons between allocations and
arena speed are tough. The real win, though, would hopefully be in less
fragmentation and thrashing on the allocator, which is tougher to measure (on
Windows, at least.)

Now the bad news:  events can be created before the PresShell is instantiated.
So I can't simply piggy-back on the existing PresShell arena-recycler code, so I
can't claim "virtually no additional overhead" like I wanted to.

I'll look into what it would take to put similar (or possibly simpler) code on
the event state manager instead.

Suggestions for measuring the effectiveness and tradeoffs of the recycler are
welcome.
Status: NEW → ASSIGNED
Is there a patch to try out?

I'm surprised an arena-based recycler can work, even if the lifetime of the
arena is the same as that of the document or pres-shell -- can't an event bubble
across a document container boundary, and survive a release of the arena in
which the event was allocated?

/be
that's what I'm trying to figure out.  I can't find any documentation on the 
lifetime of an event.

One idea is to have a static pool on the nsDOMEvent class itself.  But then who 
cleans it up?  For a very big benefit, it need only be size=1, since the vast 
majority of the time there's only a single event live in the system at a time.
Depends on: 64819
Documentation? HAHAHAHAHAHA, heheh, uh, sorry.

I'm not sure about anything here, but if nsDOMEvent is an XPCOM class (as it
appears to be), and even if it's private to layout, it still seems possible for
someone to hold a strong ref across a document unload, and that scares me.  If
you can rule out that possibility in the current codebase, just due to careful
coding or even blind luck, then perhaps it shouldn't be a ref-counted object at
all -- which precludes the possibility of a dangling would-be-strong ref.

/be
I really do mean a recycler.  An arena is overkill.
Summary: DOM UI events should be allocated from an arena → DOM UI events should be allocated from an recycler
Yes, events cross documents. There are also places where events get translated
between types (native, DOM, etc.) by allocating a new event type and filling it
in, so tracking the lifetime is a little hard. We're also going to have to
consider crossing everything up to top level window objects. That said, I think
this is a great idea if we can work out the details.
Thanks, Chris, great feedback.  To remove complications from event lifetime 
issues, I've gone with a tiny little static recycler on the nsDOMEvent class, 
and it seems to work great.  No leaks, huge reduction in event allocations (in 
viewer on windows), still investigating the event "leak" issue in mozilla that 
prevents the recycler from being useful in the browser.
getting this right could help decrease memory fragmentation and measuably speed
up the UI.  Setting to P2/moz0.9
Keywords: perf
Priority: -- → P2
Target Milestone: --- → mozilla0.9
with a static recycler of size 1, my testing shows a 35-50% decrease in mouse
event handling time for a small number of mouse events (500) over the profile
manager UI.

jst had suggested hacking in a call to GC after every event to see if in fact
something in javascript was holding onto nsDOMEvents in the browser UI.  He was
right on the money:  the GC call frees up the event every time, which says to me
the event was propagated into the script world, used very briefly, and the
reference to it was dropped but the script world is still holding a ref count
for it.  So the question is, how do we tell javascript that an object is
temporary and should be reclaimed asap?
Buster: the recycler is saving time otherwise spent in malloc and free, which is
great, but it may not be reducing fragmentation at all.  A recycler that holds
memory otherwise reusable by malloc may increase fragmentation.  I want to make
this point clear, because it suggests that we don't have data on fragmentation,
and probably shouldn't worry about it too much -- but we do have your good data
on time savings. 35-50% is awesome.

Which leads to: running the GC after every event can be a big time sink if only
one or two garbage objects are collected.  Most of the time will be spent
scanning live objects, fruitlessly.  Did you measure time savings/loss again
when GC'ing every time?  I don't want to guess, but I wouldn't be surprised if
GC-after-every-event takes away the gain of the recycler.

There's no need to tell the JS engine to recycle an object right away, and no
safe way to do it without scanning the live object graph to be sure the event
handler didn't store a reference to the event object somewhere.  JS uses GC, not
ref-counting.  If you're worried about too many garbage event objects piling up,
maybe we should run JS_MaybeGC every several scripts -- but isn't that exactly
what ScriptEvaluated does (every 20 scripts)?

For sure, we can tune when the GC runs, but I don't think we want to force a
full GC after every event.

/be
Re: GC
The point of running GC after every event was to convince ourselves about what
was going on, and not intended as a solution.  It does show that JS is holding
onto the events, and that the events are not referenced by anyone.  Pure bloat.
The fact is, these events are transient objects, and it's painful to think that
JS is holding onto them for no good reason.  The behavior greatly reduces the
useful of the recycler.  The recycler is still useful because only the browser
seems to have this behavior.  Mail, for example, does not.  I would guess that
some object implemented in JS has set itself up as some sort of event listener
in the browser chrome, and that is what causes the event to cross over into
script-land.  Maybe the answer is to identify this object and implement it in
C++, so the release that it does has immediate impact as intended.  What do you
think?

Re: fragmentation
I have no proof, but I have to believe that on average, a single small chunk of
static memory has to lead to less fragmentation that repeated allocations and
deletes of the same size.  This is pure conjecture, and the performance gain is
sufficient reason to pursue the recycler.  But I was hoping to be helpful on the
memory usage front as well.
Re: fragmentation
If your recycler is a static object, then it will not be in the heap anyway, so
it should not contribute to heap fragmentation at all, eh?
buster: it looks like JS event objects get created fairly eagerly due to chrome
code under xpfe.  I'm not sure this is the only cause of JS event objects.

Even if such a listener were the only cause of a JS event object being created,
that wouldn't by itself indicate rewriting the chrome code in C++ -- we have a
lot of chrome JS code, and it wins on multiple counts over C++, one of those
counts being its automatic-via-GC storage management.  Rewriting the chrome code
that listens for all events in C++ would entail either rewriting all chrome JS
code in C++, or else obfuscating the event object by passing only its members
when crossing the C++=>JS boundary.

The cost of an event object in dynamic memory is fairly small, on the order of 8
bytes from a JS GC arena and 20 bytes (net) from the malloc heap for the
object's initial slots vector.  That can pile up, for sure, if thousands of
garbage event objects are created.

Does that cost times the number of event objects between GCs equal "too much
bloat"?  I don't know; running a tool that ranks JS event object bloat between
calls to the GC (every 20th ScriptEvaluated calls JS_MaybeGC, which does a full
GC if the number of live objects appears to be 1.5x the number after the last
GC) against other bloat would help decide.  xpcom/base/bloatblame.c can help;
I'll give it a whirl and report back here.

If garbage JS event objects constitute too much bloat, we can GC more often,
provided the time spent in GC doesn't become painful.  This may be a GC tuning
bug, but I don't think it's a rewrite-chrome-event-listeners-in-C++ bug.  If it
is (if there's no good trade-off between GC'ing and letting some garbage pile
up), I'll cry.

attinasi: without a patch attached to the bug, I'm ass-u-ming that by "static",
buster meant "one-off per nsDOMEvent class, but allocated using malloc (new)"
and not "storage class static".  So the malloc heap is hit, either way.  Buster,
pls. set me straight.

/be
Is this leftover cruft from an earlier attempt? Doesn't seem like we use `shell' 
anywhere.

@@ -124,6 +192,11 @@
 }
 
 nsDOMEvent::~nsDOMEvent() {
+  nsCOMPtr<nsIPresShell> shell;
+  if (mPresContext)
+  { // we were arena-allocated, prepare to recycle myself    
+    mPresContext->GetShell(getter_AddRefs(shell));
+  }
   NS_IF_RELEASE(mPresContext);
   NS_IF_RELEASE(mTarget);
   NS_IF_RELEASE(mCurrentTarget);
Oh, one other thing. We're only ever allocating events from one thread, right? 
(Might want to add some debug-only thread-assertion stuff -- a la 
nsISupportsUtils -- just to be safe.)
chris:  yes, that stuff about the pres shell in the destructor was left-over
cruft.  thanks for finding it, it's gone.
and yes, I'll add the thread assertion stuff.  that's a great idea.
Do the thread-safety assertion stuff (and verify that you don't assert), and 
sr=waterson.
Reassigning QA Contact for all open and unverified bugs previously under Lorca's
care to Gerardo as per phone conversation this morning.
QA Contact: lorca → gerardok
checked in
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
QA contact updated
QA Contact: gerardok → madhur
verified on build 2001-07-30-10-trunk
Status: RESOLVED → VERIFIED
Component: Event Handling → User events and focus handling
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: