Open Bug 409933 Opened 17 years ago Updated 2 years ago

nsSlots::mMutationObservers is slow

Tracking

()

Status:

NEW

People

(Reporter: sayrer, Unassigned)

References

(
URL
)

Details

(Keywords: perf)

Attachments

(4 files)

benchmark 17 years ago Robert Sayre 5.67 KB, text/html		Details
demo patch: use a fast hashset instead of an array 17 years ago Robert Sayre 20.67 KB, patch		Details \| Diff \| Splinter Review
shark profile -- functions above 1%, system calls charged to caller 17 years ago Robert Sayre 3.05 KB, text/plain		Details
FastAddMutationObserver 17 years ago Olli Pettay [:smaug][bugs@pettay.fi] 4.36 KB, patch		Details \| Diff \| Splinter Review

Robert Sayre

Reporter

Description

•

17 years ago

While running a simple getElementsByClassName benchmark in Shark, 20-30% of time is spent in the constructor of nsContentList, in mRootNode->AddMutationObserver(this); (see URL for lxr link)

The destructor is also expensive.

Judging by the AppendUnlessExists method, it looks like an array is bad a choice for this job. There are iterator mutation and order guarantees provided by nsTObserverArray. Aside from the guarantee that documents are called first, I'm not sure which are important for this field.

I have a quick and dirty patch that uses google::dense_hash_set with a separate pointer for the first observer notification (instead of calling Prepend). It shows a substantial improvement in getElementsByClassName performance--with the matching code at the top of the profile, instead of the nsContentList constructor.

Robert Sayre

Reporter

Comment 1

•

17 years ago

Attached file benchmark — Details

Robert Sayre

Reporter

Comment 2

•

17 years ago

Attached patch demo patch: use a fast hashset instead of an array — Details — Splinter Review

Nightly trunk times: 635 826 716 620 953 658 616 939 652 617
Patched, symbols, optimized: 609 560 665 555 655 555 554 553 648 554

Boris Zbarsky [:bzbarsky]

Comment 3

•

17 years ago

Is the issue with the benchmark just that we have lots of nsContentLists in flight all at once (waiting for GC)?

Looking at the patch, it doesn't look like it gives the right behavior.  In particular, it can notify an observer that has already been removed, and won't notify newly-added observers.  The latter probably doesn't matter; I think we might actually depend on the former.

I'm also not that happy with the allocation that has to happen in the notification method.  We call those a _lot_ when we mutate the DOM (something this testcase is not doing, I assume).

It might be a better idea to change the AppendUnlessExists caller to not call it.  For example, we could have a "safe" way to add observers (which does the check) and a "fast" way (which doesn't).  With a very few exceptions, I'm pretty sure observers can take the "fast" way; certainly nsContentList can, since it's being added when it's constructed.  There will still be a cost when _destroying_ all those content lists.  But in practice I doubt that we have that many content lists in flight all at once...

Another thing we could do, perhaps, is have a hashtable similar to the one for tagname-based lists for the function-based ones.  Looking at that benchmark, we really just want to create a single nsContentList and share it, right?

Boris Zbarsky [:bzbarsky]

Comment 4

•

17 years ago

> But in practice I doubt that we have that many content lists in flight all at
> once...

In other words, I don't think this benchmark necessarily represents a realistic workload.  If I'm wrong, then we should think about the hashtable solution, possibly even customized for getElementsByClassName.  But it'd be good to have data on actual getElementsByClassName usage before doing that sort of thing.

Robert Sayre

Reporter

Comment 5

•

17 years ago

(In reply to comment #3)
> Is the issue with the benchmark just that we have lots of nsContentLists in
> flight all at once (waiting for GC)?

Yep, could be.

> Looking at the patch, it doesn't look like it gives the right behavior.  In
> particular, it can notify an observer that has already been removed, and won't
> notify newly-added observers.  The latter probably doesn't matter; I think we
> might actually depend on the former.

dense_hash_set iterators are stable allow removals. For additions, we could keep a table of things to add at the end of iteration.

> I'm also not that happy with the allocation that has to happen in the
> notification method.  We call those a _lot_ when we mutate the DOM (something
> this testcase is not doing, I assume).

Yeah, but that can be avoided as above.

> 
> It might be a better idea to change the AppendUnlessExists caller to not call
> it.  For example, we could have a "safe" way to add observers (which does the
> check) and a "fast" way (which doesn't).

That might be better.

Robert Sayre

Reporter

Comment 6

•

17 years ago

>
> dense_hash_set iterators are stable allow removals. 

er, dense_hash_set iterators allow removals (but not inserts).

Robert Sayre

Reporter

Comment 7

•

17 years ago

Attached file shark profile -- functions above 1%, system calls charged to caller — Details

Here's the profile of the benchmark, flawed though it may be.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 8

•

17 years ago

Attached patch FastAddMutationObserver — Details — Splinter Review

This should help a bit.
Do we have any other (maybe even real world) testcases where Add/RemoveMutationObserver shows up in profiles?

Boris Zbarsky [:bzbarsky]

Comment 9

•

17 years ago

It might come up with spellcheck in a very large textarea with many mis-spellings, perhaps (one observer per mis-spelled word, I would think).  Not sure whether the observer-adding cost would be noticeable there.

You could probably get a similar effect to this testcase with GetElementsByTagName and such, of course.

Past that, look at callsites of AddMutationObserver.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 10

•

17 years ago

Yeah, I think getting rid of the Contains() call here is the way to go. I've always found it odd that it's there, but I've carried it forward. I'd check all callsites and check if there really are any that aren't already safe, and if there really really are callers out there that needs the check, add a special slow path.

Jesse Ruderman

Updated

•

16 years ago

Keywords: perf

Nobody; OK to take it and work on it

Assignee

Updated

•

5 years ago

Component: DOM → DOM: Core & HTML

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

nsSlots::mMutationObservers is slow

Categories

(Core :: DOM: Core & HTML, defect)

Tracking

()

People

(Reporter: sayrer, Unassigned)

References

(
URL
)

Details

(Keywords: perf)

Crash Data

Security

(public)

User Story

Attachments

(4 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Updated

Updated

Attachment

General

Description

File Name

Content Type