Closed Bug 513169 Opened 16 years ago Closed 1 year ago

Investigate DOM child list storage strategies

Tracking

()

Status:

RESOLVED WONTFIX

Tracking Flags:

Tracking

Status

blocking2.0

---

People

(Reporter: bzbarsky, Assigned: peterv)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(3 files, 1 obsolete file)

Patch to do the double thing in our current world 16 years ago Boris Zbarsky [:bzbarsky] 16.47 KB, patch		Details \| Diff \| Splinter Review
Try to renumber the smallest range possible 15 years ago (no longer active) 4.39 KB, patch		Details \| Diff \| Splinter Review
Renumber of GetIndex (WIP 1) 15 years ago (no longer active) 19.53 KB, patch		Details \| Diff \| Splinter Review
Renumber on GetIndex (WIP 2) 15 years ago (no longer active) 23.84 KB, patch		Details \| Diff \| Splinter Review

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

16 years ago

We have several issues that show up as performance problems: 1) Position comparisons can be slow if comparing the position of two nodes far apart in a long child list, because we lose the locality of reference on the IndexOf cache. 2) Mutations of the DOM need to realloc the child pointer array (and memmove, too). A possible approach to this is to switch to a setup like webkit's with a doubly-linked list and some sort of cache for handling [n] on the childNodes (not sure what else they cache). We could include something like the setup described http://groups.google.com/group/mozilla.dev.tech.dom/browse_thread/thread/cf57c56ba10e5c80# to make position comparisons fast (needed for layout). We'd need to eliminate our use of indexOf and change the document observer ContentAppended to pass in the first new node, not the first index. We'd need to change all the iterations over kids to not use GetChildArray or GetChildAt. We should get numbers on this.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 1

•

16 years ago

Attached patch Patch to do the double thing in our current world — Details — Splinter Review

One other note. roc suggested that the right way to deal with running out of precision is to not renumber on insert but rather allow the indices to be nondecreasing (rather than increasing). Then renumber when two indices are equal on position compare.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 2

•

16 years ago

couldn't we use float instead of double to save a little memory?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

16 years ago

Sure. If you look at the diff, it's actually just using a typedef to define the type; switching it from double to float is a one-liner. It might be worth measuring both and seeing what things look like.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

16 years ago

Blocks: 481131

Jeff Walden [:Waldo]

Comment 4

•

16 years ago

(In reply to comment #2) > couldn't we use float instead of double to save a little memory? float doesn't preserve PRUint32's precision, unless we enforce an arbitrary limit that an element may have no more than 2**(24ish) children.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 5

•

16 years ago

In our current implementation, the child count integer is 22 bits unsigned (see nsAttrAndChildArray::ChildCount). So if we limit number of kids to the mantissa of a float (which is 23 bits), we actually allow more kids.

Peter Van der Beken [:peterv]

Assignee

Updated

•

15 years ago

Assignee: nobody → peterv

Status: NEW → ASSIGNED

(no longer active)

Comment 6

•

15 years ago

(In reply to comment #1) > Created an attachment (id=397193) [details] > Patch to do the double thing in our current world I had an idea about making the reindexing more efficient by using 2d indices instead of doubles, but in practice it didn't turn out to be useful. I also wrote a patch on top of this one to make the renumbering more efficient by renumbering a smaller range (actually, the minimal range that we can get away with), but that didn't help much as well because it turns out that by making the renumbering range smaller, we would have to do more renumbering phases because at any given execution of the renumbering phase, the indices are not separated by enough space apart, so this didn't turn out to be a performance win over your patch. > One other note. roc suggested that the right way to deal with running out of > precision is to not renumber on insert but rather allow the indices to be > nondecreasing (rather than increasing). Then renumber when two indices are > equal on position compare. I'm not sure if this only needs to be done on position compare, or on every GetIndex call (because otherwise, we probably need to remember to call a renumbering function wherever we use GetIndex for position comparison at least). And furthermore, it seems to me that this only optimizes the case where we insert a lot of nodes in a batch without needing to do any position comparison. Is that the only case we want to optimize against?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 7

•

15 years ago

> I'm not sure if this only needs to be done on position compare, or on every > GetIndex call I'm happy to nix GetIndex in favor of an explicit position compare function on nodes if that makes things simpler. > And furthermore, it seems to me that this only optimizes the case where we > insert a lot of nodes in a batch without needing to do any position comparison. I believe this is the common case, esp once we have lazy frame construction.

(no longer active)

Comment 8

•

15 years ago

Attached patch Try to renumber the smallest range possible — Details — Splinter Review

This is the patch I was talking about in comment 6. This should be applied over attachment 397193 [details] [diff] [review]. The code needs some cleanup, but I didn't spend time on that because I don't think it's actually worth taking.

(no longer active)

Comment 9

•

15 years ago

(In reply to comment #7) > > I'm not sure if this only needs to be done on position compare, or on every > > GetIndex call > > I'm happy to nix GetIndex in favor of an explicit position compare function on > nodes if that makes things simpler. > > > And furthermore, it seems to me that this only optimizes the case where we > > insert a lot of nodes in a batch without needing to do any position comparison. > > I believe this is the common case, esp once we have lazy frame construction. Hmm, in that case I think we're better off with a linear renumbering of the entire child list in order to avoid having to do too many renumbering passes over small sets of children. What do you think?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 10

•

15 years ago

(In reply to comment #9) > Hmm, in that case I think we're better off with a linear renumbering of the > entire child list in order to avoid having to do too many renumbering passes > over small sets of children. What do you think? That sounds like it could easily fall into pathological behavior.... It also seems like we shouldn't end up with many renumbering passes in the common case.

(no longer active)

Comment 11

•

15 years ago

(In reply to comment #10) > (In reply to comment #9) > > Hmm, in that case I think we're better off with a linear renumbering of the > > entire child list in order to avoid having to do too many renumbering passes > > over small sets of children. What do you think? > > That sounds like it could easily fall into pathological behavior.... It also > seems like we shouldn't end up with many renumbering passes in the common case. So, are you suggesting a localized renumbering pass, such as attachment 417542 [details]?

(no longer active)

Comment 12

•

15 years ago

Comment on attachment 397193 [details] [diff] [review] Patch to do the double thing in our current world >+ NS_ASSERTION(kidIndex != targetIndex, "Wrong kid at our index?"); >+ NS_WARNING("Inefficient IndexOf call: child doesn't seem to have the right " >+ "parent"); BTW, is it expected with your patch to get this assertion and warning?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 13

•

15 years ago

> So, are you suggesting a localized renumbering pass, such as attachment 417542 [details]? I think so, yes. > BTW, is it expected with your patch to get this assertion and warning? The latter is possible if a consumer screws up. The former... I suppose one could get it if the latter happens. We should fix that caller!

(no longer active)

Comment 14

•

15 years ago

Attached patch Renumber of GetIndex (WIP 1) (obsolete) — Details — Splinter Review

This applies on top of the previous patch. The performance is not so great yet, but at least now it doesn't crash! :-) I'll look at improving the performance of this patch tomorrow to see if something useful can be made out of it.

(no longer active)

Comment 15

•

15 years ago

Attached patch Renumber on GetIndex (WIP 2) — Details — Splinter Review

This patch doesn't assert or crash, but I'm still seeing huge (correctness, not performance) problems with it when I play with the browser. I'm posting this here in case someone is interested to pick up on this work, because I don't think that this approach is going to pay off on performance much. It improves some aspects of our performance in some testcases, but degrades the performance in other testcases. I think we need to explore the doubly-linked list approach as an alternative here, as stated in previous comments.

Attachment #417848 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Reporter

Comment 16

•

15 years ago

I was just looking at some of our style resolution code and discovered that a significant performance issue there is nsAttrAndChildArray::NonMappedAttrCount. Then I looked at the L2 cache miss stats, and discovered that all the cost is due to L2 cache misses on this line: while (count > 0 && !mImpl->mBuffer[(count - 1) * ATTRSIZE]) Our typical "fast" DOM traversal would have a node, do some stuff with it, then look up the address mImpl points to (probably a cache hit, since the whole DOM node fits in a cache line), then load address pointed to by mImpl plus the offset to mBuffer (possibly a cache miss), then load the address pointed to by that plus the offset of the node we want (another possible cache miss). That gives us the node pointer. Then we may or may not hit on the actual access to the node. In contrast, the doubly-linked list case is a cache hit on the firstchild address lookup and possibly on the nextsibling address lookup (depending on the size of the traversed subtree and whether our node got evicted from the cache during it). Actually dereferencing the pointer is a maybe-miss just like in our code. So while an array in theory gives better cache behavior, the particular way we do it happens to actually suck more cache-wise... To not suck more the array buffer would need to be directly pointed to from the node. To be better the buffer would need to be part of the node's memory or the array would need to store nodes instead of addresses. Or something.

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

15 years ago

blocking1.9.2: --- → ?

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

15 years ago

blocking1.9.2: ? → ---

blocking2.0: --- → ?

Johnny Stenback (:jst)

Comment 17

•

15 years ago

Not blocking the release on this, but we'll be working on figuring out what to do here independent of our release schedules.

blocking2.0: ? → -

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

15 years ago

Depends on: 564432

Wayne Mery (:wsmwk)

Comment 18

•

15 years ago

has peterv really taken this on? or the assignment a mistake and it's ehsan? => major to match severity of Bug 440641 - Sending longish plaintext message hangs Thunderbird

Severity: normal → major

Keywords: perf

(no longer active)

Comment 19

•

15 years ago

I'm not working on this again any time soon...

Mounir Lamouri (:mounir)

Updated

•

14 years ago

OS: Mac OS X → All

Hardware: x86 → All

Johnny Stenback (:jst)

Updated

•

14 years ago

Depends on: 661064

Mounir Lamouri (:mounir)

Updated

•

14 years ago

Depends on: 651120

Mounir Lamouri (:mounir)

Updated

•

14 years ago

No longer depends on: 661064

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

RaresB

Updated

•

4 years ago

QA Whiteboard: qa-not-actionable

BugBot [:suhaib / :marco/ :calixte]

Comment 20

•

3 years ago

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 21

•

3 years ago

Peter, is this bug still valid?

Severity: -- → S3

Flags: needinfo?(peterv)

Peter Van der Beken [:peterv]

Assignee

Comment 22

•

1 year ago

Bug 651120 made this obsolete.

Flags: needinfo?(peterv)

Peter Van der Beken [:peterv]

Assignee

Updated

•

1 year ago

Status: ASSIGNED → RESOLVED

Closed: 1 year ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.