Closed Bug 327969 Opened 19 years ago Closed 19 years ago

Smarter indexofchild cache

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sicking, Assigned: sicking)

Details

Attachments

(1 file)

The current cache implementation uses a most-recently-used implementation. This limits the size of the cache since the amount of data shuffeling needed is proportional to the cache size. Additionally it limits how small arrays we want to stick in the cache since the price for caching is non-trivial. A better implementation is to use something similar to what CPUs use in caching. Where the address of a resource determines where in the cache it will appear. This way we can increase the cache size without increasing read or write times in the cache. It also allows us to chache smaller arrays since the search time is next to non-existent. Patch coming up.
Attached patch Patch to fixSplinter Review
Come to think of it. I could even inline these functions since they are now really small. Let me know what you think. I'll still keep them as functions of course, just marked as inline.
Attachment #212533 - Flags: superreview?(bzbarsky)
Attachment #212533 - Flags: review?(bzbarsky)
Comment on attachment 212533 [details] [diff] [review] Patch to fix >Index: nsAttrAndChildArray.cpp >+// NUM_INDEX_CACHE_INDEX_SHIFT indicates how many steps to downshift >+// the |this| pointer. It should be large enough to not cause collisions >+// between adjecent arrays, and small enough to not cause too many gaps. I'm not sure I follow this comment. If NUM_INDEX_CACHE_INDEX_SHIFT is 0, then we'd just be masking out the low seven bits of the address, right? So in that case "adjacent arrays" would not collide, no? It's as we make NUM_INDEX_CACHE_INDEX_SHIFT bigger that they start to collide... In any case, I'd like to see the reasoning for picking 4 documented here. >+ PRUint32 ix = NS_PTR_TO_INT32(aArray) >> NUM_INDEX_CACHE_INDEX_SHIFT; >+ ix &= NUM_INDEX_CACHE_SLOTS - 1; That whole thing should be a macro taking one arg (aArray), imo, used in the two places where we need it. As for inlining the methods, if they're not called from too many places, sure. r+sr=bzbarsky with nits picked and the shift thing explained. That said, I've been considering a similar cache for nsContentList. Now that it's just based on the pointer, would it make sense to abstract it all away into a class or something and use it both place? Not sure... Maybe followup bug.
Attachment #212533 - Flags: superreview?(bzbarsky)
Attachment #212533 - Flags: superreview+
Attachment #212533 - Flags: review?(bzbarsky)
Attachment #212533 - Flags: review+
> >+// NUM_INDEX_CACHE_INDEX_SHIFT indicates how many steps to downshift > >+// the |this| pointer. It should be large enough to not cause collisions > >+// between adjecent arrays, and small enough to not cause too many gaps. > > I'm not sure I follow this comment. If NUM_INDEX_CACHE_INDEX_SHIFT is 0, then > we'd just be masking out the low seven bits of the address, right? So in that > case "adjacent arrays" would not collide, no? It's as we make > NUM_INDEX_CACHE_INDEX_SHIFT bigger that they start to collide... Right, I got 'large' and 'small' reversed. > In any case, I'd like to see the reasoning for picking 4 documented here. Actually, now that I think about it it should probably be 5. New text below: NUM_INDEX_CACHE_INDEX_SHIFT indicates how many steps to downshift the |this| pointer. It should be small enough to not cause collisions between adjecent arrays, and large enough to make sure that all indexes are used. The size below is based on the size of the smallest possible element (currently 24[*] bytes) which is the smallest distance between two nsAttrAndChildArray. 24/(2^_5_) is 0.75. This means that two adjacent nsAttrAndChildArrays will overlap one in 4 times. However not all elements will have enough children to get cached. And any allocator that doesn't return addresses aligned to 64 bytes will ensure that any index will get used. [*] sizeof(nsGenericElement) + 4 bytes for nsIDOMElement vtable pointer. > >+ PRUint32 ix = NS_PTR_TO_INT32(aArray) >> NUM_INDEX_CACHE_INDEX_SHIFT; > >+ ix &= NUM_INDEX_CACHE_SLOTS - 1; > > That whole thing should be a macro taking one arg (aArray), imo, used in the > two places where we need it. Sounds good. > As for inlining the methods, if they're not called from too many places, sure. Set is called in 3 and Get in 1. Especially the Get would be useful since we hit it much more often. > That said, I've been considering a similar cache for nsContentList. Now that > it's just based on the pointer, would it make sense to abstract it all away > into a class or something and use it both place? Not sure... Maybe followup > bug. Hmm.. might be a good idea if we hit IndexOf there a lot. Followup for sure.
Checked in.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: