Closed Bug 1225941 Opened 9 years ago Closed 9 years ago

Add a method for getting the set of nodes immediately dominated by another node

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox45 --- fixed

People

(Reporter: fitzgen, Assigned: fitzgen)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

This commit adds the `JS::ubi::DominatorTree::getDominatedSet` method. It returns a range that can be used to safely iterate over all the nodes immediately dominated by the node used to get the range. The dominated sets are eagerly computed when creating a `JS::ubi::DominatorTree` and stored in one big contiguous array, with a side-table that keeps track of the start indices of each individual dominated set within that contiguous array.
Assignee: nobody → nfitzgerald
Status: NEW → ASSIGNED
Comment on attachment 8689129 [details] [diff] [review] Add a method for getting the set of nodes immediately dominated by another node Review of attachment 8689129 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/public/UbiNodeDominatorTree.h @@ +84,5 @@ > + > + /** > + * A pointer to an immediately dominated node. > + * > + * Don't use this type directly; it has all the safety of regular "it is no safer than..." @@ +216,5 @@ > + return mozilla::Nothing(); > + memset(indices.begin(), 0, length); > + > + // Determine the size of each dominated set and place it in > + // `indices`. So I guess what would have made me understand this more easily is something like: Create a vector 'dominated' holding a flattened set of buckets of immediately dominated children nodes, with a lookup table 'indices' mapping from each node to the beginning of its bucket. In a first pass, iterate over the full set of nodes and count up the size of each bucket. Then convert the vector to store the cumulative sum of the sizes of all buckets before each index, resulting in a mapping from node index to just past the end of that node's bucket. Walk through the full set of nodes, filling in bucket entries going from the end of each bucket to the beginning. Though now that I understand it, that doesn't seem any more or less clear than what you already have written. Maybe merge the two? Or maybe just add in to your original one the part about pointing just past the end of each bucket (as an explanation of "each index + the size of its dominated set".) My first read of this made me think it was way more complicated than it actually is. @@ +238,5 @@ > + for (uint32_t i = 0; i < length; i++) { > + auto& idx = indices[doms[i]]; > + idx--; > + dominated[idx] = i; > + } Right now, the child nodes come in arbitrary order. You sort of want them sorted by decreasing retained size. But perhaps you haven't computed that yet? Or maybe it's just cleaner to do that during display. @@ +264,5 @@ > + ? dominated.end() > + : &dominated[indices[nodeIndex + 1]]; > + return DominatedSetRange(postOrder, &dominated[indices[nodeIndex]], end); > + } > + }; This talks a lot about dominated sets and things, but really what you're computing here is very general: you have a tree represented by parent links. You want to construct a child link array (an array mapping nodes to a quick lookup of their immediate children.) So the whole thing could be generalized and put into some generic graph algorithm header or something. ...and yet, until we have a second user, that seems like useless busywork.
Attachment #8689129 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink, :s:] from comment #3) > @@ +216,5 @@ > > + return mozilla::Nothing(); > > + memset(indices.begin(), 0, length); > > + > > + // Determine the size of each dominated set and place it in > > + // `indices`. > > So I guess what would have made me understand this more easily is something > like: > > Create a vector 'dominated' holding a flattened set of buckets of > immediately dominated children nodes, with a lookup table 'indices' mapping > from each node to the beginning of its bucket. > > In a first pass, iterate over the full set of nodes and count up the size of > each bucket. Then convert the vector to store the cumulative sum of the > sizes of all buckets before each index, resulting in a mapping from node > index to just past the end of that node's bucket. Walk through the full set > of nodes, filling in bucket entries going from the end of each bucket to the > beginning. > > Though now that I understand it, that doesn't seem any more or less clear > than what you already have written. Maybe merge the two? Or maybe just add > in to your original one the part about pointing just past the end of each > bucket (as an explanation of "each index + the size of its dominated set".) > > My first read of this made me think it was way more complicated than it > actually is. How does this sound? // Create a vector `dominated` holding a flattened set of buckets of // immediately dominated children nodes, with a lookup table // `indices` mapping from each node to the beginning of its bucket. // // This has three phases: // // 1. Iterate over the full set of nodes and count up the size of // each bucket. These bucket sizes are temporarily stored in the // `indices` vector. // // 2. Convert the `indices` vector to store the cumulative sum of // the sizes of all buckets before each index, resulting in a // mapping from node index to one past the end of that node's // bucket. // // 3. Iterate over the full set of nodes again, filling in bucket // entries from the end of the bucket's range to its // beginning. This decrements each index as a bucket entry is // filled in. After having filled in all of a bucket's entries, // the index points to the start of the bucket. And then there is a tiny comment above each phase in the code like: // 1 memset(indices.begin(), 0, length); for (uint32_t i = 0; i < length; i++) indices[doms[i]]++; And copying that bit out just made me realize that the memset was wrong. It should read: memset(indices.begin(), 0, length * sizeof(uint32_t)); That saved me an rr session :P > @@ +238,5 @@ > > + for (uint32_t i = 0; i < length; i++) { > > + auto& idx = indices[doms[i]]; > > + idx--; > > + dominated[idx] = i; > > + } > > Right now, the child nodes come in arbitrary order. You sort of want them > sorted by decreasing retained size. But perhaps you haven't computed that > yet? Or maybe it's just cleaner to do that during display. Yeah, I'm not 100% sure when the right time to do the sorting is. We kind of want the immediately dominated sets built up already before computing retained sizes; the alternative being O(n^2) behavior of walking up all dominators and incrementing their retained size while iterating over every node. The easy thing to do is to sort right before giving the array of immediately dominated children to JS, but then we end up potentially sorting the same things over and over again. This was what I was planning on initially implementing. I suppose we could also sort after computing retained sizes. I think that can be a follow up for if we see sorting taking up a bunch of time in practice. > @@ +264,5 @@ > > + ? dominated.end() > > + : &dominated[indices[nodeIndex + 1]]; > > + return DominatedSetRange(postOrder, &dominated[indices[nodeIndex]], end); > > + } > > + }; > > This talks a lot about dominated sets and things, but really what you're > computing here is very general: you have a tree represented by parent links. > You want to construct a child link array (an array mapping nodes to a quick > lookup of their immediate children.) So the whole thing could be generalized > and put into some generic graph algorithm header or something. > > ...and yet, until we have a second user, that seems like useless busywork. Yeah, it is generic but also fairly niche. Maybe a "good" CS-gotcha interview whiteboard question :P
(In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #4) > How does this sound? That sounds great! Thanks! > And then there is a tiny comment above each phase in the code like: > > // 1 I'm fine with having the comments split out and sitting directly above the relevant chunk of code, like you have it now. I only merged them together for ease of writing the review comment. Actually, I'd prefer they stuck close to the relevant code. > memset(indices.begin(), 0, length); > for (uint32_t i = 0; i < length; i++) > indices[doms[i]]++; > > And copying that bit out just made me realize that the memset was wrong. It > should read: > > memset(indices.begin(), 0, length * sizeof(uint32_t)); > > That saved me an rr session :P Doh! Sorry. I should have caught that.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: