Closed
Bug 686327
Opened 14 years ago
Closed 13 years ago
shared red-black tree nil
Categories
(Core :: Memory Allocator, defect)
Core
Memory Allocator
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: igor, Unassigned)
Details
Attachments
(6 files, 3 obsolete files)
|
40.25 KB,
patch
|
Details | Diff | Splinter Review | |
|
53.79 KB,
patch
|
Details | Diff | Splinter Review | |
|
38.01 KB,
patch
|
Details | Diff | Splinter Review | |
|
57.00 KB,
patch
|
Details | Diff | Splinter Review | |
|
4.66 KB,
patch
|
Details | Diff | Splinter Review | |
|
76.91 KB,
patch
|
Details | Diff | Splinter Review |
jemalloc uses few red-black trees to manage its data structures. The implementation of the tree comes from http://hg.mozilla.org/mozilla-central/file/da2f5b63ba1e/memory/jemalloc/rb.h .
To be completely generic the implementation does not define its own tree node type. Rather it provides a macro, rb_node, that allows to embed the tree linkage fields into a user-defined structure and then use that structure as a node type avoiding any casts.
This has a drawback. The implementation needs a special Nil sentinel node that it uses to represent tree leafs. For this Nil node its left and right pointers refer to itself. Thus Nil can be shared between all tree instances and put in read-only memory. However, as there is no common node structure, the code puts Nil into the structure that holds the tree root. This mean not only that a pointer to the Nil cannot be a load-time constant, but also that the tree structure is bloated with all the fields that the user-defined structure contains.
It would be nice to fix that and provide a shared Nill using explicit node type that just stores left/right pointers and the color. Then to cast between the user-defined nodes and this internal node type the code can use offsetof() to get the offset from the start of the user-defined struct to the field holding the internal node type.
Comment 1•14 years ago
|
||
/* Root structure. */
#define rb_tree(a_type)
struct {
a_type *rbt_root;
a_type rbt_nil;
}
This whole file is super, super scary, so I think a goal should be minimizing code changes. If we compiled jemalloc as C++, could we make rbt_nil a static field? I guess we'd have to declare storage for those somewhere, but that shouldn't be so hard.
| Reporter | ||
Comment 2•14 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #1)
> /* Root structure. */
> #define rb_tree(a_type)
> struct {
> a_type *rbt_root;
> a_type rbt_nil;
> }
>
> This whole file is super, super scary, so I think a goal should be
> minimizing code changes.
It is possible to do the proposed changes using automated scripts with pattern matching.
> If we compiled jemalloc as C++, could we make
> rbt_nil a static field? I guess we'd have to declare storage for those
> somewhere, but that shouldn't be so hard.
I do not see why one cannot declare a global variable using C. If that would not work across library boundaries, then at least it can be declared file-static with users of the file defining it as necessary.
But it would be nice to replace that with C++ code indeed. I need RBT for the JS GC allocation work to implement allocation of variable-sized GC things.
| Reporter | ||
Comment 3•14 years ago
|
||
| Reporter | ||
Comment 4•14 years ago
|
||
| Reporter | ||
Comment 5•14 years ago
|
||
| Reporter | ||
Comment 6•14 years ago
|
||
The files above implements the shared nil proposal. The idea is to replace in rb.h
#define rb_node(a_type) \
struct { \
a_type *rbn_left; \
a_type *rbn_right_red; \
}
and
#define rb_tree(a_type) \
struct { \
a_type *rbt_root; \
a_type rbt_nil; \
}
with the following public structs:
struct rb_node {
rb_node *rbn_left;
rb_node *rbn_right_red;
};
and
struct rb_tree {
rb_node *rbt_root; \
};
The rbt_nil field is replaced with a suitably prefixed global constant.
This makes unnecessary to pass a_type, a_field pair into macros that work on tree. Instead all such macros assumes that their arguments are rb_node pointers. In few cases this also allows to remove the a_tree argument as currently it is only used to access the nil from the tree root.
Also the macros that take the compare function also assume that it takes the node pointer delegating the responsibility to do the conversion to the comparator.
This way all the converssions is moved to rb_wrap macro that defines type-tailored tree functions. To convert from node to user-defined type the macro uses helpers based on offsetof:
#define rb_cast(a_type, a_field, a_node) \
((a_type *) ((uintptr_t) (a_node) - offsetof(a_type, a_field)))
#define rb_cast_or_null(a_type, a_field, a_node) \
((a_node) == rb_nil ? NULL : rb_cast(a_type, a_field, (a_node)))
The changes are done in 2 stages. First the python script from the comment 3 does the bulk of patching to eliminate unused arguments and tailor all the macros to assume that all node pointers are rb_node *. This generates rb2.h from rb.h, see the comment 4. Then the patch from the comment 5 finalize patching of rb2.h with non-repetitive substantial changes and switches jemalloc.c to use the new functionality.
When the changes will be finalized, I rename rb2.h back to rb.h.
With this changes it is straightforward to implement a C++ wrapper on top of rbp_ macros that uses templates for compare and search operations without touching the scary code in rbp_remove and similar.
| Reporter | ||
Comment 7•14 years ago
|
||
I implemented this as a series of patches where each patch is small and self contained and allowed to test it on the try server. The parts are:
1. Replacing the nil node in the tree root by a shared constant. The macros that accesses left, right and color fields are modified to explicitly check for this shared nil constant so it never dereferenced. This allowed to simple set a constant to (node_type *)small_number. For small_number I chosen 2000, not 0, for better test coverage so accidental NULL would not be mistaken for the constant. The patch is implemented as a python script and an extra patch.
2. Introducing rb_node and rb_tree shared structures that the tree macros uses. The pointer to structures are converted to user-supplied types using offsetof only when calling the node compare operations. As with part 1 this is implemented as a python script and a patch.
3. Replacing rb_nil defined as (rb_node *)2000 with proper shared const to avoid checks for nil in the getter macros.
4. Finalizing the changes like removing the python scripts from the tree.
| Reporter | ||
Comment 8•14 years ago
|
||
The patch contains python script that replaces rbt_nil field in the tree root with shared rbt_nil constant. The script also replaces assert calls with rb_assert so it is easier to overwrite that. Yet another change is using rbt_ macros, not rb_ones, in rb_wrap implementation, to minimize changes later.
For convenience the patch contains both the script and its result in the form of rb2.h file.
Attachment #560127 -
Attachment is obsolete: true
Attachment #560129 -
Attachment is obsolete: true
Attachment #560130 -
Attachment is obsolete: true
Attachment #561428 -
Flags: review?(jasone)
| Reporter | ||
Comment 9•14 years ago
|
||
The attached patch finalizes conversion to the shared nil and switches jemalloc to use the result of the conversion. The patch also removes the need to define SIZEOF_PTR_2POW and SIZEOF_PTR macros to simplify rb.h usage.
Attachment #561428 -
Attachment is obsolete: true
Attachment #561428 -
Flags: review?(jasone)
Attachment #561429 -
Flags: review?(jasone)
| Reporter | ||
Updated•14 years ago
|
Attachment #561428 -
Attachment is obsolete: false
Attachment #561428 -
Flags: review?(jasone)
| Reporter | ||
Comment 10•14 years ago
|
||
This is the python converter and its result to switch rb.h implementation macros to use rb_tree and rb_node common structures instead of using user-supplied one. The conversion from-to user structures happens either in rb_wrap or when calling the node compare macros.
| Reporter | ||
Comment 11•14 years ago
|
||
The patch switches jemalloc to use rb_node and rb_tree when working with rb implementation.
| Reporter | ||
Comment 12•14 years ago
|
||
The patch replaces that (rb_node *) small_constant hack with a proper shared const structure to avoid checking for rb_nil in each and every branch. As it may not be possible to share the definition of the const across libraries, the patch requires that the user of rb.h defines a prefix for the constant so it can be defined several times.
| Reporter | ||
Comment 13•14 years ago
|
||
The patch just removes the conversion scripts and renames rb3.h back to rb.h
| Reporter | ||
Comment 14•14 years ago
|
||
Justin: do you have time to review this?
Comment 15•14 years ago
|
||
(In reply to Igor Bukanov from comment #14)
> Justin: do you have time to review this?
Sure, I'm game. But can you help me understand the motivation for these changes a bit more? Do you want a generic rbt for the purposes of some other patch? Do these changes affect the speed of jemalloc?
My guess would be that a generic rbt for use outside jemalloc should be a templated C++ class. Defining that class in terms of these macros might be kind of ugly, but I guess the upside is that it would have a chance of being correct?
| Reporter | ||
Comment 16•14 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #15)
> But can you help me understand the motivation for these
> changes a bit more? Do you want a generic rbt for the purposes of some
> other patch?
Yes - I want to adapt rbt implementation for JS GC where I would like to build a templated class on top of the macros. It is possible to do that with the current implementation, but doing that is awkward especially when implementing sharing of generated code between different templates. Another motivation is that the current nil-per-tree does not allow to pack nicely the tree with another GC structures and get a archive a particularly tight layout. In addition, on platforms where shared const nil can be put into read-only memory, it will be easier to debug bugs related to node overwrites.
> Do these changes affect the speed of jemalloc?
I will try to micro-benchmark this.
> Defining that class in terms of these macros might be
> kind of ugly, but I guess the upside is that it would have a chance of being
> correct?
Yes, I want to piggy-back on all the efforts that went into rb.h.
Comment 17•14 years ago
|
||
Is there any reason that you need to use the same RBT for the JS GC and jemalloc? Why modify the allocator if we don't need to?
For the sake of YAGNI, I'd kind of like to see this templated RBT class being used in a patch you want to land before diving deep into this patch. Is that possible?
Comment 18•14 years ago
|
||
Igor, you may want to have a look at jemalloc/include/jemalloc/internal/rb.h in the upstream jemalloc implementation. If you're planning to use this rbtree outside of jemalloc, it might make sense to match upstream, although that could mean redoing a lot of work. :(
| Reporter | ||
Updated•13 years ago
|
Assignee: igor → nobody
We can reopen this if it's needed in the future.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INCOMPLETE
| Reporter | ||
Updated•13 years ago
|
Attachment #561428 -
Flags: review?(jasone)
| Reporter | ||
Updated•13 years ago
|
Attachment #561429 -
Flags: review?(jasone)
You need to log in
before you can comment on or make changes to this bug.
Description
•