Closed
Bug 665354
Opened 13 years ago
Closed 13 years ago
alternative bump-allocation for GC arenas
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla8
People
(Reporter: igor, Assigned: igor)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
30.30 KB,
patch
|
igor
:
review+
|
Details | Diff | Splinter Review |
1.02 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #601075 +++
The implementation of the list of segments of free things in the GC arena from the bug 601075 may not be an optimal one. We should investigate alternative presentations. See bug 601075 comment 41 and later for a discussion of alternatives.
Assignee | ||
Updated•13 years ago
|
Summary: bump-allocator for GC arenas → alternative bump-allocation for GC arenas
Assignee | ||
Comment 1•13 years ago
|
||
The current bump allocation implementation that uses FreeSpan::end & ArenaMask check to distinguish the end of the last span from the linkage cell of a non-last span does not work with the changes from the bug 600234.
In that bug I move the arena header out of the arena. Thus GC things that are power 2 in size will take the whole arena tightly. Suppose now that after the GC we have the first GC thing in some arena free but the second is not. According to the current setup the free span that points to this thing will have FreeSpan::start and FreeSpan::end pointing to the arena start. But then FreeSpan::end & ArenaMask is zero and this free span will be mistreated as the empty span terminating the list.
This problem can be avoided if for the last free span FreeSpan::end will point to the last byte in the arena, not to the arena end (lastByte + 1). With this setup for the empty span at the end of the free span list we will have FreeSpan::start > FreeSpan::end and for the last cell in the non-last span we will have FreeSpan::start == FreeSpan::end. The allocation code changes from the current
thing = listHead->start;
if (thing == listHead->end) {
// We either have at least one more cell in the last span or
// two more in a non-last span
listHead->start += thingSize;
} else if (listHead->end & ArenaMask){
// We reached the last cell of the non-last span,
// read new list head
*listHead = *(FreeSpan *) thing;
} else {
// We have reached the end of the span list
thing = NULL;
}
into:
thing = listHead->start;
if (thing < listHead->end) {
// We either have at least one more cell in the last span or
// two more in a non-last span
listHead->start += thingSize;
} else if (thing == listHead->end){
// We reached the last cell of the non-last span,
// read new list head
*listHead = *(FreeSpan *) thing;
} else {
// We have reached the end of the span list
thing = NULL;
}
that is, the first and the second if conditions change from
thing != listHead->end
listHead->end & ArenaMask
to
thing < listHead->end
thing == listHead->end
At least on x86-(32,64) this generates more compact code as in latter case the compiler issues single cmp instruction followed by a couple of jb,je jumps while the former requires an extra "and/testb" instruction and a register load.
Another advantage of the new setup is that now FreeSpan::end always stays within arena and FreeSpan::end & ~ArenaMask can be used to access the arena even for the empty span saving extra ifs.
Assignee | ||
Comment 2•13 years ago
|
||
The patch implements the above proposal. With it I see a small win in v8 on Linux:
TEST COMPARISON FROM TO DETAILS
=============================================================================
** TOTAL **: 1.005x as fast 1573.4ms +/- 0.1% 1566.1ms +/- 0.1% significant
=============================================================================
v8: 1.005x as fast 1573.4ms +/- 0.1% 1566.1ms +/- 0.1% significant
crypto: 1.005x as fast 206.0ms +/- 0.2% 204.9ms +/- 0.3% significant
deltablue: - 286.0ms +/- 0.3% 285.4ms +/- 0.3%
earley-boyer: 1.011x as fast 259.0ms +/- 0.2% 256.3ms +/- 0.2% significant
raytrace: - 200.4ms +/- 0.1% 199.8ms +/- 0.2%
regexp: *1.004x as slow* 195.6ms +/- 0.3% 196.4ms +/- 0.3% significant
richards: - 216.9ms +/- 0.6% 215.8ms +/- 0.3%
splay: 1.010x as fast 209.6ms +/- 0.3% 207.5ms +/- 0.2% significant
Attachment #550997 -
Flags: review?(wmccloskey)
I should be able to review this tomorrow. But I have one question after a brief once-over. Why did you change all the uintptr_t types to uint8*? It seems like it just adds a lot more casting. Unless it's necessary, I definitely liked the old way better.
Assignee | ||
Comment 4•13 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #3)
> I should be able to review this tomorrow. But I have one question after a
> brief once-over. Why did you change all the uintptr_t types to uint8*?
Currently the fast allocation path and few other places uses code like end & (ArenaSize - 1). That makes uintptr_t much better fit than a pointer type. With the patch changes "&" is used when calculating the arena from the end pointer. The casts at that point makes a conversion between pointers to GC things rather explicit that helped to find bugs in the patch.
> It
> seems like it just adds a lot more casting. Unless it's necessary, I
> definitely liked the old way better.
With a fresh look at the patch I see the number of casts is big indeed. I try to hide it behind an access functions.
Assignee | ||
Comment 5•13 years ago
|
||
In the new version I switched to using uintptr_t. It is the simplest way to reduce the number of casts.
Attachment #550997 -
Attachment is obsolete: true
Attachment #550997 -
Flags: review?(wmccloskey)
Attachment #552036 -
Flags: review?(wmccloskey)
Comment on attachment 552036 [details] [diff] [review]
v2
Review of attachment 552036 [details] [diff] [review]:
-----------------------------------------------------------------
I like that you abstracted the encoding into separate methods. However, I thought that the original justification for using the encoding was that, for certain thingSizes, the extra word in the ArenaHeader reduces the number of things we can store in the arena by one. But once we move the ArenaHeader out of the Arena, the extra word is just an extra word. So why don't we get rid of the encoding entirely?
Also, I don't see the reason for changing the return types of allocate, getNext, and RefillTypedFreeList. It seems a bit cleaner to me for them to return cells. On the other hand, if you really want to leave them as uintptr_t, I think you missed a |return NULL| in RefillTypedFreeList.
Otherwise this looks good.
::: js/src/jsgc.h
@@ +127,5 @@
> /*
> * A FreeSpan represents a contiguous sequence of free cells in an Arena.
> + * |first| is the address of the first free cell in the span. |last| is the
> + * address of the last free cell in the span. This last cell holds a FreeSpan
> + * data structure for the next span unless this is the last span span on the
"span span" -> "span"
@@ +141,2 @@
> *
> + * |first| == |last| implies that we have one element span that records the
-> "we have a one element span"
@@ +141,4 @@
> *
> + * |first| == |last| implies that we have one element span that records the
> + * next span. So to allocate from it we need to update the span list head
> + * with a copy of span stored at |last| address so the following allocations
-> "a copy of the span"
@@ +251,5 @@
> + /*
> + * We must use first - arenaStart, not first & ArenaMask as
> + * first == ArenaMask + 1 for an empty span.
> + */
> + uintptr_t arenaAddr = last & ~ArenaMask;
Use arenaAddress() here?
@@ +283,5 @@
> + JS_ASSERT(last != uintptr_t(-1));
> + JS_ASSERT(first);
> + JS_ASSERT(last);
> + JS_ASSERT(first - 1 <= last);
> + uintptr_t arenaAddr = last & ~ArenaMask;
You do the arenaAddress() computation 3 times in checkSpan, which I think is enough to warrant an arenaAddressUnchecked method.
Attachment #552036 -
Flags: review?(wmccloskey) → review+
Assignee | ||
Comment 7•13 years ago
|
||
The new version addresses the nits and changes the allocation functions return type to void*. I have kept offset encoding as is to keep for now the arena header with 4 words to have more options in other bugs.
One schema that I consider for variable-length C things uses one byte per each 4 word of allocation memory to record the marking flags and the type of the GC thing. But with 64K chunks on 32 bits this implies that with 15 arenas we would need 15*(4096 / (4 * 4)) or 4096-256 bytes for that type map and we can have just 256 bytes for arena headers. If the arena header stays within 4 words, then we would need 16*15 or 256-16 bytes, so we can fit them and the chunk header in the remaining space and still have 4 words for the chunk info structure without reducing the number of useful arenas to 14.
Storing offsets is also useful as it does not depend on the the arena address. So empty chunk or chunks that cover the whole arena can be represented as compile-time constants. That simplifies few things for now.
But I plan to reconsider this when things settles.
Attachment #552036 -
Attachment is obsolete: true
Attachment #552317 -
Flags: review+
Assignee | ||
Comment 8•13 years ago
|
||
Whiteboard: [inbound]
Comment 9•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [inbound]
Target Milestone: --- → mozilla8
Comment 10•13 years ago
|
||
Comment on attachment 552317 [details] [diff] [review]
v3
This checkin introduced a 52-line dump of build-warning-spam when building jsgc.cpp in GCC.
The full list of warnings are here:
http://pastebin.mozilla.org/1296775
Sample:
{
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp: In member function 'bool js::gc::Arena::finalize(JSContext*) [with T = JSObject, JSContext = JSContext]':
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp:286:60: instantiated from 'void js::gc::FinalizeArenas(JSContext*, js::gc::ArenaHeader**) [with T = JSObject, JSContext = JSContext]'
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp:1271:47: instantiated from here
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp:238:21: warning: converting to non-pointer type 'uintptr_t' from NULL
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp: In member function 'bool js::gc::Arena::finalize(JSContext*) [with T = JSObject_Slots2, JSContext = JSContext]':
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp:286:60: instantiated from 'void js::gc::FinalizeArenas(JSContext*, js::gc::ArenaHeader**) [with T = JSObject_Slots2, JSContext = JSContext]'
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp:1274:54: instantiated from here
/builds/slave/m-cen-lnx64-dbg/build/js/src/jsgc.cpp:238:21: warning: converting to non-pointer type 'uintptr_t' from NULL
}
It's due to the assignment to NULL at the end of this block:
> if (newFreeSpanStart) {
> JS_ASSERT(thing >= thingsStart(sizeof(T)) + sizeof(T));
>- newListTail->start = newFreeSpanStart;
>- newListTail->end = thing - sizeof(T);
>- newListTail = newListTail->nextSpanUnchecked();
>- newFreeSpanStart = 0;
>+ newListTail->first = newFreeSpanStart;
>+ newListTail->last = thing - sizeof(T);
>+ newListTail = newListTail->nextSpanUnchecked(sizeof(T));
>+ newFreeSpanStart = NULL;
It looks like this used to be an assignemnt to 0. Can we just change it back to 0?
Comment 11•13 years ago
|
||
Here's a followup to fix the warning described above. FWIW, it looks like we initialize this same variable to 0 when it's declared, so assigning it to 0 appears to be kosher.
Attachment #552439 -
Flags: review?(igor)
Comment 12•13 years ago
|
||
Comment on attachment 552439 [details] [diff] [review]
followup to fix build warning
Stealing review. Let's go ahead and get this in.
Attachment #552439 -
Flags: review?(igor) → review+
Comment 13•13 years ago
|
||
Thanks! Landed followup on inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/fbbf306c6cb5
Assignee | ||
Comment 14•13 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #11)
> Created attachment 552439 [details] [diff] [review]
> followup to fix build warning
Thanks for fixing this!
Comment 15•13 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•