Closed
Bug 658041
Opened 15 years ago
Closed 14 years ago
Stack based marking for JSRopes
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: gwagner, Assigned: gwagner)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 8 obsolete files)
|
11.68 KB,
patch
|
gwagner
:
review+
|
Details | Diff | Splinter Review |
For parallel marking we have to change the way rope-strings are marked.
The idea is to use a stack-based marking with some sort of delayed marking for over-recursion.
Comment 1•15 years ago
|
||
One instance of Schorr-Waite is enough :)
| Assignee | ||
Comment 2•15 years ago
|
||
First simple version.
We can avoid the over recursion check if we do it mark-stack based.
The time for marking 5000000 ropes increases from 2.7msec to 2.9msec on my machine.
Assignee: general → anygregor
| Assignee | ||
Comment 3•15 years ago
|
||
Ok this patch is wrong. It's not that easy :)
| Assignee | ||
Comment 4•15 years ago
|
||
Reintroducing the stack limits for rope marking.
This patch is just for regression testing and still needs some correctness work.
I haven't figured out a smart way to set the stackLimit for the gcHelperThread. Igor any ideas about that?
Attachment #534098 -
Attachment is obsolete: true
Comment 5•15 years ago
|
||
(In reply to comment #4)
> I haven't figured out a smart way to set the stackLimit for the
> gcHelperThread. Igor any ideas about that?
I do not see why we need to worry about the stack limit. With the tail recursion eliminated when marking a rope the stack depth cannot exceed log2(max_string_count) so at worst we would have (32 or 64)/log2(string-size) frames. Also, handling the stack explicitly should not add too much complexity AFICS.
Comment 6•15 years ago
|
||
(In reply to comment #5)
> I do not see why we need to worry about the stack limit. With the tail
> recursion eliminated when marking a rope the stack depth cannot exceed
> log2(max_string_count) so at worst we would have (32 or
> 64)/log2(string-size) frames. Also, handling the stack explicitly should not
> add too much complexity AFICS.
This is wrong - the rope tree is not balanced and can have arbitrary depth.
So back to the original question about the helper thread. When the thread is initialized we can calculate the limit based on the address of a variable in the first native frame of the thread.
But it looks like navigating the tree with an explicit stack can be implemented in a quite straightforward way. It could even less source lines than the native stack management.
| Assignee | ||
Comment 7•15 years ago
|
||
testBug614653.js is a good testcase that blows away the stack if we don't check for overflow.
Igor do you mean an explicit stack for each GCMarker or for each Rope? So something similar to the objStack?
Comment 8•15 years ago
|
||
(In reply to comment #7)
> testBug614653.js is a good testcase that blows away the stack if we don't
> check for overflow.
What makes the ropes special compared with objects so that using native stack is better for them rather then explicit stacks. It does not look intuitive to have so different approaches.
> Igor do you mean an explicit stack for each GCMarker or for each Rope? So
> something similar to the objStack?
(In reply to comment #7)
> Igor do you mean an explicit stack for each GCMarker or for each Rope? So
> something similar to the objStack?
I was thinking about fixed-size buffer per GCMarker used as an explicit stack when marking a rope. It could be similar to objStack, but since rope has at most two children it can be coded differently.
| Assignee | ||
Comment 9•15 years ago
|
||
Explicit mark stack for ropes.
Maybe I am missing something but I get a speedup from 622ms to 557ms for the corrected version of testBug614653.js in the shell.
Attachment #534583 -
Attachment is obsolete: true
Comment 10•15 years ago
|
||
Comment on attachment 534874 [details] [diff] [review]
patch
> static inline void
> PushMarkStack(GCMarker *gcmarker, JSString *str)
> {
> JS_ASSERT_IF(gcmarker->context->runtime->gcCurrentCompartment,
> str->compartment() == gcmarker->context->runtime->gcCurrentCompartment
> || str->compartment() == gcmarker->context->runtime->atomsCompartment);
>
>- str->mark(gcmarker);
>+ if (str->isLinear()) {
>+ str->mark(gcmarker);
>+ } else {
>+ JSRope *rope = &str->asRope();
>+ if (!str->markIfUnmarked()) {
>+ gcmarker->pushString(rope->leftChild());
>+ gcmarker->pushString(rope->rightChild());
A tail recursion can be eliminated here at least for one of the children string to avoid going to the stack and back.
>@@ -719,16 +727,19 @@ MarkChildren(JSTracer *trc, JSXML *xml)
>
> void
> GCMarker::drainMarkStack()
> {
> while (!isMarkStackEmpty()) {
> while (!objStack.isEmpty())
> ScanObject(this, objStack.pop());
>
>+ while (!ropeStack.isEmpty())
>+ PushMarkStack(this, ropeStack.pop());
>+
As objects may push strings to the stack while strings can only push strings, it is better to drain ropes first to minimize their stack usage.
Could you check cache misses with the patch? I suppose it can explain the speedup. Mutating the string in place to implement O(1) traversal implies writing a word twice per string. The patch replaces that with coping the word into the stack. Our strings takes 4 words so that minimizes the write bandwidth to the cache and memory by factor of 4.
| Assignee | ||
Comment 11•15 years ago
|
||
new version with depth-first-seach like traversal.
I don't see any real performance difference to the other approach but this might be different on smaller devices.
Attachment #534874 -
Attachment is obsolete: true
| Assignee | ||
Comment 12•15 years ago
|
||
Hm I guess now we can end up in an endless loop where we start delayed marking with the first string we wanna mark.
Comment 13•15 years ago
|
||
Comment on attachment 534905 [details] [diff] [review]
patch
> static inline void
> PushMarkStack(GCMarker *gcmarker, JSString *str)
> {
> JS_ASSERT_IF(gcmarker->context->runtime->gcCurrentCompartment,
> str->compartment() == gcmarker->context->runtime->gcCurrentCompartment
> || str->compartment() == gcmarker->context->runtime->atomsCompartment);
>
>- str->mark(gcmarker);
>+ if (str->isLinear()) {
>+ str->mark(gcmarker);
>+ } else {
>+ if (!str->markIfUnmarked()) {
>+ JSRope *rope = &str->asRope();
>+ gcmarker->pushString(rope->rightChild());
>+ JSString *leftStr = rope->leftChild();
>+ while (leftStr->isRope()) {
>+ rope = &leftStr->asRope();
>+ gcmarker->pushString(rope->rightChild());
>+ leftStr = rope->leftChild();
>+ }
Shouldn't this include some markIfUnmarked() calls for leftStr? Also this can be written in a simpler form like:
while (!str->isLinear()) {
if (!str->markIfUnmarked())
return;
JSRope *rope = &str->asRope();
gcmarker->pushString(rope->rightChild());
str = rope->leftChild();
}
str->mark(trx);
Comment 14•15 years ago
|
||
(In reply to comment #13)
> Shouldn't this include some markIfUnmarked() calls for leftStr? Also this
> can be written in a simpler form like:
>
> while (!str->isLinear()) {
> if (!str->markIfUnmarked())
> return;
> JSRope *rope = &str->asRope();
> gcmarker->pushString(rope->rightChild());
> str = rope->leftChild();
> }
> str->mark(trx);
This is wrong as gcmarker->pushSomething should be called only for already marked GC thing. So that code becomes:
while (!str->isLinear()) {
if (!str->markIfUnmarked())
return;
JSRope *rope = &str->asRope();
JSString *right = rope->rightChild();
if (right->markIfUnmarked()) {
if (right->isLinear())
right->mark(gcmarker);
else
gcmarker->pushString(right);
}
str = rope->leftChild();
}
str->mark(gcmarker);
| Assignee | ||
Comment 15•15 years ago
|
||
(In reply to comment #14)
> (In reply to comment #13)
> while (!str->isLinear()) {
> if (!str->markIfUnmarked())
> return;
> JSRope *rope = &str->asRope();
> JSString *right = rope->rightChild();
> if (right->markIfUnmarked()) {
> if (right->isLinear())
> right->mark(gcmarker);
> else
> gcmarker->pushString(right);
> }
> str = rope->leftChild();
> }
> str->mark(gcmarker);
Thx for looking at it.
We are getting closer but still not there. "right" might be a static string so we need another isStatic check. I will change it.
| Assignee | ||
Comment 16•15 years ago
|
||
More complicated than I expected...
So the problem is that we have to be very careful when we can mark the string and we have to consider that the Rope is already marked once it comes from the ropeStack. I created a ScanString function to handle all strings from the ropestack.
Attachment #534905 -
Attachment is obsolete: true
| Assignee | ||
Comment 17•15 years ago
|
||
Now with correct isMarkStackEmpty handling.
Attachment #534988 -
Attachment is obsolete: true
| Assignee | ||
Comment 18•15 years ago
|
||
Finally, try-server seems to like it.
Attachment #535065 -
Attachment is obsolete: true
Attachment #535119 -
Flags: review?(igor)
Comment 19•15 years ago
|
||
Comment on attachment 535119 [details] [diff] [review]
patch
Review of attachment 535119 [details] [diff] [review]:
-----------------------------------------------------------------
r+ with this fixed.
::: js/src/jscntxt.h
@@ +426,5 @@
> JSObject *gcWeakMapList;
>
> /* Pre-allocated space for the GC mark stacks. Pointer type ensures alignment. */
> void *gcMarkStackObjs[js::OBJECT_MARK_STACK_SIZE / sizeof(void *)];
> + void *gcMarkStackRopes[js::ROPES_MARK_STACK_SIZE / sizeof(void *)];
We should change the constants into NAME_MARK_STACK_LENGTH to avoid multiplication by the size at the definition point just to divide by the size here.
::: js/src/jsgc.h
@@ +1236,5 @@
> void markDelayedChildren();
>
> bool isMarkStackEmpty() {
> + return objStack.isEmpty() && ropeStack.isEmpty() &&
> + xmlStack.isEmpty() && largeStack.isEmpty();
Nit: use one isEmpty() per line if all do not fit one line.
@@ +1247,5 @@
> delayMarkingChildren(obj);
> }
>
> + void pushString(JSString *str) {
> + JS_ASSERT(str->isRope());
Rename this into pushRope, take JSRope as a parameter and remove the assert.
::: js/src/jsgcmark.cpp
@@ +542,5 @@
> +
> + if (rightChild->isLinear()) {
> + rightChild->mark(gcmarker);
> + } else {
> + if (rightChild->isRope() && rightChild->markIfUnmarked())
The current code in JSString::mark assumes that a non-linear string means rope. So either the code there has a bug (and the comments in jsstr.h are wrong) or we can drop the rightChild->isRope() check here. In the latter case we should replace !str->isLinear() with str->isRope() in the loop condition.
@@ +551,5 @@
> str->mark(gcmarker);
> }
>
> +static inline void
> +ScanString(GCMarker *gcmarker, JSString *str)
Replace that with ScanRope and take JSRope * as a parameter.
@@ +567,5 @@
> + if (rightChild->isLinear()) {
> + rightChild->mark(gcmarker);
> + } else {
> + if (rightChild->isRope() && rightChild->markIfUnmarked())
> + gcmarker->pushString(rightChild);
The same comment as in PushString. Also couldn't we just replace the PushString body with:
if (str->isLinear()) {
str->asLinear().mark(gcmarker);
} else {
JS_ASSERT(str->isRope());
if (str->markIfUnmarked())
ScanRope(gcmarker, str->asRope());
}
? If not, comment why this is not done.
::: js/src/jsstr.cpp
@@ +121,5 @@
>
> void
> JSString::mark(JSTracer *trc)
> {
> + JS_ASSERT(isLinear());
Remove JSString::mark alltogether and just call asLinear().mark() instead.
@@ -193,5 @@
> - parent->d.s.u2.right = str;
> - str = parent;
> - parent = nextParent;
> - goto finish_node;
> - }
It is interesting that this is the second time Deutsch-Schorr-Waite is removed in SM. The first time was when the delayed marking replaced it for object marking. The reason was that it was not possible to generalize to non-standard objects. Now the patch does that for ropes to make marking thread-safe. What makes DSW so bad I wonder?..
Attachment #535119 -
Flags: review?(igor) → review+
| Assignee | ||
Comment 20•15 years ago
|
||
(In reply to comment #19)
> Comment on attachment 535119 [details] [diff] [review] [review]
> patch
>
> Review of attachment 535119 [details] [diff] [review] [review]:
> -----------------------------------------------------------------
>
> r+ with this fixed.
>
> ::: js/src/jscntxt.h
> @@ +426,5 @@
> > JSObject *gcWeakMapList;
> >
> > /* Pre-allocated space for the GC mark stacks. Pointer type ensures alignment. */
> > void *gcMarkStackObjs[js::OBJECT_MARK_STACK_SIZE / sizeof(void *)];
> > + void *gcMarkStackRopes[js::ROPES_MARK_STACK_SIZE / sizeof(void *)];
>
> We should change the constants into NAME_MARK_STACK_LENGTH to avoid
> multiplication by the size at the definition point just to divide by the
> size here.
I already changed this in my parallel-marking patch.
>
> ::: js/src/jsgcmark.cpp
> @@ +542,5 @@
> > +
> > + if (rightChild->isLinear()) {
> > + rightChild->mark(gcmarker);
> > + } else {
> > + if (rightChild->isRope() && rightChild->markIfUnmarked())
>
> The current code in JSString::mark assumes that a non-linear string means
> rope. So either the code there has a bug (and the comments in jsstr.h are
> wrong) or we can drop the rightChild->isRope() check here. In the latter
> case we should replace !str->isLinear() with str->isRope() in the loop
> condition.
I got confused with static atoms and there we can't call markIfUnmarked() on them. But it works out now with the new control flow.
>
> The same comment as in PushString. Also couldn't we just replace the
> PushString body with:
>
> if (str->isLinear()) {
> str->asLinear().mark(gcmarker);
> } else {
> JS_ASSERT(str->isRope());
> if (str->markIfUnmarked())
> ScanRope(gcmarker, str->asRope());
> }
>
> ? If not, comment why this is not done.
Yeah that's better! Great!
>
> It is interesting that this is the second time Deutsch-Schorr-Waite is
> removed in SM. The first time was when the delayed marking replaced it for
> object marking. The reason was that it was not possible to generalize to
> non-standard objects. Now the patch does that for ropes to make marking
> thread-safe. What makes DSW so bad I wonder?..
Most of the papers that report results for the DSW are fairly old. They measure the GC time in minutes... I guess it's time for academia to revisit old algorithms and re-measure the impact with modern architectures.
BTW: This patch is a 50% marking time speedup for ropes! We reduce the marking time from 5.5ms to around 2.5ms for each GC in testBug614653.js
| Assignee | ||
Comment 21•15 years ago
|
||
Attachment #535119 -
Attachment is obsolete: true
Comment 23•15 years ago
|
||
Sweet! I wish I had known that there was all this existing delayed-marking machinery when I dag-ified ropes :)
| Assignee | ||
Comment 24•15 years ago
|
||
Comment on attachment 535237 [details] [diff] [review]
patch
re-applying Igors r+
Attachment #535237 -
Flags: review+
| Assignee | ||
Comment 25•15 years ago
|
||
Whiteboard: fixed-in-tracemonkey
Comment 26•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•