Truncate the stack more aggressively in adoptAsyncStack

RESOLVED FIXED in Firefox 44

Status

()

Core
JavaScript Engine
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: fitzgen, Assigned: fitzgen)

Tracking

(Blocks: 2 bugs)

unspecified
mozilla44
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 fixed)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment, 1 obsolete attachment)

where N = ~10.

Adopting is what is expensive for async stacks. 10 async frames seems like more than plenty async context.
Assignee: nobody → nfitzgerald
See Also: → bug 1152893
Created attachment 8630090 [details] [diff] [review]
Speed up SavedStacks::adoptAsyncStack by limiting the number async frames we capture
Attachment #8630090 - Flags: review?(paolo.mozmail)
Blocks: 1152893

Comment 2

3 years ago
Comment on attachment 8630090 [details] [diff] [review]
Speed up SavedStacks::adoptAsyncStack by limiting the number async frames we capture

Review of attachment 8630090 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/SavedStacks.cpp
@@ +42,5 @@
>  
>  /**
>   * Maximum number of saved frames returned for an async stack.
>   */
> +const unsigned ASYNC_STACK_MAX_FRAME_COUNT = 10;

So this is doing what I suggested in bug 1152893 comment 5, where I was actually referring only to limiting the total number of async stack frames, and I see that in your reply in bug 1152893 comment 6 you though I was referring to the total number of captured frames.

Since performance should be inversely proportional to the square of this number, what if we try 20 frames as a start?

@@ +1007,5 @@
>  
> +    // The maximum number of async frames we save is bounded by
> +    // ASYNC_STACK_MAX_FRAME_COUNT because copying stack frames and allocating a
> +    // bunch of JSObjects is relatively slow.
> +    maxFrameCount = std::min(maxFrameCount, ASYNC_STACK_MAX_FRAME_COUNT);

This would apply only when maxFrameCount is not zero. I guess this isn't the case for the instances where we've observed the regressions, so this is not needed?
Attachment #8630090 - Flags: review?(paolo.mozmail)
(In reply to :Paolo Amadini from comment #2)
> ::: js/src/vm/SavedStacks.cpp
> @@ +42,5 @@
> >  
> >  /**
> >   * Maximum number of saved frames returned for an async stack.
> >   */
> > +const unsigned ASYNC_STACK_MAX_FRAME_COUNT = 10;
> 
> So this is doing what I suggested in bug 1152893 comment 5, where I was
> actually referring only to limiting the total number of async stack frames,
> and I see that in your reply in bug 1152893 comment 6 you though I was
> referring to the total number of captured frames.
> 
> Since performance should be inversely proportional to the square of this
> number, what if we try 20 frames as a start?

Where are you getting O(n^2) from? This should be O(n) as we aren't doing any nested iterations or anything: one O(n) pass with FrameIter to create the lookups, and then after that a second O(n) pass to get or create the objects.

Am I missing something?

> @@ +1007,5 @@
> >  
> > +    // The maximum number of async frames we save is bounded by
> > +    // ASYNC_STACK_MAX_FRAME_COUNT because copying stack frames and allocating a
> > +    // bunch of JSObjects is relatively slow.
> > +    maxFrameCount = std::min(maxFrameCount, ASYNC_STACK_MAX_FRAME_COUNT);
> 
> This would apply only when maxFrameCount is not zero. I guess this isn't the
> case for the instances where we've observed the regressions, so this is not
> needed?

Yes this would only apply when maxFrameCount != 0 because the maxFrameCount == 0 case is handled above this line. I can make this addition an else branch if you think that is more legible.

Some callers do provide maxFrameCount and others do not. Errors do, but promises do not. It may be worth changing the default maxFrameCount or making more callers provide an upper limit.

Comment 4

3 years ago
(In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #3)
> Where are you getting O(n^2) from? This should be O(n) as we aren't doing
> any nested iterations or anything: one O(n) pass with FrameIter to create
> the lookups, and then after that a second O(n) pass to get or create the
> objects.

Hm, no, you're right, that's actually O(n). In the iterative case under scrutiny, that amounts to N*M frame copies where M is the number of iterations we do past the first N (where we don't have to rewrite the stack).

So, my proposal would be, what if we had two thresholds, N1 and N2=N1*2, and we truncated to N1 async frames every time there were N2 or more on the async stack? This should be roughly only M frame copies (independently from the value of N1) and can be most effective against the performance issues we've noticed.

We could try N1=20 and N2=40 as a start. (For consumers specifying a maxFrameCount, we'll want that to be higher than that value plus the number of sync frames on the stack.)

The optimization of storing the number of parent frames will come in handy then (although I think copying rather than walking is the expensive part).

> Some callers do provide maxFrameCount and others do not. Errors do, but
> promises do not. It may be worth changing the default maxFrameCount or
> making more callers provide an upper limit.

Is there a reason why some callers specify their own custom limit and others don't?

Comment 5

3 years ago
Nick, I've implemented the optimization from comment 4, will post a patch soon.

This increases the performance of the microbenchmark in bug 1152893 significantly.

If we had bug 1180980, we could optimize this much more. Do you think we should do that?

And now that I think about it, all the SavedFrame objects in a given chain should be same compartment even if they have different principals (it's the FrameIter that may cross compartments) so the compartment check can be moved out of the loop.

Comment 6

3 years ago
Created attachment 8657181 [details]
MozReview Request: Bug 1177508 - Truncate the stack more aggressively in adoptAsyncStack. r=fitzgen

Bug 1177508 - Truncate the stack more aggressively in adoptAsyncStack. r=fitzgen
Attachment #8657181 - Flags: review?(nfitzgerald)
Comment on attachment 8657181 [details]
MozReview Request: Bug 1177508 - Truncate the stack more aggressively in adoptAsyncStack. r=fitzgen

https://reviewboard.mozilla.org/r/18329/#review16439

LGTM, do you have benchmark numbers?

::: js/src/vm/SavedStacks.cpp:208
(Diff revision 1)
> -    typedef Vector<Lookup, 20> LookupVector;
> +    typedef Vector<Lookup, 60> LookupVector;

Instead of using the magic number 60, lets use ASYNC_STACK_MAX_FRAME_COUNT.

::: js/src/vm/SavedStacks.cpp:1144
(Diff revision 1)
> -        maxFrameCount = ASYNC_STACK_MAX_FRAME_COUNT;
> +                                           : ASYNC_STACK_MAX_FRAME_COUNT;

Style nit: This should all live on the same line.

::: js/src/vm/SavedStacks.cpp:1155
(Diff revision 1)
> +        }

Style nit: no {} on single line ifs in spidermonkey

::: js/src/vm/SavedStacks.cpp:1172
(Diff revision 1)
> +    size_t oldestFrameIndex = stackChain->length();

This is not the oldest frame's index, but the oldest frame's index + 1. Let's remove the + 1, keep the variable name, and make the look that assigns parent pointers below check `i >= 0` rather than `i != 0` and use `i` rather than `i-1` in that loop body.
Attachment #8657181 - Flags: review?(nfitzgerald) → review+

Comment 8

2 years ago
(In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #7)
> LGTM, do you have benchmark numbers?

Nothing formal, we may ask Ben in bug 1152893 to test more when this lands.

> ::: js/src/vm/SavedStacks.cpp:1172
> (Diff revision 1)
> > +    size_t oldestFrameIndex = stackChain->length();
> 
> This is not the oldest frame's index, but the oldest frame's index + 1.
> Let's remove the + 1, keep the variable name, and make the look that assigns
> parent pointers below check `i >= 0` rather than `i != 0` and use `i` rather
> than `i-1` in that loop body.

I just noticed size_t is unsigned while trying to do this. I've simply renamed the variable to "oldestFramePosition" and added a comment saying it's a 1-based index, because this looks much better than all the forced type conversions and +1/-1 I would have had to do otherwise.

I've also simplified the compartment logic as previously noted. I'll post an updated patch, let me know if you have any notes. If nothing comes up I'll land it soon.

Updated

2 years ago
Summary: Only adopt min(N, maxFrameCount) in adoptAsyncStack → Truncate the stack more aggressively in adoptAsyncStack

Updated

2 years ago
Attachment #8630090 - Attachment is obsolete: true

Comment 9

2 years ago
Comment on attachment 8657181 [details]
MozReview Request: Bug 1177508 - Truncate the stack more aggressively in adoptAsyncStack. r=fitzgen

Bug 1177508 - Truncate the stack more aggressively in adoptAsyncStack. r=fitzgen
https://hg.mozilla.org/mozilla-central/rev/516efde7cfb0
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox44: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
You need to log in before you can comment on or make changes to this bug.