Closed Bug 1601179 Opened 6 years ago Closed 5 years ago

[jsdbg2] Always capture async stacks when DevTools is open

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla79
Tracking Status
firefox79 --- fixed

People

(Reporter: loganfsmyth, Assigned: loganfsmyth)

References

(Blocks 1 open bug, Regressed 1 open bug)

Details

Attachments

(1 file)

Right now we have a few different places that need to decide whether the stack information will be captured. For example:

Right now it's a bit of a weird state because SpiderMonkey has cx->options().asyncStacks() which is set by the environment, but it also internally allows cx->realm()->isDebuggee() to also enable capturing (in the case of promises). This is inconsistent however because

  1. Gecko doesn't check this so behaves inconsistently compared to promises.
  2. SpiderMonkey's own AutoSetAsyncStackForNewCalls bails out inconsistently vs ShouldCaptureDebugInfo

Possibilities:

  • Declare a public API to check whether capturing a stack is recommended
  • Make JS::CaptureCurrentStack(aCx, &stack) itself bail out if capturing is disabled and the realm isn't a debuggee, with a flag to force-capture even when disabled, or something
  • Make deciding whether to capture a proper hook for the embedding to implement

I also think the temporality of things isn't well-defined. Is asyncStacks() defining whether a stack should be captured by default, or is it defining whether it is possible for SpiderMonkey to include an async stack at in output. As in, does it apply at the time the stack is captured, the time the stack is inspected, or both, and how do those cases interact with the case where a realm is a debuggee?

For instance, if you enable the debugger and it causes a promise or callback to have an associated captured stack, and then the debugger is removed from the realm, should the async data still be included in the trace? If an embedding explicitly calls JS::CaptureCurrentStack(aCx, &stack) and AutoSetAsyncStackForNewCalls, is it SpiderMonkey's responsibility to still exclude the async stack data?

Clarifying the expected behavior for these flags seems important.

Component: Internationalization → JavaScript Engine
Priority: -- → P2

(As context, there's a bunch of discussion in https://bugzilla.mozilla.org/show_bug.cgi?id=1280819 that applies to this issue as well)

Alright, here's an initial patch for this. I expect there to be thoughts on all this, nothing is set in stone, but I figured it would be a good place to start talking through the details.

In exploring this issue, there are two separate but related questions, when is a stack captured, and when is a captured stack accessed.

  1. When an async operation is triggered, should an async stack be captured? There is active code creating the callback, so we can use the executing code to decide whether to generate a stack trace.
  2. When some logic later runs to capture the current execution location (new Error().stack for instance, or the Debugger pausing and asking how we reached the place we're paused), should that include async frames as well as the normal sync ones? There is also the complicating factor that there is no guarantee that the associated async stack was put there by code that called JS::IsStackCaptureRecommended(cx), so if there is an arbitrary async stack, do we want to ignore that some of the time? Also, if the code that captured the stack did check that, the condition might have changed, like if the DevTools were closed before the operation completed.

Currently without this patch, Firefox does the following:

  1. Capture a stack for DOM callbacks only if the pref is set. Capture a stack for Promise callbacks if the pref is set or the promise is created by a debuggee.
  2. Include async entries in the trace if the pref is enabled, otherwise include no async entries in the trace.

Part 2 still means that traces never include an async part, but part 1 means that promises have async data attached to them that we could technically still show in the debugger if you inspected the object, though I don't believe we show that now.

With my current patch, this expands to:

  1. DOM and Promise operations do prefEnabled || isDebuggee via JS::IsStackCaptureRecommended(cx).
  2. Include async entries in the trace if the pref is enabled, or if the frame at the top of the synchronous frame list is a debuggee (which is maybe weird?).

Personally, I think the (1) makes plenty of sense, but I'm still curious about the second part. Another answer to that question would be

  1. Always include async entries, which would mean:
  • If you open devtools and then close it, errors thrown/stacks manually captured might still include the async entries for that async operations started when it was open.
  • If someone manually associated a stack with a call (with Cu.callFunctionWithAsyncStack for example), it will always show up.

That avoids the weirdness of the debuggee check, but it then makes the asyncStacks pref more of a alwaysCaptureStacks pref, rather than the feature toggle that it is right now. We could also have 2 prefs with asyncStacks on by default as a cutoff switch and alwaysCaptureStacks off by default.

Another topic that we'll want to validate is performance of the gecko profiler. There are comments around about not wanting to capture stack traces while the profiler is running. As far as I can tell, there isn't much to actually make that happen right now, but it does seem like something that is on people minds. The question is, if you open the devtools pane, the page becomes a debuggee. With the changes discussed above, having the devtools open will now be slightly more expensive because it stores the extra stack data, and that extra time/memory will show up in the profiler (how much I don't know). So the question for me is: Should I adjust IsStackCaptureRecommended to not recommend capturing stack data when cx->runtime()->geckoProfiler().enabled()? I'm leaning toward adding that.

Please don't change the meaning of the current pref. People have it off to disable all async stack capturing so that reasonable performance measurements can be done.
Or is the idea here to implement https://bugzilla.mozilla.org/show_bug.cgi?id=1280819#c1 so that all this async stack stuff is enabled only if devtools are enabled? That would be ok. And yes Gecko profiler wouldn't count as devtools.
But if devtools are open, then whether or not gecko profiler is running shouldn't affect to the feature. We do want to be able to profile devtools themselves.

Or is the idea here to implement https://bugzilla.mozilla.org/show_bug.cgi?id=1280819#c1 so that all this async stack stuff is enabled only if devtools are enabled? That would be ok. And yes Gecko profiler wouldn't count as devtools.

Yep, that's right. With this patch, if the devtools have loaded and attached one of SpiderMonkey's Debugger objects to the realm having its performance measured, this will cause async stacks to be captured and the time spent capturing those traces will show up in the perf charts.

But if devtools are open, then whether or not gecko profiler is running shouldn't affect to the feature. We do want to be able to profile devtools themselves.

This aligns with my current patch, but I wanted to make sure that was alright since it means that you could potentially get different perf results with the devtools closed vs open (and depending on which devtools tabs you've opened).

One gets different profiles anyhow if devtools are open, since devtools tend to slow down various things quite significantly.

(Reflecting the latest decision in the title)

Summary: [jsdbg2] We should clarify the implementation of the 'asyncStacks()' option and its interaction with debuggee realms → [jsdbg2] Always capture async stacks when DevTools is open
Assignee: nobody → loganfsmyth
Status: NEW → ASSIGNED
Attachment #9136217 - Attachment description: Bug 1601179 - Expose async stacks for debuggee realms, overriding the option. → Bug 1601179 - Enable async stacks but limit captured async stacks to debuggees.

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:loganfsmyth, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(loganfsmyth)
Pushed by loganfsmyth@gmail.com: https://hg.mozilla.org/integration/autoland/rev/550164313c4f Enable async stacks but limit captured async stacks to debuggees. r=jorendorff,smaug
Flags: needinfo?(loganfsmyth)
Pushed by loganfsmyth@gmail.com: https://hg.mozilla.org/integration/autoland/rev/ede1c973aa85 Enable async stacks but limit captured async stacks to debuggees. r=jorendorff,smaug
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79

== Change summary for alert #26211 (as of Mon, 15 Jun 2020 02:37:55 GMT) ==

Improvements:

10% tp5o_webext responsiveness windows10-64-shippable opt e10s stylo 2.05 -> 1.85
5% tp5o_webext linux64-shippable-qr opt e10s stylo 283.08 -> 268.86
4% tabswitch linux64-shippable-qr opt e10s stylo 7.67 -> 7.34
3% startup_about_home_paint linux64-shippable opt e10s stylo 745.50 -> 724.67
2% startup_about_home_paint_realworld_webextensions linux64-shippable opt e10s stylo 773.79 -> 757.67
2% startup_about_home_paint_realworld_webextensions linux64-shippable opt e10s stylo 774.21 -> 758.42

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26211

Regressions: 1646017
No longer blocks: 1642034

Hi Logan!

I have a question related to this change. Two weeks ago, the async stacks available in the regular logs on try allowed me to easily identify the root cause of an unhandled promise rejection: https://bugzilla.mozilla.org/show_bug.cgi?id=1637793#c17

I noticed today that more recent failures no longer contained the async part and I guess this is related to this change?
Do you think it could make sense to enable async stacks when running tests? Given that the pref is read from C++ code, I'm not sure how to get the information however.

Flags: needinfo?(loganfsmyth)

I disabled it because we wanted to make all of the environments consistent, which is why we turned off the always-capture behavior for Nightly. I think having it on for non-perf tests makes sense to me.

Given that the pref is read from C++ code, I'm not sure how to get the information however.

I'm not very familiar with the pref system, does toggling the pref in the usual way for tests not work? I've been able to toggle it in about:config and I think I've seen the behavior change properly.

Flags: needinfo?(loganfsmyth)
Regressions: 1672834
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: