Closed Bug 1576776 Opened 3 months ago Closed 3 months ago

Add Debugger.onNativeCall hook

Categories

(Core :: JavaScript Engine, task, P1)

task

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: bhackett, Assigned: bhackett)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [debugger-mvp])

Attachments

(3 files)

In order to support evaluating scripts with guarantees that no externally-visible side effects can occur, we need a Debugger trap which is called when a debuggee script make a native call. See bug 1561424 comment 4 for more about the rationale behind this.

Priority: -- → P1

This is an interesting project, but I'm worried about the following:

  1. I've been fighting code adding overhead for all of our users for a feature used by 0.001% of them. It looks like this bug will add another instance on the native function call path. I know it's "just" a test + branch per native call, but still. These things do tend to add up over time (death by a thousand cuts is a real concern), will slowly add complexity and will constrain JIT optimizations we can do in the future.

  2. In debuggee frames we will now no longer optimize any native calls/setters?

Do we know how V8 implements this?

Flags: needinfo?(jimb)
Flags: needinfo?(bhackett1024)

One suggestion is to run these "eager evaluations" always in the C++ interpreter (with a new Debugger API for that) so we don't need any JIT changes. There could be a "stay out of JITs" JSContext flag if needed. For simple expressions typed in the console that should be perfectly fine. WDYT?

(In reply to Jan de Mooij [:jandem] from comment #4)

One suggestion is to run these "eager evaluations" always in the C++ interpreter (with a new Debugger API for that) so we don't need any JIT changes. There could be a "stay out of JITs" JSContext flag if needed. For simple expressions typed in the console that should be perfectly fine. WDYT?

Sure, that sounds good. We definitely don't need these expressions to run in the JITs.

It also seemed to me like we could optimize away the isDebuggee() tests in caches attached to baseline scripts compiled without debug instrumentation, but I don't know the architecture of the JITs well enough to know if that's feasible.

Flags: needinfo?(jimb)
Flags: needinfo?(bhackett1024)
Attached patch patchSplinter Review

Here is a complete patch which avoids changes to the JITs per the above suggestion. Instead of a disableJITs flag or something on the context, I went with the clunkier cx->insideDebuggerEvaluationWithOnNativeCallHook. This allows the onNativeCall hook to still be stored on the Debugger (where all the other hooks are stored) and provide well defined semantics when we are dealing with mixtures of different VM evaluation modes and debuggers.

Actually, this patch needs some more revision... We can't use dbg.onEnterFrame to trap when entering self-hosted functions, so we won't catch direct side effects in the self hosted function, and any C++ natives called by self hosted code are internal to the self-hosting system and aren't useful when the debugger is doing comparisons for bug 1577007. The best way to resolve this is I think to call the onNativeCall hook when calling self-hosted functions (which is a similar treatment to how self hosted functions are handled in other parts of the VM), and suppress the onNativeCall hook when a native calls another native without entering content JS first.

(In reply to Brian Hackett (:bhackett) from comment #7)

Actually, this patch needs some more revision... We can't use dbg.onEnterFrame to trap when entering self-hosted functions, so we won't catch direct side effects in the self hosted function, and any C++ natives called by self hosted code are internal to the self-hosting system and aren't useful when the debugger is doing comparisons for bug 1577007. The best way to resolve this is I think to call the onNativeCall hook when calling self-hosted functions (which is a similar treatment to how self hosted functions are handled in other parts of the VM), and suppress the onNativeCall hook when a native calls another native without entering content JS first.

OK, I've updated the patches to call the onNativeCall hook when self hosted JS functions are called. I didn't include the latter suppressing as getting this right is complicated. A couple options going forward are either to be able to distinguish user-facing self hosted functions from internal ones (maybe there's a way to do this already but I didn't see it), or just let the devtools do its own whitelisting (which would have to be name-based rather than use the bug 1577007 stuff) when seeing onNativeCall calls for the builtins and intrinsics which can be invoked by a self hosted function.

Blocks: dbg-71
Status: NEW → ASSIGNED
Whiteboard: [debugger-mvp]
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b00926690f17
Part 1 - VM support for DebugAPI::onNativeCall, r=jandem.
https://hg.mozilla.org/integration/autoland/rev/da3d3d8b83c8
Part 2 - Add Debugger.onNativeCall hook, r=jimb.
Status: ASSIGNED → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71

Hi Brian.

Your push seems to have highly increased the failure rate in Bug 1573281.

https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&searchStr=windows%2C10%2Cx64%2Casan%2Creftests%2Ctest-windows10-64-asan%2Fopt-crashtest-e10s%2Cr%28c%29&fromchange=7a3378adb34084d5d19352d45abbaa1bee681ac8&tochange=d8c98c89a57d169e321c55d55730132bcc178301

Please take a look. I initially wanted to backout because ever since your push landed, this job is failing on every run, however i got conflicts because you had other bugs pushed after, that seem to have changes on the same files as this bug.

Flags: needinfo?(bhackett1024)

(In reply to Andreea Pavel [:apavel] from comment #11)

Hi Brian.

Your push seems to have highly increased the failure rate in Bug 1573281.

https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&searchStr=windows%2C10%2Cx64%2Casan%2Creftests%2Ctest-windows10-64-asan%2Fopt-crashtest-e10s%2Cr%28c%29&fromchange=7a3378adb34084d5d19352d45abbaa1bee681ac8&tochange=d8c98c89a57d169e321c55d55730132bcc178301

Please take a look. I initially wanted to backout because ever since your push landed, this job is failing on every run, however i got conflicts because you had other bugs pushed after, that seem to have changes on the same files as this bug.

I'll look at this later today, but the functionality added by this bug isn't even used yet so I have no idea what would be causing this failure to happen more often. The log for that failure says that something crashed but doesn't give any stack or other indication of what the problem is, which doesn't leave much to go on but I'll try selectively backing things out on try and seeing if that helps anything.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla71 → ---

So, I've done a bunch of try runs this week to figure out what's going on here, and from what I can tell this patch is blameless. Here are a couple try runs with modified versions of this patch. The first one fails, and the second one succeeds, and the only difference is that the second one has inlined the contents of a slowPathOnNativeCall() call.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4f00d8e39a41da1e890f2854ab9fde3992bca9bf
https://treeherder.mozilla.org/#/jobs?repo=try&revision=416f1a213a14fb43b32358b4165c213aef81a2b4

/* static */
ResumeMode DebugAPI::onNativeCall(JSContext* cx, const CallArgs& args,
                                  CallReason reason) {
  if (!cx->realm() || !cx->realm()->isDebuggee()) {
    return ResumeMode::Continue;
  }
  //return ResumeMode::Continue; // PASSES
  //return slowPathOnNativeCall(cx, args, reason); // FAILS
}

/* static */
ResumeMode DebugAPI::slowPathOnNativeCall(JSContext* cx, const CallArgs& args,
                                          CallReason reason) {
  return ResumeMode::Continue;
}

I don't know how to proceed here, though. This patch makes an existing intermittent failure happen more often, and the log is completely useless as to what led to the failure in the first place.

Flags: needinfo?(bhackett1024)

(In reply to Brian Hackett (:bhackett) from comment #14)

I don't know how to proceed here, though. This patch makes an existing intermittent failure happen more often, and the log is completely useless as to what led to the failure in the first place.

As an update here, I don't know how to read these logs and the failure was associated with a specific test, and apparently due to a stack overflow (the stack limit must be lower on asan builds, and I guess the extra slowPathOnNativeCall frame was enough to push it over the edge). In bug 1573281 I've disabled the test on asan builds, so this should be able to reland after I've done another try run.

Depends on: 1573281
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5a61ac1dbb00
Part 1 - VM support for DebugAPI::onNativeCall, r=jandem.
https://hg.mozilla.org/integration/autoland/rev/0765fd605a48
Part 2 - Add Debugger.onNativeCall hook, r=jimb.
Status: REOPENED → RESOLVED
Closed: 3 months ago3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
You need to log in before you can comment on or make changes to this bug.