Closed Bug 1554524 Opened 1 year ago Closed 1 year ago

Add Debugger interface for instrumenting new scripts

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox70 --- fixed

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

(Blocks 1 open bug)

Details

Attachments

(8 files, 12 obsolete files)

77.35 KB, patch
Details | Diff | Splinter Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review

Right now, determining information about the scripts executing in a runtime can be done in two ways:

  • Modifying the VM and JITs to keep track of the information. This is done for code coverage and incrementing the record/replay execution progress counter (and maybe other things). While this technique is arbitrarily powerful it is also hard to get right with the huge complexity and ever-changing nature of the VM and JITs (for example, the code coverage analysis ignores hits that occur in Ion code AFAICT), and each new analysis only increases this complexity.

  • Using the debugger API to set breakpoints at each site of interest in the running scripts, and using the breakpoint handler to inspect runtime state. This is very powerful and requires no changes to the VM or JITs, but it's also extremely slow. Web replay currently uses this technique to tabulate the hits on breakpoint sites so that they can be searched for later, but the resulting slowdown is pretty huge. In a simple script (see example code below), setting a breakpoint at a statement which is never hit causes the code to run 45 times slower, or 5 times slower if Ion is disabled. When the breakpoints are hit, a lot of additional slowdown is needed to construct Debugger.Frames and so forth so that the execution state can be inspected.

It would be nice to have a mechanism that can be used to perform a wide variety of analyses on the scripts executing in a runtime, without needing to modify the VM or JITs and without incurring much overhead, i.e. baseline and ion should still run normally.

One way of accomplishing this is to embed the analysis logic in the script's bytecode itself. The attached patch adds a Debugger API for rewriting a script with new instructions inserted at specific places in its bytecode. Only a small set of instructions is supported --- GETGNAME, INT32, CALL, POP, some others (more will be needed in the future, probably). This API can be called in the debugger's onNewScript hook, which is invoked before the script has started executing. The intent here is to not change the overall semantics of the script --- the same original instructions should still execute in the same order --- but the presence of the new instructions does change the layout of the script's bytecode, and a bunch of grubby code is needed to replace references to offsets in the original script with the corresponding offsets in the instrumented script. This is all pretty experimental, and it is possible for users of the Debugger API to do unsupported things like modifying a script after it has started executing, or creating arbitrary edges in the CFG, in which case the process will probably crash.

The three analyses referred to above could be implemented using this technique as follows:

  • For code coverage, instrumentation can be inserted at the start of each basic block to increment a counter stored in a normal JS array associated with the script. Trickier bits like dealing with thrown exceptions could be handled using the (low-overhead, I think) onExceptionUnwind debugger hook.

  • For the execution progress counter, a call to a native function that bumps the counter could be inserted at the start of each tracked script and loop head within the script. Maybe a dedicated JSOp for this would be better, but that would still end up being simpler to implement in the VM/JITs than the current progress counter logic.

  • Tabulating breakpoint hits can be done by inserting a call to a native function at each breakpoint site that performs the logic to save information about the hit.

For now I'm only really interested in using this API for the last one of these, though, along with some future as-yet-unwritten analyses. It would be nice to fix the first two as well, but since those analyses already exist and work well, rewriting them might not be a great use of time.

This patch is mostly done: it works on a simple example and should work on more complex stuff, but a lot of testing (and test writing) is needed.

Example script referenced above, which is much slower when a breakpoint is set on the (never-executed) print statement.

function foo(n) {
  var j = 0;
  for (var i = 0; i < 10000; i++) {
    j++;
  }
  if (n) {
    print('helo');
  }
}
for (var i = 1; i < 10000; i++) {
  foo();
}
Attached patch WIP (obsolete) — Splinter Review
Assignee: nobody → bhackett1024

We should discuss this in Whistler. JimB is already organizing a meeting to discuss debugger API simplification and we probably should defer this proposal to at least that point.

Unfortunately the timing for landing something like this isn't great. Several spidermonkey projects in flight would likely be derailed by this as we continue to evolve the script representation (Bug 1535138); merge LazyScripts (Bug 1529456); pin bytecode data across process for fission (Bug ???); support js field proposal (Bug 1499448); decouple parser/BCE from the GC (Bug ???). New surprises are going to be difficult to manage in the FF69 time frame.

Let's brainstorm at Whistler. This is an interesting approach to handling instrumentation, but I'm concerned the "grubby code" may have an out-sized impact on the spidermonkey team's work and this will need some planning ahead.

(In reply to Ted Campbell [:tcampbell] from comment #2)

We should discuss this in Whistler. JimB is already organizing a meeting to discuss debugger API simplification and we probably should defer this proposal to at least that point.

Unfortunately the timing for landing something like this isn't great. Several spidermonkey projects in flight would likely be derailed by this as we continue to evolve the script representation (Bug 1535138); merge LazyScripts (Bug 1529456); pin bytecode data across process for fission (Bug ???); support js field proposal (Bug 1499448); decouple parser/BCE from the GC (Bug ???). New surprises are going to be difficult to manage in the FF69 time frame.

Let's brainstorm at Whistler. This is an interesting approach to handling instrumentation, but I'm concerned the "grubby code" may have an out-sized impact on the spidermonkey team's work and this will need some planning ahead.

I don't need to land this right away, but it would be nice if it wasn't delayed indefinitely. Efficient software development requires letting people work on things in parallel. To minimize impacts on other projects, would it be OK to land this without any tier 1 tests? If other changes break it, they don't get backed out, and I go in and fix things. This is the model that's in use for Web Replay itself, which is not on any roadmap and has no tier 1 tests.

Otherwise, I don't expect this to have much impact on other projects. What I like about this patch is that its logic is self contained, with a new interface in Debugger / JSScript but no changes elsewhere besides some minor refactoring to avoid duplicating logic.

I won't be in Whistler.

Also, for anyone concerned, this work was done while I was on PTO / weekends, and is separate from my 'official' work at Mozilla. If there is a problem with me working on this in the first place, please talk to me directly.

Attached patch WIP (obsolete) — Splinter Review

This patch works pretty well when testing in the shell and has some tests. I'm going to table this until it's been integrated with Web Replay and working well in that context, since more changes to the interface here will be needed (i.e. providing a mapping from offsets in the instrumented script to offsets in the original script).

Attachment #9067643 - Attachment is obsolete: true
Type: task → enhancement
Priority: -- → P3

I don't need to land this right away, but it would be nice if it wasn't delayed indefinitely. Efficient software development requires letting people work on things in parallel.

Of course, but enabling parallel work doesn't imply landing that work in Mozilla Central. Maintaining changes in a separate repo and merging changes from the main line is a perfectly legitimate form of parallel work. That would be the preferred way to manage changes that are not part of the product, and not covered by tests.

So, maybe we just need to find a way to make this part of the product. Several (all?) of the use cases involve calling native handler functions at selected points in the code in a way that is friendly to Ion. Would it be possible to generalize Debugger's breakpoint mechanism into something that is usable from C++, can call functions other than the usual breakpoint handler to serve things like Web Replay, coverage, etc., and is supported even by Ion, using bailouts for difficult cases?

Then this could provide fast Debugger breakpoints (breakpoints set on statements never executed wouldn't keep code out of Ion), and the only part that would not be tested and used in the product would be the non-Debugger-related handler functions for the other applications.

(In reply to Jim Blandy :jimb from comment #6)

Of course, but enabling parallel work doesn't imply landing that work in Mozilla Central. Maintaining changes in a separate repo and merging changes from the main line is a perfectly legitimate form of parallel work. That would be the preferred way to manage changes that are not part of the product, and not covered by tests.

Well, in this case the change is supporting a feature that is part of the product, albeit an experimental nightly only one. Not landing it would leave downstream web replay changes in the lurch; we would have an old and slow version in the tree, and a newer and faster one somewhere else that is only occasionally synced with the tree. This would be a weird and unfortunate situation, not least because web replay (and this feature) does have tests, they just aren't tier 1. Making the tests tier 2 has worked out well, as other work isn't slowed down by web replay's presence in the tree, and changesets that break it can be immediately identified. The latter feature is lost when working on things out of the tree.

So, maybe we just need to find a way to make this part of the product. Several (all?) of the use cases involve calling native handler functions at selected points in the code in a way that is friendly to Ion. Would it be possible to generalize Debugger's breakpoint mechanism into something that is usable from C++, can call functions other than the usual breakpoint handler to serve things like Web Replay, coverage, etc., and is supported even by Ion, using bailouts for difficult cases?

Then this could provide fast Debugger breakpoints (breakpoints set on statements never executed wouldn't keep code out of Ion), and the only part that would not be tested and used in the product would be the non-Debugger-related handler functions for the other applications.

I think making things Ion friendly is only part of the performance problem which this patch is trying to address. The example in comment 0 is not representative of real websites, which can have tons of code and spend much (most?) of their time in the interpreter or baseline. Baseline scripts are only compiled once, so they need to be able to dynamically detect whether there are breakpoint handlers installed on each visited PC, syncing the stack all the while, and leading to a 5x slowdown (which could maybe be improved, but I don't think it would be all that easy).

For the analyses described in comment 0, the instrumentation needed is fixed, so the script bytecode can be modified when it is created and left that way forever. This allows the interpreter, baseline, and Ion to all work as is without having to worry about the script's behavior dynamically changing.

Additionally, I don't think that breakpoint handlers will be a sufficiently powerful tool, either for the analyses needed immediately or for the ones planned for the future. Instrumentation is extremely versatile, and the analyses planned for web replay (see the dynamic analysis section in https://developer.mozilla.org/en-US/docs/Mozilla/Projects/WebReplayRoadmap) can I think all be implemented using it.

In general, I think pulling these instrumentation techniques earlier, out of the JITs and into the bytecode, is a step in the right direction. Being able to observe the behavior of code in a flexible way without adding even more complexity to the JITs is a great idea.

I'm much less enthusiastic about adding unsound APIs to the Debugger API. And it doesn't seem like that degree of flexibility is actually needed. Surely there's a design that is more limited than patching bytecode at arbitrary locations while still being able to accomplish the sorts of instrumentation needed.

(In reply to Jim Blandy :jimb from comment #8)

I'm much less enthusiastic about adding unsound APIs to the Debugger API. And it doesn't seem like that degree of flexibility is actually needed. Surely there's a design that is more limited than patching bytecode at arbitrary locations while still being able to accomplish the sorts of instrumentation needed.

I agree, letting users of the debugger API break the debuggee, cause crashes, etc. would be really bad --- all debugger methods (and everything else, really) should be able to stand up to the fuzzers. All current and future use cases can be supported by a design that adds instrumentation but constrains it so that the resulting bytecode is guaranteed to be valid. This could be done either by specifying bytecode instructions in the instrumentation and validating them in the API, or by specifying higher level operations in the instrumentation and converting/validating them to bytecode in the debugger. I kind of prefer the former but it's not a strong preference.

I'm also concerned about having the debugger rely on JS VM internals. For example right now splitting JSOP_FOO in JSOP_FOO1 and JSOP_FOO2 is a matter of changing the emitter, Opcodes.h followed by fixing all compilation errors and then it should Just Work. If we have to worry about updating random devtools code/tests that will add complexity for us. I'm sure this will be easy to handle initially, but over time there will be layers of abstraction on top of it that could make it a real pain.

I think adding extra opcodes when the bytecode is generated is the best way to do any instrumentation. I previously suggested the same approach to do taint-analysis offline.

While I do not think exposing publicly this instrumentation through the Debugger API is wise, I think this is a good approach to implement many of the issues seen in the debugger, such as breakpoint, tracepoints, and maybe watchpoints.

Nicolas, could you expand on how you could see breakpoints being implemented with this approach?

What do you mean by tracepoint? I think we refer to these in the UI as logpoints. I'd like to make these as fast as possible.

(In reply to Jan de Mooij [:jandem] from comment #10)

I'm also concerned about having the debugger rely on JS VM internals. For example right now splitting JSOP_FOO in JSOP_FOO1 and JSOP_FOO2 is a matter of changing the emitter, Opcodes.h followed by fixing all compilation errors and then it should Just Work. If we have to worry about updating random devtools code/tests that will add complexity for us. I'm sure this will be easy to handle initially, but over time there will be layers of abstraction on top of it that could make it a real pain.

OK, that's fair. Right now the patch preserves this property, because only a limited set of opcodes are supported and these are all filtered through the tests in DebuggerScript_addInstrumentation in Debugger.cpp. Down the line things could get awkward with an opcode based approach for specifying instrumentation, though. For example, right now when specifying instrumentation, users can supply a value like '{ opcode: "GETGNAME", name: "foobar" }' to generate a bytecode instruction that loads the "foobar" global name and pushes it onto the stack. If the bytecode emitted to load a global name changes in the future so that, say, multiple instructions are required, Debugger.cpp would be changed to emit that new bytecode sequence and the devtools code using the API wouldn't need to be changed. This would leave things so that one '{ opcode }' value maps to multiple bytecode instructions, which would be confusing.

Alternatively, if the instrumentation was specified in a slightly different way, like '{ type: "loadGlobalName", name: "foobar" }' then there is no reason to expect it to map to a single bytecode, and the VM internals are hidden away by the interface. Instrumenting a call to a global with the value '42' would look something like:

[
  { type: "loadGlobalName", name: "foobar" },
  { type: "value", value: undefined },
  { type: "value", value: 42 },
  { type: "call", numArguments: 1 },
  { type: "pop" },
]

This still exposes the concept of an operand stack to users of the Debugger API, though. That's a pretty fundamental part of the VM in comparison with the bytecode layout, but it could also be hidden away if the instrumentation was specified as a compound object like:

{
  type: "call",
  callee: { type: "loadGlobalName", name: "foobar" },
  thisv: { type: "value", value: undefined },
  arguments: [ { type: "value", value: 42 } ],
}

This would be a little trickier to convert to bytecode in Debugger.cpp, but it would hide all details of what's happening in the VM.

Blocks: 1556813

These patches implement the first idea in comment 13. Instrumentation is specified as a series of operations that are separate from the underlying opcodes (but usually map directly to a bytecode instruction). These are expressed in terms of the operand stack, but controls are in place to keep possible behaviors under control:

  • instrumentation can't modify contents of the existing operand stack, and any values it pushes will be popped before doing back to the original script's instructions.

  • only certain offsets can be instrumented. For now this is entry/exit offsets and breakpoint offsets, though this will probably need to expand in the future. The main thing here is to be careful when dealing with bytecodes that are involved in control flow.

  • instrumentation can't change the script's control flow, except by introducing branches at the main offset that go to cloned copies of the script's contents that can be instrumented differently. These are designed so that they will look like the bytecode for an 'if' statement when doing analysis of the script's control flow (e.g. for IonControlFlow.cpp).

  • scripts can't be instrumented after they've been used (executed or been JIT compiled, though I'm sure the fuzzers will turn up more cases like this; there is a script->setUninstrumentable() method to make this easy to deal with).

This constrained API, I'm much more comfortable with. And the general goal, of turning instrumentation into something that doesn't complicate the JITs and indeed benefits from JIT improvements in the compilation of ordinary code, seems really good.

However, I think the concerns Ted raised in comment 2 about coordinating this with other work do need to be addressed. Putting it off "indefinitely" is not what's being discussed. It's just that landing things blindly is also bad for parallel development. I'll get that started (in other channels).

If I had my druthers, the stackless, more expression-like API from comment 13 would be my first choice.

(In reply to Jim Blandy :jimb from comment #20)

If I had my druthers, the stackless, more expression-like API from comment 13 would be my first choice.

Yeah, looking again this does seem better. It's easier to specify and read the instrumentation and eliminates a several problems that the debugger needs to check for, like instrumentation that leaves the stack unbalanced. I just updated the patch to use this strategy to specify the instrumentation. To keep the grammar simple we distinguish between top level operations and the operands to those operations, so for making a call we end up with, for example:

{
  kind: "expression",
  value: {
    kind: "call",
    callee: { kind: "globalName", name: "callback" },
    this: { kind: "value", value: undefined },
    arguments: [{ kind: "value", value: 42 }],
  },
}

I think my general feeling with be a preference to have instrumentation be more of a pipeline step at the end of the frontend before allocating the JSScript. It would be nice if we could indicate to the CompileOptions with a flag that a certain instrumentation transform has been applied.

The replaceScriptData type functions add yet another copy of behaviour to track on top of the BCE/XDR/CopyScript paths and we have been trying to standardize this to fix weird bugs. If we could update the bytecode before the JSScript is allocated, I'd be much more comfortable. Unfortunately these patches have bit-rotted due to the current and ongoing work of Bug 1535138 ([meta] Revamp JSScript). I'd like to avoid the guts of a JSScript changing after being created. There have been some issues with weird indeterminate states being observed due to the various things that still use CellIter<JSScript>.

One question for my understanding is: Does this toggle at runtime (at least for a minimal WebRR MVP)? Even having the decision of if something is instrumented being a realm creation option would go a long way to being able to reason about things and make fuzzing easier.

(I think various future planned projects around revisiting relazification and making changes to the frontend will make this type of instrumentation easier to support and more consistently understood by different parts of the engine :)

The main problem with performing instrumentation before the script has been created is that users installing instrumentation will not know what instrumentation to install and where to do it. The script has to be inspected to find the offsets that need to be instrumented or to see what instrumentation is needed (such as whether the script is async/generator), and the best way of doing that is via the existing Debugger.Script interface, which operates on completed scripts. Anything else would require a substantial duplication of logic.

An alternative strategy would be for the addInstrumentation() API to generate a new JSScript with the instrumented contents, and replace references to the old script with the new script. This is more complicated to get right, as references in various places (LazyScript, JSFunction, debugger state) would all need to be updated, and the old JSScript would be left in a state where its function does not refer back to it. This seems like it would be more challenging for users of CellIter to deal with. When modifying the script in place, there is nothing weird left around for other code to find, and modifying the script in place does not allow intermediate states to be observed: after the script starts being modified remaining operations are infallible (or crash-on-OOM).

This seems a separate issue from needing a bunch of unpleasant logic to assemble the bytecode and source notes for the instrumented script, though. I can try factoring more logic out of BytecodeEmitter and into a ScriptAssembler class, which can be used to keep track of everything which feeds into the final script and encapsulates the raw bytecode and source note operations.

I don't know what you mean by 'toggle at runtime', but for current and all future use cases the presence of instrumentation will be invariant across a realm, and we'll know when the realm is created whether instrumentation might be present. I do not understand why this would make reasoning or fuzzing easier, though. The idea here is that instrumented scripts should be indistinguishable from uninstrumented scripts throughout the VM, so that the rest of the VM doesn't need to know about instrumentation.

Attached patch patch (obsolete) — Splinter Review

Updated rolled up patch. This factors out the data and logic from BytecodeEmitter which is needed to emit the final script, and puts this in a new ScriptAssembler class which BytecodeEmitter inherits from. When instrumenting a script, another ScriptAssembler is used, so that no special handling is needed to construct the instrumented script's private or shared data and so that the code emitting instructions and source notes is tidier. The net result is a reduction of the script instrumentation logic size from ~600 LOC to ~400 LOC, and with little extra complexity gained elsewhere and a nicer division of functionality in the frontend. On the latter front, more logic could be moved from BytecodeEmitter into ScriptAssembler; I just did the minimum necessary for the job here. I also moved the script instrumentation logic in JSScript.cpp into its own file to keep things better separated.

Attachment #9067867 - Attachment is obsolete: true
Attachment #9072045 - Attachment is obsolete: true
Attachment #9072046 - Attachment is obsolete: true

I should probably say a little more here about why instrumentation is such an exciting feature for me. It transforms spidermonkey into an extremely powerful and (relatively) easy to use engine for doing dynamic analysis of a web page's JS behavior. One example that I started looking into yesterday is analyzing the level of polymorphism of property accesses on a page. The difference between a monomorphic and a megamorphic property access can be a 15x performance difference with ion compiled code, and yet the JS can look exactly the same, something innocuous like the below can really mess up performance.

function thing() {
  this.a = 0;
  if (...) {
    this.b = 1;
  }
  this.c = ...;
   ...
}

If we had a way to surface information like this for developers it would be good, and it would be especially good if we could provide a detailed analysis of the site and help developers with fixing these performance faults and making their pages faster. Doing such a detailed analysis --- accumulating all the shapes that feed into the site --- currently requires a lot of engine instrumentation, but with this technique it could be done almost entirely in JS, with a little C++ to expose the right information in the Debugger interface. This sort of analysis would have a ton of overhead and make the page pretty unusable though, which is where web replay comes in. We can record a page's behavior with minimal overhead, and then use instrumentation to run analyses while replaying that were not in place while recording, without having to worry about overhead. We can also keep track of the execution points where different shapes appear so that we can warp users to the exact points where accesses with different shapes occur. Once we are able to keep track of an object's history over time (future work) we could show a timeline for each of these accesses, with the points in the execution where the object's shape changed.

The initial intent of this modification is to speed-up WebRR, by introducing fast recording of seek-able points.
One option, which would not be as generic as what is suggested here, would be to add a simple recording of seek-able point at function entry and loop back-edges. This would not block WebRR, and be a smaller modification to get accepted.

I do agree that there is a value in having a generic instrumentation mechanism, as this gives us the ability to delegate the instrumentation to persons who do not necessarily have knowledge about the internal of SpiderMonkey. This also remove the maintenance burden out of the JavaScript team. However, we should avoid blocking WebRR on such project.

A more technical point reported during the meeting: We are planning to separate the call to Debugger::newScript to way after the allocation of a JSScript, thus, such instrumentation should be done when the JSScript are compiled, not when a Debugger see the JSScript compiled.

Ted and I were talking this over, and we've come up with another approach that might be worth considering.

I'll start by listing out the design constraints/preferences as we understand them:

  1. WebRR needs to be able to insert "breakpoints" into scripts.
  2. The existing debugger breakpoints are too slow, because they have to create frames, etc...
  3. The "sites of interest" should be selectable in JS code (ideally via the onNewScript callback).
  4. We don't want to over-complicate the life-cycle of scripts.
  5. Performance just has to be "good enough".

Given those constraints, here's the new proposal.

  1. We add a new JSOP_INSTRUMENTATION opcode.
  2. When building an instrumented script, we insert that opcode at every possible breakpoint (or take a set of flags that specify the kinds of breakpoints we care about, and insert it there).
  3. The effect of JSOP_INSTRUMENTATION is to look up a pointer in a new side-table. That pointer will either be null (no instrumentation at this pc) or point to a native function that handles the instrumentation. If that function exists, we would call it with the pc (or other suitable value, like table index) as an argument. (Alternatively, that callback could bounce through a native trampoline to a scripted callback, if we want to trade performance for flexibility.)
  4. The onNewScript callback can populate the side-table however it likes.
  5. If we require that all instrumentation be specified in onNewScript and never change, then we can bake the instrumentation directly into compiled code, and non-instrumented breakpoints would have zero overhead in baseline or Ion.
  6. If we want to make instrumentation mutable, then we can hardcode an address to check in Baseline, and invalidate Ion scripts when instrumentation changes. Baseline overhead would be a load and a branch; Ion overhead would still be zero.

This covers all of the cases where we just want to know that we hit a certain point, without needing access to arbitrary data. Reading back through previous comments, that includes progress counters, code coverage, and (I think) breakpoint tabulation. Once we have the infrastructure in place, the existence of low-overhead breakpoints is probably useful in other cases too.

Does that work for WebRR's immediate needs? It's not quite as flexible as the grand vision of inserting arbitrary functionality, but it's simpler to implement and maintain.

(In reply to Iain Ireland [:iain] from comment #32)

Ted and I were talking this over, and we've come up with another approach that might be worth considering.
...
Does that work for WebRR's immediate needs? It's not quite as flexible as the grand vision of inserting arbitrary functionality, but it's simpler to implement and maintain.

Thanks for working your way through this. This approach looks like it will mostly satisfy web replay's immediate needs. The main thing I'm concerned about is having a plan in place for future instrumentation needs, not least because per comment 30 I've started working on those. A smaller thing is that we would need multiple instrumentation callbacks in place for a realm --- instrumentation behavior at, say, script entry points is different from that at exit points or breakpoint sites.

The description below modifies your proposal to address these. I think it will handle all of web replay's immediate and future needs. The Debugger based interface would look like:

dbg.onNewGlobal = g => {
  g.setInstrumentation(
    ["breakpoint", "entryPoint", "exitPoint"],
    ["binaryArithmetic"],
    ["getProperty", "setProperty", "getElement", "setElement"],
  );
};

g.setInstrumentationCallback(0, (scriptID, offset) => { ... });
g.setInstrumentationCallback(1, (scriptID, offset, lhs, rhs) => { ... });
g.setInstrumentationCallback(2, (scriptID, offset, object, property) => { ... });

What is going on here is that setInstrumentation() defines the instrumentation used for all scripts within a global's realm. setInstrumentation() can be called at most once per global, and JSOP_INSTRUMENTATION opcodes will be added for all future scripts compiled in the global's realm. The number of arguments to setInstrumentation is the number of instrumentation callbacks the realm has, and each argument is an array describing where to add JSOP_INSTRUMENTATION opcodes for that callback. setInstrumentationCallback() sets one of the instrumentation callbacks for a global's realm, with an index that is in the range of the original setInstrumentation() call. Callbacks can be changed, and can be null if instrumentation is inactive.

JSOP_INSTRUMENTATION itself would have two operands: a count of stack values to pop (like JSOP_CALL), and an index into the realm's callbacks to call. The bytecode emitter would emit JSOP_INSTRUMENTATION ops at the appropriate places, similar to how source notes are currently emitted for breakpoint sites. The stack values consumed depend on the operation being instrumented ("getProperty", "breakpoint", etc.). If we are instrumenting an 'x.f' getProperty access, say, the uninstrumented code would look like:

GETGNAME x
GETPROP f

While with instrumentation the code would look like:

GETGNAME x
DUP
STRING f
INSTRUMENTATION 2 CallbackIndex
GETPROP f

The semantics of JSOP_INSTRUMENTATION NumPopped CallbackIndex are:

  1. Pop NumPopped values off the stack.
  2. Get the CallbackIndex-numbered callback from the global, which is either null or a JSFunction.
  3. If the callback is non-null, call it with an identifier for the script (see below), the bytecode offset of the JSOP_INSTRUMENTATION, and all values which were just popped.

This makes some nice improvements to the current implementation:

  • Because tests for the instrumentation callbacks are built in to JSOP_INSTRUMENTATION, this should still behave efficiently when instrumentation is inactive, without any need to clone the script contents and introduce branches at the start of the script.

  • This has a better story for instrumenting sites other than breakpoints. If we want to instrument, say, all get-property accesses in new scripts, with the current patch we would either need new source notes in the script, or just scan for JSOP_GETPROP opcodes, which could pick up random things we don't want like GETPROPs emitted as part of iteration or destructuring operations. With the JSOP_INSTRUMENTATION approach, we check in the bytecode emitter whether to emit the instrumentation when we still have the parse tree and know whether the operation should be associated with instrumentation or not. (This is a similar story I think to why the bytecode emitter now emits source notes for breakpoints and debugger step targets.)

The main new wrinkle is determining the identifier to use for the script when calling an instrumentation callback. Web replay maintains an integer identifier for each script, and with the current patch we can pass this on to instrumentation functions easily by incorporating that identifier into the instrumentation instructions. With a JSOP_INSTRUMENTATION approach we won't know that ID until after the script has been emitted, so need another way of finding it. Adding a side table on the global which maps JSScript/LazyScripts to IDs, and adding a script.setInstrumentationId(N) debugger method should take care of this, though.

If these modifications look reasonable, I should be able to implement this stuff over the weekend.

Most of your suggestions seem reasonable to me. A couple questions:

  1. I was previously under the impression that you wanted to selectively annotate a subset of events within the script: for example, that you wanted to support a single watchpoint within a function, which would be otherwise uninstrumented. That's why I proposed a side-table of per-site callback pointers. Your proposal seems to be more interested in classes of event: "breakpoint", "entrypoint", "exitpoint", and so on. If I am reading you correctly, all breakpoints in a script would be instrumented in unison: either you've added a callback for breakpoints, or you haven't. Is that correct? I expect that this would improve memory consumption and hurt performance vs my proposal, since there would be no inactive sites for Baseline/Ion to compile down to a no-op. If it works for your use cases, though, then I don't think I object.

  2. Your proposal for how to pass arguments to callbacks looks very similar to the extension I had been thinking about. From the bytecode emitter's perspective, it receives a set of flags telling it which kinds of instrumentation to generate, and then mechanically inserts JSOP_INSTRUMENTATION ops in the correct places. (It seems like we should be able to obtain or generate a script id at this point and bake it into the opcode, but that's an implementation detail.) The only constraints are a) we have to be able to identify the instrumentation sites without adding too much complexity to the existing bytecode emitter, and b) the overhead of the instrumentation has to be low enough to be useful for WebRR. Those decisions can be made on a case-by-case basis. As an MVP, though, it seems like we should be able to handle all of WebRR's short-term requirements with the no-argument version, keeping the more complicated version in our back pocket until we need it. Is that correct?

Assuming I've understood you correctly: I think it would be reasonable to try writing up an MVP. Probably just the debugger interface, the minimal underlying infrastructure, and a single instrumentation kind (say, entry points). That will help us figure out whether the design is sound, and give us a better idea of the performance overhead.

(In reply to Iain Ireland [:iain] from comment #34)

Most of your suggestions seem reasonable to me. A couple questions:

  1. I was previously under the impression that you wanted to selectively annotate a subset of events within the script: for example, that you wanted to support a single watchpoint within a function, which would be otherwise uninstrumented. That's why I proposed a side-table of per-site callback pointers. Your proposal seems to be more interested in classes of event: "breakpoint", "entrypoint", "exitpoint", and so on. If I am reading you correctly, all breakpoints in a script would be instrumented in unison: either you've added a callback for breakpoints, or you haven't. Is that correct? I expect that this would improve memory consumption and hurt performance vs my proposal, since there would be no inactive sites for Baseline/Ion to compile down to a no-op. If it works for your use cases, though, then I don't think I object.

The intent of this instrumentation is to capture information uniformly about the execution; e.g. we aren't interested in capturing the hits on a particular breakpoint site (the existing Debugger APIs are fine for this), but in capturing the hits on all breakpoint sites. So, yeah, instrumentation can be applied at the level of classes of operations and works uniformly on scripts. The current approach needs information about the script being instrumented for adding that instrumentation, but only in certain ways --- obtaining the offsets of breakpoint/entry/exit sites in the script, baking the script's ID into the instrumentation, and cloning the script's contents to have an instrumentation-less copy when the script is not async/generator. The JSOP_INSTRUMENTATION approach needs to deal with these in other ways; the latter point evaporates but the others add VM complexity.

  1. Your proposal for how to pass arguments to callbacks looks very similar to the extension I had been thinking about. From the bytecode emitter's perspective, it receives a set of flags telling it which kinds of instrumentation to generate, and then mechanically inserts JSOP_INSTRUMENTATION ops in the correct places. (It seems like we should be able to obtain or generate a script id at this point and bake it into the opcode, but that's an implementation detail.) The only constraints are a) we have to be able to identify the instrumentation sites without adding too much complexity to the existing bytecode emitter, and b) the overhead of the instrumentation has to be low enough to be useful for WebRR. Those decisions can be made on a case-by-case basis. As an MVP, though, it seems like we should be able to handle all of WebRR's short-term requirements with the no-argument version, keeping the more complicated version in our back pocket until we need it. Is that correct?

Assuming I've understood you correctly: I think it would be reasonable to try writing up an MVP. Probably just the debugger interface, the minimal underlying infrastructure, and a single instrumentation kind (say, entry points). That will help us figure out whether the design is sound, and give us a better idea of the performance overhead.

Well, I'd just as soon write up something where JSOP_INSTRUMENTATION can pop stack values, to give a better sense of what is involved in the design, to avoid churn down the road, and because I need it for what I'm working on now with web replay (comment 30) and would like something to base that project off of.

I'm not worried about the performance overhead of the JSOP_INSTRUMENTATION ops, and when not instrumenting they aren't emitted so won't affect normal code. In the non-instrumentation case the only overhead will be easily predicted load+branches in BytecodeEmitter to see if JSOP_INSTRUMENTATION ops are needed. I guess we can see/measure what that overhead is, though historically branches like this haven't been much of a problem.

Attached patch patchSplinter Review

Updated patch. This works in the shell and I've adapted the bug 1556813 stuff to use it (though I won't be updating those patches until this is further along). I've made some changes to the previous discussion to simplify the implementation. Mainly, as described earlier JSOP_INSTRUMENTATION is complicated to implement in the JITs, and has too many moving parts. This patch changes how instrumentation is specified, so that the instrumentation callbacks in a realm are fixed, but that instrumentation can be either active or inactive, with the callbacks only being invoked in the former case. JSOP_INSTRUMENTATION is replaced by three other new opcodes, with much simpler semantics:

JSOP_INSTRUMENTATION_ACTIVE: Pushes a boolean for whether instrumentation is active.
JSOP_INSTRUMENTATION_CALLBACK: Push one of the fixed callbacks for the realm. It would be nice to use JSOP_OBJECT here but that opcode has some tricky semantics tied to the singleton objects it is normally used with.
JSOP_INSTRUMENTATION_SCRIPT_ID: Pushes the int32 associated with the script via Debugger.Script.setIinstrumentationId. If the id was allocated in the frontend and attached to the script/lazyscript then this could be a JSOP_INT32, but I didn't want to bloat JSScript or introduce a new weak map to associate the scripts with ids when instrumentation is in use.

The bytecode emitter uses these opcodes along with standard opcodes (JSOP_IFEQ, JSOP_CALL, etc.) to add instrumentation to the normal bytecode stream.

Attachment #9075033 - Attachment is obsolete: true
Attachment #9072043 - Attachment is obsolete: true
Attachment #9075036 - Attachment is obsolete: true
Attachment #9075037 - Attachment is obsolete: true
Attachment #9075038 - Attachment is obsolete: true
Attachment #9075040 - Attachment is obsolete: true
Attachment #9075041 - Attachment is obsolete: true
Attachment #9072044 - Attachment is obsolete: true

Iain and I have had a chance to review the current proposed patches and I think we are happy enough with this approach of instrumentation in the bytecode-emitter. I'll include the overall feedback here and will put some smaller comments on Phabricator on parts 4, 5. This will need a rebase, in particular around the debugger source has had major reorganizations recently.

I notice that you disable off-thread parse, but I think that is the wrong way to look at this. We are currently moving away from the frontend having access to cx->realm (among other sources of state) in favour of reading that state upfront while constructing CompileOptions. See https://searchfox.org/mozilla-central/rev/96403eac4ff6f96f89a0b120b29bd46540d5705e/js/src/jsapi.cpp#3582-3588 . With this approach, off-thread parse should be functional and we aren't smuggling state. It is also likely in the future that that will be encoded in the produced script. For callbacks, something like a UniquePtr<EnumeratedArray<cb_index>> might make sense.

For simplicity at this point, I'd like to avoid supporting multiple callbacks. You already restrict to a single registration call, so that point should already be able to combine various callback behaviours.

The design objectives of the script-id system are still unclear to me. Could we do something like have the CompileOptions constructor fetch an ID from the instrumentation object (keyed off SSO + sourceStart are something)? This would let us eliminate an opcode, vmcalls, and the tls access in IonBuilder.

I think we are getting pretty close to consensus. Thanks for your patience.

Thanks for the feedback, I'll update the patches this weekend. The purpose of the script ID stuff is to allow uniquely identifying the running script in a way that the identifier can be embedded in the script and passed to the instrumentation callback. Downstream we need to be able to associate the ID passed with the corresponding Debugger.Script. I think we can get rid of the special script ID system by passing the identifying information in two parts: the script's source->id (a process wide unique uint32) and the script's sourceStart, which I think will uniquely identify it within the source.

(In reply to Brian Hackett (:bhackett) from comment #45)

... I think we can get rid of the special script ID system by passing the identifying information in two parts: the script's source->id (a process wide unique uint32) and the script's sourceStart, which I think will uniquely identify it within the source.

That makes a lot of sense and is the sort of keying we'd like to keep trustworthy as we look at more sophisticated caching stories. One wrinkle to watch for is JSScript::setDefaultClassConstructorSpan. It is used as we clone scripts to create default class constructors, but should be resolved before the Debugger::onNewScript hook is called.

Out of curiosity, do IDs need to be unique for script clones (eg CloneFunctionAndScript)?

(In reply to Ted Campbell [:tcampbell] from comment #46)

Out of curiosity, do IDs need to be unique for script clones (eg CloneFunctionAndScript)?

Hmm, yes they do. We need to be able to map the ID back to the Debugger.Script it is associated with, and if there are multiple Debugger.Scripts which it could map to then we don't know what code is actually executing. If the script's identifying information is directly embedded in the bytecode, in a way that the value doesn't change when the script is cloned, then it seems impossible for downstream code to associate the ID with a particular Debugger.Script. Moreover, if the identifying information includes a script source ID, and the source of the cloned script is different from the source of the original script, the identifying information in the cloned script will be wrong.

So, given the existence of script cloning (which I'd totally forgotten about earlier) I don't think we can get the property of providing a unique ID for the executing script without a new opcode. I know JSOP_INSTRUMENTATION_SCRIPT_ID is a little awkward, but it's designed to avoid bloating JSScript, to minimize the extra VM complexity (i.e. no new weak maps), and (edited) to provide a simple interface for Debugger users whose implementation is straightforwardly correct. I can remove the TLS access in IonBuilder and just abort the builder if there isn't an instrumentation ID set on the script, but it also seems kind of nice to use a JSContext and behave uniformly across the different execution engines.

FWIW the TLS access in IonBuilder is necessary because IonBuilder doesn't normally have a JSContext, but the main reason for that is historical --- several years ago (bug 785905) I moved MIR construction off thread, a change which got backed out pretty quickly due to its complexity, introduction of new race conditions, and lack of meaningful speedup on benchmarks. While not having a JSContext makes it easy to avoid calling methods that have side effects, it also leads to cumbersome code like having to store template objects in baseline ICs instead of allocating them in IonBuilder. It might be worth letting IonBuilder always have a JSContext again, so that things like this could be streamlined.

Thanks for looking in to that. Let's stick with the SCRIPTID opcode at this time. As I continue working on Bug 1535138 I'll keep in mind this use-case of script identifiers and it should be easy to tweak down the road. Once I successfully pull off Bug 1529456 to merge JSScript and LazyScript, the DebuggerScript identity logic should become a lot more robust.

FYI: We currently have Bug 1529991 which might effect you in edge cases.

(In reply to Ted Campbell [:tcampbell] from comment #48)

Thanks for looking in to that. Let's stick with the SCRIPTID opcode at this time. As I continue working on Bug 1535138 I'll keep in mind this use-case of script identifiers and it should be easy to tweak down the road. Once I successfully pull off Bug 1529456 to merge JSScript and LazyScript, the DebuggerScript identity logic should become a lot more robust.

FYI: We currently have Bug 1529991 which might effect you in edge cases.

OK, yeah, I noticed that the Debugger.Script identity logic is broken and have a fix waiting for review in https://phabricator.services.mozilla.com/D34962. I didn't see the problem was filed already though, I'll make a note in bug 1529991.

Attachment #9076917 - Attachment description: Bug 1554524 Part 6 - Disallow XDR and off thread compilation in realms using instrumentation, r=tcampbell. → Bug 1554524 Part 6 - Disallow XDR in realms using instrumentation, r=tcampbell.

Current design looks good to me! I'll get to final review early this week. I think I'm just down to comment and format nits.

Cool, thanks for the feedback and help in getting this in shape for the tree.

Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4388debbb9fd
Part 1 - Allow emitDupAt to dupe a range of stack values, r=tcampbell.
https://hg.mozilla.org/integration/mozilla-inbound/rev/fc204daba6f3
Part 3 - Emit JSOP_EXCEPTION for catch blocks in the same place, r=tcampbell.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e802face8e04
Part 4 - Add Debugger interface for instrumenting scripts, r=jimb,tcampbell.
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ff79e46fcf4
Part 5 - Emit instrumentation opcodes when they have been set in a realm, r=tcampbell.
https://hg.mozilla.org/integration/mozilla-inbound/rev/b0817ca77a5a
Part 6 - Disallow XDR in realms using instrumentation, r=tcampbell.
https://hg.mozilla.org/integration/mozilla-inbound/rev/51161275a65e
Part 7 - Add interpreter and JIT support for instrumentation opcodes, r=jandem,tcampbell.
https://hg.mozilla.org/integration/mozilla-inbound/rev/35189f59108f
Part 8 - Add instrumentation tests, r=tcampbell.
Regressions: 1569608
Depends on: 1570091
Regressions: 1604530
You need to log in before you can comment on or make changes to this bug.