Open Bug 944260 Opened 6 years ago Updated 1 year ago

[jsdbg2] Debugger should provide a GC-insensitive replacement for findScripts

Categories

(Core :: JavaScript Engine, task, P2)

task

Tracking

()

People

(Reporter: jimb, Unassigned)

References

(Blocks 3 open bugs)

Details

The contents of the array returned by findScripts depends on which scripts have been GC'd and which are still alive. This means that the result can depend on timing details. These non-deterministic APIs are a big source of misery and confusion: tests are written to depend on things they should not, and unrelated changes elsewhere in the browser can affect the behavior of the tools. It's horrible.

One alternative would be to replace findScripts and onNewScript with an API that promises that the Debugger API's user can be notified whenever a script matching a given query is entered. This notification could be one-shot per matching script: once it reports entry into a particular script, that notification would be "settled" for that script, and the debugger wouldn't interfere with it any more.

This, along with a stack walk to deal with scripts already executing, would be enough to reliably set breakpoints. It's completely deterministic. And it should actually be *more* efficient than findScripts and onNewScript, because it would only ever touch scripts that run. As a bonus, the lazy script optimization would be completely invisible to it.
Blocks: 944261
Imagine an API like:

  dbg.onNextExecution({ url: "foo.js", global: thisWindow, etc. },
                      function (frame) ...)

such that the given handler function gets called the next time each matching JSScript is entered, or returned to. The handler would fire only once per JSScript; once the Debugger customer has been given a chance to set breakpoints, there's no reason to interfere with that script's subsequent execution. (Care would be required to ensure that lazy compilation has no effect on onNextExecution's visible behavior.)

This would only need to apply to extant scripts; the Debugger customer could use the onNewScript hook to set breakpoints in subsequently created scripts.

Even if the internal implementation entails something wildly GC-sensitive like walking the heap looking for matching JSScripts, setting onNextExecution hooks on garbage scripts will have no effect, because they'll never be run. So the GC-sensitivity isn't visible from the outside.

Based on an IRC conversation with shu, here's what would be entailed:

1) Add a flag to JSScript saying "you have onNextExecution hooks".

2) When setting an onNextExecution hook, set that bit on all relevant JSScripts.

3) In the debugger prologue, check that flag on the script we're entering.

4) In the debugger epilogue, check that flag on the script we're returning to.

5) Similarly for 'catch' and 'finally' clauses. (I don't think the debugger epilogue handles those.)

6) Similarly on the path that uses rematerialized frames to synthesize baseline frames from Ion frames when they've been touched by the debugger.



Shu, you said that stack frames themselves would need a bit, but it seems to me that we can always just check the frame's JSScript's bit instead; did I misunderstand?
Flags: needinfo?(shu)
(In reply to Jim Blandy :jimb from comment #1)
 
> Shu, you said that stack frames themselves would need a bit, but it seems to
> me that we can always just check the frame's JSScript's bit instead; did I
> misunderstand?

No you're right, you can and should only check the script, since this is a once-per-script thing, not a once-per-frame thing. That should make the patch even easier to implement: you don't need to touch any JIT frame structures at all, just some frame iters to check if this flag is set on the caller's JSScript.

So that means 6) above doesn't apply.

I *think* the debugger epilogue already handles 5). I'm not entirely sure though.
Flags: needinfo?(shu)
Duplicate of this bug: 1083291
Duplicate of this bug: 1108070
Is it possible to use a simplified test case similar to this:

https://bugzilla.mozilla.org/show_bug.cgi?id=1191188

=== debugger-controller.js (onNewSource/onSourcesAdded) ===
After setting a breakpoint in (onSourcesAdded) and doing a page refresh is there a reason why we would see source duplicates and sometimes missing sources?

This maybe a useful simplified test case.
Duplicate of this bug: 1550440
Blocks: dbg-69, dbg-api
Priority: -- → P2
Type: defect → task

In bug 1550440, Brian said:

[F]inding the reachable scripts without involving the GC would be really expensive, even slower than a full non-incremental GC.

The idea is that this new API could continue to use Zone::cellIter, just as findScripts does now. But, since it doesn't actually return the set of affected scripts to JS, and Debugger only gets notified when some matching script actually runs, the visible behavior is entirely deterministic.

Whiteboard: [debugger-mvp]

(In reply to Jim Blandy :jimb from comment #7)

In bug 1550440, Brian said:

[F]inding the reachable scripts without involving the GC would be really expensive, even slower than a full non-incremental GC.

The idea is that this new API could continue to use Zone::cellIter, just as findScripts does now. But, since it doesn't actually return the set of affected scripts to JS, and Debugger only gets notified when some matching script actually runs, the visible behavior is entirely deterministic.

Hmm, so I have some concerns:

  • The bugs I've seen related to unpredictable debugger behavior wrt GC have been concerned with sources not being shown in the debugger, rather than the scripts themselves. dbg.findSources() is implemented separately from dbg.findScripts() and also uses cell iteration / unpredictable GC behavior. Does this bug encapsulate findSources() as well?

  • If so, right now the sources which findSources finds are used to determine the sources which the client will ultimately show in the sources pane. What we do right now is show all the sources on the page that haven't been GC'ed, and with this change it seems like we will only show the sources whose scripts have run since the debugger was attached. Is this change in behavior expected / desired? The current behavior seems more intuitive to me.

  • Similarly, for findScripts() itself the server uses the scripts found to populate the breakpoint positions which the client uses to decide where the user can set breakpoints. If the debugger only knows about scripts that have executed since it attached, users will only be able to install breakpoints on those scripts, which seems especially non-intuitive.

The last point could be worked around by reparsing a source to determine its breakpoint positions, which would make the breakpoint positions GC insensitive but would entail more work for the server to do. That would be an improvement I think but still doesn't address the larger user-facing problem of showing an incomplete set of sources when attaching to existing tabs.

No longer blocks: dbg-69
Whiteboard: [debugger-mvp]

After talking with Jim, I'm going to work on a related approach here, where we have some GC-sensitive Debugger APIs but modify the devtools server to try to hide GC non-determinism from users. There are two main aspects to this:

  • Keep the URLs for the sources in a realm around as long as possible. Heuristics here are TBD; we can't keep the URLs around forever, but if they are able to live longer and have predictable/controllable lifetimes then things should be less confusing for users. If the server finds a URL that has been loaded into a realm but has no sources, it can fetch that URL and show it in the client, as it has to do now when attaching to an existing tab and showing the HTML for inline scripts.

  • Obtain breakpoint positions in URLs where scripts have been GC'ed. To do this we need to ask spidermonkey to reparse the source (which is rather tricky when it's an inline script) and fetch the breakpoint positions in the resulting scripts so that the client can allow the user to set breakpoints in all scripts in the source, regardless of whether they have been GC'ed.

Assignee: nobody → bhackett1024

Yeah, I agree with this. Brian's comment handles the forward-looking ideas, so for my own benefit, here's the retrospective:

In an ideal world, the Debugger API would expose no GC-sensitive behavior, and thus the server built on top of it would be GC-insensitive by construction. But this isn't achievable, even in principle:

  • When the developer opens the devtools, we must show the full set of sources, so the user can set breakpoints.

  • But we must permit code to be GC'd in tabs that have not yet had the debugger opened, to save memory.

  • Yet, when code has not yet been GC'd, we must use the source code provided by SpiderMonkey. If we re-fetch sources that SpiderMonkey might still run and pause in, or backtrace through, and the sources we fetch don't match what SpiderMonkey is running, then they'll be out of sync, which is chaos, completely unnecessary because we have the sources right there.

Ergo, when the devtools are opened on a running page, there is no way to avoid refetching only those sources that have not been GC'd. The Debugger API must exhibit GC-sensitive behavior. So my ideal "correct by construction" approach simply isn't available to us.

This makes "in principle" changes like this bug less attractive, since we're not going to obey the principle anyway. It wouldn't be harmful to make this change, but it's not clear what purpose it serves.

This work is being moved to bug 1571592, since it will need to be spread out over several bugs.

Assignee: bhackett1024 → nobody
You need to log in before you can comment on or make changes to this bug.