Open Bug 1589906 Opened 5 years ago Updated 2 years ago

[jsdbg2] Breakpoint handlers can fire when they should not, in rare situations

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

People

(Reporter: jimb, Unassigned)

References

Details

If several breakpoints are set at the same location, and one's handler removes the other, and then a third new breakpoint is set at that location, and the third breakpoint's underlying js::Breakpoint structure happens to get allocated at the same address as the deleted breakpoint's, then we will execute undefined behavior.

How it happens:

When we hit a breakpoint, there may be several breakpoints set at the same location. The location has a single BreakpointSite, and that has a doubly-linked list of Breakpoints set at that site, each with its own handler. So the first thing we do when a breakpoint is hit is to gather up all the breakpoints at that location, as a vector of Breakpoint* pointers. Then, we iterate over that vector, and call each breakpoint's handler.

But handlers are arbitrary JS, and one handler can remove another. To avoid firing handlers for breakpoints that have been removed, each iteration of the loop checks that the Breakpoint* it got from the vector is still present in the BreakpointSite's linked list; if it's not found there, it's ignored.

Unfortunately, removing a breakpoint from a site frees its Breakpoint structure, so the pointer in the vector becomes a dangling pointer. Comparing the dangling pointer to the elements of the BreakpointSite's linked list is not UB (since C++ 14), but then using it if a match is found is UB: just because a dead pointer compares equal to a live pointer doesn't mean that dereferencing the dead pointer is okay. So if another Breakpoint happens to get allocated at the same address and inserted at the same site, we get UB.

Priority: -- → P2

Instead of building a list of Breakpoint* values, we could instead build a list of Debugger* and handler pairs. Those we can root ourselves until we're done using them, so they can't be invalidated. We could still re-check the location to make sure the breakpoint hadn't been removed; the script/instance and offset are available from the stack frame, and are rooted.

Can we do this by changing the invariants around Debugger::breakpoints and BreakpointSite::breakpoints? For instance, we could set a flag on the debugger while we're iterating through a set of breakpoints, and if a breakpoint is removed from a site, but its debugger is iterating over breakpoints, we could remove it from the breakpoint site list, but not remove it from the debugger's breakpoint list. Then we can remove it from the debugger's list after the list has been fully iterated. Since that keeps the breakpoint alive during iteration, we wouldn't have the same concerns around lifetime checking.

That sounds really delicate.

  • Breakpoints point to their sites; we could null that pointer out for safety, but to be crash-free we'd have to be sure that pointer can never be used after the breakpoint is deleted once.

  • Breakpoints are owned by BreakpointSites; this would change the rules to say that Breakpoints are owned by BreakpointSites or their Debugger lists under certain circumstances.

The nice thing about copying is that the ownership story stays simple.

If we make sure the copy array preserves the order in which the breakpoints occur, and kick each breakpoint to the end of its site's linked list after we run it, then checking whether the next breakpoint we're about to run is still present in the list will be constant time in the common case.

Fair enough. I see where you're coming from.

For what you're describing, wouldn't we still need a triplet instead of a pair so we have the Breakpoint to compare the handler/debugger values against?

The Breakpoint might get freed and a fresh one allocated at the same address, as described in comment 0. You just can't use pointers that way.

I think we have enough information to look up the site, and then search the site's list for a breakpoint whose Debugger and handler match.

Won't that searching be expensive though since we'd have to traverse the site's breakpoint list just before calling every handler? I'm trying to figure out some psuedocode for what you have in mind is all.

The algorithm I had in mind is:

hit trap (script, pc):
    look up breakpoint site
    build a rooted list of (Debugger, handler) pairs from the site's breakpoint list
    for each (Debugger, handler) pair in our copy:
        verify that this breakpoint handler is still set:
            re-look up the breakpoint site (constant time: a hash or array lookup)
            if not found:
                quit, there are no more breakpoints set here any more
            search the site's breakpoint list for a breakpoint with this (Debugger, handler) pair
                (usually constant time: unless handlers modify the breakpoints
                set at the location (assumed rare), the next breakpoint will be
                at the front of the site's list, and the search will succeed
                after one iteration)
        if site and (Debugger, handler) pair found:
            rotate this breakpoint's entry to the back of the site's list
                (constant time: doubly-linked list)
            call the breakpoint handler
        else:
            breakpoint was deleted, do nothing

I'm assuming that handlers that modify the set of breakpoints at a site are not too common. Even if they do, there are usually few breakpoints set at a given location.

If the quadratic behavior shows up in practice, then we could do something more drastic, like have DebugAPI::onTrap put its list of breakpoints in a MOZ_STACK structure that adds itself to a linked list whose head is stored on the JSContext, and have breakpoint removal scan the list for live iterations and delete the breakpoint from the lists being iterated over. But that seems like too much work for a case that seems like it should be really rare.

Severity: minor → S4
You need to log in before you can comment on or make changes to this bug.