Closed Bug 1515290 Opened 8 months ago Closed 7 months ago

DebuggerServer should be loaded in dedicated loader with invisibleToDebugger flag set to true when debugging system compartments

Categories

(DevTools :: Framework, enhancement, P2)

enhancement

Tracking

(firefox67 fixed)

RESOLVED FIXED
Firefox 67
Tracking Status
firefox67 --- fixed

People

(Reporter: ochameau, Assigned: ochameau)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

Bug 1512029 is going to require the DebuggerServer to be loaded in Sandboxes with invisibleToDebugger flag set to true when debugging system compartments.
Debugger API can't be used from the same compartment than its debuggee.
And this bug is going to make it so that all system compartment sandboxes and documents start using the same compartment.
invisibleToDebugger flag allows to force it to be loaded in a distinct compartment, still system one, but distinct.

As it wasn't strictly mandatory flag for now, we don't ensure setting this flag when debugging privileged context. But we should.

Some tests already do that, like this one:
https://searchfox.org/mozilla-central/source/devtools/client/debugger/new/test/mochitest/browser_dbg-chrome-debugging.js#17-19

But some others don't as they are using TargetFactory and uses production codepath:
https://searchfox.org/mozilla-central/source/devtools/server/tests/mochitest/test_memory_allocations_01.html#22

Ideally, TargetFactory should handle this as that's the main code that instantiate the debugger server.
When debugging contexts running from the system compartment, the debugger has
to be loaded in a dedicated Loader, with invisibleToDebugger flag turned on.
This ensures that the Debugger API is going to be used from a distinct system
compartment. Otherwise it may be used from the same compartment than the page
we are debugging.
This patch broke tests using DebuggerServer.searchAllConnectionsForActor while testing against a chrome document.
That's because the DebuggerServer is now a custom one and not the same than the main loader's one.

There is two things to do:
* Stop using chrome documents unless we really want to assert chrome's behavior, which is rarely/never the case in these tests
* Stop using searchAllConnectionsForActor from the parent process. We have to use it through ContentTask as the actor will be in the content process when testing against a regular content document.
* Stop using NodeFront.rawNode() as it depends on searchAllConnectionsForActor and will fail for the same reasons.
* Stop using chrome mochitest and use browser mochitest opening a tab. The only reason we are using chrome mochitest is historical. We should only be using browser mochitests or xpcshells... We end up having duplicated helpers. Different ways to write the exact same tests.

In the new patch I'm trying to address that. I hope I went through most of the tests that required some tweaks, but some others may be hidding behind the first failures.
Assignee: nobody → poirot.alex
Severity: normal → enhancement
Priority: -- → P2
In bug 1514210 I'll add freshCompartment: true for the devtools sandbox. Then we can revert that when you're done fixing these tests :)
Blocks: 1517210

I'm making some progress but there is a significant amount of tests to refactor and their refactoring isn't trivial:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8549cd7ce4e244179afd323c539e4f090e3fc5c
But such refactoring is really valuable. I'm converting many from mochitest chrome to mochitest browser. It better align with the vast majority of devtools tests which are mochitest browser.

Depends on: 1520782
Depends on: 1521052

I move all the test refactoring to bug 1520782 where I'm done with refactoring tests from chrome to regular tabs. (Still pending review there)

But then I hit an issue with DebuggerServer in the content process that is never cleaned up.

I thought that bug 1521052 was the reason behind the leak introduced by the new test I'm introducing in this patch:
devtools/client/framework/test/browser_target_server_compartment.js
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bb745819c992b08cd0738cffdcc948d4077f5f22
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=223466448&repo=try&lineNumber=15442
TEST-UNEXPECTED-FAIL | leakcheck | default 2882167 bytes leaked (AbstractThread, AbstractWatcher, AnimationTimeline, AsyncFreeSnowWhite, AtomSet, ...)

But it looks like it is unrelated as the leak still reproduces.
Unfortunately that's a C++ leak, which is really hard to track down.

I think that I finally found the root issue behind this leak:
the thread actor is instantiating a new Debugger, after it has been detached/destroyed, letting it alive after the test finishes and "leaking the world".

This happens very easily when we call threadClient.attachThread, that, without calling threadClient.resume right after that.
In the tests I'm adding in this revision, I'm not calling resume.
It looks like calling resume do fix the leak.
Here is a try run, with the revision already on phabricator + the call to resume:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=097e861a8d19ca9b15feb433c33945dadc40554c

browser_toolbox_meatball.js

Julian helped me debug that, the leak was reproducible locally by running:
./mach mochitest --headless browser_target_server_compartment.js browser_toolbox_meatball.js
And adding an explicit call to await toolbox.destroy() at the end of meatball test fix the leak locally.
Unfortunately, this doesn't seem to fix the leak on try. I just pushed the phabricator revision + call to toolbox.destroy:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=84b60cddf5c89d0f26c47a4d61d0dca8457ab56d

I think that there is two issues colliding here. meatball test is probably having an intermittent leak or is very sensible to races during its termination. And there was a clear issue with the test I added and its leaking Debugger instance.

Longer story about the thread actor

When calling ThreadActor.attach (ie. when calling targetFront.attachThread), the actor immediately returns the response manually here:
https://searchfox.org/mozilla-central/rev/5c8ea961d04767db723a0a15e3a8f7fbca154129/devtools/server/actors/thread.js#298-304

      // Send the response to the attach request now (rather than
      // returning it), because we're going to start a nested event loop
      // here.
      this.conn.send(packet);

      // Start a nested event loop.
      this._pushThreadPause();

and starts a nested event loop (I'm not sure that the nested event loop is important here?).
Then, if we immediately destroy the actor (which I was doing in the test), we end up trying to resume here:
https://searchfox.org/mozilla-central/rev/5c8ea961d04767db723a0a15e3a8f7fbca154129/devtools/server/actors/thread.js#217-221

  destroy: function() {
    dumpn("in ThreadActor.prototype.destroy");
    if (this._state == "paused") {
      this.onResume();
    }

While destroying the Debugger instance synchronously right after:
https://searchfox.org/mozilla-central/rev/5c8ea961d04767db723a0a15e3a8f7fbca154129/devtools/server/actors/thread.js#241-242

    this._dbg.enabled = false;
    this._dbg = null;

But onResume is asynchronous, and is accessing the dbg attribute, after destroy is called, from here:
https://searchfox.org/mozilla-central/rev/5c8ea961d04767db723a0a15e3a8f7fbca154129/devtools/server/actors/thread.js#940-942

      if (this.dbg.replaying) {

And forces the instantiation of a new Debugger instance:
https://searchfox.org/mozilla-central/rev/5c8ea961d04767db723a0a15e3a8f7fbca154129/devtools/server/actors/thread.js#113-115

  get dbg() {
    if (!this._dbg) {
      this._dbg = this._parent.makeDebugger();

I'll merge this chance into the base revision if you find it valuable.

Depends on D14957

BrowsingContextTargetActor, which is calling _detach expects it to return true
if everything worked as expected. But the overloaded method in ParentProcessTargetActor
was always returning undefined.

Depends on D17924

This test seems fragile and report leaks when run sequentially with other tests.
Destroying its toolbox and more importantly waiting for it to complete seems to
stop the leaks.

Depends on D17925

Attachment #9039736 - Attachment is obsolete: true
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/820e7e98a4f7
Instantiate DebuggerServer in dedicated loader when debugging chrome tabs. r=yulia,jdescottes
https://hg.mozilla.org/integration/autoland/rev/35df2ad0974a
Prevent ParentProcessTargetActor.detach from failing with 'wrongState' error. r=jdescottes
https://hg.mozilla.org/integration/autoland/rev/4fbe9b31f4ee
Prevent intermittent leaks in browser_toolbox_meatball.js r=jdescottes
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 67
Blocks: 1525652
You need to log in before you can comment on or make changes to this bug.