Closed Bug 1102301 Opened 10 years ago Closed 9 years ago

browser_canvas-frontend-open.js leaks toolbox when running the mochitests as a standalone directory

Categories

(DevTools Graveyard :: Canvas Debugger, defect)

All
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jmaher, Assigned: jmaher)

References

Details

Attachments

(1 file)

I am trying to land --run-by-dir for devtools, it is landed for regular runs, but with e10s I had to turn it off due to: https://treeherder.mozilla.org/logviewer.html#?job_id=4076967&repo=mozilla-inbound I went to reproduce this locally on my linux64 development machine: ./mach mochitest-devtools --e10s browser/devtools/canvasdebugger/test it reproduces easily and reliably: 1019 INFO TEST-PASS | chrome://mochitests/content/browser/browser/devtools/canvasdebugger/test/browser_canvas-frontend-stepping.js | The last context call should now be selected. 1020 INFO Destroying the specified canvas debugger. 1021 INFO Waiting for event: 'destroyed' on [object Object]. 1022 INFO Removing tab. 1023 INFO Tab removed and finished closing. 1024 INFO finish() was called, cleaning up... 1025 INFO Forcing GC after canvas debugger test. 1026 INFO MEMORY STAT vsize after test: 896876544 1027 INFO MEMORY STAT residentFast after test: 198012928 1028 INFO MEMORY STAT heapAllocated after test: 77889600 1029 INFO TEST-OK | chrome://mochitests/content/browser/browser/devtools/canvasdebugger/test/browser_canvas-frontend-stepping.js | took 751ms 1030 INFO checking window state WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":"resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js","lineNumber":1664,"stack":["resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js:Toolbox.prototype.destroy/leakCheckObserver:1664","chrome://mochikit/content/browser-test.js:Tester.prototype.nextTest</<:548","chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/maybeRunTests/<:683","null:null:0"]}] Barrier: ShutdownLeaks: Wait for cleanup to be finished before checking for leaks 1031 INFO Console message: 1416499426861 Services.HealthReport.HealthReporter WARN Saved state file does not exist. 1032 INFO Console message: 1416499426861 Services.HealthReport.HealthReporter WARN No prefs data found. FATAL ERROR: AsyncShutdown timeout in ShutdownLeaks: Wait for cleanup to be finished before checking for leaks Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":"resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js","lineNumber":1664,"stack":["resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js:Toolbox.prototype.destroy/leakCheckObserver:1664","chrome://mochikit/content/browser-test.js:Tester.prototype.nextTest</<:548","chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/maybeRunTests/<:683","null:null:0"]}] At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources. [Parent 10632] ###!!! ABORT: file resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js, line 1664 [Parent 10632] ###!!! ABORT: file resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js, line 1664 ###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv TEST-INFO | Main app process: killed by SIGSEGV 1033 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/devtools/canvasdebugger/test/browser_canvas-frontend-stepping.js | application terminated with exit code 11 runtests.py | Application ran for: 0:01:27.008439 zombiecheck | Reading PID log: /tmp/tmp1HtxkFpidlog ==> process 10632 launched child process 10675 zombiecheck | Checking for orphan process with PID: 10675 PROCESS-CRASH | chrome://mochitests/content/browser/browser/devtools/canvasdebugger/test/browser_canvas-frontend-stepping.js | application crashed [None] Crash dump filename: /tmp/tmp3K2nz4.mozrunner/minidumps/6f25ba62-4232-318f-4b99789b-34dae9d2.dmp MINIDUMP_STACKWALK not set, can't process dump. Stopping web server Stopping web socket server Stopping ssltunnel WARNING | leakcheck | refcount logging is off, so leaks can't be detected! runtests.py | Running tests: end. SUITE-END | took 88s I did a bit of test debugging and removing and found that doing this in the browser.ini will let it pass locally: [browser_canvas-frontend-open.js] skip-if = e10s of course I would rather see the root cause fixed if that is easy.
Victor, Can you comment on why we would be failing to close the toolbox in e10s mode? Is this something we should be disabling on e10s or be fixing?
Flags: needinfo?(vporof)
I don't know why this is failing. It most definitely shouldn't, especially since the same logic is used in most (if not all) canvasdebugger tests. I'd rather not have this test disabled, but find the reason for it's breakage.
Flags: needinfo?(vporof)
is there a possibility that in e10s some code silently fails causing this to fail in cleanup? What can I do to debug this further? In the past we have gone down the road of waiting weeks for a fix, then when the test is fixed something else has broken, that is not how we are proceeding anymore. With that said, fixing this in a few days time is worth waiting for. Giving me some direction of what to do to help would be an acceptable option as well :)
Victor, What can we do to debug this further?
Flags: needinfo?(vporof)
If it were to fail silently in e10s, I would expect all canvasdebugger tests to fail, as highlighted in comment 2. Just search for `initCanvasDebuggerFrontend` and `teardown` in the folder. That being said, I see that comment 0 shows browser_canvas-frontend-stepping.js failing, and not browser_canvas-frontend-open.js as this bug's description says. So indeed this might be a generic problem, and not tied to one particular test. Looks like "Wait until toolbox is destroyed" is taking too long, which happens inside `Toolbox.prototype.destory`, in toolbox.js. This was added in bug 1060840 (in a patch which I indeed review). First thing I'd do is verify that the toolbox's `_destroyer` promise is getting resolved. If not, then it's very likely that `CanvasDebuggerPanel.prototype.destroy` inside canvasdebugger/panel.js returns a promise that's never destroyed, in which case we're likely throwing an error somewhere there.
Flags: needinfo?(vporof)
ok, more hacking - here is what I did: * verified this failed on recent code from inbound - it still does * hacked in the manifest - if I removed browser_canvas-frontend-open.js from the manifest (browser.ini), all the other tests pass just fine * hacked in the manifest - if I removed everything BUT browser_canvas-frontend-open.js from the manifest (browser.ini), it passes just fine * determined that all browser_canvas-frontend*.js call initCanvasDebuggerFrontend, including this 'open' test. * removed all browser_canvas-frontend*.js tests from browser.ini manifest and let the browser_canvas-actor* tests run along with "open", all pass * It appears that any single frontend test alongside the "open" test will hang! * Investigating the differences between the frontend tests, "open" is the only one that just initializes, checks default state, then destroys, the others reload(target), do a few things, etc... * add a reload the end, all the things work Probably not the right solution, but this fixes it without disabling the test. I would like to resolve this one way or another- if there is a better fix, a reason why reload is required, or just to disable the test on e10s
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8540334 - Flags: review?(vporof)
Comment on attachment 8540334 [details] [diff] [review] fix browser_canvas-frontend_open.js to run in e10s mode (1.0) Review of attachment 8540334 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/devtools/canvasdebugger/test/browser_canvas-frontend-open.js @@ +36,5 @@ > is($("#debugging-pane-contents").getAttribute("hidden"), "true", > "The rest of the UI should initially be hidden."); > > + // Bug 1102301 - e10s mode seems to hang on shutdown when we call initCanvasDebuggerFrontend >1 and we don't reload() > + yield reload(target); I don't understand how or why this fixes things. It's probably better to dig deeper into this, and only land this patch as a last resort and if there's other bugs blocked by this.
Attachment #8540334 - Flags: review?(vporof) → feedback+
I have no expertise in this area. Is there somebody that can look into this? Quite possibly there is some code in the devtools that upon onload completes initialization. As this only happens on e10s, that should give us a hint that there is something odd going on with either initialization or teardown which isn't e10s compatible. Any real pointers to code will be helpful. If we get down to this being the last blocker to enable --run-by-dir for mochitest devtools e10s, would you rather I land this hacky reload patch, or just disable the "open" test for e10s mode?
Personally, I would rather have this test disabled than land a fix that we don't really understand. The only toolbox initialization/destruction code that could affect this and that I know of is described in comment 5, [0] and [1] to be more specific. Patrick Brosset has been messing around with the canvasdebugger's teardown lately in bug 1099370, maybe you can sync with him? Optimistically, maybe his recent changes fix this leak? [0] http://dxr.mozilla.org/mozilla-central/source/browser/devtools/framework/toolbox.js#1593 [1] http://dxr.mozilla.org/mozilla-central/source/browser/devtools/canvasdebugger/panel.js#62
Thanks for the info. I do see this failing still (fresh push today) for linux opt m-e10s only: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6d627e9ee98c while we don't have all the data in, I suspect this is the only issue preventing us from landing --run-by-dir for devtools. Let me hack around a bit more in the destroy function for the toolbox, maybe I can figure out the root cause! Thanks for the links.
I see this as the problem: http://dxr.mozilla.org/mozilla-central/source/browser/devtools/framework/toolbox.js#1693 In both e10s and non-e10s we are calling this. In the non-e10s mode, we pop into this then block: http://dxr.mozilla.org/mozilla-central/source/browser/devtools/framework/toolbox.js#1675 So somehow in e10s mode we do not get into the then block. Any tips on what I should look into next?
victor, do you have any further thoughts on this bug?
Flags: needinfo?(vporof)
(In reply to Joel Maher (:jmaher) from comment #12) > victor, do you have any further thoughts on this bug? Disable this test for now if you can't track down the problem and it's blocking something more important. However, I still don't understand how only one test is failing when the same destruction logic is used in all tests!
Flags: needinfo?(vporof)
Works without issue these days.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Product: Firefox → DevTools
Product: DevTools → DevTools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: