Intermittent browser_net_timing-division.js | FATAL ERROR: AsyncShutdown timeout in ShutdownLeaks: Wait for cleanup to be finished before checking for leaks Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":" ...

RESOLVED WORKSFORME

Status

P5
normal
RESOLVED WORKSFORME
3 years ago
4 months ago

People

(Reporter: RyanVM, Unassigned)

Tracking

(Blocks: 1 bug, {crash, intermittent-failure})

Trunk
x86
Linux
crash, intermittent-failure

Firefox Tracking Flags

(e10s+, firefox42 affected)

Details

(Reporter)

Description

3 years ago
08:33:44 INFO - 4383 INFO TEST-START | browser/devtools/netmonitor/test/browser_net_timing-division.js
08:33:50 INFO - MEMORY STAT | vsize 712MB | residentFast 210MB | heapAllocated 98MB
08:33:50 INFO - 4384 INFO TEST-OK | browser/devtools/netmonitor/test/browser_net_timing-division.js | took 5307ms
08:33:50 INFO - console.error:
08:33:50 INFO - Protocol error (noSuchActor): No such actor for ID: server1.conn80.child1/profilerActor20
08:34:00 INFO - WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":"resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js","lineNumber":1927,"stack":["resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js:Toolbox.prototype.destroy/leakCheckObserver:1927","chrome://mochikit/content/browser-test.js:Tester.prototype.nextTest</<:632","chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<:736","null:null:0"]}] Barrier: ShutdownLeaks: Wait for cleanup to be finished before checking for leaks
08:34:51 INFO - FATAL ERROR: AsyncShutdown timeout in ShutdownLeaks: Wait for cleanup to be finished before checking for leaks Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":"resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js","lineNumber":1927,"stack":["resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js:Toolbox.prototype.destroy/leakCheckObserver:1927","chrome://mochikit/content/browser-test.js:Tester.prototype.nextTest</<:632","chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<:736","null:null:0"]}] At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources.
08:34:51 INFO - [Parent 3382] ###!!! ABORT: file resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js, line 1927
08:34:51 INFO - [Parent 3382] ###!!! ABORT: file resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js, line 1927
08:34:52 INFO - [Child 3443] ###!!! ABORT: Aborting on channel error.: file /builds/slave/fx-team-lx-pgo-000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1753
08:34:52 INFO - [Child 3443] ###!!! ABORT: Aborting on channel error.: file /builds/slave/fx-team-lx-pgo-000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1753
08:34:52 INFO - TEST-INFO | Main app process: killed by SIGSEGV
08:34:52 INFO - 4385 INFO checking window state
08:34:52 WARNING - TEST-UNEXPECTED-FAIL | browser/devtools/netmonitor/test/browser_net_timing-division.js | application terminated with exit code 11
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
9 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 4
* b2g-inbound: 4
* fx-team: 1

Platform breakdown:
* linux64: 4
* windows7-32: 3
* linux32: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2015-12-14&endday=2015-12-20&tree=all
6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 4
* fx-team: 1
* b2g-inbound: 1

Platform breakdown:
* windows7-32: 4
* linux64: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2015-12-21&endday=2015-12-27&tree=all
7 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 7

Platform breakdown:
* windows7-32: 7

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2015-12-28&endday=2016-01-03&tree=all
17 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 14
* fx-team: 2
* b2g-inbound: 1

Platform breakdown:
* windows7-32: 17

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-01-04&endday=2016-01-10&tree=all
9 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 6
* try: 1
* mozilla-central: 1
* fx-team: 1

Platform breakdown:
* windows7-32: 9

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-01-11&endday=2016-01-17&tree=all
8 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 3
* fx-team: 3
* try: 1
* mozilla-central: 1

Platform breakdown:
* windows7-32: 7
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-01-18&endday=2016-01-24&tree=all
(Reporter)

Updated

3 years ago
See Also: → bug 1240580
(Reporter)

Comment 20

3 years ago
This and related bugs all appear to be Windows e10s failures. Sami, any chance you'd be up for taking a look? :)
Blocks: 984139
tracking-e10s: --- → ?
See Also: → bug 1235148

Updated

3 years ago
tracking-e10s: ? → +
(Reporter)

Comment 21

3 years ago
This is pretty easy to hit on Try on OSX 10.6 with e10s enabled too.

Comment 22

3 years ago
I looked at the logs from older failures and based on the created and destroyed DOMWindows I was able to determine two tests that created DOMWindows which were not destroyed before the suite ended:
* browser_net_pane-collapse.js for some failures
* browser_net_prefs-reload.js for other failures

However, I don't see anything clearly wrong with the tests other than the fact that ALL netmonitor tests directly access the content window via CPOWs and teardown the test by 1) removing the tab synchronously and 2) waiting for the netmonitor to be destroyed. Other devtools tests do this the other way around: destroy the toolbox first and then remove the tab.

Since the netmonitor tests wait for the "destroyed" event [1], it's not the netmonitor that is blocking the devtools shutdown but something else in the toolbox destroyer promise mess [2] that just pushes bunch of promises to an array and hopes them to either resolve or reject at some point in the future.

So in this case one of these promises (not the netmonitor one) hangs somewhere failing to resolve.

[1] https://dxr.mozilla.org/mozilla-central/source/devtools/client/netmonitor/test/head.js#178
[2] https://dxr.mozilla.org/mozilla-central/source/devtools/client/framework/toolbox.js#2033

Comment 23

3 years ago
With some extra debugging I've been able to determine that the cause for the never finishing cleanup is not any particular test; the issue is a bit more generic.

The most likely culprit here is that all the tests forcibly remove the tab without closing the toolbox first. This causes starts a race between the server and the client since both start destruction from their respective ends. The toolbox detects that the tab it lived in was removed and starts the destruction process. Similarly, the tab actor at the server side notices that the tab it's representing has closed and starts to destroy itself. If the stars are perfectly aligned and the timings are spot on, the toolbox destruction blocks infinitely for some undetermined reason.

What I know is that the toolbox destroyer promise never resolves since target.destroy() at [1] never resolves. This in turn is caused by the fact that DebuggerClient.close() never calls the callback function at [2]. The callback is supposed to be called when the transport is closed but it never seem to happen. There's two possible causes for that:
1) The transport is closed but the client misses the notification somehow.
2) The transport is never closed since DC.close() doesn't call cleanup() at [3] which would close the transport. This means that one of the actor clients the debugger client was managing never responds to the detach request and the call to cleanup() never happens since it only happens once all clients have been detached.

Here's a couple of suggestions that might fix this:
1) Find the root cause and fix it. This will also make the duplicates go away but since this code is so fragile something is bound to break in the process.
2) Change the way netmonitor tests teardown themselves and deal with the fallout. Instead of removing the tab directly, destroy the toolbox first and only after that remove he tab. This might break tests that assume the teardown happens the way it does at the moment but at least the possible bustage is localized to netmonitor tests.

Anyway, this is such a huge can of worms that I'll leave this to the engineers that are paid to look into this stuff.

[1] https://dxr.mozilla.org/mozilla-central/source/devtools/client/framework/toolbox.js#2029
[2] https://dxr.mozilla.org/mozilla-central/source/devtools/client/framework/target.js#567
[3] https://dxr.mozilla.org/mozilla-central/source/devtools/shared/client/main.js#408
16 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* try: 16

Platform breakdown:
* osx-10-6: 10
* osx-10-10: 5
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-02-29&endday=2016-03-06&tree=all
13 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 11
* mozilla-central: 1
* fx-team: 1

Platform breakdown:
* osx-10-10: 11
* windowsxp: 1
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-03-07&endday=2016-03-13&tree=all
21 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 14
* fx-team: 6
* mozilla-aurora: 1

Platform breakdown:
* osx-10-10: 16
* linux64: 3
* windows7-32: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-03-14&endday=2016-03-20&tree=all
21 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 10
* fx-team: 8
* mozilla-aurora: 2
* try: 1

Platform breakdown:
* osx-10-10: 15
* windows7-32: 6

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-03-21&endday=2016-03-27&tree=all
15 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 9
* try: 4
* fx-team: 2

Platform breakdown:
* osx-10-10: 13
* windows7-32: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-03-28&endday=2016-04-03&tree=all
13 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 6
* fx-team: 3
* mozilla-central: 2
* try: 1
* mozilla-aurora: 1

Platform breakdown:
* osx-10-10: 8
* windows7-32: 3
* linux64: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-04-04&endday=2016-04-10&tree=all
19 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 14
* fx-team: 3
* try: 1
* mozilla-aurora: 1

Platform breakdown:
* osx-10-10: 11
* windows7-32: 7
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-04-11&endday=2016-04-17&tree=all
17 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 10
* fx-team: 4
* try: 1
* mozilla-central: 1
* mozilla-aurora: 1

Platform breakdown:
* osx-10-10: 14
* linux64: 2
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-04-18&endday=2016-04-24&tree=all
10 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 7
* fx-team: 2
* try: 1

Platform breakdown:
* windows7-32: 5
* osx-10-10: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-05-16&endday=2016-05-22&tree=all
7 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 4
* mozilla-aurora: 2
* fx-team: 1

Platform breakdown:
* osx-10-10: 4
* windows7-32: 2
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-05-23&endday=2016-05-29&tree=all
11 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 9
* mozilla-central: 1
* fx-team: 1

Platform breakdown:
* osx-10-10: 7
* windows7-32: 4

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-05-30&endday=2016-06-05&tree=all
6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 2
* mozilla-central: 2
* try: 1
* fx-team: 1

Platform breakdown:
* osx-10-10: 4
* windows7-32: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-06-06&endday=2016-06-12&tree=all
Blocks: 1280289
Blocks: 1283725
Intermittent e10s test failure
Priority: -- → P5
(Reporter)

Comment 37

2 years ago
Nicolas, is this something you'd be interested in looking at? Seems to be a fairly common issue with toolbox-using tests (see deps).
No longer blocks: 1280289, 1283725
Component: Developer Tools: Netmonitor → Developer Tools: Framework
Flags: needinfo?(chevobbe.nicolas)
9 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 3
* autoland: 3
* mozilla-central: 1
* mozilla-aurora: 1
* fx-team: 1

Platform breakdown:
* osx-10-10: 4
* windows7-32-vm: 3
* linux64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1185045&startday=2016-07-04&endday=2016-07-10&tree=all

Comment 39

2 years ago
I'll have a look in the next days
Assignee: nobody → chevobbe.nicolas
Flags: needinfo?(chevobbe.nicolas)

Updated

2 years ago
Assignee: chevobbe.nicolas → nobody
(Reporter)

Comment 40

2 years ago
Looks like this went away around mid-July.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME

Updated

4 months ago
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.