Closed Bug 1567448 Opened 1 year ago Closed 1 year ago

Intermittent leakcheck | tab 509676 bytes leaked (APZEventState, AbstractThread, ActiveElementManager, AsyncFreeSnowWhite, AtomSet, ...)

Categories

(Core :: Panning and Zooming, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: botond)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [comment 17][stockwell unknown])

Attachments

(2 files)

Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=257360204&repo=mozilla-central
Full log: https://queue.taskcluster.net/v1/task/EwWhNpHQQ9mtJTgogfEszQ/runs/0/artifacts/public/logs/live_backing.log


[task 2019-07-19T10:57:08.570Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsStyleUIReset
[task 2019-07-19T10:57:08.571Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsStyleVisibility
[task 2019-07-19T10:57:08.572Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsStyleXUL
[task 2019-07-19T10:57:08.572Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 2 nsSupportsCString
[task 2019-07-19T10:57:08.573Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1231 nsTArray_base
[task 2019-07-19T10:57:08.574Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsTextEditorState
[task 2019-07-19T10:57:08.574Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 83 nsTextFragment
[task 2019-07-19T10:57:08.575Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 81 nsTextNode
[task 2019-07-19T10:57:08.575Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 2 nsThread
[task 2019-07-19T10:57:08.576Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsThreadPool
[task 2019-07-19T10:57:08.576Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 7 nsTimer
[task 2019-07-19T10:57:08.576Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 7 nsTimerImpl
[task 2019-07-19T10:57:08.577Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 11 nsURIHashKey
[task 2019-07-19T10:57:08.578Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsUUIDGenerator
[task 2019-07-19T10:57:08.578Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 5 nsVariant
[task 2019-07-19T10:57:08.579Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 38 nsWeakReference
[task 2019-07-19T10:57:08.580Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 2 nsWebBrowser
[task 2019-07-19T10:57:08.580Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsWindowWatcher
[task 2019-07-19T10:57:08.581Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXBLService
[task 2019-07-19T10:57:08.581Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsXPCComponents
[task 2019-07-19T10:57:08.582Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsXPCComponentsBase
[task 2019-07-19T10:57:08.582Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsXPCComponents_Classes
[task 2019-07-19T10:57:08.582Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 2 nsXPCComponents_Constructor
[task 2019-07-19T10:57:08.583Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXPCComponents_ID
[task 2019-07-19T10:57:08.584Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsXPCComponents_Interfaces
[task 2019-07-19T10:57:08.584Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsXPCComponents_Results
[task 2019-07-19T10:57:08.584Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 4 nsXPCComponents_Utils
[task 2019-07-19T10:57:08.585Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 11 nsXPCWrappedJS
[task 2019-07-19T10:57:08.586Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXPConnect
[task 2019-07-19T10:57:08.586Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXULPopupManager
[task 2019-07-19T10:57:08.587Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXULPrototypeCache
[task 2019-07-19T10:57:08.587Z] 10:57:08 INFO - TEST-INFO | leakcheck | tab leaked 7 xpc::CompartmentPrivate
[task 2019-07-19T10:57:08.588Z] 10:57:08 INFO - TEST-UNEXPECTED-FAIL | leakcheck | tab 509676 bytes leaked (APZEventState, AbstractThread, ActiveElementManager, AsyncFreeSnowWhite, AtomSet, ...)
[task 2019-07-19T10:57:08.588Z] 10:57:08 INFO -
[task 2019-07-19T10:57:08.589Z] 10:57:08 INFO - runtests.py | Running tests: end.

last test run: gfx/layers/apz/test/mochitest/test_wheel_transactions.html

Removing "regression" keyword from APZ intermittents for reasons laid out in bug 1571054 comment 0.

Keywords: regression

Botond, could you please take a look over this bug or redirect to someone? It has failed 191 times in the last 30 days and it's now on disable-recommended bugs list: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-07-20&endday=2019-08-19&tree=trunk&bug=1567448
Most often the leaks happen after gfx/layers/apz/test/mochitest/test_wheel_transactions.html
Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=262319202&repo=autoland&lineNumber=28012

Flags: needinfo?(botond)

The symptoms point to a window being leaked.

(In reply to Cosmin Sabou [:CosminS] from comment #9)

Most often the leaks happen after gfx/layers/apz/test/mochitest/test_wheel_transactions.html

I think that's just because the leak is reported at the end of the test chunk, and test_wheel_transactions.html is the last test to run in the chunk.

Andrew, do you have a suggestion for how to investigate this? Typically, I would look at the regressing patch and try to find the leak by inspection, but this is a low-frequency intermittent and we don't know the regressing patch.

Flags: needinfo?(botond) → needinfo?(continuation)

The typical way to begin investigating these kinds of leaks is to get a shutdown cycle collector log. Unfortunately, if the frequency is too low to be able to reproduce, it will be difficult to fix.

Flags: needinfo?(continuation)
Whiteboard: [stockwell disable-recommended] → [stockwell needswork]

Here is a Try push to try to collect a shutdown cycle collector log: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d4c31c5fac21ec75cebf352b7cd964a34b118d3

(In reply to Botond Ballo [:botond] from comment #12)

Here is a Try push to try to collect a shutdown cycle collector log: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d4c31c5fac21ec75cebf352b7cd964a34b118d3

That seems to successfully produce cycle collector log artifacts. Now, we just need to see if we can trigger the failure given enough retriggers...

Hi guys, is there an update here?

There are 178 failures in the last 30 days, mostly on linux64-qr debug, but also on macosx1014-64 debug and windows7-32.

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=263959541&repo=autoland&lineNumber=26998

[task 2019-08-28T23:05:30.511Z] 23:05:30 INFO - TEST-START | gfx/layers/apz/test/mochitest/test_wheel_transactions.html
[task 2019-08-28T23:05:30.527Z] 23:05:30 INFO - GECKO(2764) | [Parent 2764, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4457
[task 2019-08-28T23:05:30.607Z] 23:05:30 INFO - GECKO(2764) | [Parent 2764, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4457
[task 2019-08-28T23:05:30.623Z] 23:05:30 INFO - GECKO(2764) | ++DOMWINDOW == 22 (0x7f14b580c800) [pid = 2857] [serial = 328] [outer = 0x7f14b6775f20]
[task 2019-08-28T23:05:30.785Z] 23:05:30 INFO - GECKO(2764) | got MozAfterPaint: 0,0,0,0
[task 2019-08-28T23:05:30.785Z] 23:05:30 INFO - GECKO(2764) | Dispatching 0 onpaint listeners
[task 2019-08-28T23:05:30.833Z] 23:05:30 INFO - GECKO(2764) | WaitUntilApzStable: flushed APZ repaints in parent proc, waiting for callback...
[task 2019-08-28T23:05:30.833Z] 23:05:30 INFO - GECKO(2764) | WaitUntilApzStable: APZ flush done in parent proc
[task 2019-08-28T23:05:30.840Z] 23:05:30 INFO - GECKO(2764) | WaitUntilApzStable: got apz-flush-done in child proc
[task 2019-08-28T23:05:30.848Z] 23:05:30 INFO - GECKO(2764) | WaitUntilApzStable: done promiseFocus
[task 2019-08-28T23:05:30.848Z] 23:05:30 INFO - GECKO(2764) | done...
[task 2019-08-28T23:05:30.849Z] 23:05:30 INFO - GECKO(2764) | WaitUntilApzStable: done promiseAllPaintsDone
[task 2019-08-28T23:05:30.853Z] 23:05:30 INFO - GECKO(2764) | PromiseApzRepaintsFlushed: Flushed APZ repaints, waiting for callback...
[task 2019-08-28T23:05:30.853Z] 23:05:30 INFO - GECKO(2764) | PromiseApzRepaintsFlushed: APZ flush done
[task 2019-08-28T23:05:30.860Z] 23:05:30 INFO - GECKO(2764) | WaitUntilApzStable: all done
[task 2019-08-28T23:05:30.896Z] 23:05:30 INFO - GECKO(2764) | got MozAfterPaint: 0,0,0,0
[task 2019-08-28T23:05:30.897Z] 23:05:30 INFO - GECKO(2764) | Dispatching 0 onpaint listeners
[task 2019-08-28T23:05:30.985Z] 23:05:30 INFO - GECKO(2764) | got MozAfterPaint: 0,0,0,0
[task 2019-08-28T23:05:30.985Z] 23:05:30 INFO - GECKO(2764) | Dispatching 0 onpaint listeners
[task 2019-08-28T23:05:31.186Z] 23:05:31 INFO - GECKO(2764) | --DOMWINDOW == 21 (0x7f14b5b91400) [pid = 2857] [serial = 323] [outer = (nil)] [url = http://mochi.test:8888/tests/SimpleTest/iframe-between-tests.html]
[task 2019-08-28T23:05:32.356Z] 23:05:32 INFO - GECKO(2764) | got MozAfterPaint: 0,0,0,0
[task 2019-08-28T23:05:32.356Z] 23:05:32 INFO - GECKO(2764) | Dispatching 0 onpaint listeners
[task 2019-08-28T23:05:33.755Z] 23:05:33 INFO - GECKO(2764) | MEMORY STAT | vsize 2562MB | residentFast 179MB | heapAllocated 25MB
[task 2019-08-28T23:05:33.779Z] 23:05:33 INFO - TEST-OK | gfx/layers/apz/test/mochitest/test_wheel_transactions.html | took 3269ms
[task 2019-08-28T23:05:33.840Z] 23:05:33 INFO - GECKO(2764) | [Parent 2764, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4457
[task 2019-08-28T23:05:33.855Z] 23:05:33 INFO - GECKO(2764) | ++DOMWINDOW == 22 (0x7f14b586e400) [pid = 2857] [serial = 329] [outer = 0x7f14b6775f20]
[task 2019-08-28T23:05:33.930Z] 23:05:33 INFO - TEST-START | Shutdown
[task 2019-08-28T23:05:33.930Z] 23:05:33 INFO - Passed: 1131
[task 2019-08-28T23:05:33.937Z] 23:05:33 INFO - Failed: 0
[task 2019-08-28T23:05:33.938Z] 23:05:33 INFO - Todo: 8
[task 2019-08-28T23:05:33.938Z] 23:05:33 INFO - Mode: e10s

[task 2019-08-28T23:05:37.810Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 38 nsWeakReference
[task 2019-08-28T23:05:37.811Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 2 nsWebBrowser
[task 2019-08-28T23:05:37.811Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 1 nsWindowWatcher
[task 2019-08-28T23:05:37.812Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXBLService
[task 2019-08-28T23:05:37.812Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 3 nsXPCComponents
[task 2019-08-28T23:05:37.812Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 3 nsXPCComponentsBase
[task 2019-08-28T23:05:37.813Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 3 nsXPCComponents_Classes
[task 2019-08-28T23:05:37.813Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 2 nsXPCComponents_Constructor
[task 2019-08-28T23:05:37.814Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXPCComponents_ID
[task 2019-08-28T23:05:37.814Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 3 nsXPCComponents_Interfaces
[task 2019-08-28T23:05:37.815Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 3 nsXPCComponents_Results
[task 2019-08-28T23:05:37.815Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 3 nsXPCComponents_Utils
[task 2019-08-28T23:05:37.816Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 10 nsXPCWrappedJS
[task 2019-08-28T23:05:37.816Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXPConnect
[task 2019-08-28T23:05:37.817Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXULPopupManager
[task 2019-08-28T23:05:37.817Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 1 nsXULPrototypeCache
[task 2019-08-28T23:05:37.818Z] 23:05:37 INFO - TEST-INFO | leakcheck | tab leaked 6 xpc::CompartmentPrivate
[task 2019-08-28T23:05:37.818Z] 23:05:37 INFO - TEST-UNEXPECTED-FAIL | leakcheck | tab 484811 bytes leaked (APZEventState, AbstractThread, ActiveElementManager, AsyncFreeSnowWhite, AtomSet, ...)
[task 2019-08-28T23:05:37.819Z] 23:05:37 INFO -
[task 2019-08-28T23:05:37.819Z] 23:05:37 INFO - leakcheck | Processing leak log file /tmp/tmpvqYh0Y.mozrunner/runtests_leaks_gpu_pid2798.log

Should we disable test_wheel_transactions.html on linux64-qr to see if the failure rate goes down?

Flags: needinfo?(continuation)
Flags: needinfo?(botond)

I don't know if Botond's investigations have turned up anything, sorry.

Flags: needinfo?(continuation)

I did manage to trigger the failure on Try with cycle collector logging enabled:

https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=262770018&revision=4d4c31c5fac21ec75cebf352b7cd964a34b118d3

The logs are available as build artifacts in the job's "Job Details" pane.

Andrew also pointed me to instructions for how to analyze the logs. I just haven't had a chance to do so yet (a bit overloaded at the moment...)

I'll assign the bug to me and try to get to it as soon as I can.

(In reply to Andreea Pavel [:apavel] from comment #15)

Should we disable test_wheel_transactions.html on linux64-qr to see if the failure rate goes down?

Unfortunately, I don't think that will help. As mentioned, test_wheel_transactions.html shows up in the log right before the error because it's the last test to run in this test chunk and the leak is reported at the end of the test chunk. However, the leak could be in any test in the chunk, or even in utility code used by multiple tests.

Assignee: nobody → botond
Flags: needinfo?(botond)
Whiteboard: [stockwell disable-recommended] → [comment 17][stockwell needswork:owner]
Whiteboard: [comment 17][stockwell disable-recommended] → [comment 17][stockwell needswork:owner]

(In reply to Botond Ballo [:botond] from comment #17)

I did manage to trigger the failure on Try with cycle collector logging enabled:

https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=262770018&revision=4d4c31c5fac21ec75cebf352b7cd964a34b118d3

That doesn't appear to have cc/gc logs for the pid that leaks. Talked to Andrew, he suspects its because of the content process sandbox, so the process doesn't have privileges to write the file. There is the env var MOZ_DISABLE_CONTENT_SANDBOX to disable the sandbox, so I pushed with that to see if we get the right logs.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=70ddbd475ca10201a8dd77b45671aca373ff20e8

I ran find roots on the two cc logs (not sure the difference between the two) with nsGlobalWindow as the target. If I'm reading this correctly the window helper_overscroll_behavior_bug1425573.html is being leaked? Nothing really stands out about that test, it's pretty simple.

Am I reading that right? What are the next steps to take to figure this out?

Flags: needinfo?(continuation)

Seems like it is definitely helper_overscroll_behavior_bug1425573.html as fault. Disabling just that and I don't get any leaks.

Flags: needinfo?(continuation)

Added some logging to the test and apz utils functions it calls, we take the same path in both a failing and successful run.

The CC log output shows that the leak is happening via documents. Now that you've narrowed it down to a single test, you could try using command line option to mach to have it run the one test 10 times and see if that helps to reproduce it.

Ran the test 150 times ("open browser, run test, exit" being one time), and then 90 iterations of "open browser, run test 10x, exit". Never reproduced.

And running the entire directory fails locally, so I can't try that.

Realized this only happens on webrender and I wasn't enabling it. I'll try again.

Still no luck with webrender enabled.

One trick I've had luck with before is to make like 5 or more copies of the leaking test, add them all to the mochitest.ini file or whatever, comment out all of the other tests in the directory, then run the whole directory. I think some leaks probably don't trigger if the test runs too close to startup or shutdown.

Tried that, on mac, and then also on linux in a debug build that doesn't have opt enabled. No luck.

And tried with chaos mode too, no luck.

Do you have in mind a way to investigate the leak if you are able to reproduce it locally?

Is it something you could do in automation instead?

We had 40 failures in the past 7 days, all on linux64-qr debug.
Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=267689615&repo=autoland&lineNumber=27299

[task 2019-09-20T17:06:15.036Z] 17:06:15 INFO - TEST-UNEXPECTED-FAIL | leakcheck | tab 484987 bytes leaked (APZEventState, AbstractThread, ActiveElementManager, AsyncFreeSnowWhite, AtomSet, ...)
[task 2019-09-20T17:06:15.037Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.037Z] 17:06:15 INFO - leakcheck | Processing leak log file /tmp/tmpnlzOvi.mozrunner/runtests_leaks_gpu_pid3090.log
[task 2019-09-20T17:06:15.038Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.038Z] 17:06:15 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, gpu process 3090
[task 2019-09-20T17:06:15.039Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.039Z] 17:06:15 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
[task 2019-09-20T17:06:15.039Z] 17:06:15 INFO - | | Per-Inst Leaked| Total Rem|
[task 2019-09-20T17:06:15.040Z] 17:06:15 INFO - 0 |TOTAL | 34 0| 722602 0|
[task 2019-09-20T17:06:15.040Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.041Z] 17:06:15 INFO - nsTraceRefcnt::DumpStatistics: 325 entries
[task 2019-09-20T17:06:15.041Z] 17:06:15 INFO - TEST-PASS | leakcheck | gpu no leaks detected!
[task 2019-09-20T17:06:15.042Z] 17:06:15 INFO - leakcheck | Processing leak log file /tmp/tmpnlzOvi.mozrunner/runtests_leaks_tab_pid3126.log
[task 2019-09-20T17:06:15.042Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.042Z] 17:06:15 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, tab process 3126
[task 2019-09-20T17:06:15.043Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.043Z] 17:06:15 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
[task 2019-09-20T17:06:15.043Z] 17:06:15 INFO - | | Per-Inst Leaked| Total Rem|
[task 2019-09-20T17:06:15.044Z] 17:06:15 INFO - 0 |TOTAL | 43 0| 75662 0|
[task 2019-09-20T17:06:15.044Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.045Z] 17:06:15 INFO - nsTraceRefcnt::DumpStatistics: 789 entries
[task 2019-09-20T17:06:15.045Z] 17:06:15 INFO - TEST-PASS | leakcheck | tab no leaks detected!
[task 2019-09-20T17:06:15.045Z] 17:06:15 INFO - leakcheck | Processing leak log file /tmp/tmpnlzOvi.mozrunner/runtests_leaks_tab_pid3200.log
[task 2019-09-20T17:06:15.046Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.046Z] 17:06:15 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, tab process 3200
[task 2019-09-20T17:06:15.047Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.047Z] 17:06:15 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
[task 2019-09-20T17:06:15.048Z] 17:06:15 INFO - | | Per-Inst Leaked| Total Rem|
[task 2019-09-20T17:06:15.048Z] 17:06:15 INFO - 0 |TOTAL | 47 0| 23818 0|
[task 2019-09-20T17:06:15.048Z] 17:06:15 INFO -
[task 2019-09-20T17:06:15.049Z] 17:06:15 INFO - nsTraceRefcnt::DumpStatistics: 415 entries
[task 2019-09-20T17:06:15.049Z] 17:06:15 INFO - TEST-PASS | leakcheck | tab no leaks detected!
[task 2019-09-20T17:06:15.050Z] 17:06:15 INFO - runtests.py | Running tests: end.
[task 2019-09-20T17:06:15.050Z] 17:06:15 INFO - Buffered messages finished
[task 2019-09-20T17:06:15.051Z] 17:06:15 INFO - 0 INFO TEST-START | Shutdown
[task 2019-09-20T17:06:15.051Z] 17:06:15 INFO - 1 INFO Passed: 16813
[task 2019-09-20T17:06:15.051Z] 17:06:15 INFO - 2 INFO Failed: 0
[task 2019-09-20T17:06:15.052Z] 17:06:15 INFO - 3 INFO Todo: 1996
[task 2019-09-20T17:06:15.052Z] 17:06:15 INFO - 4 INFO Mode: e10s
[task 2019-09-20T17:06:15.053Z] 17:06:15 INFO - 5 INFO SimpleTest FINISHED
[task 2019-09-20T17:06:15.053Z] 17:06:15 INFO - Buffered messages finished
[task 2019-09-20T17:06:15.053Z] 17:06:15 INFO - SUITE-END | took 632s
[task 2019-09-20T17:06:18.500Z] 17:06:18 INFO - Return code: 0
[task 2019-09-20T17:06:18.502Z] 17:06:18 ERROR - Got 1 unexpected statuses
[task 2019-09-20T17:06:18.502Z] 17:06:18 INFO - TinderboxPrint: mochitest-mochitest-plain-chunked<br/>607/<em class="testfail">1</em>/1
[task 2019-09-20T17:06:18.503Z] 17:06:18 WARNING - # TBPL WARNING #
[task 2019-09-20T17:06:18.503Z] 17:06:18 WARNING - setting return code to 1
[task 2019-09-20T17:06:18.503Z] 17:06:18 WARNING - The mochitest suite: mochitest-plain-chunked ran with return status: WARNING
[task 2019-09-20T17:06:18.504Z] 17:06:18 INFO - Running post-action listener: _package_coverage_data
[task 2019-09-20T17:06:18.504Z] 17:06:18 INFO - Running post-action listener: _resource_record_post_action
[task 2019-09-20T17:06:18.504Z] 17:06:18 INFO - Running post-action listener: process_java_coverage_data
[task 2019-09-20T17:06:18.505Z] 17:06:18 INFO - [mozharness: 2019-09-20 17:06:18.504711Z] Finished run-tests step (success)
[task 2019-09-20T17:06:18.505Z] 17:06:18 INFO - Running post-run listener: _resource_record_post_run
[task 2019-09-20T17:06:18.625Z] 17:06:18 INFO - Validating Perfherder data against /builds/worker/workspace/mozharness/external_tools/performance-artifact-schema.json
[task 2019-09-20T17:06:18.629Z] 17:06:18 INFO - PERFHERDER_DATA: {"framework": {"name": "job_resource_usage"}, "suites": [{"subtests": [{"name": "cpu_percent", "value": 82.41427469135809}, {"name": "io_write_bytes", "value": 1461362688}, {"name": "io.read_bytes", "value": 44761088}, {"name": "io_write_time", "value": 218688}, {"name": "io_read_time", "value": 1160}], "extraOptions": ["e10s", "taskcluster-m5.large"], "name": "mochitest.mochitest-plain-chunked.12.overall"}, {"subtests": [{"name": "time", "value": 11.687956094741821}, {"name": "cpu_percent", "value": 50.25454545454546}], "name": "mochitest.mochitest-plain-chunked.12.install"}, {"subtests": [{"name": "time", "value": 0.00025081634521484375}], "name": "mochitest.mochitest-plain-chunked.12.stage-files"}, {"subtests": [{"name": "time", "value": 637.204668045044}, {"name": "cpu_percent", "value": 83.02036163522018}], "name": "mochitest.mochitest-plain-chunked.12.run-tests"}]}
[task 2019-09-20T17:06:18.629Z] 17:06:18 INFO - Total resource usage - Wall time: 649s; CPU: 83.0%; Read bytes: 44761088; Write bytes: 1461362688; Read time: 1160; Write time: 218688
[task 2019-09-20T17:06:18.630Z] 17:06:18 INFO - TinderboxPrint: CPU usage<br/>83.0%
[task 2019-09-20T17:06:18.630Z] 17:06:18 INFO - TinderboxPrint: I/O read bytes / time<br/>44,761,088 / 1,160
[task 2019-09-20T17:06:18.630Z] 17:06:18 INFO - TinderboxPrint: I/O write bytes / time<br/>1,461,362,688 / 218,688
[task 2019-09-20T17:06:18.630Z] 17:06:18 INFO - TinderboxPrint: CPU idle<br/>218.4 (16.9%)
[task 2019-09-20T17:06:18.631Z] 17:06:18 INFO - TinderboxPrint: CPU system<br/>114.7 (8.9%)
[task 2019-09-20T17:06:18.631Z] 17:06:18 INFO - TinderboxPrint: CPU user<br/>947.1 (73.5%)
[task 2019-09-20T17:06:18.631Z] 17:06:18 INFO - TinderboxPrint: Swap in / out<br/>0 / 0
[task 2019-09-20T17:06:18.632Z] 17:06:18 INFO - install - Wall time: 12s; CPU: 50.0%; Read bytes: 0; Write bytes: 21741568; Read time: 0; Write time: 5396
[task 2019-09-20T17:06:18.633Z] 17:06:18 INFO - stage-files - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2019-09-20T17:06:18.636Z] 17:06:18 INFO - run-tests - Wall time: 637s; CPU: 83.0%; Read bytes: 43024384; Write bytes: 1439621120; Read time: 1144; Write time: 213292
[task 2019-09-20T17:06:18.730Z] 17:06:18 WARNING - returning nonzero exit status 1
[task 2019-09-20T17:06:18.753Z] cleanup
[task 2019-09-20T17:06:18.753Z] + cleanup
[task 2019-09-20T17:06:18.753Z] + local rv=1
[task 2019-09-20T17:06:18.753Z] + [[ -s /builds/worker/.xsession-errors ]]
[task 2019-09-20T17:06:18.754Z] + cp /builds/worker/.xsession-errors /builds/worker/artifacts/public/xsession-errors.log
[task 2019-09-20T17:06:18.755Z] + true
[task 2019-09-20T17:06:18.756Z] + cleanup_xvfb
[task 2019-09-20T17:06:18.756Z] pidof Xvfb
[task 2019-09-20T17:06:18.756Z] ++ pidof Xvfb
[task 2019-09-20T17:06:18.759Z] + local xvfb_pid=24
[task 2019-09-20T17:06:18.759Z] + local vnc=false
[task 2019-09-20T17:06:18.759Z] + local interactive=false
[task 2019-09-20T17:06:18.759Z] + '[' -n 24 ']'
[task 2019-09-20T17:06:18.760Z] + [[ false == false ]]
[task 2019-09-20T17:06:18.760Z] + [[ false == false ]]
[task 2019-09-20T17:06:18.760Z] + kill 24
[task 2019-09-20T17:06:18.775Z] XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
[task 2019-09-20T17:06:18.775Z] after 5281 requests (5281 known processed) with 0 events remaining.
[task 2019-09-20T17:06:18.775Z] compizconfig - Info: Backend : ini
[task 2019-09-20T17:06:18.775Z] compizconfig - Info: Integration : true
[task 2019-09-20T17:06:18.775Z] compizconfig - Info: Profile : default
[task 2019-09-20T17:06:18.775Z] + screen -XS xvfb quit
[task 2019-09-20T17:06:18.918Z] No screen session found.
[task 2019-09-20T17:06:18.919Z] + true
[task 2019-09-20T17:06:18.919Z] + exit 1
[taskcluster 2019-09-20 17:06:19.182Z] === Task Finished ===
[taskcluster 2019-09-20 17:06:23.973Z] Unsuccessful task run with exit code: 1 completed in 942.426 seconds

Flags: needinfo?(botond)

(In reply to Andrew McCreight [:mccr8] (PTO 9/16-9/23) from comment #23)

The CC log output shows that the leak is happening via documents. Now that you've narrowed it down to a single test, you could try using command line option to mach to have it run the one test 10 times and see if that helps to reproduce it.

Andrew, what would be your suggested way to track down the issue if we are able to reproduce it locally?

(We haven't been able to, I'm just trying to see if it's possible to adapt the strategy to automation where we can repro it.)

Flags: needinfo?(botond) → needinfo?(continuation)

Anyways, as Timothy has narrowed down the issue to one test, we can disable it until we resolve the leak, to spare the sheriffs from having to star a frequent intermittent.

Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3b859414dc3b
Disable helper_overscroll_behavior_bug1425573.html until we track down the memory leak it's causing. r=tnikkel
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71

(In reply to Botond Ballo [:botond] from comment #37)

Andrew, what would be your suggested way to track down the issue if we are able to reproduce it locally?

This will let you find the missing references from the cycle collector log:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD/Heap_Scan_Mode

Note that this is actually just all based on logging (the basic idea is you dump the full contents of memory plus allocation stacks for every block of memory, late in shutdown, and correlate that with the CC logs), so it should be possible to get it going in automation, and then download the logs locally and investigate them. Symbolification of stacks will presumably require some extra step of downloading build artifacts.

Flags: needinfo?(continuation)
Blocks: 1583955
You need to log in before you can comment on or make changes to this bug.