Closed Bug 1323812 Opened 8 years ago Closed 8 years ago

leak in gamepad mochitests when running in taskcluster win7 environment

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1324592

People

(Reporter: jmaher, Assigned: gbrown)

References

Details

8:37:58 INFO - TEST-INFO | leakcheck | plugin process: leak threshold set at 0 bytes 18:37:58 INFO - TEST-INFO | leakcheck | tab process: leak threshold set at 10000 bytes 18:37:58 INFO - TEST-INFO | leakcheck | geckomediaplugin process: leak threshold set at 20000 bytes 18:37:58 INFO - TEST-INFO | leakcheck | gpu process: leak threshold set at 0 bytes 18:37:58 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 2988 18:37:58 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->| 18:37:58 INFO - | | Per-Inst Leaked| Total Rem| 18:37:58 INFO - 0 |TOTAL | 20 3572| 1626053 86| 18:37:58 INFO - 84 |CancelableRunnable | 24 144| 2639 6| 18:37:58 INFO - 122 |CondVar | 24 312| 391 13| 18:37:58 INFO - 195 |DelayedRunnable | 72 432| 20 6| 18:37:58 INFO - 318 |IdlePeriod | 12 72| 64 6| 18:37:58 INFO - 410 |Mutex | 20 140| 1589 7| 18:37:58 INFO - 552 |Runnable | 20 360| 10821 18| 18:37:58 INFO - 1275 |nsTArray_base | 4 48| 386797 12| 18:37:58 INFO - 1283 |nsThread | 200 1200| 63 6| 18:37:58 INFO - 1287 |nsTimer | 16 96| 879 6| 18:37:58 INFO - 1288 |nsTimerImpl | 128 768| 879 6| 18:37:58 INFO - nsTraceRefcnt::DumpStatistics: 1389 entries 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 6 CancelableRunnable 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 13 CondVar 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 6 DelayedRunnable 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 6 IdlePeriod 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 7 Mutex 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 18 Runnable 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 12 nsTArray_base 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 6 nsThread 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 6 nsTimer 18:37:58 INFO - TEST-INFO | leakcheck | default process: leaked 6 nsTimerImpl 18:37:58 WARNING - TEST-UNEXPECTED-FAIL | leakcheck | default process: 3572 bytes leaked (CancelableRunnable, CondVar, DelayedRunnable, IdlePeriod, Mutex, ...) this comes from the /tests/dom/tests/mochitest/gamepad directory. you can see this leak on e10s and non-e10s: https://treeherder.mozilla.org/#/jobs?repo=try&revision=30155ac5b04f8a6e597eed5260d670387b288524&filter-tier=1&filter-tier=2&filter-tier=3&group_state=expanded&selectedJob=32759345&filter-searchStr=mochitest%203 this is different from the environments that we run in for buildbot win7 as we have a fresh configuration of the machine which means there are slightly different versions of tools, system libraries, etc. looking at the manifest, there are 6 tests, one or more of those could be the culprit: https://dxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/gamepad/mochitest.ini
looking at the logs for the gamepad api on buildbot vs taskcluster I don't see anything that jumps out- possibly just narrowing down to the test case which is causing problems and then looking at the code or doing other debugging will help.
Assignee: nobody → gbrown
It seems that each of the gamepad tests produce leaks; definitely test_check_timestamp.html, test_gamepad_hidden_frame.html, and test_navigator_gamepads.html. For example, https://treeherder.mozilla.org/#/jobs?repo=try&revision=5b8d370ac7247b93e6c402bca40ce83e22648d62.
I took a look at test_check_timestamp.html out of curiosity: https://dxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/gamepad/test_check_timestamp.html?q=path%3Atest_check_timestamp.html&redirect_type=single I see that we add a listener but never remove it- it is also a bit unclear if we add >1 gamepad and if we are really removing the proper gamepad. there is also the mock_gamepad.js: https://dxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/gamepad/mock_gamepad.js?q=path%3Amock_gamepad.js&redirect_type=single I suspect there is more to investigate before coming to conclusions. I do wonder if this reproduces locally. Glad to see the test of each test and sad to see so many failing- that hints at a common flaw in either the gamepad api or the way the tests are written.
Thanks Joel! The listener is part of the problem. Once that's resolved, there is still a leak of 1 GamepadPlatformService (based on a simplified test_check_timestamp.html).
`mock_gamepad.js` is used because we don't have real gamepad devices on the testers. I'm not sure why the event listener would cause a leak, surely it'd just get GCed after the test page is unloaded? It's not surprising that this would reproduce with any of the gamepad tests, they're all doing fundamentally the same stuff. One thing that's a possible cause here is if the Taskcluster machines have an xinput dll available and the buildbot ones don't. We spin up an extra thread to poll XInput when we can load an xinput dll.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #5) > One thing that's a possible > cause here is if the Taskcluster machines have an xinput dll available and > the buildbot ones don't. We spin up an extra thread to poll XInput when we > can load an xinput dll. Thanks Ted, good idea, but I found 'xinput9_1_0.dll' is loaded on both taskcluster and buildbot. Looking closer, I could find virtually no difference between taskcluster and buildbot...and then I noticed, these tests are failing with the same leaks on buildbot -- they just don't turn the job orange!! Grrr.
is there a reason this leak is not reported as orange on buildbot? Just 2 weeks ago I realized that crashtests were perma orange on buildbot, but not taskcluster- this reduces my confidence in our reported :( At the very least lets get a bug on file, ideally one that has information about why this is happening and we can get it fixed.
Depends on: 1324592
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b4835e3edd9121d832023280396ad4b1dcd02ff9, with https://hg.mozilla.org/try/rev/be6d942382bea7a50bf395d6bcc1111f301a4bb0, demonstrates the difference between the bb and tc cases: the mozharness configurations have diverged, and when the divergence is rectified, jobs are green (despite the leak) on both bb and tc. It looks like leak failures go undetected when structured logging is used.
Depends on: 1325136
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.