Closed
Bug 1324592
Opened 8 years ago
Closed 8 years ago
Windows gamepad mochitests perma-fail with leaks, but do not turn the job orange
Categories
(Core :: DOM: Device Interfaces, defect)
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox51 | --- | unaffected |
firefox52 | + | fixed |
firefox53 | + | fixed |
People
(Reporter: gbrown, Assigned: cleu)
References
Details
Attachments
(2 files)
3.43 KB,
patch
|
cleu
:
review+
jcristau
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
1.61 KB,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-win32-debug/1482159878/mozilla-central_win7_vm-debug_test-mochitest-3-bm129-tests1-windows-build81.txt.gz
09:27:00 INFO - TEST-OK | dom/tests/mochitest/gamepad/test_navigator_gamepads.html | took 192ms
09:27:00 INFO - ++DOMWINDOW == 50 (12002800) [pid = 3512] [serial = 50] [outer = 1B3D2C00]
09:27:00 INFO - [3512] WARNING: An event was posted to a thread that will never run it (rejected): file c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/xpcom/threads/nsThread.cpp, line 695
09:27:00 INFO - TEST-START | Shutdown
09:27:00 INFO - Passed: 46
09:27:00 INFO - Failed: 0
09:27:00 INFO - Todo: 0
09:27:00 INFO - Mode: non-e10s
09:27:00 INFO - Slowest: 1628ms - /tests/dom/tests/mochitest/gamepad/test_check_timestamp.html
09:27:00 INFO - SimpleTest FINISHED
09:27:00 INFO - TEST-INFO | Ran 1 Loops
09:27:00 INFO - SimpleTest FINISHED
...
09:27:06 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 3512
09:27:06 INFO -
09:27:06 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
09:27:06 INFO - | | Per-Inst Leaked| Total Rem|
09:27:06 INFO - 0 |TOTAL | 20 3572| 1782633 86|
09:27:06 INFO - 84 |CancelableRunnable | 24 144| 3362 6|
09:27:06 INFO - 122 |CondVar | 24 312| 398 13|
09:27:06 INFO - 195 |DelayedRunnable | 72 432| 33 6|
09:27:06 INFO - 318 |IdlePeriod | 12 72| 66 6|
09:27:06 INFO - 409 |Mutex | 20 140| 1614 7|
09:27:06 INFO - 552 |Runnable | 20 360| 11984 18|
09:27:06 INFO - 1281 |nsTArray_base | 4 48| 419425 12|
09:27:06 INFO - 1289 |nsThread | 200 1200| 65 6|
09:27:06 INFO - 1293 |nsTimer | 16 96| 989 6|
09:27:06 INFO - 1294 |nsTimerImpl | 128 768| 989 6|
09:27:06 INFO -
09:27:06 INFO - nsTraceRefcnt::DumpStatistics: 1396 entries
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 6 CancelableRunnable
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 13 CondVar
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 6 DelayedRunnable
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 6 IdlePeriod
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 7 Mutex
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 18 Runnable
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 12 nsTArray_base
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 6 nsThread
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 6 nsTimer
09:27:06 INFO - TEST-INFO | leakcheck | default process: leaked 6 nsTimerImpl
09:27:06 WARNING - TEST-UNEXPECTED-FAIL | leakcheck | default process: 3572 bytes leaked (CancelableRunnable, CondVar, DelayedRunnable, IdlePeriod, Mutex, ...)
...
09:33:41 ERROR - Return code: 1
09:33:41 INFO - TinderboxPrint: mochitest-plain-chunked<br/>1964/0/0
09:33:41 WARNING - # TBPL SUCCESS #
09:33:41 WARNING - The mochitest suite: plain-chunked ran with return status: SUCCESS
Comment 1•8 years ago
|
||
odd, we get a return code=1 and still get tbpl success- this might be in mozharness or some regex in treeherder?
![]() |
Reporter | |
Comment 2•8 years ago
|
||
See https://bugzilla.mozilla.org/show_bug.cgi?id=1323812#c8.
:jgraham -- You seem to be able to sort out these structured logging issues better than most. Can you take a look?
Flags: needinfo?(james)
![]() |
Reporter | |
Comment 3•8 years ago
|
||
As for the actual gamepad leak, it looks like it was introduced on central in https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=8d8846f63b74eb930e48b410730ae088e9bdbee8 and on aurora in https://treeherder.mozilla.org/#/jobs?repo=mozilla-aurora&revision=a065f231fb79a48683b37c4cdb5136d7f14fd560; it is not on beta.
Comment 4•8 years ago
|
||
We had a similar problem (bug 1321127). Is there anything we can do to prevent problems like this from happen in the future? Otherwise we should backout bug 1261199. Unreliable test results will bring a serious quality concern into our products.
![]() |
Reporter | |
Comment 5•8 years ago
|
||
The gamepad leak started with https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=313bd77f09009341a35b45bc75866a57f02907b9, making it a regression from bug 1318839 (or, less likely, bug 1312236).
:lenzak800 -- It looks like bug 1318839 caused a leak in gamepad tests, which has been overlooked for all this time because of a problem in our test infrastructure. We'll need those leaks fixed or your patch backed out before we fix the infrastructure problem.
Flags: needinfo?(cleu)
![]() |
Reporter | |
Comment 6•8 years ago
|
||
(You can see the leaks in logs for Windows Debug mochitest-3, following the gamepad tests, but the jobs are unexpectedly green.)
Assignee | ||
Comment 7•8 years ago
|
||
OK, backout them if needed, I will try to figure out how to fix it.
Flags: needinfo?(cleu)
Comment 8•8 years ago
|
||
So the problem here is that mochitest is still logging leaks using an warning message with the string TEST-UNEXPECTED-FAIL, but mozharness no longer parses that out of the log. So we need the same fix as in the reftest assert bug where we add this to the structured logs somehow.
I think we should add a "leak" action to the structured logs with fields like type/count/url/docshellIds (with the latter being a (pid, id) tuple). Laternatively if we don't want all the information we could just add a generic "message" field. There is also some logging that should be upgraded from "warning" to "error" about failing to parse the line.
Testing this stuff is a little complex. We could perhaps write some tests that are known to leak/assert/etc. and then ensure that mozharness signals treeherder in the correct way. But we don't have any harness for writing that kind of test so it's not a trivial project. Might be a good outreachy idea, or similar.
Flags: needinfo?(james)
Comment 9•8 years ago
|
||
Then we should audit the string TEST-UNEXPECTED-FAIL in the tree?
https://dxr.mozilla.org/mozilla-central/search?q=TEST-UNEXPECTED-FAIL&redirect=false
Comment 10•8 years ago
|
||
Yes, but it's not as simple as "that string should no longer appear in the source tree (except in mozlog formatters)" since aiui not every test harness has moved over to structured logging. But it shouldn't appear in reftests or mochitests anymore.
Comment 11•8 years ago
|
||
Oh and anything logged as error or above should turn the job orange regardless of the string, so TEST-UNEXPECTED-FAIL is OK in those cases too.
Comment 12•8 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #4)
> We had a similar problem (bug 1321127). Is there anything we can do to
> prevent problems like this from happen in the future? Otherwise we should
> backout bug 1261199. Unreliable test results will bring a serious quality
> concern into our products.
This is a mochitest issue, bug 1261199 is for reftest so please don't back that out :). Also, you are assuming that there was not a "serious quality concern" prior to structured logging. When we switched to using the StructuredOutputParser, we probably discovered and fixed around 5-6 issues that were failing but not turning jobs orange. While there have been some unfortunate growing pains, structured logging has actually been a large improvement over its unstructured counterpart in terms of properly flagging errors.
That being said, yes, we should have better test suites for the test harnesses themselves to catch stuff like this. As jgraham mentioned, testing this kind of stuff can be tricky.
Comment 13•8 years ago
|
||
(In reply to Michael Leu[:lenzak800](UTC+8) from comment #7)
> OK, backout them if needed, I will try to figure out how to fix it.
You have a bit of time, no need to backout until there is a patch here ready to land which likely won't happen until sometime after the holidays.
Comment 14•8 years ago
|
||
(In reply to James Graham [:jgraham] from comment #8)
> I think we should add a "leak" action to the structured logs with fields
> like type/count/url/docshellIds (with the latter being a (pid, id) tuple).
> Laternatively if we don't want all the information we could just add a
> generic "message" field. There is also some logging that should be upgraded
> from "warning" to "error" about failing to parse the line.
I agree that this probably warrants a new action. But due to the holidays and the fact that getting said action right might take a bit of time, I propose we file a follow-up and close out this bug by turning that WARNING into an ERROR. This should at least make sure we don't get any new leaks while we figure out a more proper solution.
Assignee | ||
Comment 15•8 years ago
|
||
Hi Geoff, I have submitted a new patch for bug1318839 and pushed it to try server, I didn't see the warning anymore, is that mean it mitigates the leak?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e9e42191c9ac26f3fd8b7379c303932a9924d75&selectedJob=33304458
Flags: needinfo?(gbrown)
![]() |
||
Comment 17•8 years ago
|
||
[Tracking Requested - why for this release]: Leaks are bad
Are we sure this is not a problem on beta? Bug 1318839 claims it's fixed on 51.
tracking-firefox51:
--- → ?
tracking-firefox52:
--- → ?
tracking-firefox53:
--- → ?
Flags: needinfo?(cleu)
Comment 18•8 years ago
|
||
This leak exists on aurora:
https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-aurora-win32-debug/1483024040/mozilla-aurora_win7_vm-debug_test-mochitest-3-bm129-tests1-windows-build1.txt.gz
But I don't see it on beta. The leak detection in mochitest regressed Oct. 26th, which is when 52 was central. So there shouldn't be any mochitest leaks on 51.
status-firefox51:
--- → unaffected
status-firefox52:
--- → affected
status-firefox53:
--- → affected
tracking-firefox51:
? → ---
Assignee | ||
Comment 19•8 years ago
|
||
Assignee: nobody → cleu
Flags: needinfo?(cleu)
Assignee | ||
Updated•8 years ago
|
Attachment #8822534 -
Flags: review+
Assignee | ||
Comment 20•8 years ago
|
||
Here is try
https://treeherder.mozilla.org/#/jobs?repo=try&revision=81bdf744578c382f62b91da5795ffa672a6ae7e8
Want to ensure nothing is wrong, so run it again with the rebased patch.
Assignee | ||
Updated•8 years ago
|
Keywords: checkin-needed
Updated•8 years ago
|
Component: General → DOM: Device Interfaces
Product: Testing → Core
Version: Version 3 → 52 Branch
Comment 21•8 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ee66c0cbd058
Use nsITimer to trigger DirectInput polling instead of DelayedDispatch. r=qdot
Keywords: checkin-needed
Comment 22•8 years ago
|
||
Don't forget to remove this bug from the whitelist:
https://dxr.mozilla.org/mozilla-central/rev/31ffcb82ced81bb75faa800d2b7e883a3761a03b/testing/mochitest/runtests.py#2222
Comment 23•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 25•8 years ago
|
||
Yes, this needs to land on 52 as well.
Updated•8 years ago
|
Comment 27•8 years ago
|
||
Michael, please fill in the template for uplift to 52. Thanks!
Flags: needinfo?(cleu)
Comment 28•8 years ago
|
||
This also still needs the whitelisting removed, as comment 22 said.
Assignee | ||
Comment 29•8 years ago
|
||
Comment on attachment 8822534 [details] [diff] [review]
Bug1324592 - Using nsITimer to trigger DirectInput polling instead of DelayedDispatch r=qdot
Review of attachment 8822534 [details] [diff] [review]:
-----------------------------------------------------------------
[Feature/regressing bug #]: Gamepad mochitest has memory leak upon gecko shutdown under Windows backend
[User impact if declined]: Leaks are bad
[Describe test coverage new/current, TBPL]: Hand test with physical gamepad and tryserver for checking memory leak
[Risks and why]: The risk is low, it uses nsITimer to substitude nsIThread->DelayedDispatch to do same behavior
[String/UUID change made/needed]: none
Attachment #8822534 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(cleu)
Comment 30•8 years ago
|
||
and here is a try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=359ac2890f7841c6c497131d508c69046443373b
Attachment #8824392 -
Flags: review?(ahalberstadt)
Comment 31•8 years ago
|
||
Comment on attachment 8824392 [details] [diff] [review]
gamepad_leak.patch
Review of attachment 8824392 [details] [diff] [review]:
-----------------------------------------------------------------
Lgtm, this should also land on 52 after the fix does.
Attachment #8824392 -
Flags: review?(ahalberstadt) → review+
Comment 32•8 years ago
|
||
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1e28310fb3e5
enable leak checking for gamepad mochitests. r=ahal
Comment 33•8 years ago
|
||
bugherder |
Comment 34•8 years ago
|
||
Comment on attachment 8822534 [details] [diff] [review]
Bug1324592 - Using nsITimer to trigger DirectInput polling instead of DelayedDispatch r=qdot
gamepad memory leak fix, aurora52+
Attachment #8822534 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 35•8 years ago
|
||
bugherder uplift |
This needed a lint fix because of some rebasing between m-c and aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/9a50a9d02d35491c28b60cca9ede974e7e4da200
Comment 37•8 years ago
|
||
Backed out in https://hg.mozilla.org/releases/mozilla-aurora/rev/27fec05e14c908974d37eb9529e2abb6057c3057 for Windows leaking in the gamepad directory, https://treeherder.mozilla.org/logviewer.html#?job_id=67475301&repo=mozilla-aurora
Comment 38•8 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #37)
> Backed out in
> https://hg.mozilla.org/releases/mozilla-aurora/rev/
> 27fec05e14c908974d37eb9529e2abb6057c3057 for Windows leaking in the gamepad
> directory,
> https://treeherder.mozilla.org/logviewer.html#?job_id=67475301&repo=mozilla-
> aurora
Flags: needinfo?(cleu)
Assignee | ||
Comment 39•8 years ago
|
||
That's quite strange, why same patch leak in only 52 instead of both 52 and 53....
I'll try to figure it out.
Flags: needinfo?(cleu)
Comment 40•8 years ago
|
||
(In reply to Michael Leu[:lenzak800](UTC+8) from comment #39)
> That's quite strange, why same patch leak in only 52 instead of both 52 and 53....
The problem is that the leak fix did not land on Aurora, only the un-whitelisting:
https://hg.mozilla.org/releases/mozilla-aurora/rev/d1dd912cac5b
Both patches need to land there.
Keywords: checkin-needed
Whiteboard: [checkin-needed Aurora][land both patches]
Comment 41•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/e7bab2d3d88a
https://hg.mozilla.org/releases/mozilla-aurora/rev/1099916bc52b
Comment 42•8 years ago
|
||
bugherder uplift |
You need to log in
before you can comment on or make changes to this bug.
Description
•