Closed
Bug 1251893
Opened 10 years ago
Closed 9 years ago
Leak stacks don't work on Taskcluster test runs
Categories
(Taskcluster :: Services, defect)
Taskcluster
Services
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla48
People
(Reporter: RyanVM, Assigned: armenzg)
References
Details
Attachments
(1 file)
https://treeherder.mozilla.org/logviewer.html#?job_id=17319342&repo=try
20:20:43 INFO - allocation stack:
20:20:43 INFO - #00: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #01: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #02: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #03: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #04: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #05: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #06: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #07: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #08: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #09: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
20:20:43 INFO - #10: start_thread (/build/eglibc-rrybNj/eglibc-2.15/nptl/pthread_create.c:308)
20:20:43 INFO - #11: clone (/build/eglibc-rrybNj/eglibc-2.15/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:114)
20:20:43 INFO - 14 @0x7f2e0bc67670 (0 references; 0 from COMPtrs)
| Reporter | ||
Comment 1•10 years ago
|
||
To be clear, these stacks are typically generated when adding |"--setenv=XPCOM_MEM_LOG_CLASSES=<the leaking class you want a stack for>"| to the appropriate mozharness test config in the tree.
An example of this in action - https://hg.mozilla.org/try/rev/4689074c7b22 - which gives a log like https://treeherder.mozilla.org/logviewer.html#?job_id=17324551&repo=try#L74993 when working correctly.
| Reporter | ||
Updated•10 years ago
|
| Assignee | ||
Comment 2•10 years ago
|
||
Where can I learn about how these stacks work? or make them work?
Flags: needinfo?(ted)
Comment 3•10 years ago
|
||
I don't think the leak log bit is super relevant (despite the way the bug is filed). I think the bug is that the stack hasn't been symbolized, which is supposed to happen inside the Mochitest harness. It should be using the same code we use to get symbols for assertion stacks:
https://dxr.mozilla.org/mozilla-central/rev/5e0140b6d11821e0c2a2de25bc5431783f03380a/testing/mochitest/runtests.py#2517
The actual implementation of that is in mozrunner:
https://dxr.mozilla.org/mozilla-central/rev/5e0140b6d11821e0c2a2de25bc5431783f03380a/testing/mozbase/mozrunner/mozrunner/utils.py#246
Flags: needinfo?(ted)
Comment 4•10 years ago
|
||
This should be simpler to test by making a test cause an assertion. You can do that by calling a method on the nsIDebug interface, like:
https://dxr.mozilla.org/mozilla-central/rev/5e0140b6d11821e0c2a2de25bc5431783f03380a/testing/xpcshell/selftest.py#539
Comment 5•10 years ago
|
||
Moving to TaskCluster component...
Component: TaskCluster → Integration
Product: Testing → Taskcluster
Version: Version 3 → unspecified
Comment 6•9 years ago
|
||
:armenzg, this one issue seemed to have fallen through the cracks, can you pick this up and help get this solved?
Flags: needinfo?(armenzg)
| Assignee | ||
Comment 8•9 years ago
|
||
Let me try to rephrase this, in this push [1] we can see logs of jobs that should have a meaningful allocation stacks but don't [2]
In one type of jobs we download the symbols on deman while on the Buildbot ones we don't.
I see that get_stack_fixer() takes a symbolsPath. This could be related [3]
Should we move away from using ondemand for downloading symbols?
stackFixer() is called in the __init__ of the OutputHandler.
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=96b972ba996b
[2]
https://public-artifacts.taskcluster.net/HYqHzLTjR3yOkSR4NVWqkA/0/public/logs/live_backing.log
19:20:51 INFO - allocation stack:
19:20:52 INFO - #00: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
19:20:52 INFO - #01: ??? (/home/worker/workspace/build/application/firefox/libxul.so)
[3]
https://dxr.mozilla.org/mozilla-central/rev/5e0140b6d11821e0c2a2de25bc5431783f03380a/testing/mozbase/mozrunner/mozrunner/utils.py#266
Comment 9•9 years ago
|
||
I believe that buildbot always downloads symbols for debug tests ahead of time to handle cases like this.
Comment 10•9 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #9)
> I believe that buildbot always downloads symbols for debug tests ahead of
> time to handle cases like this.
Yes, this is correct. For non-debug builds having the symbol path be a URL works great because mozcrash will download the symbols on-demand to produce a stack for a crash. For debug builds, we will almost always hit an assertion which will print a stack like this and needs symbols to make it useful, so we should download them before running tests, unpack them, and set the symbol path to the local directory.
| Assignee | ||
Comment 11•9 years ago
|
||
Do we need the change for every debug job? or only for mochitests?
Comment 12•9 years ago
|
||
Every debug test job.
| Assignee | ||
Comment 13•9 years ago
|
||
Debug test jobs are expected to cause asserts and we need to have symbols on disk to
get proper allocation stacks. The OutputParser requires the symbols to be on disk upon
contrustcion of the object [1][2].
[1] https://dxr.mozilla.org/mozilla-central/rev/5e0140b6d11821e0c2a2de25bc5431783f03380a/testing/mochitest/runtests.py#2484
[2] https://dxr.mozilla.org/mozilla-central/rev/5e0140b6d11821e0c2a2de25bc5431783f03380a/testing/mozbase/mozrunner/mozrunner/utils.py#266
Review commit: https://reviewboard.mozilla.org/r/46439/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/46439/
Attachment #8741378 -
Flags: review?(ted)
| Assignee | ||
Comment 14•9 years ago
|
||
Comment on attachment 8741378 [details]
MozReview Request: Bug 1251893 - Change TC Linux64 debug jobs to always download symbols. r=ted
Not ready for review. Sorry.
Attachment #8741378 -
Flags: review?(ted)
| Assignee | ||
Comment 15•9 years ago
|
||
Comment on attachment 8741378 [details]
MozReview Request: Bug 1251893 - Change TC Linux64 debug jobs to always download symbols. r=ted
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/46439/diff/1-2/
Attachment #8741378 -
Flags: review?(ted)
| Assignee | ||
Comment 16•9 years ago
|
||
hrmm my next review push raised the flag back.
Anyways, here's the push that shows that nothing breaks [1]
Here's a command showing --download-symbols=true instead of ondemand [2]
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=637e5197199933ea5ad2f931fcf13ba4e54199d3&filter-searchStr=tc&group_state=expanded&selectedJob=19497053
[2] /home/worker/workspace/mozharness/scripts/desktop_unittest.py --config-file mozharness/configs/unittests/linux_unittest.py --config-file mozharness/configs/remove_executables.py --no-read-buildbot-config --installer-url=https://queue.taskcluster.net/v1/task/M_n8wx4wR9aaBC_mEwj-Cw/artifacts/public/build/target.tar.bz2 --test-packages-url=https://queue.taskcluster.net/v1/task/M_n8wx4wR9aaBC_mEwj-Cw/artifacts/public/build/target.test_packages.json --download-symbols=true --gtest-suite=gtest
Updated•9 years ago
|
Attachment #8741378 -
Flags: review?(ted)
Comment 17•9 years ago
|
||
Comment on attachment 8741378 [details]
MozReview Request: Bug 1251893 - Change TC Linux64 debug jobs to always download symbols. r=ted
https://reviewboard.mozilla.org/r/46439/#review43875
I don't fully understand the current state of the in-tree task definitions, but it looks like these same tasks would be used for opt tests, is that right? If so then changing these to always download symbols isn't what we want. We want opt tests to continue to download ondemand.
| Assignee | ||
Comment 18•9 years ago
|
||
At the moment there are no opt tests.
gbrown is likely to tackle this and we will need to split the yml files into opt/debug.
We can have opt jobs at that point use ondemand.
Can we get this landed and ask gbrown to make sure to use ondemand for opt jobs?
Comment 20•9 years ago
|
||
(In reply to Armen Zambrano [:armenzg] - Engineering productivity from comment #18)
> At the moment there are no opt tests.
> gbrown is likely to tackle this and we will need to split the yml files into
> opt/debug.
> We can have opt jobs at that point use ondemand.
>
> Can we get this landed and ask gbrown to make sure to use ondemand for opt
> jobs?
OK. I'm not wild about leaving footguns in the tree for the next person to deal with, but if this is something we've already scoped, maybe you can just put a comment in the task definitions to note that those should be `ondemand` for opt jobs?
Flags: needinfo?(ted)
Updated•9 years ago
|
Attachment #8741378 -
Flags: review+
Comment 21•9 years ago
|
||
Comment on attachment 8741378 [details]
MozReview Request: Bug 1251893 - Change TC Linux64 debug jobs to always download symbols. r=ted
https://reviewboard.mozilla.org/r/46439/#review44265
::: testing/taskcluster/tasks/tests/fx_desktop_unittest.yml:12
(Diff revision 2)
> - bash
> - /home/worker/bin/test.sh
> - --no-read-buildbot-config
> - --installer-url={{build_url}}
> - --test-packages-url={{test_packages_url}}
> - - --download-symbols=ondemand
> + - --download-symbols=true
As I said, let's put a comment here to let gbrown (or whoever winds up doing the work) know that opt tests need to use `ondemand`, so that we don't wind up with the reverse of this problem in the near future.
| Assignee | ||
Comment 22•9 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #21)
> > - - --download-symbols=ondemand
> > + - --download-symbols=true
>
> As I said, let's put a comment here to let gbrown (or whoever winds up doing
> the work) know that opt tests need to use `ondemand`, so that we don't wind
> up with the reverse of this problem in the near future.
This is scoped in bug 1265773 by gbrown.
I will land since it is already going to be worked on.
| Assignee | ||
Comment 23•9 years ago
|
||
I will add comments and land it tomorrow.
Comment 24•9 years ago
|
||
There are a few debug test jobs that still use ondemand symbols after this change:
fx_linux64_firefox_ui_functional.yml
fx_linux64_firefox_ui_functional_e10s.yml
and some that don't specify:
fx_linux64_luciddream.yml
fx_linux64_mochitest_vg.yml
Is that intentional?
| Assignee | ||
Comment 25•9 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #24)
> There are a few debug test jobs that still use ondemand symbols after this
> change:
>
> fx_linux64_firefox_ui_functional.yml
> fx_linux64_firefox_ui_functional_e10s.yml
>
These are whimboo's jobs and these don't run on developer's pushes but on nightly jobs.
They're not debug jobs.
> and some that don't specify:
>
> fx_linux64_luciddream.yml
> fx_linux64_mochitest_vg.yml
>
> Is that intentional?
Luciddream (not running) and valgrind are only for opt builds.
| Assignee | ||
Comment 26•9 years ago
|
||
| Assignee | ||
Comment 27•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/cc6534601ad80c04c260493112e9a8d621c304ad
Bug 1251893 - Change TC Linux64 debug jobs to always download symbols. r=ted
Comment 28•9 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Updated•7 years ago
|
Component: Integration → Services
You need to log in
before you can comment on or make changes to this bug.
Description
•