Perma Linux 18.04 x64 tsan opt crashtest exceptions claim_expired
Categories
(Core :: Sanitizers, defect)
Tracking
()
People
(Reporter: nataliaCs, Unassigned)
References
Details
Attachments
(1 file)
Task: https://firefox-ci-tc.services.mozilla.com/tasks/cskOja4STlqXcOfNbHj8_w/runs/0/
Could this be related to Bug 1692068 in any way?
Reporter | ||
Updated•3 years ago
|
Comment 1•3 years ago
|
||
Crashtests were having issues with some suites where it triggered some sort of mysterious infinite-retry on running out of resources. I couldn't work out why it was happening, but I did find and turn off the series of tests that seemed to be causing it. After that I didn't run into any of these, but looks like it's still happening in the wild. I'll examine more of the tests tonight to see if any of them are responsible.
If there are more tests left over in the suite doing this at high enough volume, we may need to rollback until we can find the incriminating tests.
Comment 2•3 years ago
|
||
I notice there's no logs for any of these. Is there any way to get logs? Or are they discarded for retry?
Reporter | ||
Comment 3•3 years ago
|
||
None of the crashtest exceptions have any logs, or at least they are not visible on our side.
Aryx, do you know what we can do to access them?
Thank you.
Comment 4•3 years ago
|
||
The machines terminates before the logs can be uploaded. Requesting an interactive worker, logging in an running the tests might be the only way to catch this.
Comment 5•3 years ago
|
||
Bug 1712198 which has the first failure added a crashtest. Emilio, can this be fixed soon?
Comment 6•3 years ago
|
||
Yeah I'll try to repro. Worst case we can disable the crastest on tsan or something.
Comment 7•3 years ago
|
||
Not ideal, but not worse than having the jobs not terminate. I can try
to spend some time digging, but I have other stuff on my plate so this
should at least go back to the previous state.
Updated•3 years ago
|
Comment 8•3 years ago
|
||
Let's disable for now on tsan, I don't have a lot of cycles this month.
Updated•3 years ago
|
Comment 10•3 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 18•3 years ago
|
||
This started to fail as an exception on Linux WebRender asan opt.
https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception%2Cpending%2Crunning%2Cretry&fromchange=55fa1d670b119502b16f8795990f9456e1bfcd8c&searchStr=linux%2C18.04%2Cx64%2Cwebrender%2Casan%2Copt%2Cmochitests%2Cwith%2Csoftware%2Cwebrender%2Cand%2Cfission%2Cenabled%2Ctest-linux1804-64-asan-qr%2Fopt-mochitest-browser-chrome-swr-fis-e10s%2Cbc13&tochange=728ead24c5e80f438419b406678c567ed1215c50&selectedTaskRun=MlF6ilf-Q_itqcygvcLqQA.5
Comment hidden (Intermittent Failures Robot) |
Comment 20•3 years ago
|
||
(In reply to Cristian Tuns from comment #18)
This started to fail as an exception on Linux WebRender asan opt.
https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception%2Cpending%2Crunning%2Cretry&fromchange=55fa1d670b119502b16f8795990f9456e1bfcd8c&searchStr=linux%2C18.04%2Cx64%2Cwebrender%2Casan%2Copt%2Cmochitests%2Cwith%2Csoftware%2Cwebrender%2Cand%2Cfission%2Cenabled%2Ctest-linux1804-64-asan-qr%2Fopt-mochitest-browser-chrome-swr-fis-e10s%2Cbc13&tochange=728ead24c5e80f438419b406678c567ed1215c50&selectedTaskRun=MlF6ilf-Q_itqcygvcLqQA.5
In the past when I've seen this it was often related to resource exhaustion. If there is something that has recently caused asan to go out of memory then we may run into this issue. From my understanding the machine crashes before it can generate any usable artifacts.
Comment 21•3 years ago
|
||
Recent cases are being handled in bug 1731580 (and regressor seems to have been identified).
Comment hidden (Intermittent Failures Robot) |
Updated•3 years ago
|
Comment 23•3 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #21)
Recent cases are being handled in bug 1731580 (and regressor seems to have been identified).
Can this be closed?
Comment 24•3 years ago
|
||
Usually we keep the bug open until the issue has been fixed and don't close it when the test gets skipped. Developers can still choose to close it and work on the fix in a different bug.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 37•7 months ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:emilio, maybe it's time to close this bug?
For more information, please visit BugBot documentation.
Comment 38•7 months ago
|
||
Not working actively on it, but it seems this hasn't happened in quite a while, let's file a new bug if it ever happens again.
Updated•7 months ago
|
Description
•