Closed Bug 1621323 Opened 4 years ago Closed 4 years ago

Linux x64 tsan mochitest 20 M(20) frequently retried and failing,

Categories

(Core :: Sanitizers, defect, P1)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: aryx, Assigned: decoder)

References

Details

(Whiteboard: [retriggered])

Attachments

(1 file)

Since yesterday UTC evening, the Linux x64 tsan M(20) frequently gets retried, up to the limit of 5 retries (and then fails with an exception):

https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&resultStatus=success%2Cpending%2Crunning%2Cretry%2Cusercancel%2Ctestfailed%2Cbusted%2Cexception&fromchange=dc30c7d0df8392adf5c90262ce563ff9f6bbe180&searchStr=07846be0250651c0251e1b10f61192ba4dbbd754&tochange=28fa31759b1000b73f9cf8b054e6c53a7a927615&selectedJob=292427013

Many of the tasks which complete fail with bug 1619072.

The log for the treeherder link above mentions

SUMMARY: ThreadSanitizer: out-of-memory (/builds/worker/workspace/build/tests/bin/xpcshell+0x55864)

Backfill is inconclusive, failures start with a change which should only affect mochitest browser-chrome and not mochitest-plain: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&selectedJob=292471838&resultStatus=success%2Cpending%2Crunning%2Cretry%2Cusercancel%2Ctestfailed%2Cbusted%2Cexception&fromchange=00a5c2ac8f8466c4b5c3a5e5eeb95f815768cca7&searchStr=linux%2Ctsan%2Cmochitest&tochange=1692426a9549efd97f3241b8fc314eac0786a750

Christian, can you take a look or redirect if necessary, please?

Flags: needinfo?(choller)
Assignee: nobody → choller
Status: NEW → ASSIGNED

This should disable the test that OOMs (in some runs, it also simply times out), but it also fixes TSan's options about OOMing to behave like ASan does (return NULL when an allocation fails due to OOM, rather than aborting).

Flags: needinfo?(choller)
Pushed by choller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a7a6063bd5a4
Disable an OOMing test for TSan and fix TSan OOM options. r=froydnj

Tasks still getting frequently retried but if they complete then mostly without errors now.

Keywords: leave-open
Whiteboard: [retriggered]

The priority flag is not set for this bug.
:decoder, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(choller)

:aryx, this is gone, right?

Flags: needinfo?(choller) → needinfo?(aryx.bugmail)
Priority: -- → P1

Yes. M(19) and M(20) sometimes hit 1-2 retries but in general it's gone.

Flags: needinfo?(aryx.bugmail)

The leave-open keyword is there and there is no activity for 6 months.
:decoder, maybe it's time to close this bug?

Flags: needinfo?(choller)
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(choller)
Keywords: leave-open
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: