Closed Bug 1745116 Opened 10 months ago Closed 2 months ago

Intermittent ThreadMetrics.CollectMetrics | Expected: (duration) >= (50000u), actual: 48922 vs 50000 @ /builds/worker/checkouts/gecko/xpcom/tests/gtest/TestThreadMetrics.cpp:186

Categories

(Core :: XPCOM, defect, P5)

defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox-esr91 --- disabled
firefox-esr102 --- fixed
firefox95 --- wontfix
firefox96 --- wontfix
firefox97 --- wontfix
firefox105 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: nika)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disabled])

Attachments

(2 files)

Filed by: abutkovits [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=360560748&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Ru1mw6q3QhiohaUWHwCqzQ/runs/0/artifacts/public/logs/live_backing.log


[task 2021-12-09T03:25:02.131Z] 03:25:02     INFO -  TEST-START | ThreadMetrics.CollectMetrics
[task 2021-12-09T03:25:02.628Z] 03:25:02  WARNING -  TEST-UNEXPECTED-FAIL | ThreadMetrics.CollectMetrics | Expected: (duration) >= (50000u), actual: 48922 vs 50000 @ /builds/worker/checkouts/gecko/xpcom/tests/gtest/TestThreadMetrics.cpp:186
[task 2021-12-09T03:25:02.631Z] 03:25:02  WARNING -  TEST-UNEXPECTED-FAIL | ThreadMetrics.CollectMetrics | test completed (time: 499ms)

Update:
There have been 42 failures within the last 7 days:
• 13 failures on Windows 10 x64 2004 asan WebRender opt
• 3 failures on Windows 10 x64 2004 CCov WebRender opt
• 14 failures on Windows 10 x64 2004 WebRender debug
• 11 failures on Windows 10 x64 2004 WebRender opt
• 1 failures on Windows 10 x64 2004 WebRender Shippable opt
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=361823931&repo=mozilla-central

[task 2021-12-20T00:16:57.152Z] 00:16:57     INFO -  TEST-START | ThreadMetrics.CollectMetrics
[task 2021-12-20T00:16:57.244Z] 00:16:57  WARNING -  TEST-UNEXPECTED-FAIL | ThreadMetrics.CollectMetrics | Expected: (duration) >= (50000u), actual: 47275 vs 50000 @ Z:/task_163995083551877/build/src/xpcom/tests/gtest/TestThreadMetrics.cpp:186
[task 2021-12-20T00:16:57.257Z] 00:16:57  WARNING -  TEST-UNEXPECTED-FAIL | ThreadMetrics.CollectMetrics | test completed (time: 105ms)
[task 2021-12-20T00:16:57.257Z] 00:16:57     INFO -  TEST-START | ThreadMetrics.CollectRecursiveMetrics
Whiteboard: [stockwell needswork:owner]

Hi Nika! Could you please help us assign this to someone?
Thank you!

Flags: needinfo?(nika)

:bwc, could you perhaps look into this? I'm thinking this is probably caused by PR_Sleep waking us up slightly early and the sleeping code in TimedRunnable not accommodating for it. (https://searchfox.org/mozilla-central/rev/fac07284a9a996ddf968ea53adaf25c2a8b7c520/xpcom/tests/gtest/TestThreadMetrics.cpp#53-62)

Flags: needinfo?(nika) → needinfo?(docfaraday)

There have been 40 failures in the last 7 days: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2021-12-29&endday=2022-01-05&tree=trunk&bug=1745116

Happens on:
-windows10-64-2004-asan-qr opt
-windows10-64-2004-qr opt and debug.

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=363053512&repo=autoland&lineNumber=4624

Sorry, I have my plate full at the moment.

Flags: needinfo?(docfaraday)
Assignee: nobody → ctuns
Status: NEW → ASSIGNED
Assignee: ctuns → nobody
Status: ASSIGNED → NEW
Keywords: leave-open
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Pushed by ctuns@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/73f018a3325f
disable ThreadMetrics.CollectMetrics on windows x64 r=intermittent-reviewers,MasterWayZ DONTBUILD

This failure started to appear again after bug 1780614 landed
At the moment fails only on win10 x86 and looks high frequency

@Nika, can you take a look?

Flags: needinfo?(nika)

The test was already disabled on 64-bit windows previously in this bug due to high intermittent rates. It's somewhat unsurprising to me that it would start occurring on x86 as well after some changes nearby this code.

That being said, the particular bug is pretty evident here (sleep calls can end early, and the check doesn't take that into account. unlike all other tests in the file), so I'll post a patch.

Flags: needinfo?(nika)
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cf15683f9c21
Make TestThreadMetrics check against how much time was actually spent, r=smaug
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 105 Branch
Assignee: nobody → nika
You need to log in before you can comment on or make changes to this bug.