Intermittent FOG.TestCppTimingDistWorks | Expected: (data.sum) > ((uint64_t)(15 * NANOS_IN_MILLIS) - EPSILON), actual: 14813300 vs 14960000 @ /builds/worker/checkouts/gecko/toolkit/components/glean/tests/gtest/TestFog.cpp:294
Categories
(Toolkit :: Telemetry, defect, P5)
Tracking
()
Tracking | Status | |
---|---|---|
firefox112 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: rzvncj, Mentored)
References
Details
(Keywords: intermittent-failure, Whiteboard: [good second bug][lang=C++])
Attachments
(1 file)
Filed by: nfay [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=403807246&repo=mozilla-esr102
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bpu5xbUJSSyfTS7lR_SFoA/runs/0/artifacts/public/logs/live_backing.log
[task 2023-01-27T20:05:48.787Z] 20:05:48 INFO - TEST-START | FOG.TestCppTimingDistWorks
[task 2023-01-27T20:05:48.797Z] 20:05:48 WARNING - TEST-UNEXPECTED-FAIL | FOG.TestCppTimingDistWorks | Expected: (data.sum) > ((uint64_t)(15 * NANOS_IN_MILLIS) - EPSILON), actual: 14813300 vs 14960000 @ /builds/worker/checkouts/gecko/toolkit/components/glean/tests/gtest/TestFog.cpp:294
[task 2023-01-27T20:05:48.797Z] 20:05:48 WARNING - TEST-UNEXPECTED-FAIL | FOG.TestCppTimingDistWorks | test completed (time: 10ms)
[task 2023-01-27T20:05:48.797Z] 20:05:48 INFO - TEST-START | FOG.TestLabeledBooleanWorks
Comment 1•2 years ago
|
||
The 15 * NANOS_IN_MILLIS
is the assumed, reasonable sum of one timing sample that slept for two 5ms intervals plus one timing sample that slept for one 5ms interval.
HOWEVER, things are rarely (see the bugs in the See Also list for just how rare) that reasonable. Those 5ms intervals are only "roughly" 5ms long, and if each of them is shaved juuuuust enough, we can find ourselves at a sum that is 15ms - (3 * SHAVE)
. This is why we put in EPSILON
: we hope that EPSILON > (3 * SHAVE)
and simultaneously keep it as small as possible.
Alas. An EPSILON
of 0.04ms
is apparently not always enough. We need one of at least 0.1867ms
.
(( 1.3% seems like an awful lot for a timer to be off... but remember, this is three compounding errors, so it averages to 0.41%. Still not great, but. ))
(( Also, we straight up use an EPSILON
of 2ms in the JS versions of this test (most of that's due to rounding, but not all of it) ))
A decent fix for this would be to up EPSILON
to 200000
or (uint64_t)(0.2 * NANOS_IN_MILLIS)
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 3•2 years ago
|
||
Updated•2 years ago
|
Comment 5•2 years ago
|
||
bugherder |
Description
•