Closed Bug 1813167 Opened 2 years ago Closed 2 years ago

Intermittent FOG.TestCppTimingDistWorks | Expected: (data.sum) > ((uint64_t)(15 * NANOS_IN_MILLIS) - EPSILON), actual: 14813300 vs 14960000 @ /builds/worker/checkouts/gecko/toolkit/components/glean/tests/gtest/TestFog.cpp:294

Categories

(Toolkit :: Telemetry, defect, P5)

defect

Tracking

()

RESOLVED FIXED
112 Branch
Tracking Status
firefox112 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: rzvncj, Mentored)

References

Details

(Keywords: intermittent-failure, Whiteboard: [good second bug][lang=C++])

Attachments

(1 file)

Filed by: nfay [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=403807246&repo=mozilla-esr102
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bpu5xbUJSSyfTS7lR_SFoA/runs/0/artifacts/public/logs/live_backing.log


[task 2023-01-27T20:05:48.787Z] 20:05:48     INFO -  TEST-START | FOG.TestCppTimingDistWorks
[task 2023-01-27T20:05:48.797Z] 20:05:48  WARNING -  TEST-UNEXPECTED-FAIL | FOG.TestCppTimingDistWorks | Expected: (data.sum) > ((uint64_t)(15 * NANOS_IN_MILLIS) - EPSILON), actual: 14813300 vs 14960000 @ /builds/worker/checkouts/gecko/toolkit/components/glean/tests/gtest/TestFog.cpp:294
[task 2023-01-27T20:05:48.797Z] 20:05:48  WARNING -  TEST-UNEXPECTED-FAIL | FOG.TestCppTimingDistWorks | test completed (time: 10ms)
[task 2023-01-27T20:05:48.797Z] 20:05:48     INFO -  TEST-START | FOG.TestLabeledBooleanWorks

The 15 * NANOS_IN_MILLIS is the assumed, reasonable sum of one timing sample that slept for two 5ms intervals plus one timing sample that slept for one 5ms interval.

HOWEVER, things are rarely (see the bugs in the See Also list for just how rare) that reasonable. Those 5ms intervals are only "roughly" 5ms long, and if each of them is shaved juuuuust enough, we can find ourselves at a sum that is 15ms - (3 * SHAVE). This is why we put in EPSILON: we hope that EPSILON > (3 * SHAVE) and simultaneously keep it as small as possible.

Alas. An EPSILON of 0.04ms is apparently not always enough. We need one of at least 0.1867ms.

(( 1.3% seems like an awful lot for a timer to be off... but remember, this is three compounding errors, so it averages to 0.41%. Still not great, but. ))

(( Also, we straight up use an EPSILON of 2ms in the JS versions of this test (most of that's due to rounding, but not all of it) ))

A decent fix for this would be to up EPSILON to 200000 or (uint64_t)(0.2 * NANOS_IN_MILLIS)

Mentor: chutten
Whiteboard: [good second bug][lang=C++]
Assignee: nobody → rzvncj
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 112 Branch
See Also: → 1849415
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: