Intermittent and perma mingw debug Assertion failure: mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp < aChunks->ChunkHeader().mDoneTimeStamp (Chunks must be released in increasing timestamps), at /builds/worker/workspace/obj-build/dist/include/
Categories
(Core :: Gecko Profiler, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox82 | --- | wontfix |
firefox83 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: mozbugz)
Details
(Keywords: assertion, intermittent-failure)
Attachments
(1 file)
Filed by: rmaries [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=315804609&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RE-mD_oLQC2e__upjg7iDA/runs/0/artifacts/public/logs/live_backing.log
[task 2020-09-16T05:20:32.572Z] 05:20:32 INFO - TEST-START | TestSafeThreadLocal.exe
[task 2020-09-16T05:20:32.611Z] 05:20:32 INFO - mozcrash checking c:\users\task_1600232880\appdata\local\temp\tmpqod5bm for minidumps...
[task 2020-09-16T05:20:32.611Z] 05:20:32 INFO - TEST-PASS | TestSafeThreadLocal.exe | took 38ms
[task 2020-09-16T05:20:32.611Z] 05:20:32 INFO - TEST-START | TestBaseProfiler.exe
[task 2020-09-16T05:20:32.651Z] 05:20:32 INFO - PID 5016 |
[task 2020-09-16T05:20:32.651Z] 05:20:32 INFO - Assertion failure: mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp < aChunks->ChunkHeader().mDoneTimeStamp (Chunks must be released in increasing timestamps), at /builds/worker/workspace/obj-build/dist/include/mozilla/ProfileBufferChunkManagerWithLocalLimit.h:156
[task 2020-09-16T05:20:32.651Z] 05:20:32 INFO - mozcrash checking c:\users\task_1600232880\appdata\local\temp\tmpza3ntk for minidumps...
[task 2020-09-16T05:20:32.651Z] 05:20:32 WARNING - TEST-UNEXPECTED-FAIL | TestBaseProfiler.exe | test failed with return code 2147483651
[task 2020-09-16T05:20:32.651Z] 05:20:32 INFO - TEST-INFO took 38ms```
Assignee | ||
Comment 1•5 years ago
|
||
It looks rare, but I feel I should look at it ASAP.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
![]() |
||
Comment 4•5 years ago
|
||
This is a permanent failure for Windows x64 mingw debug cppunit.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
![]() |
||
Updated•5 years ago
|
This was starting to annoy me on try runs so I attempted to take a look, but unfortunately I can't reproduce it locally even if I take the binary from a failing push on treeherder.
I then threw some printf at it, and got:
[task 2020-10-13T20:59:49.823Z] 20:59:49 INFO - mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp.mValue.mGTC = 2919070870125
[task 2020-10-13T20:59:49.823Z] 20:59:49 INFO - mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp.mValue.mQPC = 2919573268000
[task 2020-10-13T20:59:49.823Z] 20:59:49 INFO - aChunks->ChunkHeader().mDoneTimeStamp.mValue.mGTC = 2919070870125
[task 2020-10-13T20:59:49.823Z] 20:59:49 INFO - aChunks->ChunkHeader().mDoneTimeStamp.mValue.mQPC = 2919573268000
So the timestamps are equal.
Gerald, does that offer any clues?
Assignee | ||
Comment 8•5 years ago
|
||
Sorry for the annoyance, I haven't had a chance to look into this yet, but I really want to, hopefully in the next few days...
Thank you :dmajor for the printfs. I did not expect equal timestamps, so that's certainly a good clue! Are timestamps less precise on mingw?
These failures happen in tests, and tests typically use small chunks, so I guess they could theoretically be filled in such a short time that they could have equal "done" timestamps!? If that's correct, a quick fix would be to add some delays during testing.
(In real life, chunks are 1MB, which should take a significant amount of time to fill.)
Assignee | ||
Comment 9•5 years ago
|
||
So the problem is not what I initially feared when I wrote comment 1, but it is indeed a timing issue due to tests going unreasonably quickly through chunks, so they end up with the same timestamps.
I will add some delays between chunks MarkDone()
calls, which is closer to what really happens in Firefox.
Thank you again :dmajor for pushing me to take a look at it now.
Assignee | ||
Comment 10•5 years ago
|
||
These unit tests were going through chunks very quickly in tight loops, which on some platforms could cause successive chunks to be marked "done" with the same timestamps.
Distinct timestamps are needed to uniquely identify chunks (in each process), and related assertions caught that. In real Firefox code, chunks are much bigger (around 1MB), and fill at a slower rate, so timestamps are always different as expected there.
To fix this artificial issue in our unit tests, delays are introduced before each MarkDone()
, to ensure that they'll get a different timestamp every time.
Comment 11•5 years ago
|
||
Comment 12•5 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Updated•5 years ago
|
Description
•