Closed Bug 1665265 Opened 5 years ago Closed 5 years ago

Intermittent and perma mingw debug Assertion failure: mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp < aChunks->ChunkHeader().mDoneTimeStamp (Chunks must be released in increasing timestamps), at /builds/worker/workspace/obj-build/dist/include/

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

RESOLVED FIXED
83 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- wontfix
firefox83 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: mozbugz)

Details

(Keywords: assertion, intermittent-failure)

Attachments

(1 file)

Filed by: rmaries [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=315804609&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RE-mD_oLQC2e__upjg7iDA/runs/0/artifacts/public/logs/live_backing.log


[task 2020-09-16T05:20:32.572Z] 05:20:32     INFO -  TEST-START | TestSafeThreadLocal.exe
[task 2020-09-16T05:20:32.611Z] 05:20:32     INFO -  mozcrash checking c:\users\task_1600232880\appdata\local\temp\tmpqod5bm for minidumps...
[task 2020-09-16T05:20:32.611Z] 05:20:32     INFO -  TEST-PASS | TestSafeThreadLocal.exe | took 38ms
[task 2020-09-16T05:20:32.611Z] 05:20:32     INFO -  TEST-START | TestBaseProfiler.exe
[task 2020-09-16T05:20:32.651Z] 05:20:32     INFO -  PID 5016 |
[task 2020-09-16T05:20:32.651Z] 05:20:32     INFO -  Assertion failure: mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp < aChunks->ChunkHeader().mDoneTimeStamp (Chunks must be released in increasing timestamps), at /builds/worker/workspace/obj-build/dist/include/mozilla/ProfileBufferChunkManagerWithLocalLimit.h:156
[task 2020-09-16T05:20:32.651Z] 05:20:32     INFO -  mozcrash checking c:\users\task_1600232880\appdata\local\temp\tmpza3ntk for minidumps...
[task 2020-09-16T05:20:32.651Z] 05:20:32  WARNING -  TEST-UNEXPECTED-FAIL | TestBaseProfiler.exe | test failed with return code 2147483651
[task 2020-09-16T05:20:32.651Z] 05:20:32     INFO -  TEST-INFO took 38ms```

It looks rare, but I feel I should look at it ASAP.

Assignee: nobody → gsquelart
Severity: normal → S3
Flags: needinfo?(gsquelart)
Priority: P5 → P2

This is a permanent failure for Windows x64 mingw debug cppunit.

Summary: Intermittent Assertion failure: mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp < aChunks->ChunkHeader().mDoneTimeStamp (Chunks must be released in increasing timestamps), at /builds/worker/workspace/obj-build/dist/include/mozilla/ProfileBufferChun → Intermittent and perma mingw debug Assertion failure: mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp < aChunks->ChunkHeader().mDoneTimeStamp (Chunks must be released in increasing timestamps), at /builds/worker/workspace/obj-build/dist/include/

This was starting to annoy me on try runs so I attempted to take a look, but unfortunately I can't reproduce it locally even if I take the binary from a failing push on treeherder.

I then threw some printf at it, and got:

[task 2020-10-13T20:59:49.823Z] 20:59:49     INFO -  mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp.mValue.mGTC = 2919070870125
[task 2020-10-13T20:59:49.823Z] 20:59:49     INFO -  mReleasedChunks->Last()->ChunkHeader().mDoneTimeStamp.mValue.mQPC = 2919573268000
[task 2020-10-13T20:59:49.823Z] 20:59:49     INFO -  aChunks->ChunkHeader().mDoneTimeStamp.mValue.mGTC = 2919070870125
[task 2020-10-13T20:59:49.823Z] 20:59:49     INFO -  aChunks->ChunkHeader().mDoneTimeStamp.mValue.mQPC = 2919573268000

So the timestamps are equal.

Gerald, does that offer any clues?

Sorry for the annoyance, I haven't had a chance to look into this yet, but I really want to, hopefully in the next few days...

Thank you :dmajor for the printfs. I did not expect equal timestamps, so that's certainly a good clue! Are timestamps less precise on mingw?

These failures happen in tests, and tests typically use small chunks, so I guess they could theoretically be filled in such a short time that they could have equal "done" timestamps!? If that's correct, a quick fix would be to add some delays during testing.
(In real life, chunks are 1MB, which should take a significant amount of time to fill.)

So the problem is not what I initially feared when I wrote comment 1, but it is indeed a timing issue due to tests going unreasonably quickly through chunks, so they end up with the same timestamps.
I will add some delays between chunks MarkDone() calls, which is closer to what really happens in Firefox.

Thank you again :dmajor for pushing me to take a look at it now.

Flags: needinfo?(gsquelart)

These unit tests were going through chunks very quickly in tight loops, which on some platforms could cause successive chunks to be marked "done" with the same timestamps.
Distinct timestamps are needed to uniquely identify chunks (in each process), and related assertions caught that. In real Firefox code, chunks are much bigger (around 1MB), and fill at a slower rate, so timestamps are always different as expected there.
To fix this artificial issue in our unit tests, delays are introduced before each MarkDone(), to ensure that they'll get a different timestamp every time.

Pushed by gsquelart@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9150b0886ac9 Prevent chunks from being marked done with the same timestamps in quick tests - r=gregtatum
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: