Investigate performance impact of FlushFileBuffers()
Categories
(MailNews Core :: Database, task)
Tracking
(Not tracked)
People
(Reporter: betterbird.project+17, Unassigned, NeedInfo)
References
(Blocks 1 open bug)
Details
(Keywords: perf)
Attachments
(3 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 Edg/126.0.0.0
Steps to reproduce:
We compared the new folder compaction in TB 128 with TB 115 on Windows. Result:
Smallish folder: 17.000 messages, 700 MB with .msf file 10 MB.
115: 5 seconds, 128: 10 seconds.
Large folder: 190.000 messages, 860 MB with .msf file 85 MB.
115: 40-60 seconds, 128: Between 5 minutes and 20 seconds and 5 minutes and 35 seconds.
Time vary of course depending on OS disk access, etc.
So the compaction takes between 2 and 5.5 times the time it took in 115. The progress bar takes about the same time to completion (100%), the rest of the time appears without activity. For the large folder the system becomes unresponsive during the last 4.5 minutes after the progress bar has run to completion.
This will be a severe problem for users with large mailboxes where auto-compact is triggered.
In general: Please test all new development on larger datasets to avoid this sort of problem, see for example bug 1864510 or bug 1839063.
| Reporter | ||
Updated•1 year ago
|
| Reporter | ||
Comment 1•1 year ago
|
||
Adding to the list: Bug 1854035.
| Reporter | ||
Comment 2•1 year ago
|
||
Here's a screenshot from compaction the smaller folder. You can see that the system is working for 5 seconds (2.5s to 7.5s) and then becomes unresponsive for four seconds (7.5s to 11.5s). Looks like the culprit is nsSafeFileOutputStream::Finish() and the code it calls, in particular:
https://searchfox.org/mozilla-central/rev/a18d79db1695fc6019cac47874113fd3bb65a821/nsprpub/pr/src/md/windows/w95io.c#407 (Windows 95??). Looks like this Netscape Portable Runtime (NSPR) code was not intended for larger files.
BTW, profiling the large folder compaction which took over 6 minutes failed, no profile was captured.
Comment 3•1 year ago
|
||
Pinging Ben and adding Brendan to help investigate this.
| Reporter | ||
Comment 4•1 year ago
|
||
The Win32 API FlushFileBuffers() is known to be slow, so the only fix is not to use it and hence not to use nsSafeFileOutputStream, which pulls out the rug under the new compact implementation. One of the articles that shows up in the search suggests to use a memory-mapped file. Now it's too late for any improvements of M-C code, so a backout is likely the only way forward in the short term (which would return bug 1878541).
Comment 5•1 year ago
|
||
Blimey. What on earth is windows doing in it's caching that means FlushFileBuffers() takes so long to write out 700MB?
In any case, I've got a couple of possible solutions. One is to just use nsAtomicFileOutputStream, which is exactly the same as nsSafeFileOutputStream, but omits the Flush() in it's Finish() (no other differences as far as I can see).
This does leave the issue of non-flushed data being lost if something catastrophic happens at exactly the wrong instant, but:
- it's not as bad as the old leftover nstmp files (bug 1878541) - they were mostly due to bad handling of other errors rather than power-loss catastrophes.
FlushFileBuffers()doesn't completely remove the power-loss issue anyway. It's ensures the data is written out to the storage device, but there's no guarantee when the data gets on to physical media (rather than internal device caching). It's a much smaller window, sure, but still there.
The second possible solution is a bit little cargo-cult. It involves making sure the windows file is created with FILE_FLAG_WRITE_THROUGH (exposed in NSPR PR_SYNC). I would actually expect that to perform worse than just doing a single Flush() at the end... but... well, who can tell.
I'll upload the two patches here - want to give them a go and see if either help?
Comment 6•1 year ago
|
||
This adds the PR_SYNC flag to the NS_NewSafeLocalFileOutputStream() call.
The other flags are the usual defaults. On windows, this causes the
underlying file to be created with FILE_FLAG_WRITE_THROUGH, which forces
writes to go directly to disk rather than just into the system cache.
See:
https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea#caching_behavior
The writes may also be buffered in the system cache, but that should
quick as it's just RAM.
Comment 7•1 year ago
|
||
This changes the mbox compaction to use nsAtomicFileOutputStream instead of
nsSafeFileOutputStream for the destination mbox. The only difference between
them is that nsSafeFileOutputStream does an explicit Flush() at the end, which
forces the OS to flush it's FS caching for the file, which can be slow.
| Reporter | ||
Comment 8•1 year ago
|
||
want to give them a go and see if either help?
Who are you addressing? The project whose contributions aren't accepted and whose accounts are regularly blocked?
D215304 doesn't appear to work, the progress bar doesn't advance, it remains in state "start" (static animation), after a while it disappears as if the compaction had completed. A XXX-1 file of about 90 KB is created and left behind, and on a subsequent compact "The folder 'XXX on Local Folders' cannot be compacted because another operation is in progress. Please try again later." is received. Isn't the trick that you're writing to the same file you're reading? So you can't write to it while you're still reading?
With D215305 the progress bar moves, runs to completion after one minute, and then the system is unresponsive for a further several minutes (we killed it in the end). So no change with that patch. Looks like nsAtomicFileOutputStream is a nsSafeFileOutputStream (see here), so why would the behaviour be different?
| Reporter | ||
Comment 9•1 year ago
|
||
Another observation: Flushing buffers for 17K messages with 700 MB of data takes 5 seconds whereas flushing buffers for 190K messages with 860 MB of data takes 5 minutes or more. So the less messages/writes/buffers, the faster the flushing at the end. You may consider writing it all to memory in one buffer and then flushing it at the end (which sounds a lot like the memory-mapped file mentioned in comment #4).
| Reporter | ||
Comment 10•1 year ago
|
||
While reporting bug 1905786, we restored our test folder to a state that was never touched by version 128. We deleted a few messages and compacted and much to our surprise the compaction ran in the time needed by TB 115. After the progress bar had run to the end, the compaction was complete. We sorted the folder, by subject (takes a long time with 190K messages) and then be date again. Then compacting started to become unresponsive again, but nowhere near 5-6 minutes, this time we measured about 20 seconds. So it looks like repeated compaction, also when the sort order has changed, causes the unresponsiveness and also bug 1905786. It all sounds weird, but those are the observations.
| Reporter | ||
Comment 11•1 year ago
|
||
Here's some more investigation:
FlushFileBuffers()is slow, and it's slower to write an entire file after going through the input rather than writing incrementally while going to the input.- We've done some more profiling and recognised that not the compaction is slow, but the view creation at the end. That restores the secondary and primary sort order. If one of these was "unfortunate", like sort by Correspondent on a huge bugmail folder where all the senders are the same, view creation makes the system unresponsive, just as sorting by Correspondent would when done manually. The same issue exists in 115 already.
So the bug is mostly invalid.
| Reporter | ||
Comment 12•1 year ago
|
||
Since we now know that the slowness has two components, FlushFileBuffers() and the view restoral, D215305 should be tested again in a situation where the view restoral is fast to see whether it brings an improvement.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 13•1 year ago
|
||
(In reply to betterbird.project+17@gmail.com from comment #11)
Here's some more investigation:
... omission ...
If one of these was "unfortunate", like sort by Correspondent on a huge bugmail folder where all the senders are the same, view creation makes the system unresponsive
... omission...
Sounds to me that we need to improve the sorting routine.
Description
•