In bug 1176192, git1.dmz.scl3.mozilla.com experiencing high load. When bacula-fd started running its backup, it started sucking a lot of I/O away from the primary service on the machine. At times, it was doing 100MB/s of disk reads! Meanwhile, the machine was going in and out of swapping and the primary service on the machine was performing more disk reads, likely because of page cache eviction triggered by bacula-fd reading infrequently-read files. bacula-fd is currently using default values for nice and ionice. I think bacula-fd should run with a lower, less-urgent priority so it doesn't contend with other processes on the machine during times of high load. I argue this should be a universal change. But I'll settle for changing just git1.dmz.scl3.mozilla.com.
I propose starting with ionice -c3 on the agent, without altering the nice level, since we do need bacula to retain some amount of CPU priority to coordinate backps. That would satisfy the use case of git1 I/O starvation (with the understanding that the host was already experiencing I/O issues at the time), and in general keep bacula from being a concern to I/O-sensitive things.
Comment #1 is acceptable to me. Something is better than nothing :)
Im currently working on getting bacula-fd to run ionice -c3 per comment #1
found during work on bug 1176192, but setting to block our perf tracker bug 1126345
Closing this out, haven't seen any issues with this or activity.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.