Open Bug 791340 Opened 12 years ago Updated 2 years ago

Add disk queue limits to gloda indexing, to adjust to OS disk activity affecting Thunderbird, and mitigate impact of indexing on OS

Categories

(MailNews Core :: Search, defect)

x86_64
All
defect

Tracking

(Not tracked)

People

(Reporter: rkent, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

In bug 465353, the rate of gloda indexing was adjusted based on CPU limitations. But at least for my system, it is not CPU but disk access that really causes things to bog down. On my Windows system, I can get a view of the disk queue, and when that gets large (5 - 10) is when things really get sluggish. There must be ways to get the disk queue (or some other disk-related performance measurement) for each OS. If we had that, we could also back off gloda indexing rate when the disk is busy.
Yes, this is a common complaint, that disk activity is high. On linux, you can set io-priority to IDLE (which is not the same as process priority (nice value)). But that would probably mean having gloda on a separate thread/process.
Gloda runs its queries as asynchronous calls to mozstorage (sqlite). IIRC, these are run in a separate thread, so that means we need the underlying mozstorage subsystem to expose a way to throttle down its IO priority. Andrew should be able to tell whether this is feasible or not.
The patch that I mentioned in bug 465353 was not directed at gloda queries, but rather to the base rate of message indexing that is done on initial setup of gloda. Gloda itself has throttles to affect its rate of indexing, and that bug used those throttles to slow down indexing when cpu usage was high. All that I am asking here is that we add some measure of disk loading to those as well. I don't think we need changes to mozstorage to do that, but we do need some new OS hooks.
man ioprio_set only mentioned processes, but http://permalink.gmane.org/gmane.linux.man/2980 is a patch to the manpage that suggests it actually does work on threads. So, yes, we could throttle the I/O priority of gloda's database connection while indexing is in progress with an enhancement to mozStorage. In terms of linux and monitoring disk loading, /proc/diskstats (which is basically the concatenation of all /sys/block/*/stat files) is what iostat uses. One would want to map the mountpoint to the block device in use. Because of the limitations of detecting I/O, especially in the face of virus checkers that can muddy the issue, I think deprioritizing the I/O thread may be more beneficial than detecting I/O, but both could work well together. For windows, it's been a while, but perhaps the performance counter API would work? http://msdn.microsoft.com/en-us/library/windows/desktop/aa372013%28v=vs.85%29.aspx It's not clear if js-ctypes would let us get at the performance counters. I don't see any use of "Pdh" API's in-tree at this time.
Severity: normal → major
Keywords: perf
Blocks: 1023000
Severity: major → normal
OS: Windows 7 → All
Summary: Add disk queue limits to gloda indexing → Add disk queue limits to gloda indexing, to adjust to OS disk activity affecting Thunderbird, and mitigate impact of indexing on OS
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.