Closed Bug 840224 Opened 11 years ago Closed 10 years ago

Change read-ahead buffer and disable atime for socorro collector systems

Categories

(Infrastructure & Operations Graveyard :: WebOps: Socorro, task, P4)

x86_64
Linux

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: selenamarie, Unassigned)

Details

Actions to take: 

* Set read-ahead buffer to 2MB
* Disable atimes on filesystems

From a detailed email from Jake Maul: 

On the Ops side, there's 3 things we're doing to mitigate this.

1) "blockdev --setra 4096 /dev/sda" raises the read-ahead cache from
128KB to 2MB (it's in 512-byte units... "4096" is 2MB). This with Step 2
above, and thus improves the overall throughput of the workflow. Crash
dumps range from a few KB to a few MB, averaging around 250KB. A bigger
read-ahead makes logical sense, and in practice does have a decent
impact on speed. It's hard to be precise without a clean-room benchmark,
but I estimate around 15-25% improved throughput in terms of number of
crashes written to HBase per second from this.

2) Disable atimes on the local data store volume (/). This eliminates
writes on Step 2, freeing up yet more IOPS. This gave us another 15% or
so on top of the gain from --setra above.

In the end, these 2 tweaks netted us approximately a 35% increase in
crash dump performance in testing. In raw numbers, we went from around
1900 dumps/min to around 2600.

These 2 tweaks are live on sp-collector06.phx1, where we were testing,
and the readahead is also live on 05. The only challenge to rolling this
out to all the nodes is how to apply these 2 things via puppet. We don't
currently manage mount options for / (as far as I know). The other
setting can be checked in /sys/block/sda/queue/read_ahead_kb (although
note that "blockdev" works in 512-byte units, not 1KB units... so 2048
in /sys/ is 4096 in blockdev, and 2MB in reality), or with "blockdev
--getra /dev/sda")... it might take an "exec", but it should be doable.

We played with schedulers too... "cfq" (the default) with default
settings proved to be faster than either "noop" or "deadline", so
there's no change to make here. if there's any room for improvement on
this front, it's probably not much.
Is this a production change? If so, this needs to be approved by the CAB. 

Jake?
Flags: needinfo?(nmaul)
(In reply to Shyam Mani [:fox2mike] from comment #1)
> Is this a production change? If so, this needs to be approved by the CAB. 

Really? Turning off atime and setting the read-ahead buffer to something non-trivial are both non-destructive and well-documented stability enhancements for high-IO Linux systems.
It's a pretty trivial set of changes and was already tested in prod.  What's the scope of things that need to go to CAB?

In addition, we're getting SSDs on these machines tomorrow, so that would be a good time to make the changes IMHO since they have to come out of the pool for that.
Had a chat with Corey, punting to webops.
Assignee: server-ops → server-ops-webops
Component: Server Operations → Server Operations: Web Operations
QA Contact: shyam → nmaul
(In reply to Selena Deckelmann :selenamarie :selena from comment #2)
 
> Really? Turning off atime and setting the read-ahead buffer to something
> non-trivial are both non-destructive and well-documented stability
> enhancements for high-IO Linux systems.

Yeah. I'm not happy going ahead and changing values on our collector machines. It's always nice to do it when people are aware vs "we're going to make this change now" IMHO :) 

Anyway, seems like this is too far ahead to block on. I'm happy for someone from webops to make these changes.
Flags: needinfo?(nmaul)
Sorry for the confusion on this. Yes, webops will take this since we manage the socorro nodes. These are trivial changes and can easily be done with no service downtime. The hardest part is writing the puppet config to implement them properly... manually they are each a single command. :)

I believe it would be worthwhile for us to consider one or both of these tweaks on an infra-wide scale, but that needn't be this bug. Neither is hugely important in most situations, but local disk I/O does come up now and then.
Assignee: server-ops-webops → nmaul
Assignee: nmaul → server-ops-webops
Component: Server Operations: Web Operations → WebOps: Socorro
Priority: -- → P4
Product: mozilla.org → Infrastructure & Operations
No longer a problem with SSDs.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.