1305174 - EBS initialization makes I/O absurdly slow on freshly provisioned instances

Reporter

Description

•

9 years ago

I was investigating an apparent Mercurial 3.9 performance regression in bug 1304791. :markco gave me access to a Windows Server 2008 instance in use1. Not only did I notice that Mercurial performance was horrible, but I/O was just generally bad. Downloading a 1.5 GB Mercurial bundle from S3, I noticed that data was effectively streaming to memory and being written back out to disk at only 2-3 MB/s. Profiling system calls using Process Monitor confirmed that low-level file write function calls were taking 0.5-1.2s to write 2 MB. Ouch. arr found an AWS support thread which led her to the following: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ebs-initialize.html From these docs: > New EBS volumes receive their maximum performance the moment that they > are available and do not require initialization (formerly known as > pre-warming). However, storage blocks on volumes that were restored > from snapshots must be initialized (pulled down from Amazon S3 and > written to the volume) before you can access the block. This preliminary > action takes time and can cause a significant increase in the latency > of an I/O operation the first time each block is accessed. So basically if you create an EBS volume from a snapshot, the initial I/O to a block will be slow. That means the first time you write data to a block, that write will be slow. I used a Windows port of dd to trigger reads from all blocks on the C:\ drive of the instance I was given access to. It was painfully slow. 2-3MB/s. I let it run for a few minutes then killed it. When I started the process again, it very quickly read up to the point I previously killed it at. This seemingly confirms that EBS volumes initialized from an AMI count as a "snapshot" and suffer from the slow first block access issue. Further reinforcing this suspicion, markco mounted a fresh EBS volume on this same instance. I formatted it as NTFS and conducted similar tests. Disk writes during wget were at least 4x faster. Mercurial operations were faster. It didn't seem to suffer from the slow first block access problem. And this makes sense: Amazon says new EBS volumes have maximum performance from the beginning. Since EBS volumes initialized from snapshots/AMIs are absurdly slow, I strongly recommend changing our EC2 instance management strategy to do as many operations as possible on a fresh EBS volume that is separate from the instance/AMI volume. This means builds and tests should have their workspace on a separate EBS volume. Ideally, any additional programs we install (like mozilla-build) should also be put on a fresh EBS volume. This does mean we lose the advantage of AMIs and their bake once reuse everywhere approach. However, if the "reuse everywhere" bit means 2 MB/s I/O on first use, I dare say the overhead of installing things on N separate instances may be faster when you factor in the performance penalty on first block access.