Closed
Bug 1437243
Opened 7 years ago
Closed 6 years ago
Further investigate ext4 formatting options
Categories
(Taskcluster :: Workers, defect, P5)
Taskcluster
Workers
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: gps, Unassigned)
References
(Blocks 1 open bug)
Details
When you `mkfs.ext4`, the default behavior is to zero out the inode tables on first mount. On first mount, an ext4lazyinit process is spawned by the kernel to do that.
If we format a 120 GB EBS volume with default inode settings (like we do in docker-worker today), we get 7,864,320 inodes of size 256 (as reported by `tune2fs -l`). e.g.:
$ sudo tune2fs -l /dev/nvme1n1
tune2fs 1.42.9 (4-Feb-2014)
Filesystem volume name: <none>
Last mounted on: /mnt
Filesystem UUID: 2642881f-72b9-4aff-829d-21a0c1586795
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 7864320
Block count: 31457280
Reserved block count: 1572864
Free blocks: 30915691
Free inodes: 7864309
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1016
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sat Feb 10 01:32:33 2018
Last mount time: Sat Feb 10 01:33:19 2018
Last write time: Sat Feb 10 01:33:19 2018
Mount count: 1
Maximum mount count: -1
Last checked: Sat Feb 10 01:32:33 2018
Check interval: 0 (<none>)
Lifetime writes: 132 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: ba09703a-c49b-42ec-953e-ac8d9b0eccdd
Journal backup: inode blocks
Assuming that 256 is bytes, this comes out to 2,013,265,920 bytes of write I/O on first mount of the EBS volume.
I haven't measured exactly, but I do notice with dstat that upon initial mount, our I/O counts are several dozen MB/s for several seconds. So I believe we are writing ~2 GB on volume mount.
This block zeroing is *possibly* hurting our performance on early instance startup. I think we still have plenty of I/O credits to incur this heavy write I/O. But it may be sucking away perf and preventing workers from getting to a ready state faster.
The mitigation for this is to reduce the size of the inode table. That will limit the number of inodes we can track on volumes. I'm pretty sure we come nowhere close to inode exhaustion on docker-worker instances. The thing requiring the most inodes is likely VCS clones and checkouts. And you need >100 clones or checkouts of mozilla-central with default provisioning ratios to be in territory where you worry about inode exhaustion. So I think reducing the inode density (at least on larger volumes) is worth considering.
Comment 1•6 years ago
|
||
gps: how much startup time are we likely to gain from experimenting with this change?
Flags: needinfo?(gps)
Priority: -- → P5
Reporter | ||
Comment 2•6 years ago
|
||
I'm not sure. A P5 feels like a good triage in the absence of concrete numbers. I think things like tuning the mount options to throw away filesystem consistency/durability protections and moving to c5d instances with non-EBS NVMe storage are much better time investments.
Flags: needinfo?(gps)
Assignee | ||
Updated•6 years ago
|
Component: Docker-Worker → Workers
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•