Closed Bug 1398355 Opened 7 years ago Closed 5 years ago

Nerf filesystem consistency defaults

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

People

(Reporter: gps, Unassigned)

References

(Blocks 1 open bug)

Details

Gregory Szorc [:gps]

Reporter

Description

•

7 years ago

eatmydata (https://github.com/stewartsmith/libeatmydata) is an LD_PRELOAD library and helper command that essentially turns expensive I/O primitives like fsync() into no-ops. My testing shows that it can drastically speed up various operations. For example, when building the desktop1604-test Docker image, the non-download part of an `apt-get install` which pulls in ~1745 packages goes from ~585s to ~180s (31% of original)! (dpkg will fsync things as part of and in-between package installs as I understand it.) My machine has a reasonably fast SSD and I don't perceive I/O wait to be a problem. So I'm guessing the savings in automation will be even greater. We should consider using eatmydata where I/O correctness isn't required and we don't care about potential for data loss. Candidates for eatmydata include: * Docker image building (the entire task) * Toolchain tasks * VCS checkouts (if repo corruption occurs robustcheckout should be able to recover) * Task setup and teardown. Tooltool downloads. Archive extraction. etc. * Parts of the Firefox build (compiling, test archive generation, symbols generation, but not `make check`) Where we shouldn't use eatmydata: * In tests (unless we're absolutely sure misbehaving I/O patterns won't interfere with test accuracy - and even then)

Gregory Szorc [:gps]

Reporter

Comment 1

•

7 years ago

I don't think this will matter much for Mercurial clones or checkouts because Mercurial doesn't abuse fsync().

Gregory Szorc [:gps]

Reporter

Comment 2

•

7 years ago

Jonas and I were discussing this IRL. We think adjusting the ext4 mount options in docker-worker to nerf filesystem safety is a better approach because it is global. "nobarrier" might be sufficient. We may also want to data=writeback to make journaling faster. And we may want to tune the page cache settings so Linux doesn't wait on cached data to flush before allowing more I/O.

Gregory Szorc [:gps]

Reporter

Comment 3

•

7 years ago

This now appears to be a docker-worker bug. I just touched this docker-worker code for bug 1415725. Once those PR's are accepted, I could look at this. Another wrinkle here is invalidating test results. But unless we're testing performance or filesystem robustness, I'm not sure how tweaking things would invalidate test results. From the perspective of everything in userland, the kernel preserves POSIX semantics around filesystem state regardless of what various buffers and caches are doing under the hood. Worst case, we may need a per-worker or per-task setting to control behavior. And per-task is difficult, since multiple tasks could be running simultaneously. Probably better to make it per-worker.

Component: Task Configuration → Docker-Worker

Summary: Use eatmydata when I/O correctness isn't required → Nerf filesystem consistency defaults

Gregory Szorc [:gps]

Reporter

Comment 4

•

7 years ago

Thinking about this more, we can implement this as a per-task flag in TaskGraph. `run-task` can manage the use of eatmydata by looking at an environment variable, etc.

Jonas Finnemann Jensen (:jonasfj)

Comment 5

•

7 years ago

agree, eatmydata (which is LD_PRELOAD) should probably be done in-tree, rather than at worker-level. But we could still tweak file system parameters to disable journaling, etc..

John Ford [:jhford] CET/CEST Berlin Time

Updated

•

7 years ago

Priority: -- → P5

Nobody; OK to take it and work on it

Assignee

Updated

•

6 years ago

Component: Docker-Worker → Workers

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

5 years ago

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Nerf filesystem consistency defaults

Categories

(Taskcluster :: Workers, enhancement, P5)

Tracking

(Not tracked)

People

(Reporter: gps, Unassigned)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated