Closed
Bug 1272060
Opened 8 years ago
Closed 7 months ago
Use low I/O, volatile storage for Places when running mochitests
Categories
(Toolkit :: Places, defect, P3)
Toolkit
Places
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: gps, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
Let's use the preference from bug 1272025 to drastically lower Places SQLite related I/O when running mochitests. I'll write up in the commit message why I think this is safe to do.
Reporter | ||
Comment 1•8 years ago
|
||
By default, Places opens a SQLite database using a disk-based WAL journal. When commits are performed, the SQLite files are flushed to the filesystem - effectively waiting on the disk to acknowledge writes. This can be very slow, especially on spinning disks. This commit changes the test environment to put Places/SQLite in low-I/O, volatile storage mode. In this mode, the journal will reside in memory and disk writes are only written to the OS - we don't wait for data to be flushed to disk. This drastically lowers the amount of I/O required for SQLite operations. The I/O required by Places/SQLite when running tests can be significant. Tests like reftests, mochitests, and WPT effectively load hundreds or even thousands of pages in rapid order. Each page load has the potential to generate I/O via Places recording the history visit. In effect, these tests are stress testing Places/SQLite. In bug 1271035, we recorded ~50 GB of requested I/O related to Places when running reftests on Windows. In automation, 10+ GB disk I/O was incurred during job execution. Disabling Places wholesale in reftests elimited most of this I/O, significantly speeding up test execution. While this commit reduces the robustness of Places/SQLite storage, I don't think it is relevant for tests. This is because our tests are verifying that Places works, not that SQLite writes are durable. The change to use more volatile SQLite storage shouldn't make a difference to the behavior of Places for purposes of testing Firefox. So the net effect is we make Places/SQLite significantly faster without sacrificing Places test coverage. TODO document improvements measured in automation. Review commit: https://reviewboard.mozilla.org/r/52009/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/52009/
Attachment #8751400 -
Flags: review?(mak77)
Attachment #8751400 -
Flags: review?(jmaher)
Reporter | ||
Comment 2•8 years ago
|
||
Comment on attachment 8751400 [details] MozReview Request: Bug 1272060 - Put Places in low-I/O, volatile storage mode during tests Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52009/diff/1-2/
Comment 3•8 years ago
|
||
Comment on attachment 8751400 [details] MozReview Request: Bug 1272060 - Put Places in low-I/O, volatile storage mode during tests https://reviewboard.mozilla.org/r/52009/#review48851 while I like this, I do wonder: 1) how much time this actually saves (I suspect a decent amount) 2) will the create lack of coverage on the places database- right now we probably hit a lot of odd edge cases, with this patch I would imagine less Does this make sense for talos? Should we write a unittest that stresses the places db?
Attachment #8751400 -
Flags: review?(jmaher) → review+
Reporter | ||
Comment 4•8 years ago
|
||
https://reviewboard.mozilla.org/r/52009/#review48851 I'll add the job time numbers once I get them. I only have data for Linux so far. It doesn't change much on Linux because our I/O there is pretty decent. It should result in measurable savings on OS X and Windows because spinning disks. I argue this doesn't really cause a lack of coverage on the Places database (see commit message). But I have flagged Marco for review just to be sure. I think we should *not* change Talos (at least the Talos tests measuring page load performance) because Places can have an impact on Talos performance and we want Talos to be testing what we ship to users. To clarify: Talos doesn't inherit prefs_general.js, right? I thought it had its own preferences list. But I could be wrong...
Comment 5•8 years ago
|
||
I don't think this should happen for every mochitest harness, it should only happen for those not explicitly testing Places. That is, I think it should not happen for mochitest-browser, and should not happen for mochitests in places/ folders. The reason is that mochitest-browser is closer to the real UI usage and can sometimes uncover concurrency bugs that only exist when we are in WAL mode (intermittents often uncovered those bugs). I care a less about other mochitests , even if that is still true, cause there are less interesting code paths hit. In general, we should always test what we ship to the users, not something else. The other problem I have is that we should not allow final users to set this pref, it will be very tempting for users or "perf add-ons" to set this pref, and then we'd have to deal with broken dbs reports caused by this non standard setting, and we may even not know the pref is set. Which alternatives do we have to make this a build time options that only applied to tinderbox builds but not to Nightly/Aurora/Beta/Release?
Comment 6•8 years ago
|
||
talos has its own preference definitions, so this change won't affect it. I am fine leaving this alone in talos, thanks for the thoughts here. ::mak, is there a way that we could identify the list of tests to not have this pref? We could set it in the mochitests (ideally in something like head.js for browser-chrome), or since we do run-by-dir, we can find a way to either fix this in the manifest files (.ini files) to set a pref or have the harness determine if we are in this list of directories and change the prefs.
Comment 7•8 years ago
|
||
once we have a way to ignore the pref in user-released builds, surely we could use head files for exclusions. As I said, any tests in toolkit/components/places/ and browser/components/places and any mochitest-browser test in browser/ (and subfolders) should test the real thing we ship
Reporter | ||
Comment 8•8 years ago
|
||
Release engineering is now using "build promotion" which means we take normal builds and ship them to users. So e.g. a PGO build on central becomes Nightly. This means we can't simply have a build time flag control the behavior: we need something that can be set at run time. We have undocumented prefs that enable unwanted behavior. "security.turn_off_all_security_so_that_viruses_can_take_over_this_computer" comes to mind. Perhaps we could rename the pref to "places.allow_database_to_lose_data_and_become_corrupt" or something. And we can have AMO screen for extensions setting that pref as an extra guard against people getting this set. As for per-directory and per-suite options, that makes sense. I'll see what I can cook up.
Reporter | ||
Comment 9•8 years ago
|
||
Comment on attachment 8751400 [details] MozReview Request: Bug 1272060 - Put Places in low-I/O, volatile storage mode during tests Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52009/diff/2-3/
Attachment #8751400 -
Attachment description: MozReview Request: Bug 1272060 - Put Places in low-I/O, volatile storage mode during tests; r?jmaher, mak → MozReview Request: Bug 1272060 - Put Places in low-I/O, volatile storage mode during tests
Reporter | ||
Comment 10•8 years ago
|
||
I /think/ the OS X 10.10 opt bc7 job at https://treeherder.mozilla.org/#/jobs?repo=try&revision=977aef8d14c758edfd3641c7073bcd709bb566c3 increased the frequency of some intermittents of tests touching Places. That Linux ASAN mda intermittent is also kinda wonky. At first glance it doesn't appear related to Places. But the failure rate is pretty high...
Comment 11•8 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #8) > Release engineering is now using "build promotion" which means we take > normal builds and ship them to users. So e.g. a PGO build on central becomes > Nightly. This means we can't simply have a build time flag control the > behavior: we need something that can be set at run time. Could we have something that is set only when tests run instead, like an env variable? So when we read the pref, if the env var is not set we ignore it. I don't care that much about the fact we already have footguns, 2 wrongs don't make a right :)
Reporter | ||
Comment 12•8 years ago
|
||
Yes, making the pref only work if an env variable is set would be fine and easy to do. I don't think it will stop people from footgunning themselves. But it is better than nothing.
Comment 13•8 years ago
|
||
At some point in the past I remember we looked into why xpcshell tests were so slow on Windows (bug 617503), and determined that it was mostly disk I/O. Unfortunately we probably can't turn this off for the xpcshell tests that are actually testing Places. Maybe we could try getting faster disks in our test machines?
Comment 14•8 years ago
|
||
If you can build something like a statistics of the tests that are doing more I/O, it could have 2 advantages: 1. we could look into them and maybe figure out alternatives to reduce it 2. maybe by looking into those, we could find perf improvements that would turn to the final users Something like the Orange Factor for I/O, even a very simple hall of shame.
Comment 15•8 years ago
|
||
Comment on attachment 8751400 [details] MozReview Request: Bug 1272060 - Put Places in low-I/O, volatile storage mode during tests https://reviewboard.mozilla.org/r/52009/#review50302 I'm ok with the idea, but we should exclude some tests harnesses (comment 5) that are critical for Places functionality. plus we need the new ENV var for this to be effective.
Attachment #8751400 -
Flags: review?(mak77)
Reporter | ||
Comment 16•7 years ago
|
||
I'm not actively working on this.
Assignee: gps → nobody
Status: ASSIGNED → NEW
Comment 17•7 years ago
|
||
fwiw, last year perf improvements and now bug 1356220 should have helped here, by increasing the wal autocheckpoint value to 2MB, that is unlikely to be hit by tests. So while we'll write yet, we likely won't fsync that much and some pages may be written less times to the db. We could also expose a pref to further increase the autocheckpoint in tests if it helps, and that pref could indeed be used from prefs_general. We should re-run the `mach reftest` test with Sysinternals Process Monitor and check how bad the current situation is yet, and maybe measure if a 10MB autocheckpoint would make it even better. I'll see if I can find time to do those couple measurements in the next weeks.
Flags: needinfo?(mak77)
Reporter | ||
Comment 18•7 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #17) > We should re-run the `mach reftest` test with Sysinternals Process Monitor > and check how bad the current situation is yet, and maybe measure if a 10MB > autocheckpoint would make it even better. > I'll see if I can find time to do those couple measurements in the next > weeks. Just don't do it with reftests: we disabled places in reftests in bug 1271035 because places isn't relevant to reftests.
Comment 19•7 years ago
|
||
didn't find the time, but it's still a valid concern. Moving to Places backlog. considered the improvementa and the fact one can disable global history in browsers, the issue is workaroundable. Also, if we end up not using places.database.disableDurability (and we are currently not using it) we may evaluate removing it.
Component: Mochitest → Places
Flags: needinfo?(mak77)
Priority: -- → P3
Product: Testing → Toolkit
Version: Version 3 → unspecified
Updated•2 years ago
|
Severity: normal → S3
Comment 20•7 months ago
|
||
I think it's worth investigating disabling history for certain tests subsets, rather than overall.
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•