Closed Bug 1497898 Opened 5 years ago Closed 5 years ago

Optimise the manifest update.

Categories

(Testing :: web-platform-tests, enhancement)

Version 3
enhancement
Not set
normal

Tracking

(firefox65 fixed)

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: jgraham, Assigned: jgraham)

References

Details

Attachments

(15 files)

46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
Now that we run the manifest update every time we start wpt, the delay had become quite noticable. This can be optimised by using file metadata rather than file contents to prefilter the items that will be included in the update. We also see a lot of time is taken processing the gitignore rules, so the results of that can also be cached.
Depends on D8221
We end up with a lot of rules like (?:.*)/.*\.ext which are basically
trying to find the last component in a path and match against
that. These are rather slow to run so the easiest thing tdo is just
pass in the last component of the path when we know that's the only
thing the rule can match.

The changes to surrounding code to use this API will be made in future
commits.

Depends on D8222
This updates the gitignore implemenation to take input like os.walk
but with additional stat data for the files. It also makes several
useful optimistaions:

 * Avoid using regex when just matching a literal
 * Identify patterns that can only match the final component of a path
   and run those against that component rather than the full path.
 * Add the possibility of providing a dictionary of paths to gitignore
   statuses as a cache.

This dramatically reduces the amount of time we spend in gitignore
processing when updating the manifest.

Depends on D8223
When processing the manifest using the worktree, instead of reading
all files to see if the content changed, instead only process files where
the mtime has been updated since the previous run. Also cache the
result of running gitignore, so we can save a couple of seconds
processing the gitignore rules.

Depends on D8225
Compared to the normal os.walk this has a couple of differences:

* It returns lists of (name, stat) for filenames and directories,
  allowing callers to reuse the stat data without going back to the
  system to re-request it.

* Directories are always returned as paths relative to the root, and
  the root itself is returned as the empty string.

* It is non-recursive.

There are also a few features missing that aren't required for our use
cases.

Depends on D8226
The caches weren't being invalidated when the manifest itself
changed. To fix this the manifest itself has to be written before the
cache and the cache has to include data about the manifest that it's
associated with (the mtime and path are used for this purpose).

To make all this work requires a single method that can load the
manifest, update it, write the manifest and write the
caches. Therefore we introduce a single load_and_update method that is
intended to replace all previous use of the load() or update() methods
(and as a bonus handles manifest version mismatches in a single
place).

Depends on D8227
Depends on D8228
Depends on D8230
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/5e3b8ad4c8f4
Use testfile mtimes to pre-filter files iterated over for the manifest update, r=ato
https://hg.mozilla.org/integration/autoland/rev/9afac925aef8
Update the .gitignore file, r=ato
https://hg.mozilla.org/integration/autoland/rev/a6a89509add7
Allow the gitignore filter to work on name components only, r=ato
https://hg.mozilla.org/integration/autoland/rev/c3cb0408498c
Update the gitignore implementation to work as an iterator filter, r=ato
https://hg.mozilla.org/integration/autoland/rev/184bc31c33a6
Update the lint to the new gitignore API, r=ato
https://hg.mozilla.org/integration/autoland/rev/48be184d5377
Add manifest caches for the mtime and gitignore rules, r=ato
https://hg.mozilla.org/integration/autoland/rev/2caa5633dea1
Add a custom implementation of os.walk, r=ato
https://hg.mozilla.org/integration/autoland/rev/c482d18cc050
Fix the cache lifecycle, r=ato
https://hg.mozilla.org/integration/autoland/rev/3a9a7760db5c
Fix the .gitignore rules, r=ato
https://hg.mozilla.org/integration/autoland/rev/7bba4d617db6
Pass the manifest file directly into wpttest.from_manifest, r=ato
https://hg.mozilla.org/integration/autoland/rev/22a06c8c8dc6
Update manifest before tests, r=ato
https://hg.mozilla.org/integration/autoland/rev/57877c614829
Update gecko wpt manifest update to use caches, r=ato
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/13618 for changes under testing/web-platform/tests
Upstream PR was closed without merging
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/61bb9426c2da
Use testfile mtimes to pre-filter files iterated over for the manifest update, r=ato
https://hg.mozilla.org/integration/autoland/rev/60bb960edc7a
Update the .gitignore file, r=ato
https://hg.mozilla.org/integration/autoland/rev/4082767a2fbf
Allow the gitignore filter to work on name components only, r=ato
https://hg.mozilla.org/integration/autoland/rev/37d7cb9fafd3
Update the gitignore implementation to work as an iterator filter, r=ato
https://hg.mozilla.org/integration/autoland/rev/3601361c9ebb
Update the lint to the new gitignore API, r=ato
https://hg.mozilla.org/integration/autoland/rev/b17bb7f9ada2
Add manifest caches for the mtime and gitignore rules, r=ato
https://hg.mozilla.org/integration/autoland/rev/8d205db77cb6
Add a custom implementation of os.walk, r=ato
https://hg.mozilla.org/integration/autoland/rev/cd31e38cc943
Fix the cache lifecycle, r=ato
https://hg.mozilla.org/integration/autoland/rev/4f6a4937a2e2
Fix the .gitignore rules, r=ato
https://hg.mozilla.org/integration/autoland/rev/2d3c2a2938c4
Pass the manifest file directly into wpttest.from_manifest, r=ato
https://hg.mozilla.org/integration/autoland/rev/7144fb498d61
Update manifest before tests, r=ato
https://hg.mozilla.org/integration/autoland/rev/c7e37bb6cf0c
Update gecko wpt manifest update to use caches, r=ato
https://hg.mozilla.org/integration/autoland/rev/8ae5310b6412
Handle WindowsError trying to determine if git exists, r=ato
Upstream PR was closed without merging
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/aa24577bdeae
Use testfile mtimes to pre-filter files iterated over for the manifest update, r=ato
https://hg.mozilla.org/integration/autoland/rev/284749dd9ee9
Update the .gitignore file, r=ato
https://hg.mozilla.org/integration/autoland/rev/46c50835b997
Allow the gitignore filter to work on name components only, r=ato
https://hg.mozilla.org/integration/autoland/rev/215f0ae61298
Update the gitignore implementation to work as an iterator filter, r=ato
https://hg.mozilla.org/integration/autoland/rev/166466f3a5fe
Update the lint to the new gitignore API, r=ato
https://hg.mozilla.org/integration/autoland/rev/9d6171503744
Add manifest caches for the mtime and gitignore rules, r=ato
https://hg.mozilla.org/integration/autoland/rev/10e90d3295ee
Add a custom implementation of os.walk, r=ato
https://hg.mozilla.org/integration/autoland/rev/fe2d962a8ed2
Fix the cache lifecycle, r=ato
https://hg.mozilla.org/integration/autoland/rev/daf29e8abb1e
Fix the .gitignore rules, r=ato
https://hg.mozilla.org/integration/autoland/rev/7f5437635332
Pass the manifest file directly into wpttest.from_manifest, r=ato
https://hg.mozilla.org/integration/autoland/rev/750b40bcdad7
Update manifest before tests, r=ato
https://hg.mozilla.org/integration/autoland/rev/1a0c1f962ae4
Update gecko wpt manifest update to use caches, r=ato
https://hg.mozilla.org/integration/autoland/rev/3dedddeb0c8d
Handle WindowsError trying to determine if git exists, r=ato
https://hg.mozilla.org/integration/autoland/rev/6875c54fd481
Update jstests.py wpt integration for manifest optimisations, r=bbouvier,Ms2ger
Assignee: nobody → james
Pushed by wptsync@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aca5ae992b39
[wpt PR 13618] - [Gecko Bug 1497898] Use testfile mtimes to pre-filter files iterated over for the manifest update, a=testonly
Flags: needinfo?(james)
You need to log in before you can comment on or make changes to this bug.