Closed Bug 1568277 Opened 2 months ago Closed Last month

Support "shadow" optimizers for validating algorithms on autoland in real time

Categories

(Firefox Build System :: Task Configuration, task, P1)

Tracking

(firefox70 fixed)

RESOLVED FIXED
mozilla70
Tracking Status
firefox70 --- fixed

People

(Reporter: ahal, Assigned: ahal)

References

Details

Attachments

(10 files, 1 obsolete file)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review

Bug 1555032 gave us the ability to test scheduling algorithms on historical data by using an environment variable to inject external optimizers into the task generation.

This mechanism is useful for quick prototyping and validation, but it suffers from a few problems:

  1. Some scheduling algorithms use data that is not pinned to the tree. For example, SETA uses the treeherder database. Code coverage uses external coverage data. This means the analysis isn't using the proper "snapshot" of these data sources that existed when they would have run in real life.

  2. Analysis takes a really long time. A months worth of data is a multi-day computation, mostly due to running the task generation for each push.

To get around these issues, we need a way to run our experimental algorithms on live pushes (autoland, tier 3). The idea is that for each experimental algorithm, there is a "shadow" decision task that generates an optimized taskgraph using it. We can then go back in history and quickly download each artifact without the need for expensive computations. The artifact will also have been generated using, e.g, the SETA database at the proper moment in time. So comparisons with the baseline algorithm will be valid.

I'll likely do a bit of minor refactoring of optimize.py (i.e, turn it into a directory) as a prerequisite to this work.

Depends on D40206

Attachment #9082239 - Attachment description: Bug 1568277 - [taskgraph] Refactor optimization code → Bug 1568277 - [taskgraph] Use a 'register_strategy' decorator in optimize.py
Attachment #9082240 - Attachment description: Bug 1568277 - [taskgraph.optimize] Split strategies out into a separate file → Bug 1568277 - [taskgraph] Split optimize strategies out into a separate file
Attachment #9082241 - Attachment description: Bug 1568277 - [taskgraph] Create optimize strategy aliases → Bug 1568277 - [taskgraph] Create optimize strategy aliases for the 'test' kind
Attachment #9082244 - Attachment is obsolete: true

Thanks for all the great comments.. I'll think about how to rework them all into a coherent series. Probably won't have it ready until next week though.

See Also: → 1572514
Attachment #9082243 - Attachment description: Bug 1568277 - [taskgraph] Make SETA class reusable → Bug 1568277 - [taskgraph] Pass push and time intervals into SETA.is_low_value_task
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9b89f970d46b
[taskgraph] Remove some dead code in optimize.py r=tomprince
https://hg.mozilla.org/integration/autoland/rev/c8f797a19731
[taskgraph] Use a 'register_strategy' decorator in optimize.py r=tomprince
https://hg.mozilla.org/integration/autoland/rev/cb35fd836621
[taskgraph] Move 'taskgraph.transforms.job.import_all' to a utility function r=tomprince
https://hg.mozilla.org/integration/autoland/rev/d2b1d6c0a732
[taskgraph] Split optimize strategies out into a separate file r=tomprince
https://hg.mozilla.org/integration/autoland/rev/391a90f3f02b
[taskgraph] Create optimize strategy aliases for the 'test' kind r=tomprince
https://hg.mozilla.org/integration/autoland/rev/f8b41cbaaf8e
[taskgraph] Ensure user specified optimization strategies update instead of replace the default ones r=tomprince
https://hg.mozilla.org/integration/autoland/rev/7b59ed5d703d
[taskgraph] Pass push and time intervals into SETA.is_low_value_task r=tomprince
https://hg.mozilla.org/integration/autoland/rev/d7e8f80e2c85
[taskgraph] Merge SETA implementation with optimization strategy r=tomprince
https://hg.mozilla.org/integration/autoland/rev/632d943c947b
[tasgraph] Add ability to redirect |mach taskgraph|'s output to a file, r=tomprince
https://hg.mozilla.org/integration/autoland/rev/056d9515483c
[ci] Add an experimental SETA optimize strategy and task to run it r=tomprince

Backed out for breaking ./mach bootstrap:

https://hg.mozilla.org/mozilla-central/rev/8da8443e0bcb7a6d9766d179332443660c926d8b

Error running mach:
​
    ['artifact', 'toolchain', '--from-build', 'win64-node']
​
The details of the failure are as follows:
​
ImportError: cannot import name IndexSearch
​
  File "c:\Users\fuzz1\trees\mozilla-central\python/mozbuild/mozbuild/artifact_commands.py", line 295, in artifact_toolchain
    from taskgraph.optimize import IndexSearch
Status: RESOLVED → REOPENED
Flags: needinfo?(ahal)
Resolution: FIXED → ---
Target Milestone: mozilla70 → ---

I ran into this issue in comment 15 and :Aryx helped me back it out.

To test this properly I had to remove all *.pyc files in mozilla-central if it is not a fresh clone: https://stackoverflow.com/a/925597/445241

Removing all pyc files via find . -name '*.pyc' -delete makes mach bootstrap work on 9229fd85bc05, but not on tip (prior to comment 15's backout).

Attachment #9085834 - Attachment description: Bug 1568277 - [tasgraph] Add ability to redirect |mach taskgraph|'s output to a file, r?tomprince → Bug 1568277 - [taskgraph] Add ability to redirect |mach taskgraph|'s output to a file, r?tomprince
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/60e1e6fe37b0
[taskgraph] Remove some dead code in optimize.py r=tomprince
https://hg.mozilla.org/integration/autoland/rev/16abfb2f6a4e
[taskgraph] Use a 'register_strategy' decorator in optimize.py r=tomprince
https://hg.mozilla.org/integration/autoland/rev/b1a9c2bf303f
[taskgraph] Move 'taskgraph.transforms.job.import_all' to a utility function r=tomprince
https://hg.mozilla.org/integration/autoland/rev/856b78de0b75
[taskgraph] Split optimize strategies out into a separate file r=tomprince
https://hg.mozilla.org/integration/autoland/rev/b77bd375f87f
[taskgraph] Create optimize strategy aliases for the 'test' kind r=tomprince
https://hg.mozilla.org/integration/autoland/rev/be3aa560097c
[taskgraph] Ensure user specified optimization strategies update instead of replace the default ones r=tomprince
https://hg.mozilla.org/integration/autoland/rev/1b6dcd65b6e7
[taskgraph] Pass push and time intervals into SETA.is_low_value_task r=tomprince
https://hg.mozilla.org/integration/autoland/rev/ca94d50fd957
[taskgraph] Merge SETA implementation with optimization strategy r=tomprince
https://hg.mozilla.org/integration/autoland/rev/79c6b6bb28be
[taskgraph] Add ability to redirect |mach taskgraph|'s output to a file, r=tomprince
https://hg.mozilla.org/integration/autoland/rev/eaf06f57c250
[ci] Add an experimental SETA optimize strategy and task to run it r=tomprince
Flags: needinfo?(ahal)
You need to log in before you can comment on or make changes to this bug.