Closed Bug 1603467 Opened 6 years ago Closed 5 years ago

Stand up a dashboard measuring scheduler efficiency

Categories

(Firefox Build System :: Task Configuration, task, P2)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ahal, Assigned: ekyle)

References

(Blocks 1 open bug)

Details

(Whiteboard: [smart-sched])

We have a rudimentary metric (which we'll improve over time) that attempts to measure how efficient a scheduling algorithm is. We also have the ability to run so called shadow-schedulers, and may implement another mechanism to measure scheduling changes in the task generation phase.

We should automate the process of collecting the requisite data and feed it into a dashboard so we can see at a glance how the different scheduling algorithms are performing. Then we can use this information to determine what gets run by default on autoland.

Priority: -- → P2

In such a dashboard, it would also be nice to show the following:

  • graph of the evolution of the number (and total duration, and cost if we can) of total tasks that could be run;
  • graph of the evolution of the number (and total duration, and cost if we can) of tasks scheduled;
  • graph of the evolution of the number of backouts and delay between landings and corresponding backouts.

We could call the dashboard "arewegreenyet", hinting to the quantity of carbon dioxide emissions we will prevent.

Assignee: nobody → klahnakoski
Whiteboard: [smart-sched]

CO2 Emissions Dashboard!

:ahal

I assume "how efficient a scheduling algorithm is" is located here: https://github.com/mozilla/ci-recipes/blob/master/recipes/scheduler_analysis.py#L135

That line assumes mach and hg are installed. What other setup is required?

Flags: needinfo?(ahal)

Yes, that is the one. You need a mozilla-central clone at the moment, I don't think there are any other dependencies other than the ones in poetry install. It's best if you clone outside of that script and then pass it in via the --gecko-path argument.

Though, I think gecko is only necessary if you want to test a custom scheduling algorithm that you've implemented locally.. If we only care about the shadow-schedulers (which we would for this dashboard), then gecko shouldn't be needed. That script was written 6 months ago and is in need of a re-write on top of modern mozci.

Flags: needinfo?(ahal)

The baseline scheduler output can be found in the decision task: eg https://treeherder.mozilla.org/#/jobs?repo=autoland&searchStr=decision&selectedJob=291693194

The task-graph.json is the output with (currently SETA) decisions: https://firefoxci.taskcluster-artifacts.net/fzgPWeChT-mcYbQwduwWzA/0/public/task-graph.json found at data.values().label

mozci has code to pull this information from an artifact: https://github.com/mozilla/mozci/blob/master/mozci/push.py#L277

Steps

  1. ETL the 'optimized_tasks.list' into a database (like bigquery)
    • there problem is where to run this cron job
    • maybe the task can push the data into bigquery directly?
  2. ETL the backout informationinto same database
    • what's pulled, and how to process should be in mozci
  3. Write the analysis logic (not necessarily complicated)
    • the backout rate, and distance to backout (?mozci?)
    • number of tasks rquested
    • total test hours requested (will require average run time per task type, from the treeherder data)
  4. show dashboard (using DataStudio)

Will it be possible to plug redash on thr big query db?
I would be also interested to get the end to end time (decision task to completion)
Thanks

I think end to end time is out of scope of the project. I do think this project will make a dent in that- here is a dashboard showing scheduled->completion time:
https://datastudio.google.com/reporting/1Xo4joOq1PzlqF7iwq1SNmwbcQlLPHghV/page/Y5Vx

See Also: → 1620427

:Sylvestre I add the desire for end-to-end time to the MVP measures. We can discuss it when we discuss what the MVP will be exactly. At least it is not lost.

Depends on: 1623101

Should we close this bug out Kyle? Or did you want to use it to track moving to Armen's repo?

Flags: needinfo?(klahnakoski)

Please keep this open, I am not done

Flags: needinfo?(klahnakoski)
See Also: → 1634679
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.