Bug 1649701 Comment 4 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Marco Castelluccio [:marco] from comment #3)
> Would this be a problem for Searchfox?

This potentially could be a major regression for the engineers who use Searchfox, depending on the impact on Searchfox, but I think we can coordinate things to save costs while potentially improving the latency of searchfox indexing and better coordinating with the completion of coverage runs, potentially reducing searchfox's indexer start latency.  The current coverage data provides an invaluable service of letting engineers see whether the block of code they're looking at is ever run during tests as well as the order of magnitude of coverage it gets.  Knowing this information at a glance can save significant amounts of engineer time.

Searchfox on mozilla-central currently runs 2x a day based on a AWS EventBridge CloudWatch cron job to trigger an AWS Lambda Function to try and match up with the 2x a day nightly builds, but we have moved from triggering searchfox jobs based on a TaskCluster cron job to just on every push because we were having problems where the cron job would pick to run searchfox tasks on a job that didn't have coverage run against it.  Originally searchfox only ran 1x a day, but this was emergently very biased towards only providing engineers on the West Coast of North America with fresh indexing results.  I added a second run which helped European time zones have fresh indexing results too.

I think ideally we would run coverage the same number of times we run searchfox, and ideally we could use the same decision task mechanism to only decide to run the "searchfox" jobs (and dependencies like the bugzilla component job and test-all job) on builds for which we're triggering coverage and for which we expect to result in a searchfox run.  There's no point running searchfox jobs if we're not running coverage jobs.  We probably could find a way to have the completion of coverage or the proposed https://github.com/taskcluster/taskgraph/issues/71 mechanism do something that could poke the AWS Lambda job that triggers searchfox for mozilla-central via a webhook or MQ or something.

Is it possible to quantify how much the full m-c coverage runs are costing so that management could decide if we can afford 2x m-c searchfox + supporting coverage runs a day?  Searchfox's AWS job for m-c is an m5d.2xlarge in us-west-2 which costs $0.452 if paying the (highest) on-demand price and it runs for less than 2 hours, so it costs less than $0.90 a run and the resulting web server is a t3.large at $0.0832 an hour (on-demand) for max $2 a day and so is unlikely to contribute appreciably to the cost factor decision itself.  If there's some attempt to categorize the CI costs, perhaps these coverage jobs directly corresponding to searchfox runs could be notionally tracked as searchfox costs?

In terms of skipping on weekends and holidays, again unless the costs are prohibitive, it's nice to run these at least 1x a day (assuming there have been new pushes) because:
- Sometimes searchfox runs fail, leaving us with a stale index.  It's much better for the stale index to be 12 or 24 hours old rather than 48 hours old.
- Searchfox does some house-cleaning on saturdays.
- Firefox volunteer contributors may be contributing on weekends/holidays.  Sometimes MoCo employees may also avail themselves of support for flexible working hours and also do work on the weekend, etc.
(In reply to Marco Castelluccio [:marco] from comment #3)
> Would this be a problem for Searchfox?

This potentially could be a major regression for the engineers who use Searchfox, depending on the impact on Searchfox, but I think we can coordinate things to save costs and potentially reduce searchfox's indexer start latency.  The current coverage data provides an invaluable service of letting engineers see whether the block of code they're looking at is ever run during tests as well as the order of magnitude of coverage it gets.  Knowing this information at a glance can save significant amounts of engineer time.

Searchfox on mozilla-central currently runs 2x a day based on a AWS EventBridge CloudWatch cron job to trigger an AWS Lambda Function to try and match up with the 2x a day nightly builds, but we have moved from triggering searchfox jobs based on a TaskCluster cron job to just on every push because we were having problems where the cron job would pick to run searchfox tasks on a job that didn't have coverage run against it.  Originally searchfox only ran 1x a day, but this was emergently very biased towards only providing engineers on the West Coast of North America with fresh indexing results.  I added a second run which helped European time zones have fresh indexing results too.

I think ideally we would run coverage the same number of times we run searchfox, and ideally we could use the same decision task mechanism to only decide to run the "searchfox" jobs (and dependencies like the bugzilla component job and test-all job) on builds for which we're triggering coverage and for which we expect to result in a searchfox run.  There's no point running searchfox jobs if we're not running coverage jobs.  We probably could find a way to have the completion of coverage or the proposed https://github.com/taskcluster/taskgraph/issues/71 mechanism do something that could poke the AWS Lambda job that triggers searchfox for mozilla-central via a webhook or MQ or something.

Is it possible to quantify how much the full m-c coverage runs are costing so that management could decide if we can afford 2x m-c searchfox + supporting coverage runs a day?  Searchfox's AWS job for m-c is an m5d.2xlarge in us-west-2 which costs $0.452 if paying the (highest) on-demand price and it runs for less than 2 hours, so it costs less than $0.90 a run and the resulting web server is a t3.large at $0.0832 an hour (on-demand) for max $2 a day and so is unlikely to contribute appreciably to the cost factor decision itself.  If there's some attempt to categorize the CI costs, perhaps these coverage jobs directly corresponding to searchfox runs could be notionally tracked as searchfox costs?

In terms of skipping on weekends and holidays, again unless the costs are prohibitive, it's nice to run these at least 1x a day (assuming there have been new pushes) because:
- Sometimes searchfox runs fail, leaving us with a stale index.  It's much better for the stale index to be 12 or 24 hours old rather than 48 hours old.
- Searchfox does some house-cleaning on saturdays.
- Firefox volunteer contributors may be contributing on weekends/holidays.  Sometimes MoCo employees may also avail themselves of support for flexible working hours and also do work on the weekend, etc.

Back to Bug 1649701 Comment 4