Open Bug 1761730 Opened 3 years ago Updated 3 years ago

Investigate reducing the tppagecycles and cycles for DAMP

Categories

(DevTools :: General, task)

task

Tracking

(Not tracked)

People

(Reporter: jdescottes, Unassigned)

References

Details

We currently have both cycles and tppagecycles set to 5, which means that each DAMP test (devtools performance test) runs 25 times.

https://searchfox.org/mozilla-central/rev/b671b6390e88672543b9b7c82132be655bd98856/testing/talos/talos/test.py#583-585

We could review those figures and lower them especially considering that we often run the DAMP job several times to get more accurate results.

With the split happening at 1749928, our current jobs take between 10 and 25 minutes to run. Reducing the tppagecycles & cycles to 3 makes them take only between 5 and 10 minutes. The noise seems to be similar to what we currently have, but that's when pushing with --rebuild 6.

:sparky, do you have any advice to pick good values for cycles/tppagecycles?

For instance, even though we say that we often run the jobs several times (pushing with --rebuild 6 for instance), maybe we still need to keep those values quite high for the mozilla-central jobs, where I can see that DAMP by default runs only once.

I imagine that if we lower those too much and make the results more noisy, it might lead to more false alerts?

Flags: needinfo?(gmierz2)

You're correct that if we remove too many our trials, results could be noisier and we may have issues catching regressions. We currently have a task to use replicates/trials rather than the median/mean value of the subtest in the perfherder compare view. See bug 1674370 (I've set it as a blocker to this bug).

We're hoping to get this issue resolved this year so I think we could look into adjusting these two settings after those changes since they should resolve the main issue we have here that is: retriggers are needed to get a statistically significant result even though we already have almost all the data we need.

After those changes, we could actually increase the number of trials being done here so you won't have to do so many retriggers to be confident in your results.

Let me know if this would solve your concerns. Otherwise, we could try reducing to 3, check the variance, and if the difference is minimal, we could make the switch.

Depends on: 1674370
Flags: needinfo?(gmierz2)
You need to log in before you can comment on or make changes to this bug.