Once we have test machines in production, we never recheck them for accuracy until someone reports a problem. How about we setup an idle-time job that runs once a month, on a weekend to have all unittest/talos machines test the same changeset, and flag any machines that report differently to the rest? There are so many intermittent tests, so hard to tell which are machine problems vs test problems. To start with, for any that are different, we can re-test those to see if its a machine problem or a test problem? This could also automatically compare test results of new machines in staging with most recent idle-time test run, before moving the new machines to production.
John, back to you for prioritization!
7 years ago
In our pool of test machines model, this isn't really feasible. We've got a pretty good workflow for pulling out bad machines already, too.