Closed Bug 1254964 Opened 4 years ago Closed 3 years ago
Enable manual taskcluster scheduling for valgrind-mochitest runs
This bug builds on bug 1245566. That added much of the framework needed to allow taskcluster valgrind/mochitest runs using special try syntax, but stopped short of completely solving the problem, due to concerns described at https://bugzilla.mozilla.org/show_bug.cgi?id=1245566#c8 This bug aims to finish the job, so that, once landed, it is possible to schedule such a run merely by supplying the correct try syntax, for example: -b o -p linux64_tc -u mochitest-valgrind -t none
https://bugzilla.mozilla.org/show_bug.cgi?id=1245566#c8 appears to list two areas of concern for the uncommitted sections:  the "linux64" build got renamed to "linux64_tc"  adding a "mochitest-valgrind" test description would cause the tests to get run for "-u all", which is not desired. This patch attempts to address  by cloning the two "linux64" parts so as to create "linux64_tc" equivalents, rather than simply renaming them. I do not know how to solve . Any suggestions?
I will be away until March 23rd. Please find another reviewer before that, otherwise, I will get to it at that point. My apologies for the inconvenience.
With slight revisions after discussing w/ jmaher and dustin on irc.
Comment on attachment 8730715 [details] [diff] [review] bug1254964-2.cset Review of attachment 8730715 [details] [diff] [review]: ----------------------------------------------------------------- From bug 1245566 I think what you're trying to do here is to define a test job that does not run on try by default, but can be run on-demand. I believe that the test job requires a specific build of Firefox (with Valgrind), and not the normal linux64 build. Again from that bug, it sounds like the not-by-default is because the suite isn't green yet. Is that about right? For the not-green-yet bit, Armen's answer in that bug is good: keep a patch around that adds the job, and include that in your try pushes until the job is green. Another (more expensive) option is to land that patch but hide the job in treeherder. I think it's possible to define a job but not schedule it by default -- I just don't see it as a great option. Aside from that, I'm confused by this patch -- it looks like linux64-mochitest-valgrind is an exact duplicate of linux64. How would the resulting build differ?
(In reply to Dustin J. Mitchell [:dustin] from comment #4) Thank you for looking at this in some detail. > From bug 1245566 I think what you're trying to do here is to define a test > job that does not run on try by default, but can be run on-demand. Yes, correct. > I believe that the test job requires a specific build of Firefox (with > Valgrind), and not the normal linux64 build. No, I don't think that the test job requires a (new) specific build of Firefox. As far as I can see, the standard linux64 opt build is done with the relevant flags (--enable-valgrind, --disable-jemalloc). I think that is because there is already a short on-every-push Valgrind run on try, "Linux x64 opt Valgrind Build (V)" and so the build needs to be configured to make that work anyway. And I believe that this test job just uses the "Linux x64 opt" build. > Again from that bug, it sounds like the not-by-default is because the > suite isn't green yet. Is that about right? Partially, but there are other reasons. The not-by-default is because a complete mochitest-on-valgrind-on-a-slow-EC2-instance run requires about 200 CPU hours. It was never the intention to have this be a default on-every-push job. Rather, the intention was for it to be a Tier-2 job that is run once every 12 or 24 hours. Making it green is difficult to say the least. The runs (it is split into 40 pieces) go orange not only because of Valgrind errors, but also because of any oranges that would have occurred anyway even running not on Valgrind. Also there are timeout issues that occur more frequently than on "normal" runs, because Valgrind slows down execution so much. I have dealt with some of them, but there is more work to do here. > For the not-green-yet bit, Armen's answer in that bug is good: keep a patch > around that adds the job, and include that in your try pushes until the job > is green. I understand that .. but for the reasons above, getting to a completely green status will be difficult. > Aside from that, I'm confused by this patch -- it looks like > linux64-mochitest-valgrind is an exact duplicate of linux64. How would the > resulting build differ? I think it won't differ, and we can just use the linux64 build, per comments above. Does that make sense? I am slightly hazy about this myself, but I _think_ what I say is correct.
OK, I think in that case that you just want the third hunk from your patch. The second hunk is adding a new build job, which you don't need, and the first hunk is adding your job to the default try runs. With that in place, you should be able to test a try push to run your new task. Once you've done that, look at the "D" task for your try push, and you'll see your try syntax injected into a `mach taskcluster-graph` command. You can verify the behavior for "-u all" by copy/pasting that command and adjusting the try syntax. If your job appears in the resulting task graph (grep for it), then it has not been correctly excluded from -u all.
Rebased. Required try syntax: -b o -p linux64 -u mochitest-valgrind -t none
Attachment #8730715 - Attachment is obsolete: true
Attachment #8809782 - Flags: review?(dustin)
Comment on attachment 8809782 [details] [diff] [review] bug1254964-7.cset Review of attachment 8809782 [details] [diff] [review]: ----------------------------------------------------------------- <y-u-no-mozreview obligatory /> ::: taskcluster/ci/desktop-test/test-sets.yml @@ +36,5 @@ > - web-platform-tests-wdspec > - xpcshell > > +# This is the same as all-tests-debug, > +# with the addition of "mochitest-valgrind". I filed bug 1316877 to make this easier to express. Fine to land as-is for now, and I'll clean it up in that bug. ::: taskcluster/ci/desktop-test/tests.yml @@ +539,5 @@ > + - remove_executables.py > + extra-options: > + - --mochitest-suite=valgrind-plain > + # Bug 1281241: migrating to m3.large instances > + instance-size: legacy Removing this would move you from m1.medium to m3.large, which would likely get better performance.
Attachment #8809782 - Flags: review?(dustin) → review+
* switches instance size from m1.medium to m3.large, for extra perf (anti-timeout) headroom and lower latency * adds "run-on-projects: "
Attachment #8809782 - Attachment is obsolete: true
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/4513f4dad100 Enable manual taskcluster scheduling for valgrind-mochitest runs. email@example.com.
You need to log in before you can comment on or make changes to this bug.