Closed Bug 855591 Opened 10 years ago Closed 10 years ago

Support "T push" syntax on trychooser

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: justin.lebar+bug, Assigned: sfink)

Details

(Keywords: sheriffing-untriaged, trychooser)

Attachments

(1 file, 2 obsolete files)

It would be really nice if trychooser.pub.build.mozilla.org/ supported building "T pushes", such as

> try: -b do -p all -u all[x64] -t none

If we made the default |-b do -p all -u all[X]| where X is the platform with the fewest pending tests, we might save /a lot/ of cycles.
The current default is to run nothing at all, so I don't think the default should be changed.  Exposing it would still be worthwhile and might save some cycles.

While I'm nitpicking, this should really be called a "turnstile push":  http://en.wikipedia.org/wiki/Turnstile_%28symbol%29.
> The current default is to run nothing at all, so I don't think the default should be changed.  
> Exposing it would still be worthwhile and might save some cycles.

I meant the default on trychooser.  Presumably if you're there you want to run something.
(In reply to Justin Lebar [:jlebar] from comment #2)
> I meant the default on trychooser.  Presumably if you're there you want to
> run something.

So did njn. The default on trychooser is intentionally nothing, to make people explicitly choose what to run rather than just copy pasta'ing the default (bug 823711, bug 823907, bug 836980).
> So did njn. The default on trychooser is intentionally nothing, to make people explicitly choose 
> what to run rather than just copy pasta'ing the default (bug 823711, bug 823907, bug 836980).

Okay.  Maybe we could nonetheless make it easy to specify one of these pushes.
Yeah your proposal is a good idea, wasn't meaning that we shouldn't try to implement it (just maybe not touch the current default etc).
We should be sure to choose the platform that doesn't have any tests disabled on it.
(In reply to Phil Ringnalda (:philor) from comment #6)
> We should be sure to choose the platform that doesn't have any tests
> disabled on it.

Also the platform that runs all single-platform tests, right?
Attached image Mockup (obsolete) —
I think ideally we should show the TBPL test list and let users select them. Perhaps even have some row and column headers to quickly select an entire platform or run the test on all platforms.

For for now here's a mockup of something trivial to implement. You could select a platform as build only, or two run all the selected tests.
Assignee: nobody → bgirard
Status: NEW → ASSIGNED
Assignee: bgirard → nobody
Attached patch Simple UI for doing T pushes (obsolete) — Splinter Review
This is the sort of basic UI I was thinking of. I don't really know what are good options to offer, though; I've no clue which ones have lots of disabled and things. I'll request feedback, I guess.

BenWa: not sure I follow your mockup. What does it mean to request tests without builds on a platform?
Assignee: nobody → sphink
Attachment #731021 - Flags: feedback?(philringnalda)
Hm, can only do one feedback flag, I guess.
Flags: needinfo?(emorley)
(In reply to Steve Fink [:sfink] from comment #9)
> BenWa: not sure I follow your mockup. What does it mean to request tests
> without builds on a platform?

Looks like we had the same idea just a different UI for it. I had it that Tests implied Builds. We could make the JS check the Build if the Test is selected and unselected Build unselects Tests.
Comment on attachment 731021 [details] [diff] [review]
Simple UI for doing T pushes

You would have been better of with Ed for the feedback, my only feedback is that all this penny-wise crap is going to drive me either the rest of the way crazy, or away. I spend *all* of my *every* evening backing out patch after patch after patch which has an oh-so-hopeful Try URL posted in the bug, which did not run the thing or things that it failed, which caused it to bounce.

How expensive *is* electricty and CPU, anyway?
Attachment #731021 - Flags: feedback?(philringnalda) → feedback?(emorley)
And a followup: who exactly is it that is saying that we cannot afford any more electricity and slaves, and are they actually the people who are supposed to make that decision about binding the mouths of the kine?
> And a followup: who exactly is it that is saying that we cannot afford any more electricity and 
> slaves

I've never heard that argument.  The argument I've heard is: "We're working on adding capacity; it just takes time."  For example,

> http://oduinn.com/blog/2013/03/10/infrastructure-load-for-february-2013/
>
> Our test pool situation continues to improve, but is not yet as great as the situation with our 
> builders. We’re making good progress, but the rate of checkins, the improved capacity of the build 
> machines to generate more builds that need testing, the ever-increasing number of test suites to 
> run on each build and the hardware specific nature of some test suites make this test capacity 
> problem harder to solve. New hardware is still (slowly) coming. Meanwhile, RelEng, ATeam and devs 
> continue the work of finding test suites which should (in theory!) be able to run on AWS

To be clear, I don't mean to defend this position.
(In reply to Phil Ringnalda (:philor) from comment #12)
> Comment on attachment 731021 [details] [diff] [review]
> Simple UI for doing T pushes
> 
> You would have been better of with Ed for the feedback, my only feedback is
> that all this penny-wise crap is going to drive me either the rest of the
> way crazy, or away.

And that would be why I asked for your feedback. :-)

> I spend *all* of my *every* evening backing out patch
> after patch after patch which has an oh-so-hopeful Try URL posted in the
> bug, which did not run the thing or things that it failed, which caused it
> to bounce.

What causes this? Do the set of tests that run substantially differ between platforms? That's what you seemed to be saying in comment 6, but I don't know enough to evaluate what platform(s) to use for these pushes or whether the whole idea should be scrapped.

> How expensive *is* electricty and CPU, anyway?

Not my area, so I won't hazard an opinion. From past history, though, it seems like we never have enough capacity, so the goal is making the best of what we've got.

You say there are many pushes with a link to a try run that didn't run the tests that ended up failing. (Which I find easy to believe, given that several such pushes were mine.) Do you know how many of them actually *did* end up running the failed tests, but the inbound push happened before those results came back? Given the length of time required for a full set of results given the test queues, I wouldn't be surprised if this were common, but it's really a question for the sheriffs since you guys are the ones with actual data.

Similarly, how much of a problem is delay and coalescing on inbound? When I look at inbound tbpl, I don't understand how we can be keeping up as well as we are given the sea of gray. I understand that priority actually works now, but the try server is still half of the overall load, so I assume try load still causes delay and coalescing.

Anyway, the question here is whether the gain from the reduced load resulting from the pushes suggested here is enough to offset the poorer test coverage. And also what platform(s) to pick to avoid as much of that loss as possible. If some platforms are less useful than others, we should either leave them out of the UI or put a comment next to them. If the drawbacks from this outweigh the benefits, then we shouldn't put anything in the UI at all. I cannot answer these questions. (Though if I had more time than I do, I could try to piece it together from historical records of backouts and try pushes mentioned in bugs.)
> Do the set of tests that run substantially differ between platforms? 

Yes, actually.  Especially where b2g is involved....

In the end, developers should understand what the likely impact of their patches is and test on the relevant set of platforms. Unfortunately, what philor is saying is that they don't.  :(
Ok, I'm done messing around with this patch. The disable_filters stuff is probably overkill.
Attachment #732156 - Flags: review?(emorley)
Attachment #731021 - Attachment is obsolete: true
Attachment #731021 - Flags: feedback?(emorley)
Comment on attachment 732156 [details] [diff] [review]
Simple UI for doing T pushes

I like this approach :-)

Works well locally - only thing that might be worth changing is making the checkboxes apply the restriction to talos too (that is, if the trychooser syntax supports that, otherwise don't worry).
Attachment #732156 - Flags: review?(emorley) → review+
Flags: needinfo?(emorley)
Attachment #730765 - Attachment is obsolete: true
Attachment #732156 - Flags: checked-in+
Sadly, it does not yet support talos. I suppose I ought to get around to doing that too.

http://hg.mozilla.org/build/tools/rev/642fa41fca36
This is live now.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.