Closed Bug 1287018 Opened 8 years ago Closed 8 years ago

Make TaskCluster use data from SETA to schedule tasks

Categories

(Taskcluster :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla52

People

(Reporter: sabergeass, Assigned: sabergeass)

References

Details

Attachments

(2 files, 19 obsolete files)

5.31 KB, patch
Details | Diff | Splinter Review
5.39 KB, patch
sabergeass
: review+
Details | Diff | Splinter Review
Right now, taskcluster generate all kinds of tasks for full task graph and we may want it could use data from SETA and schedule task based on it.
Depends on: 1243123, 1281004
From dustin:
> I don't have a design in mind for in-tree SETA.  It makes sense to consider it a kind of optimisation,
> but it could also be done in a target set method.  In either case, SETA would need to "get along with"
> other, similar methods: if optimization, it will need to cooperate with other optimization methods.  If a
> target set method, it will need to interoperate with the try option syntax.

I'm not sure I understand how optimization determined which task should been dropped off which one shouldn't. On the document[1], it says the optimiser will try to 'hide' a task and see if its input is unchanged. What's the meaning of 'input' in there? And I'm still confusing about the difference between 'task set' and 'task graph'.

>This a a very complex bit of code, as a very new implementation of decades of accumulated functionality
>and requirements.  Any major change needs to be carefully considered taking into account those
>requirements.  For example, if SETA is implemented as optimization, will SETA eliminate tests from action
>tasks as well?  
hmm, I don't know if I got the point in here, I think SETA only act as a reference of task scheduling. When taskcluster generate task graph or want to optimize a task graph, it will search (maybe)a list from SETA and find out which test is needed and which job is not. So I don't think it will break the whole schedule process.

There are other issues to consider as well: how is the metadata as to which tests have and have not been executed "recently" stored, and is that storage subject to race conditions if, for example, a set of decision tasks operate in parallel?  If I do a "test" run of optimization on my laptop, does that affect that stored data?  Are the inputs used to determine which tests to run recorded somewhere so that the task-graph generation process can be repeated for testing or debugging?

I have not ideal for this part, but I will try to figure it out. Thank you dustin for telling me this :)


[1]http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/taskgraph.html#optimization
From me:

>On 14 July 2016 at 03:30, Tiramisu Ling <sabergeass@gmail.com> wrote:
>Hey Joel and Armen, I found myself in the mire when I try to figure out how taskcluster actually schedule >task. And I also cc Dustin in this email due to he is the expert in taskcluster.
>First one is, there are legacy and build(docker_image) kind in taskcluster, why we call it legacy? Does >it has any specific meaning?( I know is't not a good question, but I just strongly curious about it.)

From Armenzg
>In the last few months, dustin has been changing how we schedule tasks with TaskCluster.
>Before we had task graphs that would get submitted but we now submit tasks one by one in a ordered
>manner.
>The system creates a complete graph with everything in it and through various phases it breaks it down >into what we actually need rather than everything.
>The new system still supports the old style tasks, thus, they're named legacy tasks.
>Everything will be moved out of legacy and be implemented in the new style.


From me again:
>The next one is I found we put all .yml file in ci folder(https://dxr.mozilla.org/mozilla
>central/source/taskcluster/ci), but I also read some code[1] and articles[2][3] say that we store .yml at 
>testing/taskcluster/task. Actually, I haven't found  'taskcluster' in mozilla-central/testing path. Do I
>miss something? or does it bee moved into the ci folder? What's the difference between these two path(/ci
>and /testing/taskcluster/task)?

From armenzg:
>This recently got refactored.
>I would simply ignore those articles and read the official documentation.
>You can also read this:
>https://github.com/djmitche/taskcluster-in-tree-taskgraph#tldr
>Notice that *test* jobs are being moved out of the legacy kind which matters to SETA.
>After speaking with dustin in London we agreed that this would unblock you once you were close to looking >into this.
>It seems that the timing is working out.

From me:
>The last one is, the most important one, I guess I got the ideal of how task scheduling actually works,
>after dustin help me out on the irc yesterday. In my understanding, TaskGraphGenerator in generator.py
>will load legacy or build kind after the mach taskgraph decision command been excited. And legacy.py or
>docker_image.py will load all .yml files in correspond folder. But I still wondering how to make it make
>decision only based on seta data rather than run them all. Do I need to create a new 'kind' beside legacy
>and build? I don't think it's a good ideal. And I also found we have optimised  option mach command.
>However, I haven't figure how it works exactly or if we could use it to optimise the task scheduling and
>make taskcluster only run special task.

From armenzg:
>I don't know how to answer this.
>I assume that in one of the stages of the gecko decision task we would need to grab the SETA data and
>removes tasks that we don't want running.
>
Summary: Make taskcluster can use data from SETA to schedule task → Make TaskCluster use data from SETA to schedule tasks
Assignee: nobody → sabergeass
Hi dustin, I have some questions about the the task set. What's the different between task graph and task set? What's the 'transitive_closure' when we want to get a task graph from task set? And we know, when tc try to optimize a full task graph,  the optimizer will try to drop a task if none of its inputs have changed. What's the 'input' in here? Could you give me some detail situation that when the inputs are unchanged or on what's condition it will been changed? I still need help on the question in comment1 [https://bugzilla.mozilla.org/show_bug.cgi?id=1287018#c1]

Thank you very much!
Flags: needinfo?(dustin)
I see you found http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/taskgraph.html which describes these concepts in general.

The taskgraph stuff uses the word "graph" in the sense of https://en.wikipedia.org/wiki/Graph_(discrete_mathematics).  Each task is represented as a node (aka vertex), and the dependencies between tasks are represented as edges.

In that context, a task set is a graph with no edges, just a set of nodes (tasks).  A task graph is a graph with nodes (tasks) and edges (dependencies).

Transitive closure is a fancy name for this sort of operation:

 * start with a set of nodes (without edges)
 * add all edges pointing away from any of those nodes, and any nodes attached to those edges
 * repeat until nothing changes

The effect is this: imagine you start with a linux32 test job and a linux64 test job.  So start with those two nodes.  Each one depends on a docker image building task, so add the edge pointing to that docker task and the docker task itself.  And each test task depends on a build task, so add the edges pointing to those build tasks, and the build tasks themselves.  Then repeat: we already have everything that the test tasks point to.  The docker tasks don't point to anything.  But the build tasks point to different docker image build tasks (they run in a different image), so we add those edges and nodes.  And repeat again.  This time, there are no new edges or nodes to add, so we are done.  And as you can see, the graph we've built now includes everything we wanted (the test jobs) plus everything required to do that (docker images, builds).

The inputs to a job can get pretty complex -- it's any data that the job uses to produce its results.  The only thing that does optimization right now is docker image build tasks, and the question is actually pretty easy to answer there.  For example, for the desktop-test docker image, the inputs are the contents of the testing/docker/desktop-test directory in the source tree.

For a build, the "inputs" would include the source code used to compile the browser, and the environment in which the compilation takes place (compiler, libraries, etc.).

As for comment #1, my understanding of SETA is that there is some data in an API somewhere that says, for example, "run linux64/debug reftest chunk 5 on every seventh push or every 10 hours".  The question is, how do you keep track of whether the current push is the seventh?  Or what time the last push was that ran chunk 5?
Flags: needinfo?(dustin)
Hey dustin and armenzg, I got a few more questions in here:
1) what's the relationships between job_graph[2] and graph[3]? And, for a job(e.g https://pastebin.mozilla.org/8885653) What's the 'when' and 'post-build' represent for? And what's the different between 'task'(line 13) and 'build'(line 226)?
2) still in that job structure(https://pastebin.mozilla.org/8885653), what's the 'flags'(line 145) represent for? I saw a list of test and a list of build in there, is that mean it won't been scheduled if I remove some test or build in there?
3)Now we have job_graph([4]), is that mean that build won't been scheduled if we remove a item from job_graph(e.g remove the whole 'linux-l10n'(line2-24) from https://pastebin.mozilla.org/8885656 then the linux-l10n won't been included into the graph anymore)
4)What's the different between 'build' and 'task' exactly, I thought I know the answer before but it become more and more confuse me after look close to the codebase. I thought a build is 'something we used to run the test like linux-64 or windows 64' and a 'task' is "build + a series test we want to run on that platform". 

[1]https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/task/legacy.py#l340
[2]https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/task/legacy.py#l342
[3]https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/task/legacy.py#l391
[4]https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/task/legacy.py#l342
Flags: needinfo?(dustin)
Flags: needinfo?(armenzg)
Add more questions:
1) I haven't found testing/taskcluster/[1] folder in codebase, why? Is it created by someone want to define task by himself?
2) We return True at the last of should_run() function[2], is that mean we should run that task if its file_patten is empty?( I still don't know what's the file_patten and when meaning in here.


[1]https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/task/legacy.py#l426
[2]https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/task/legacy.py#l445
'when' refers to which files need to be modified on a push for this task to be scheduled.

I don't know about 'post-build'.

I don't know the implication of 'tasks' and 'builds' being at the same level. That's new to me.

I think 'flags' refers to values that the try syntax could match.

I think a lot of your questions could get answered if you added pdb.set_trace() in strategic points and check the values you get. You can even add comments for future reference.

I'm going to leave comment 5 to dustin. I'm adding no value.

The folder testing/taskcluster/ is no more. It recently got removed.

I'm not sure about should_run(), it looks weird to me that changed_files is not defined within the function. I'm unsure about the purpose of it.
Flags: needinfo?(armenzg)
(In reply to Armen Zambrano [:armenzg] - Engineering productivity from comment #7)

Actually I do use pdb, but just can't figure out the meaning of those data and names.
Please don't pay too much attention to legacy tasks or the legacy code.  It is all getting deleted.  The legacy.py file is a mess and nobody that still works here really understands it.  In particular, its use of terminology is *very* confusing.  If you really want to read the code, I would suggest ignoring the meaning of the variable names (so treat `task` as `x` and `build` as y` and `job_graph` as `z`, etc.)

As of Tuesday, all test tasks have stable labels that do not begin with "TaskLabel=="
Flags: needinfo?(dustin)
Hey, I have a idea in here. As you know, the all_task[1] is a dict including all tasks' task_label we want to create[2]. Maybe we could remove those low value jobs's tasklabels from there? And we could reduce the number of tasks.

[1] https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/generator.py#161
Flags: needinfo?(jmaher)
Flags: needinfo?(dustin)
Attached patch draft.patch (obsolete) — Splinter Review
sorry, here is the [2]https://pastebin.mozilla.org/8886586, which is a part of all_task. I forget to add it.

I also create a draft level patch for make myself more clearer. And here is what I get:

>
>MikeLings-MacBook-Pro-2:firefox mikeling$ ./mach taskgraph target -p >taskcluster/parameters.yml 
> 0:00.24 Loading kinds
> 0:00.24 Generating full task set
> 0:00.34 Generated 7 tasks for kind docker-image
> 0:00.51 Starting new HTTPS connection (1): hg.mozilla.org
> 0:06.53 Generated 33 tasks for kind legacy
> 0:07.05 Generated 16 tasks for kind android-test
> 0:08.47 Generated 5 tasks for kind desktop-test
> 0:08.47 Generating full task graph
> 0:08.47 Generating target task set
>TaskLabel==AC7EO0kcRC6F_2Blbr5tGQ
>TaskLabel==AbHN1p_SR_OLKBfw21jkDg
>TaskLabel==C3qIwB7RS0-AhQi9MzYBhQ
>TaskLabel==CGLQPx9ARNi-rFX_jjjoQA
>TaskLabel==DK3ZcssuQJCUEFlvTXjrZw
>TaskLabel==E2udK38zRoquXFBUHOAz2A
>TaskLabel==GBBNrCZFTRCvqESIcNeXeQ
>TaskLabel==HjIm595hQp242MnuqvavMg
>TaskLabel==JNw_LymVTJO9bcg1O24BIQ
>TaskLabel==JRcTHvmoTo2MrJsik9hftQ
>TaskLabel==KYqMOdK0TRC6tjrve6JbbQ
>TaskLabel==KyVWPK2fSrua267Im3mMaQ
>TaskLabel==L848WKgTShey58cG-2Ag9Q
>TaskLabel==LiGpGaAyRzmgU0o-XwtGaA
>TaskLabel==MPYbOpl9S4qFlOiX12MisQ
>TaskLabel==NAqIawUzQy2b6fFva5KHlg
>TaskLabel==NLb3d-jDRjWP7BqIoosYSQ
>TaskLabel==R0oV8rl9RWClWOi4gVKXBA
>TaskLabel==S-INTzUUTP20_ou68mj5IA
>TaskLabel==SHAg5jEWTiqPl_SBNEwXaA
>TaskLabel==Svv41VUzTYGHvO6_JM4Ajw
>TaskLabel==T5dK417TRDCIwNZU7xMtRg
>TaskLabel==TCDH3RGOSZCiG4z_7vy1ag
>TaskLabel==TQT6kJZES-Ca1bdP77CFbA
>TaskLabel==T_VyEIw6SHCvm2cmj1xGdw
>TaskLabel==UmmTrL_pTB-hnLpnUpicRA
>TaskLabel==Vaz8Z9sJQRqwYCX6m9uZvQ
>TaskLabel==X1pnzqO1Re-IHHy5tM8EwQ
>TaskLabel==ZnNhwtQuQdC1W5f3K4adrg
>TaskLabel==csl9jT1ISHSpipbLL-OWVw
>TaskLabel==dEkwusVpQ6q9xj7KePEbmA
>TaskLabel==dT9uUwZgS8S14dViz02IMQ
>TaskLabel==eR_vbP7-SWegtFmF8m_Okw
>android-test-android-4.3-arm7-api-15/debug-cppunit
>android-test-android-4.3-arm7-api-15/debug-crashtest-1
>android-test-android-4.3-arm7-api-15/debug-crashtest-3
>android-test-android-4.3-arm7-api-15/debug-mochitest-4
>android-test-android-4.3-arm7-api-15/debug-mochitest-5
>android-test-android-4.3-arm7-api-15/debug-mochitest-media-1
>android-test-android-4.3-arm7-api-15/debug-mochitest-media-2
>android-test-android-4.3-arm7-api-15/opt-mochitest-1
>android-test-android-4.3-arm7-api-15/opt-mochitest-16
>android-test-android-4.3-arm7-api-15/opt-mochitest-18
>android-test-android-4.3-arm7-api-15/opt-mochitest-19
>android-test-android-4.3-arm7-api-15/opt-mochitest-20
>android-test-android-4.3-arm7-api-15/opt-mochitest-chrome
>android-test-android-4.3-arm7-api-15/opt-robocop-2
>android-test-android-4.3-arm7-api-15/opt-robocop-3
>android-test-android-4.3-arm7-api-15/opt-robocop-4
>desktop-test-linux64-asan/opt-mochitest-browser-chrome-1
>desktop-test-linux64-asan/opt-mochitest-clipboard
>desktop-test-linux64-asan/opt-mochitest-media-e10s
>desktop-test-linux64/opt-mochitest-browser-chrome-2
>desktop-test-linux64/opt-mochitest-e10s-4
>
mikeling, this is exactly what I would envision SETA would do with taskcluster.  Given input data about high/low value jobs and recent history on tree, we would use that input and reduce the taskgraph to only run high value jobs.

As long as SETA outputs a straightforward .json blob of data we can verify the taskgraph generator can use that to reduce what is output to the target while running inside the gecko decision task.

^ that is actionable, resolving the inputs will take some creative thinking at the speed/scale we care about.
Flags: needinfo?(jmaher)
Yes, this sounds about right.  However, the tasks should be eliminated late in the generation process (at optimization time).  Without that, it will be impossible to trigger jobs that SETA skipped.
Flags: needinfo?(dustin)
(In reply to Joel Maher ( :jmaher ) from comment #12)
(In reply to Dustin J. Mitchell [:dustin] from comment #13)

Thank you Joel and Dustin, I will keep your words in mind :) Thank you for your help!
Attached patch draft.patch (obsolete) — Splinter Review
Now we remove low value task in optimization stage, and here is what we got:

Aat_oH6CTiiqZ4AvoDqpWA
C7DLh5G4SMeUQ7htrPyIGg
CFLmZJa5ROqXHHbqz6K8fQ
Cf8OXGk0QyudIhwYDO1E3A
DAvbq0WRRsylfWyxF5oR4g
E67TiSVMQTGBI1QXJ9ct2A
EtRKy8qTQXOqNnTSyXod_w
FzTecAvdTiu_oHN5jG0sJg
G-1lhN6cSZu04qENjpT-vg
HJXUPMK3R82KNrJUb3dB4A
HoXrVV0xT7WJ5fZHsHqZFA
I-GWyrl-SOe04QX5CsVdsA
JMi-GPo3S7mHldcHG2C4ZA
KalcE4fjQUWNNvlRQTxcHQ
LG1lGl16SnSQLp26k6r7gA
LKYYn8p9RwGfr3SfgroHyw
PHmTyGRiRKOVxJ-MvhS1ow
PWz938ehTs2wdN5MRuQOhA
SEq2Ph9mRfKaOnGMVf50pg
S_7ut6UzSHOY4H01lETchg
Sic2CPJmQTKce88qecgnUw
UeJ8Yb3NRXe-Sh7Oj98ZUQ
VB2Mw-CmR525VArQpesrZQ
WKxOrc7nQ9KGx__e1_dfew
WgGVnaSVS42FWsvTufPq9Q
X5eGEz4tSsWT1Qdm19jXsA
Y-NW4d60T4-l3rgf9UZcBw
Z3Jfa_uQSG2Clbae1v6-HA
ZgmxrTavRWy79t_ZmFUC8w
argKlzVmQd6LWZI9kl4YQA
bUhYJUwlSG6k9O0hNGjcSg
bY1BCJWHTiGa4MCOVPgEqw
cpApFw9iRue3eB3VURBk7g
dqCkRImoS52jiOsC8xPjAg
fWGxWLTiT_q9k8Gm4nNl5g
BMka4db6Tvudlw6Z-TReew
CCRgnramTgi_yjJWOI14Zw
DAI1_3s_RkmKQEycfP7oog
EVTPd9MMT1SSnTFwHQG0GQ
F_2tnpv3TOKax9qDd9lozA
JL-h4NU5T3-zXFG29ivLpw
Mpew4TeCR6atkIBPxkacVw
NEskoeYjQ6SqibaRhqNcgg
Rmvs8Uj3SMOcsHGuE2gfrw
RvxrR9HUQRalGi9Jb9ljag
V4-yLjFBRtWZyuQCcQr8aA
V5ut4hl2SsWM6iMsX_Nrmw
WFtYMbzsSR-dMTXqR0LGCw
WVS5jZImT1qo--1R7LYJDw
WtnhaaCISKqhZT2Xk09DqQ
XAx2edPJTpSCavBSjkL92w
XBkHNn-jQ4-4WrEYcPg3uQ
YU5a2xBTTPuM6txMAMrL3Q
bndESHrpR8OK1DL_kpRWbQ

which is much shorter :)
Attachment #8774212 - Attachment is obsolete: true
Attachment #8776011 - Flags: feedback?(jmaher)
Attachment #8776011 - Flags: feedback?(dustin)
Comment on attachment 8776011 [details] [diff] [review]
draft.patch

Review of attachment 8776011 [details] [diff] [review]:
-----------------------------------------------------------------

Great start!

However, I think this should be accomplished within the kind implementation (taskcluster/taskgraph/tasks/test.py) rather than in the generic task-graph generation code.
Attachment #8776011 - Flags: feedback?(dustin) → feedback+
Comment on attachment 8776011 [details] [diff] [review]
draft.patch

Review of attachment 8776011 [details] [diff] [review]:
-----------------------------------------------------------------

overall this is a great demonstration and simple code!  I was expecting the optimized tasks to have useful labels like:
desktop-test/opt-mochitest-1

are we optimizing all tests out and just keeping builds?

::: taskcluster/taskgraph/optimize.py
@@ +13,5 @@
>  from slugid import nice as slugid
>  
>  logger = logging.getLogger(__name__)
>  TASK_REFERENCE_PATTERN = re.compile('<([^>]+)>')
> +SETA_DATA_PATH = os.getcwd() + '/taskcluster/seta_data.json'

I am assuming this is temporary- a comment would help.  Or indication of where this data comes from.

@@ +25,5 @@
>      named_links_dict = target_task_graph.graph.named_links_dict()
>      label_to_taskid = {}
> +    with open(SETA_DATA_PATH, 'r+') as data:
> +        high_value_tasks = json.loads(
> +                data.read())['jobtypes'].values()[0]

a few things here:
1) nit: please leave a blank line between code and the next section/comment
2) this assumes a lot of stuff about the data, we need to do error handling here
3) I would prefer low value jobs be specified- in many cases we define a brand new job and while SETA would pick it up the next day, we could be forcing these new jobs to be skipped on accident.

@@ +106,5 @@
>          else:
>              optimized, replacement_task_id = task.optimize()
> +        if label.startswith('desktop-test') or label.startswith('android-test'):
> +            if label not in high_value_tasks:
> +                optimized = True

I would like some comment here explaining:
* why we are doing this
* why desktop-test and android-test only?
* how in the future we would need to solve this to not always optimize out the tasks.
Attachment #8776011 - Flags: feedback?(jmaher)
Attachment #8776011 - Flags: feedback?(dustin)
Attachment #8776011 - Flags: feedback+
Comment on attachment 8776011 [details] [diff] [review]
draft.patch

ack, bugzilla conflicts- I didn't mean to ask :dustin for feedback.
Attachment #8776011 - Flags: feedback?(dustin) → feedback+
(In reply to Dustin J. Mitchell [:dustin] from comment #16)
> Comment on attachment 8776011 [details] [diff] [review]
> draft.patch
> 
> Review of attachment 8776011 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Great start!
> 
> However, I think this should be accomplished within the kind implementation
> (taskcluster/taskgraph/tasks/test.py) rather than in the generic task-graph
> generation code.

hmmm yes, and I had tried making the optimize() return True when it's a low value jobs. But 'optimized' will been set as False before we call task.optimize()(e.g it will been set as False in https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/optimize.py#l89 because we set all as do_not_optimize in the first place)
(In reply to Joel Maher ( :jmaher ) from comment #17)
> Comment on attachment 8776011 [details] [diff] [review]
> draft.patch
> 
> Review of attachment 8776011 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> overall this is a great demonstration and simple code!  I was expecting the
> optimized tasks to have useful labels like:
> desktop-test/opt-mochitest-1
> 
> are we optimizing all tests out and just keeping builds?
> 
> ::: taskcluster/taskgraph/optimize.py
> @@ +13,5 @@
> >  from slugid import nice as slugid
> >  
> >  logger = logging.getLogger(__name__)
> >  TASK_REFERENCE_PATTERN = re.compile('<([^>]+)>')
> > +SETA_DATA_PATH = os.getcwd() + '/taskcluster/seta_data.json'
> 
> I am assuming this is temporary- a comment would help.  Or indication of
> where this data comes from.
> 
> @@ +25,5 @@
> >      named_links_dict = target_task_graph.graph.named_links_dict()
> >      label_to_taskid = {}
> > +    with open(SETA_DATA_PATH, 'r+') as data:
> > +        high_value_tasks = json.loads(
> > +                data.read())['jobtypes'].values()[0]
> 
> a few things here:
> 1) nit: please leave a blank line between code and the next section/comment
> 2) this assumes a lot of stuff about the data, we need to do error handling
> here
> 3) I would prefer low value jobs be specified- in many cases we define a
> brand new job and while SETA would pick it up the next day, we could be
> forcing these new jobs to be skipped on accident.
> 
> @@ +106,5 @@
> >          else:
> >              optimized, replacement_task_id = task.optimize()
> > +        if label.startswith('desktop-test') or label.startswith('android-test'):
> > +            if label not in high_value_tasks:
> > +                optimized = True
> 
> I would like some comment here explaining:
> * why we are doing this
> * why desktop-test and android-test only?
> * how in the future we would need to solve this to not always optimize out
> the tasks.

Sure, I will add those comments :)
(In reply to MikeLing from comment #19)
> hmmm yes, and I had tried making the optimize() return True when it's a low
> value jobs. But 'optimized' will been set as False before we call
> task.optimize()(e.g it will been set as False in
> https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/
> optimize.py#l89 because we set all as do_not_optimize in the first place)

Ah, hm, so this is an interesting question.  We seem to shift our allegiance constantly over try: does "-u all" really mean to run all of the unit tests, or just those tests that someone else thought you should see when you write "-u all" (bug 1244176)?  Does it mean that you actually want *all* of the tests, or just those which SETA has not decided to optimize away?

I think what you're suggesting is that "-u all" means this:
 - not all tests, but the set of tests that someone thought were important enough to be run when you type "-u all"
 - except important tests that SETA considers unimportant
 - but including tests which, based on the source files changed, do not need to be run
all of which is, I think, too complicated.

So, we could go a few different directions here:

 1. Apply all optimizations, even in try -- so if I run "-u robocop" and no Android code has changed, the robocop tests don't run

 2. Never apply optimizations (including SETA) in try, but still define "-u all" to mean some important subset of the tests; this would still leave SETA operational on integration branches, which is (I think?) where it is most useful

 3. Merge SETA and "-u all" importance, so that when users type "-u all", they only get important tests as defined by SETA.  If someone wants to define a particular test as not necessary in "-u all", they can insert that setting into the SETA database.  But if a user types "-u mochitest,robocop", they'll get all mochitest and robocop tests, regardless of SETA or any notion of importance.  We could also use the SETA data to pick the target tasks for integration branches.  Note that this involves changing the target set, rather than optimization.

Looked at a different way, I think the open questions (for Joel, perhaps?) are

 A. Is SETA meant to limit the tests run on try, or on integration branches?
 B. On try, should SETA override the user's request?
short answer is no- SETA is not compatible with try.

SETA isn't ideal for try server is that it is optimized for providing jobs on specific platforms in weighted order based on our machine pool sizes and load.  In addition, SETA only runs the minimum set of jobs which have been proven to find regressions on integration trees- I suspect there are many other *high value* test jobs which exist but try server ends up finding the issues prior to landing on an integration branch.

On try server we do not run all tests with -u all for a few reasons:
1) we have load issues on a specific platform
2) tests are experimental or known to be unstalbe
3) builds/tests are not run by default (for example code coverage)
4) builds/tests are not run regularly (screenshot, pgo, nightlies, jobs on specific branches such as ash)

So there is already a mystery involved in getting all tests to actually run on try server.  The question is do we want to make it easier to use some data (whether it is SETA or not) to provide default jobs on try server?  I think in the long term future, this answer is YES.  As for now and the next 6 months, I don't see an easy option for using data to make try server useful with a reduced set of jobs.  If we were to study try server and failures on integration branches, we could see what failures are caught on try, and likewise why failures are landing on integration branches.  Concluding that study, I could see a realistic path forward with that data to make logical recommendations about reducing jobs on try server.  Quite possibly the same inputs/outputs of SETA could be used, just with much different parameters.

Keep in mind that SETA as it is designed is only useful on:
mozilla-inbound
fx-team
autoland (this should be valid, but there are different patterns on autoland so we need to validate)

SETA is not currently useful on:
mozilla-central
mozilla-aurora
mozilla-beta
mozilla-release
esr-XX
try
twigs (cedar, ash, pine, etc.)
Attached patch draft.patch (obsolete) — Splinter Review
Hi jmaher and dustin, I still got some questions in here.

1) If we need to query data from seta for each push and doesn't return low value jobs for each 5 pushes, we need to find a way to identify request from taskcluster. 

2) Do we need to recognise the branch of push? Because the seta doesn't work for try and some other branches.

3) About the timeout and retry handle in the patch, I'm not sure it's a good way to deal with it. Please give me some advices about that! Thank you :)
Attachment #8776011 - Attachment is obsolete: true
Attachment #8776936 - Flags: feedback?(jmaher)
Attachment #8776936 - Flags: feedback?(dustin)
Comment on attachment 8776936 [details] [diff] [review]
draft.patch

Review of attachment 8776936 [details] [diff] [review]:
-----------------------------------------------------------------

::: taskcluster/taskgraph/generator.py
@@ +217,5 @@
> +            response = retry(requests.get, attempts=2, sleeptime=10,
> +                             args=(SETA_ENDPOINT % time, ),
> +                             kwargs={'timeout': 5})
> +            low_value_tasks = json.loads(response.content)['jobtypes'].values()[0]
> +        except:

`except Exception` or, better, a more specific exception

@@ +223,5 @@
> +            try:
> +                with open(SETA_DATA_PATH, 'r+') as data:
> +                    low_value_tasks = json.loads(
> +                        data.read())['jobtypes'].values()[0]
> +            except:

same

::: taskcluster/taskgraph/optimize.py
@@ +109,5 @@
> +        # if length of the low value jobs is 0.
> +        if len(low_value_tasks) > 0:
> +            if label.startswith('desktop-test') or label.startswith('android-test'):
> +                if label in low_value_tasks:
> +                    optimized = True

My objection from the last patch still stands: this should be handled within the test task implementation, rather than in the generic taskgraph code.
Attachment #8776936 - Flags: feedback?(dustin)
(In reply to MikeLing from comment #23)
> 1) If we need to query data from seta for each push and doesn't return low
> value jobs for each 5 pushes, we need to find a way to identify request from
> taskcluster. 

If you consider the "each 5" operation to be probabilistic, then there's no need to track SETA requests at all.  Or perhaps I've misunderstood the question.

> 2) Do we need to recognise the branch of push? Because the seta doesn't work
> for try and some other branches.

We should have a parameter which controls whether seta is applied, and set that parameter differently based on the branch.

> 3) About the timeout and retry handle in the patch, I'm not sure it's a good
> way to deal with it. Please give me some advices about that! Thank you :)

I think the retry is reasonable.  I don't know where the fallback data would come from, though, so I think that if the SETA requests fail repeatedly the better next step would be to just assume that all tasks are high-value.  It's better to run tasks than not, in this sort of failure situation.
(In reply to Dustin J. Mitchell [:dustin] from comment #24)
> My objection from the last patch still stands: this should be handled within
> the test task implementation, rather than in the generic taskgraph code.

OK, I will find a way to implement that inside the test task. :)

> If you consider the "each 5" operation to be probabilistic, then there's no need to track SETA requests 
> at all.  Or perhaps I've misunderstood the question.

Hmm, So I think the whole logic in here is: We no longer let taskcluster to deal with the push skip(we want every 5 push as full task graph or every 90 minutes), the taskcluster just query job list from seta for each push and let seta to control it(return a empty list for every 5 push I think). But, in order to get there,  we need to track the request from taskcluster and distinguish its requests from others.

I'm afraid I haven't make myself clear, let me draw a flow:

taskcluster request -> seta ->(if it's the 5th request) ->(return low value jobs list if it's not the 5th request and return [] if it is which mean don't ignore the low value jobs)

request from other app -> seta -> just return a low value jobs
in this model we would need:

taskcluster: requestSetaJobs(branchname)
seta: recordRequestForBranch(branchname); ifFiveRequestsOrNinetyMinutes(branchname): return allJobs(), else return highValueJobs()


I agree with Dustin, that if we have a timeout, etc. we just default to running all jobs- I would like to make the timeout/fallback robust and sort of fast- so that we don't waste 5 minutes timing out for each decision task if there is an outage for a couple hours.
For the one-in-five-requests, we don't need any tracking.  Replace "if it's the 5th request" with "if random() < 0.2".  And keep in mind that this is per-task.  A particular push might run task A, B, and C but skip task D.

The "every 90 minutes" bit is a little harder, but can also be interpreted a little more loosely.  That is, if two decision tasks occur within a few minutes of each other, say 91 and 93 minutes after the last build of task E, then it's OK for them both to schedule a new task.  However, in this case please use separate API calls for requesting information about SETA jobs and reporting that SETA jobs have been scheduled.  That will allow "dry runs" to only request information, without confusing the SETA service.  I *think* that's what you were suggesting regarding identifying requests from taskcluster?
in fact we could do if rand() < 0.2 inside the decision task, and modify our request accordingly.  That breaks with the 90 minute rule though, so we either query the pushlog, or use SETA to track per branch/job.  Given the fact that SETA will not know if we manually backfill a job, this current design will either schedule the low value jobs as: all or none.

While that isn't as perfect as accounting for backfilled jobs, it does ensure a full run on a more concrete schedule and it doesn't require SETA to track the current state of each job/branch for when it was run.  In fact, we could just track it once/branch for the 3 branches we care about.
Worth noting that we also discussed (I've forgotten where) making the 90-minute delay probabalistic.  The idea was that the probability of running a low-importance task would rise from its floor of 20% or 14% or whatever at 0 minutes after the last run, to 100% 90 minutes after the last run.  That would mean that the SETA service has two operations:

 1. generate probabilities for all tasks (based on importance and time since last run on this branch)
 2. update last-run times

#1 would be a simple API call from the decision task, which could then apply the probabilities as Joel described.  #2 could be a different API call, also from the decision task, but only made when actually calling createTask().  It might also be possible to implement #2 using a pulse listener similar to the coalescer (although, would we listen for tasks to be scheduled? pending? running? completed?)
Comment on attachment 8776936 [details] [diff] [review]
draft.patch

Review of attachment 8776936 [details] [diff] [review]:
-----------------------------------------------------------------

a lot of small things I added, but overall this is moving in the right direction!

::: taskcluster/taskgraph/generator.py
@@ +16,5 @@
>  from .optimize import optimize_task_graph
>  from .util.python_path import find_object
>  
>  logger = logging.getLogger(__name__)
> +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?taskcluster=1&date=%s"

nit: having a CONST with a variable to replace can be confusing- we might want to consider adding &date=%s to the url when actually using it- and ideally making the seta endpoint default to the current date.

@@ +19,5 @@
>  logger = logging.getLogger(__name__)
> +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?taskcluster=1&date=%s"
> +# It's a temporary path to store seta data as a fallback. We will dicuss more about
> +# where we should store it.
> +SETA_DATA_PATH = os.getcwd() + '/taskcluster/seta_data.json'

can we assume os.getcwd() returns a value?  I don't know what user/service taskcluster runs as- I think it will be ok.

@@ +204,5 @@
>          yield 'target_task_graph', target_task_graph
>  
>          logger.info("Generating optimized task graph")
>          do_not_optimize = set()
> +        # The low_value_tasks is a list of jobs which seta consider unnecessary to run

nit: add a space between previous line of code and the comments.

@@ +220,5 @@
> +            low_value_tasks = json.loads(response.content)['jobtypes'].values()[0]
> +        except:
> +            logger.info("SETA server is timeout and we will switch to use backup seta data.")
> +            try:
> +                with open(SETA_DATA_PATH, 'r+') as data:

I see we have this in try/except- there is a high chance that if we fail to get SETA endpoint we won't have this backup file.  One concern here is if we fail completely, I would like to avoid SETA optimizations so we run all tasks by default.  I am not sure if we have that programmed in here.
Attachment #8776936 - Flags: feedback?(jmaher) → feedback+
Attached patch draft.patch (obsolete) — Splinter Review
Hey dustin, I know we have a lot of things need to discuss and we may need to override the whole work here. But Joel think a prototype is needed as a starting point for the further work in next couple of months and I think we need this patch to wrap up GSoC project. 

I made optimize happened inside the test task and fix some places to make seta data only used in 'mozilla-inbound', 'fx-team', 'autoland' branches. Add exception to catch and add some comments for it.
  
Thank you very much for review this and feedback! :)
Attachment #8776936 - Attachment is obsolete: true
Attachment #8779997 - Flags: feedback?(dustin)
Attached patch draftv2.patch (obsolete) — Splinter Review
Sorry, I should recheck it before submit the patch. I remove those redundant blanks. So I just don't obsoletes the last patch and ask feedback for it again. :)
Comment on attachment 8779997 [details] [diff] [review]
draft.patch

Review of attachment 8779997 [details] [diff] [review]:
-----------------------------------------------------------------

::: taskcluster/taskgraph/generator.py
@@ +18,5 @@
>  from .util.python_path import find_object
>  
>  logger = logging.getLogger(__name__)
> +# It's a list of project name which SETA is useful on
> +SETA_REPO = ['mozilla-inbound', 'fx-team', 'autoland']

This should probably be SETA_PROJECTS.

@@ +191,5 @@
>                  if task.label in all_tasks:
>                      raise Exception("duplicate tasks with label " + task.label)
>                  all_tasks[task.label] = task
>              logger.info("Generated {} tasks for kind {}".format(len(new_tasks), kind_name))
> +        seta_realted = False

`seta_related`

@@ +202,5 @@
> +            # if length of the low value jobs is 0.
> +            for task_label in all_tasks.keys():
> +                if task_label.startswith('desktop-test') or task_label.startswith('android-test'):
> +                    if task_label in low_value_tasks:
> +                        all_tasks[task_label].high_value = False

This is far too specific to this particular kind and its implementation.  This file (`generator.py`) should not know anything about specific kinds or details of their implementations.

You have hit on an interesting and as-yet-unanswered question, though: how, in a general fashion, do we gather external information that will be necessary for optimization?  There are several types of information we might need: index API lookups, SETA downloads, perhaps a last-good-build service, and so on.  The index API lookups, unlike SETA, will require some information from the kinds (specifically, they will require the index paths to look up!), so they can't be done up-front like this.

One option is for the kind to call the API endpoint on the first request, and then cache that data in memory and use the cache for the next request.  Then whatever test task is created first will perform the request, with subsequent test tasks just using the cached data.

::: taskcluster/taskgraph/optimize.py
@@ +104,5 @@
> +        # if it's a test task and its branch is seta related, we will just consider seta's choice
> +        if seta_realted:
> +            if label.startswith('desktop-test') or label.startswith('android-test'):
> +                optimized, replacement_task_id = task.optimize()
> +

I don't understand why this hunk is here -- the clause just above already calls `task.optimize`

::: taskcluster/taskgraph/task/test.py
@@ +156,5 @@
>      def optimize(self):
> +        if self.high_value:
> +            return False, None
> +        else:
> +            return True, None

So this *always* optimizes away low-value tasks?

This is where all of the action should be: if this is a SETA-governed tree, then download the SETA data (if not already downloaded), look up this label in that data to determine whether it's high or low value, and if low value, make the appropriate decision as to whether to optimize or not.
Attachment #8779997 - Flags: feedback?(dustin)
(In reply to Dustin J. Mitchell [:dustin] from comment #34)

> ::: taskcluster/taskgraph/optimize.py
> @@ +104,5 @@
> > +        # if it's a test task and its branch is seta related, we will just consider seta's choice
> > +        if seta_realted:
> > +            if label.startswith('desktop-test') or label.startswith('android-test'):
> > +                optimized, replacement_task_id = task.optimize()
> > +
> 
> I don't understand why this hunk is here -- the clause just above already
> calls `task.optimize`

Yes I know, but as I said in https://bugzilla.mozilla.org/show_bug.cgi?id=1287018#c19, if we don't re-estimate if this task should been optimized or not, the logic will jump out and set 'optimized' as 'False' before we run the last else clause. That's why I **still wondering** if it's a good choice to use optimize() to return 'True' or 'False' even I know it's make more logic sense and optimize() is been designed to check whether this task should been optimized in the first place

I thought maybe we should just use seta's data to create a optimized taskgraph if it's seta related and a test task and override other conditions like if it's blacklisted. Otherwise, it's won't been optimized no matter it's a low value or not.
Do you mean in the case where the task is in existing_tasks?  In that case, SETA does not apply because the task is not going to be re-created anyway (this occurs in action tasks).

What other return values would you like to see from optimize()?
Attached patch draft.patch (obsolete) — Splinter Review
Attachment #8779997 - Attachment is obsolete: true
Attachment #8779998 - Attachment is obsolete: true
(In reply to Dustin J. Mitchell [:dustin] from comment #36)
> Do you mean in the case where the task is in existing_tasks?  In that case,
> SETA does not apply because the task is not going to be re-created anyway
> (this occurs in action tasks).
> 
> What other return values would you like to see from optimize()?

hmm, I think I haven't make myself clear in those comments above. So I submit another PR and make all the things happened within task/test.py.

In the patch I just submit, you could see we define all the tasks as high value jobs as default and don't optimize it by making optimize() return (False, None). Then we query from seta server and get a bunch of low value jobs and pick up those low value jobs, flag those task as low value job by making its high_value_job equal to (True, None) which mean it will be optimized.

Then, we meet problem in optimize.py when we try to drop the low value task. The test task won't been dropped, no matter the task's priority is 'low' or 'high'. The optimize process won't goes to https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/optimize.py#l100 because the logic will jumps out in https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/optimize.py#l89 or  https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/optimize.py#l92. So I couldn't get a  shorter task list or test if the optimize works by making optimize() return 'True' or 'False'. 

Does it make sense to you? Please tell me if I misunderstand anything! Thank you :)
Flags: needinfo?(dustin)
This patch looks pretty good!  It's isolated to `test.py`, which I like a lot :)

As for the problem, you're right.  The logic in the first part is working as expected: if the jobs are in the do_not_optimize set, then we do not want to optimize them even if they are low value.

However, the dependencies check is not compatible with optimizing tests.  When I wrote that, the idea was that, if we perform a build, we'll naturally want to perform all of the tests that depend on that build.  This was intended as a short-cut to avoid the possibly expensive `task.optimize` call for a lot of tasks in the task graph.  It turns out, with SETA, that's not a valid short-cut.  So I think we could just drop the stanza:

        # if any dependencies can't be optimized, this task can't, either
        elif any(not t.optimized for t in dependencies):
            optimized = False

given that, I think this would work for you, right?  It won't work on try, since the `-u all` in try syntax puts all unittest tasks into do_not_optimize, but that's OK -- try isn't in SETA_PROJECTS anyway.
Flags: needinfo?(dustin)
Attached patch draft.patch (obsolete) — Splinter Review
I hope this patch can works better :)
Attachment #8780519 - Attachment is obsolete: true
Attachment #8780773 - Flags: feedback?(dustin)
Comment on attachment 8780773 [details] [diff] [review]
draft.patch

Review of attachment 8780773 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good.  I've gotten down into nitpicks of formatting and comments, but with those updated I'm happy to see this merged.

Joel, does this match your expectations, where the SETA service itself is responsible for the "each 5 pushes" and "90 minutes" bits?  You're OK with counting runs of `./mach taskgraph optimized` by developers as among those five pushes?

::: taskcluster/taskgraph/task/test.py
@@ +37,5 @@
>              raise Exception("TestTask kinds must have exactly one item in kind-dependencies")
>          dep_kind = config['kind-dependencies'][0]
>  
> +        # The low_value_tasks is a list of jobs which seta consider unnecessary to run
> +        # and it will be empty for each 5 pushes or 90 minutes to run 'all' tasks.

Please omit the second line (as when the SETA service changes, this may no longer be accurate).  Please also capitalize SETA in the first line.

@@ +45,5 @@
> +            # If this task is in the low value jobs which determined by seta,
> +            # we consider it's a low value job and need to been optimized(dropped).
> +            # Furthermore, due to we only have desktop-test and andriod-test on seta
> +            # for now, it's desktop-test and android-test only.And we just skip this step
> +            # if length of the low value jobs is 0.

I'm not sure what step this is referring to.  I think this comment should read

  # Request the set of low value tasks from the SETA service.  Low value tasks will be
  # optimized out of the task graph.

@@ +48,5 @@
> +            # for now, it's desktop-test and android-test only.And we just skip this step
> +            # if length of the low value jobs is 0.
> +            time = datetime.datetime.now().strftime('%Y-%m-%d')
> +
> +            # It will return 404 after try 2 times, each try will last 5 seconds and

# Try to fetch the SETA data twice, falling back to an empty list of low value tasks

@@ +51,5 @@
> +
> +            # It will return 404 after try 2 times, each try will last 5 seconds and
> +            # there are 10 seconds between each try.
> +            try:
> +                logger.info("It's going to retrieve low value jobs list from seta.")

logger.debug("Retrieving low-value jobs list from SETA")

@@ +57,5 @@
> +                                 args=(SETA_ENDPOINT % time, ),
> +                                 kwargs={'timeout': 5})
> +                low_value_tasks = json.loads(response.content)['jobtypes'].values()[0]
> +            except exceptions.Timeout:
> +                logger.info("SETA server is timeout, we will treat all test tasks as high value.")

logger.warning

@@ +182,2 @@
>          self.dependencies = task['dependencies']
> +        self.high_value_task = False if task['label'] in low_value_tasks else True

Since we have a list of low value tasks, it is confusing to use high_value here.  How about

    self.low_value_task = task['label'] in low_value_tasks

@@ +191,5 @@
> +        # and we wouldn't optimize it. Otherwise, it will return 'True, None'
> +        if self.high_value_task:
> +            return False, None
> +        else:
> +            return True, None

if self.low_value_task:
    # Always optimize away low-value tasks
    return True, None
else:
    return False, None
Attachment #8780773 - Flags: review+
Attachment #8780773 - Flags: feedback?(jmaher)
Attachment #8780773 - Flags: feedback?(dustin)
Comment on attachment 8780773 [details] [diff] [review]
draft.patch

Review of attachment 8780773 [details] [diff] [review]:
-----------------------------------------------------------------

I agree with Dustin's comments, lets address them and file a bug to track reducing the hardcoded values here.

Regarding the question if this is what I expect- yes, SETA endpoint seems like the logical place for this given our current thoughts.  Possibly we can configure things in-tree in the future and might have to rethink this.  There are more advantages to having this figured out at the api endpoint (multiple priorities, different data sets, etc.)

I don't like how we could lose granularity here with ./mach taskgraph optimized, but I suspect that is not run very often, so I view that as slight noise in the equation.  If we find that this is happening more often, then we could figure out a clean solution to avoid that.

::: taskcluster/taskgraph/task/test.py
@@ +20,5 @@
>  
>  logger = logging.getLogger(__name__)
> +# It's a list of project name which SETA is useful on
> +SETA_PROJECTS = ['mozilla-inbound', 'fx-team', 'autoland']
> +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?taskcluster=1&date=%s"

it would be great if we could get a list of projects from the SETA server, so this isn't hardcoded in many systems.  Also the endpoint is the seta-dev, ok for now- I see in the near future this will need to be a production endpoint.
Attachment #8780773 - Flags: feedback?(jmaher) → feedback+
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
I update the patch based on dustin's comment, but I propose not to merge this patch for now due to we don't have a push skip strategy on the server side for now :)
Comment on attachment 8782247 [details] [diff] [review]
patch_for_bug_1287018.patch

I don't think this is the patch you meant to upload :)
By the way, I just tried to use `redo` as you did, and got an ImportError, because it is not included in `build/mach_bootstrap.py`.  Did that import for you??
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
I'm very sorry. I believe it's the right one :P

And yes, I can import the redo directly. :)
Attachment #8782247 - Attachment is obsolete: true
(In reply to Dustin J. Mitchell [:dustin] from comment #45)
> By the way, I just tried to use `redo` as you did, and got an ImportError,
> because it is not included in `build/mach_bootstrap.py`.  Did that import
> for you??

ok, I just check the mach_bootstrap.py and haven't found redo package. But the patch(I just commit) still works for me even after I use **hg pull** and **hg update** to refresh my codebase. Honestly, I'm not sure why it works.
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
Let me ask feedback from Joel first, and ask dustin and Joel for review when Joel think it's ready to go :)
Attachment #8780773 - Attachment is obsolete: true
Attachment #8798065 - Flags: feedback?(jmaher)
Comment on attachment 8798065 [details] [diff] [review]
patch_for_bug_1287018.patch

Review of attachment 8798065 [details] [diff] [review]:
-----------------------------------------------------------------

I like how this is shaping up- isolating the code and keeping the integration simple.

::: taskcluster/taskgraph/util/load_low_value_jobs.py
@@ +11,5 @@
> +
> +
> +logger = logging.getLogger(__name__)
> +
> +SETA_ENDPOINT = "http://0.0.0.0:8157/data/setadetails/?taskcluster=1&date=%s&priority=5"

this should point to a real server, also I think it should be:
SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?taskcluster=1&branch=%s"

if we do this, then the time isn't needed and we would have to figure out how to get the branch for each query.
Attachment #8798065 - Flags: feedback?(jmaher) → feedback+
(In reply to Joel Maher ( :jmaher) from comment #50)

> if we do this, then the time isn't needed and we would have to figure out
> how to get the branch for each query.

hmm, why we need to remove the time? How could we make sure the job list we got is the latest data if we don't have date?
would we assume the latest date/time for each query?  We do need the branch and by using just the branch we can return the latest set of jobs which match the priority based on the counter for the specific branch.
(In reply to Joel Maher ( :jmaher) from comment #52)
> would we assume the latest date/time for each query?  We do need the branch
> and by using just the branch we can return the latest set of jobs which
> match the priority based on the counter for the specific branch.

ok, I found I was wrong. Yeah, we could remove date :) Thank you for your help!
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
Hope it's good to go :)
Attachment #8783793 - Attachment is obsolete: true
Attachment #8798065 - Attachment is obsolete: true
Attachment #8798115 - Flags: review?(jmaher)
Attachment #8798115 - Flags: review?(dustin)
Comment on attachment 8798115 [details] [diff] [review]
patch_for_bug_1287018.patch

Review of attachment 8798115 [details] [diff] [review]:
-----------------------------------------------------------------

feedback is good, review needs more work.

::: taskcluster/taskgraph/util/load_low_value_jobs.py
@@ +11,5 @@
> +
> +
> +logger = logging.getLogger(__name__)
> +
> +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?taskcluster=1&branch=%s&priority=5"

we actually do not need priority since the setadetails endpoint will return high/low value jobs based on the number of requests (and dates) for the input branch.

@@ +27,5 @@
> +    # There are 10 seconds between each try.
> +    try:
> +        logger.debug("Retrieving low-value jobs list from SETA")
> +        response = retry(requests.get, attempts=2, sleeptime=10,
> +                         args=(SETA_ENDPOINT % time, ),

here you are adding time, but we need branch, we need to figure out where to get branch.

@@ +29,5 @@
> +        logger.debug("Retrieving low-value jobs list from SETA")
> +        response = retry(requests.get, attempts=2, sleeptime=10,
> +                         args=(SETA_ENDPOINT % time, ),
> +                         kwargs={'timeout': 5})
> +        low_value_tasks = json.loads(response.content)['jobtypes'].values()[0]

I would like a comment for this like to indicate that the [0] will get us the 'date' value that shows up in the response.
Attachment #8798115 - Flags: review?(jmaher) → review-
Comment on attachment 8798115 [details] [diff] [review]
patch_for_bug_1287018.patch

Due to the bug found by joel, I need to cancel this review until I fix it. Sorry about this :(
Attachment #8798115 - Flags: review?(dustin)
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
You could test it after https://github.com/mozilla/ouija/pull/159 on github been merged :)
Attachment #8798115 - Attachment is obsolete: true
Attachment #8798855 - Flags: feedback?(jmaher)
this isn't working because seta gives a 500 error:
2016-10-07T14:39:19.340383+00:00 app[web.1]:   File "/app/src/server.py", line 478, in run_seta_details_query
2016-10-07T14:39:19.340384+00:00 app[web.1]:     delta = (time_of_now - time_string).total_seconds()
2016-10-07T14:39:19.340384+00:00 app[web.1]: TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'str'


it appears that time_string is a string vs a datetime object.


in addition, we should not be passing in priority=5, that decision on what priority to use needs to be decided by the endpoint.  I also see we are passing in taskcluster=1 as a urlparam, this is actually not used anywhere- possibly we can remove that from the taskgraph code and from the ouija code :)
Attachment #8798855 - Flags: feedback?(jmaher)
(In reply to Joel Maher ( :jmaher) from comment #58)
> in addition, we should not be passing in priority=5, that decision on what
> priority to use needs to be decided by the endpoint.  

I think we could make  priority default to 5 so we can remove the "priority=5" https://github.com/mozilla/ouija/pull/160/commits/eebd524e21bf94cd20cf5705db6b9cb69552cfd4

> I also see we are passing in taskcluster=1 as a urlparam, this is actually not used anywhere-
> possibly we can remove that from the taskgraph code and from the ouija code
> :)

hmm, we need to get the task id list rather than job list depend on make 'taskcluster=1'. https://github.com/mozilla/ouija/blob/master/src/server.py#L535
could we use the user agent to detect taskcluster?
(In reply to Joel Maher ( :jmaher) from comment #60)
> could we use the user agent to detect taskcluster?

yep, but one thing need to mentioned is other application will be treated as Taskcluster if using same user agent.
true- we already have other logic to do taskcluster specific work based on the user-agent, I think we can assume all taskcluster logic will be used with user-agent=='taskcluster'
(In reply to Joel Maher ( :jmaher) from comment #62)
> true- we already have other logic to do taskcluster specific work based on
> the user-agent, I think we can assume all taskcluster logic will be used
> with user-agent=='taskcluster'

hmm, so you mean we no longer need 'taskcluster==1'? If so, I think thing will become hard to tracing if we move all the logic about taskcluster into the ser-agent=='taskcluster'. For example, someone want to test if the task list can been achieved appropriately, he(she) must fake a taskcluster agent to test the taskcluster endpoint if we remove the 'taskcluster==1'
good point- so the count logic only works for useragent==taskcluster only.  Could we make the logic for returning taskids for taskcluster==1 or useragent==taskcluster?  then we could avoid passing in taskcluster=1 to simplify the url, but have an option for testing and fine tuning.
(In reply to Joel Maher ( :jmaher) from comment #64)
> good point- so the count logic only works for useragent==taskcluster only. 
> Could we make the logic for returning taskids for taskcluster==1 or
> useragent==taskcluster?  then we could avoid passing in taskcluster=1 to
> simplify the url, but have an option for testing and fine tuning.

hmmm, we could have it, for now, by changing the ‘taskcluster’ to 1 if the useragent==taskcluster. e.g:

> if useragent==taskcluster:
>     ......
>     taskcluster = 1
>     ......

It's not a beautiful way to deal with it, but I think it can simplify the endpoint :). Is that ok?
I think that is better than an or condition- but you choose, either way is fine- please add a comment to indicate why we are doing this.
(In reply to Joel Maher ( :jmaher) from comment #66)
> I think that is better than an or condition- but you choose, either way is
> fine- please add a comment to indicate why we are doing this.

I create a new pr in https://github.com/mozilla/ouija/pull/161 :)

I will update the pr after merge that.
merged, I will be afk for a few hours- looking forward to the updated patch for this bug.
Attached patch bug (obsolete) — Splinter Review
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
Hope it can works well :)
Attachment #8798855 - Attachment is obsolete: true
Attachment #8799127 - Attachment is obsolete: true
Attachment #8799128 - Flags: feedback?(jmaher)
Comment on attachment 8799128 [details] [diff] [review]
patch_for_bug_1287018.patch

Review of attachment 8799128 [details] [diff] [review]:
-----------------------------------------------------------------

overall I like this, a few nits, lets clean these up and get :dustin and myself for review.

::: taskcluster/taskgraph/util/load_low_value_jobs.py
@@ +21,5 @@
> +class LowValueTasks:
> +    TASKS = []
> +
> +    @classmethod
> +    def query_low_value_jobs(cls, branch):

nit: class is LowValueTasks and we have the method name with 'jobs' in it- lets make it 'tasks'

@@ +38,5 @@
> +                response = retry(requests.get, attempts=2, sleeptime=10,
> +                                 args=(url, ),
> +                                 kwargs={'timeout': 5, 'headers': headers})
> +                low_value_tasks = json.loads(response.content)['jobtypes'].values()[0]
> +            except exceptions.Timeout:

this only catches a timeout, we need this to be 100% perfect.  while this does the job great for a timeout and for a working case, what about a 404, 500, or other error where we get a response, I can easily see a 'KeyError' for jobtypes since it doesn't exist if I hack the url to be setadetail instead of setadetails.
Attachment #8799128 - Flags: feedback?(jmaher) → feedback+
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
Attachment #8799128 - Attachment is obsolete: true
Attachment #8799205 - Flags: review?(jmaher)
Attachment #8799205 - Flags: review?(dustin)
Attached patch patch_for_bug_1287018.patch (obsolete) — Splinter Review
Sorry, I commit a wrong patch :(
Attachment #8799205 - Attachment is obsolete: true
Attachment #8799205 - Flags: review?(jmaher)
Attachment #8799205 - Flags: review?(dustin)
Attachment #8799208 - Flags: review?(jmaher)
Attachment #8799208 - Flags: review?(dustin)
Comment on attachment 8799208 [details] [diff] [review]
patch_for_bug_1287018.patch

Review of attachment 8799208 [details] [diff] [review]:
-----------------------------------------------------------------

this is looking much better, thanks for addressing my previous feedback.  

I think for deployment purposes, I would like to limit this to 1 branch and watch it for a week.  This way if there are problems, we can quickly address them or turn them off while reducing impact on developers.  Probably the best branch would be autoland.

::: taskcluster/taskgraph/util/load_low_value_jobs.py
@@ +14,5 @@
> +}
> +
> +# It's a list of project name which SETA is useful on
> +SETA_PROJECTS = ['mozilla-inbound', 'fx-team', 'autoland']
> +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?branch=%s"

I would like to promote this to the non dev instance and use that.
Attachment #8799208 - Flags: review?(jmaher) → review+
(In reply to Joel Maher ( :jmaher) from comment #64)
> good point- so the count logic only works for useragent==taskcluster only. 
> Could we make the logic for returning taskids for taskcluster==1 or
> useragent==taskcluster?  then we could avoid passing in taskcluster=1 to
> simplify the url, but have an option for testing and fine tuning.

I think this is really magic and likely to lead to confusion.  In a few weeks when I've forgotten this comment and I'm trying to debug SETA, I will probably use curl, and I will see different data than the taskgraph code sees, and I'll spend an hour trying to figure out why.

It's also likely to break if the user-agent changes, and until this point we haven't done anything to be careful about user-agents in requests.

What's the advantage of this hidden magic, that justifies the future confusion?
great question :dustin, my goal is to reduce options and complexity- ideally 1 option to determine if we are in taskcluster.  Right now it is user-agent, should it be a urlparam?  You do offer a great perspective that debugging in the future will be less confusing if we have everything in a urlparam.

:mikeling, any thoughts?
Comment on attachment 8799208 [details] [diff] [review]
patch_for_bug_1287018.patch

Review of attachment 8799208 [details] [diff] [review]:
-----------------------------------------------------------------

This is looking good!

Is the intent to always optimize low-value tasks away?  I thought we wanted to run them sometimes, just not on every push?  Or is that "sometimes-ness" being handled on the seta server side?

From a design perspective, you've introduced a hidden dependency in here by sticking a call to query_low_value_tasks into the generation function (at a pretty random spot, and between a comment and the code the comment describes).  Later uses of LowValueTasks.TASKS assume that this function has been called, which is the hidden dependency.  This also amounts to using a global variable (even if it is a class variable), and access to that global variable occurs from multiple locations, which can lead to a lot of confusion and subtle bugs.

From a code style perspective, the use of a class which is never instantiated is very unusual in Python (although common in Java).  Also, all-uppercase identifiers (TASKS) are generally only used for constants, yet in this case you are modifying that value.

I guess the idea with naming the module "low_value_tasks" is that eventually this could incorporate other sources of task value, aside from SETA?  Actually, now I see that it's named "load_low_value_tasks" -- that sounds like a decent function name, but not a good module name.

I think a better design here would be to have a single module-level function that gets the list of low value tasks, caching it in a global variable only referenced in that module so that multiple calls do not repeat the HTTP requests.  Even better might be to have a single module-level function that, given a label, returns True if it is low value.  Then you would only need to change transform.py to call `<module>.is_low_value_task(self.label)` in the optimize function.  Notably, with this arrangement there is no need for a change to generator.py.
Attachment #8799208 - Flags: review?(dustin) → review-
thanks for the review Dustin.  I am glad we are to the point of these kind of details when dealing with getting SETA working for TaskCluster!

Regarding low value jobs and running them sometimes, this is taken care of on the seta server endpoint.  The idea for a fallback is that if the low_value_jobs == [], then we do not optimize (remove) jobs from the taskgraph.  In the case of the endpoint, if we want to run all the jobs for a given push, we can return [] to assign to low_value_jobs.

I like your suggestions here- I am sure MikeLing can update the patch or ask questions as needed.
(In reply to Dustin J. Mitchell [:dustin] from comment #77)

Hi dustin, thank you for your review :)

>  Notably, with this arrangement there is no need for a change to generator.py.

Yeah, actually I remember you said that before and I had noticed those issues you point out above when I commit this patch. But the thing is, we need to include 'branch name' in the request url(counter the request number for each branch in SETA). I don't know where to get that branch name in taskcluster if I don't call query_low_value_tasks in generator.py, and that's why I create a module named load_low_value_tasks but use it in different places(I mean call query_low_value_tasks in generator.py but use the TASKS in transform.py).

on the other hand, if we don't use query_low_value_tasks or other function to get branch name and init tasks list in generator.py, it will request tasks list for every kind of tasks if we query task list in tasks.py or it make request for every tasks in we do the request in transform.py.

>I think a better design here would be to have a single module-level function that gets the list of low >value tasks, caching it in a global variable only referenced in that module so that multiple calls do not >repeat the HTTP requests.

hmmm, I haven't got the idea of 'caching it in a global variable only referenced in that module'? Is that mean we should stop using the LowValueTasks class in load_low_value_job.py and just create a variable named (like) LOW_VALUE_TASKS and use query_low_value_tasks() to init it? Would it case a race condition(competition condition) when we have multiple pushes been scheduled at the same time?

> Even better might be to have a single module-level function that, given a label, returns True if it is >low value. 

I think that's required we maintain a tasks table for each kind of tasks and we just init it's task label for each push. If so, it's a pretty expensive way to make it happened I think.
I am not very clear on the next steps here.  The last few comments refer to modules, kinds, tasks- but I am not sure what that really means.

SETA is only designed to work on test tasks, not build, lint, or any other non Firefox test jobs.  If there is a desktop-test/android-test module to use, I am fine with that :)

We do need to ensure that we only call SETA once per gecko decision task, and that it includes the branch name.  Other than that, no data is needed.

:dustin, can you be more specific about what you mean by module in: <module>.is_low_value_task(self.label)?

:mikeling, can you explain more about what you see as a tasks table for each kind of task?
Flags: needinfo?(sabergeass)
Flags: needinfo?(dustin)
<module> meaning Python module.  I used that since I had earlier suggested changing the name of the module.  Let's call it "taskcluster.util.seta", in which case the function I'm suggesting would be "taskcluster.util.seta.is_low_value_task(self.label, config.params)"

Passing `config.params` there allows `is_low_value_task` to pass the branch (`config.params['project']`) along to the SETA server.
Flags: needinfo?(dustin)
Attached patch patch_for_bug_1287018_v2.patch (obsolete) — Splinter Review
I keep the LowValueTasks class actually, due to it still makes mutiple requests if I just define TASK list as a constant value rather than define it in class. I will commit another patch to prove that :)
Attachment #8799208 - Attachment is obsolete: true
Flags: needinfo?(sabergeass)
Attachment #8801440 - Flags: review?(dustin)
Attached patch patch_v2.patch (obsolete) — Splinter Review
Here is another version of patch. And it will still makes mutiple request to SETA. https://pastebin.mozilla.org/8918783
Comment on attachment 8801441 [details] [diff] [review]
patch_v2.patch

Review of attachment 8801441 [details] [diff] [review]:
-----------------------------------------------------------------

::: taskcluster/taskgraph/util/load_low_value_task.py
@@ +59,5 @@
> +        # We just print the error out as a debug message if we failed to catch the exception above
> +        except exceptions.RequestException as error:
> +            logger.warning(error)
> +
> +        TASKS = low_value_tasks

In Python, this is a local variable assignment, so it creates a new "TASK" that is local to the query_low_value_tasks function.  Once that function exits, the value is lost.

One option here is to use the `global` keyword
  https://stereochro.me/ideas/global-in-python
That would fix this issue.  However, global is not considered great style..

Another option is to structure the whole thing as a singleton class that is instantiated once (that's what "singleton" means) and used on every call to the function:

class LowValueTasks(object):

    def __init__(self):
        self.low_value_tasks = {}

    def query_low_value_tasks(self, branch):
        if branch in self.low_value_tasks:
            return self.low_value_tasks[branch]
        # .. fetch the tasks
        self.low_value_tasks[branch] = ...
        return self.low_value_tasks[branch]

    def is_low_value_task(self, label, branch):
        return label in self.query_low_value_tasks(branch)

is_low_value_task = LowValueTasks().is_low_value_task


Do you want to give that latter option a try?
Attachment #8801440 - Flags: review?(dustin)
Attached patch patch_for_bug_1287018_v2.patch (obsolete) — Splinter Review
Hmmm, it will still init 'low_value_tasks' and make request multiple time. I'm not sure if I create singleton class in a wrong way. But after I look more into how singleton class create(http://stackoverflow.com/questions/6760685/creating-a-singleton-in-python), I think the patch in  https://bugzilla.mozilla.org/show_bug.cgi?id=1287018#c82 fit the singleton class.
Attachment #8801949 - Flags: feedback?(dustin)
Not quite -- you're creating a new instance of the class on every check ("LowValueTasks()" calls the constructor and makes a new instance).
(In reply to Dustin J. Mitchell [:dustin] from comment #86)
> Not quite -- you're creating a new instance of the class on every check
> ("LowValueTasks()" calls the constructor and makes a new instance).

Sorry, I can't log into my irccloud. Hmm, so how to make it only init once?
Ok, we make LowValueTasks as Singleton class now :) Hope it's good to go.
Attachment #8801440 - Attachment is obsolete: true
Attachment #8801441 - Attachment is obsolete: true
Attachment #8801949 - Attachment is obsolete: true
Attachment #8801949 - Flags: feedback?(dustin)
Attachment #8802837 - Flags: review?(dustin)
Attached patch bug1287018.patchSplinter Review
Nice work with the singleton metaclass.  However, the constructor is taking the branch as an argument but that is not being considered in the singleton calculation, so

  autoland_obj = LowValueTasks('autoland')
  central_obj = LowValueTasks('mozilla-central')

would result in central_obj being the same object as autoland_obj, and in fact using the low-value tasks list for autoland, despite being invoked with 'mozilla-central'.  Equivalently,

  is_low_value_task('task-123', 'autoland')
  is_low_value_task('task-456', 'mozilla-central')

would, for the second call, return whether task-456 was a low-value task on autoland, not on mozilla-central.

The Singleton metaclass is useful when the class constructor is called in a bunch of places, but you want to make sure all calls result in the same object.  More advanced versions of the metaclass can be set up to take the constructor arguments into account, which would fix the issue described above (you'd end up with one instance of the class for each branch).

However, in this case it is easier to just instantiate the class once (at the end of the module).  If we only ever call the constructor once, there's no need for any fancy logic to avoid calling it twice.  So in a way, it's a much simpler form of a "singleton class" than one using metaclasses.

Once we're sure there is only one instance, we can use caching within that instance, based on the branch name, to avoid multiple HTTP requests.

I've attached an implementation of this approach -- basically what I described in comment 84.  I took the liberty of making a few other changes, too:

 - change the module name to taskgraph.util.seta
 - change the class name to SETA
 - change 'branch' to 'project', since that is how we refer to it elsewhere in this codebase

This works in my testing -- what do you think?
Attachment #8803163 - Flags: review?(sabergeass)
Attachment #8802837 - Flags: review?(dustin)
Comment on attachment 8803163 [details] [diff] [review]
bug1287018.patch

Alright, I know why I always went wrong :P Thank you for the patch and sorry about the time wasting. This looks good to me :)
Attachment #8803163 - Flags: review?(sabergeass) → review+
as a note, I am on PTO today and Monday, if you want to get this landed- go ahead it would be awesome to have this in.  If you want to wait for extra eyes on SETA, we can land this on Tuesday.
Comment on attachment 8803163 [details] [diff] [review]
bug1287018.patch

Review of attachment 8803163 [details] [diff] [review]:
-----------------------------------------------------------------

one change to make prior to landing!

::: taskcluster/taskgraph/util/seta.py
@@ +10,5 @@
> +}
> +
> +# It's a list of project name which SETA is useful on
> +SETA_PROJECTS = ['mozilla-inbound', 'fx-team', 'autoland']
> +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?branch=%s"

hey, this is still using seta-dev, can we make this use the production instance?
Mike, do you want to make a new patch with that changed?
(In reply to Joel Maher ( :jmaher) from comment #92)
> Comment on attachment 8803163 [details] [diff] [review]
> bug1287018.patch
> 
> Review of attachment 8803163 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> one change to make prior to landing!
> 
> ::: taskcluster/taskgraph/util/seta.py
> @@ +10,5 @@
> > +}
> > +
> > +# It's a list of project name which SETA is useful on
> > +SETA_PROJECTS = ['mozilla-inbound', 'fx-team', 'autoland']
> > +SETA_ENDPOINT = "http://seta-dev.herokuapp.com/data/setadetails/?branch=%s"
> 
> hey, this is still using seta-dev, can we make this use the production
> instance?

I saw this in the code here and there.
Can you please change every instance of "http://seta-dev.herokuapp.com" for os.environ['HOST']?

We can then go ahead and set the variable on each app with a different value.
Armen is talking about making that change in the ouija source code, not in this patch.
(In reply to Dustin J. Mitchell [:dustin] from comment #93)
> Mike, do you want to make a new patch with that changed?

Sure, do I need to commit a new patch?
https://hg.mozilla.org/mozilla-central/rev/e7ebeead3cf5
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: