Closed Bug 1388407 Opened 7 years ago Closed 7 years ago

Scheduling missing tests for autoland times out

Categories

(Taskcluster :: Services, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla57

People

(Reporter: armenzg, Assigned: bstack)

References

Details

Attachments

(1 file)

I pushed to try, canceled the builds, clicked on the Gecko decision task, added a payload of {} and clicked on trigger action.

Unfortunately only the missing tests action task got scheduled.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5cecffefd20de1c958d5201a39575c38b596dc1c&selectedJob=121686629

> [task 2017-08-08T15:22:04.847044Z] Out of 37 test tasks, 37 already existed and the action created 0
Right -- those tests had already been run.  That they had been cancelled is an entirely different issue.  Run missing tests runs tests that were not originally run.
OK. I will try again w/o cancelling first.

I'm used to the old behaviour where cancelled or failed builds (e.g. infra failure) would be tried again.
That might be a good set of features to add -- and it's all in tree now, so anyone can do so :)
I tried it on autoland and it failed like this:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=91b30f63e8d804de50178f3f0e85ebaad4766b4b&selectedJob=121694289

> [taskcluster:error] Task timeout after 1800 seconds. Force killing container.

I also tried it on 'try' without canceling builds and it did not schedule anything:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2850f912e482988a582f0c8c5092337a8099b699&group_state=expanded&selectedJob=121715931

Should we assume this is the new behaviour and ask for feedback?
I will rename this bug to only focus on the timeout issue.
Summary: Scheduling missing tests did not schedule anything → Scheduling missing tests for autoland times out
Attachment #8896498 - Flags: review?(dustin)
This patch should use the logic the rest of taskgraph stuff uses to submit tasks concurrently. I believe it will also fix the missing dependencies in add_talos we've been investigating and simplify the code needed for an action task as well! I assume I've missed something about how the generation stuff works since it is pretty complex, but hopefully this is at least a move in the right direction.
Assignee: nobody → bstack
Status: NEW → ASSIGNED
Comment on attachment 8896498 [details]
Bug 1388407 - Fix timeouts in action-task graph submission

https://reviewboard.mozilla.org/r/167758/#review173540

::: taskcluster/taskgraph/actions/util.py:39
(Diff revision 2)
> +def create_tasks(to_run, full_task_graph, label_to_taskid, params, decision_task_id):
> +    """Create new tasks.  The task definition will have {relative-datestamp':
> +    '..'} rendered just like in a decision task.  Action callbacks should use
> +    this function to create new tasks,
> +    allowing easy debugging with `mach taskgraph action-callback --test`.
> +    This builds up all required tasks to run in order to run the tasks requested."""

This docstring may need some adjustment.  This will also render {task-reference: ..}, and it is suited to creating tasks based on the original decision task, rather than creating "new" tasks.  Is it legit to "edit" the full_task_graph before handing it to this function?

::: taskcluster/taskgraph/actions/util.py:47
(Diff revision 2)
> +    target_task_graph = TaskGraph(
> +        {l: full_task_graph[l] for l in target_graph.nodes},
> +        target_graph)
> +    optimized_task_graph, label_to_taskid = optimize_task_graph(target_task_graph,
> +                                                                params,
> +                                                                to_run)

I think you want to pass label_to_taskid as the 4th parameter here; without that, optimization will look in the index and may find newer versions of prerequisite tasks like toolchains,
Attachment #8896498 - Flags: review?(dustin) → review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/5214574f51b6
Fix timeouts in action-task graph submission r=dustin
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/5214574f51b6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Awesome. I'm glad this works :)
Component: Integration → Services
You need to log in before you can comment on or make changes to this bug.