Closed
Bug 1337300
Opened 8 years ago
Closed 8 years ago
decision task opt decision task for cron job nightly-mochitest-valgrind cron(vg) broken on central
Categories
(Taskcluster :: General, defect)
Taskcluster
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cbook, Assigned: pmoore)
References
()
Details
Attachments
(1 file)
980 bytes,
patch
|
dustin
:
review+
|
Details | Diff | Splinter Review |
like https://treeherder.mozilla.org/logviewer.html#?job_id=75018174&repo=mozilla-central&lineNumber=779
[task 2017-02-07T04:01:56.234330Z] HTTPError: 409 Client Error: Conflict for url: http://taskcluster/queue/v1/task/XYQVC7MnQA2wZSjl949hXg
retrigger does not help. I guess whats the real error/conflict here that this error means is :
[task 2017-02-07T04:01:56.227014Z] "PUT /queue/v1/task/XYQVC7MnQA2wZSjl949hXg HTTP/1.1" 409 10376
[task 2017-02-07T04:01:56.228066Z] Task group HDMoXSanTVCGzQB-ODJ-4g contains tasks with
[task 2017-02-07T04:01:56.228133Z] schedulerId gecko-level-3-cron. You are attempting
[task 2017-02-07T04:01:56.228208Z] to include tasks from schedulerId gecko-level-3,
[task 2017-02-07T04:01:56.228275Z] which is not permitted.
[task 2017-02-07T04:01:56.228347Z] All tasks in the same task-group must have the same schedulerId.
[task 2017-02-07T04:01:56.228515Z] ----
[task 2017-02-07T04:01:56.228573Z] errorCode: RequestConflict
[task 2017-02-07T04:01:56.228601Z] statusCode: 409
Assignee | ||
Comment 1•8 years ago
|
||
I suspect https://hg.mozilla.org/mozilla-central/file/af8a2573d0f1/taskcluster/taskgraph/cron/decision.py#l95 should be changed to remove -cron in the schedulerId name, or those cron tasks should be put in a dedicated task group.
The choice about whether one task group or two task groups should be used is probably mostly an aesthetic one.
Looks like fallout from bug 1252948.
Flags: needinfo?(dustin)
See Also: → 1252948
Assignee | ||
Comment 2•8 years ago
|
||
Assignee | ||
Comment 3•8 years ago
|
||
Ah, looks like these decision cron jobs are not scheduled on try - I see no jobs created there.
Assignee | ||
Comment 4•8 years ago
|
||
Well, I guess these are scheduled based on a cron - so that isn't surprising - maybe these tasks would get added later to the try push by the taskcluster-hooks service.
From https://bugzilla.mozilla.org/show_bug.cgi?id=1252948#c19 it looks like these valgrind tasks only get run once-per-week, but probably all cron tasks are affected, not just the valgrind ones (at a guess - as I guess all taskcluster-hooks added tasks will get this new schedulerId).
Long story short, I haven't dived deeply into the code, but based on my superficial understanding, I'm guessing that https://hg.mozilla.org/try/rev/329b2478d8e8af9550a53015874127eef095fe3b will probably fix things, although it could impact roles that might have been set up which include the schedulerId in scopes it/they contain. In other words, it might require also adjusting some taskcluster roles.
I'm guessing it is probably best for us to wait until dustin/Callek/kmoir get in, who know this stuff far better than me. :-)
Assignee | ||
Comment 5•8 years ago
|
||
However, if this becomes tree-closing, I'm happy to work on rolling out the patch, looking for auth failures, and adjusting roles as necessary.
Assignee | ||
Comment 6•8 years ago
|
||
So looks like this cron runs every 15 mins, and then presumably based on the in-tree cron schedules decides what to run (so stuff can be scheduled more infrequently than every 15 mins), and when it runs, it uses the head of the default branch in mozilla-central.
https://tools.taskcluster.net/hooks/#project-releng/cron-task-mozilla-central
Assignee | ||
Comment 7•8 years ago
|
||
From that hook, we can see these are the scopes available to the cron task that runs:
https://tools.taskcluster.net/auth/roles/#hook-id:project-releng%252fcron-task-mozilla-central
So this task already has
queue:create-task:aws-provisioner-v1/gecko-1-*
queue:create-task:aws-provisioner-v1/gecko-2-*
queue:create-task:aws-provisioner-v1/gecko-3-*
which means, removing '-cron' from the schedulerId name (e.g. gecko-level-3-cron => gecko-level-3) shouldn't break anything, since the 'gecko-{level}-*' still matches the shortened scheduler name.
Assignee | ||
Comment 8•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(dustin)
Attachment #8834324 -
Flags: review?(rgarbas) → review?(dustin)
Comment 9•8 years ago
|
||
:pmoore: I'm not sure I have the understanding (yet) of what this change my cause (have not yet play much with intree stuff). But thank you for adding me, I follow the discussion and hopefully learn something. :dustin: might be a better person to review it.
Comment 10•8 years ago
|
||
Comment on attachment 8834324 [details] [diff] [review]
bug1337300_gecko_v1.patch
Review of attachment 8834324 [details] [diff] [review]:
-----------------------------------------------------------------
Great minds think alike:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e74dc930625cea6f16fb5f9f9bb13f1431261521
so r+, but no need to land this since it's already landed.
Attachment #8834324 -
Flags: review?(dustin) → review+
Updated•8 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•