Closed Bug 1347889 Opened 3 years ago Closed 3 years ago

Nightly Decision Task for Android + Desktop fail with 400 Client Error: Bad Request for url: http://taskcluster/queue/v1/task/MqSPtdETSNySrz2lCMqGQw


(Taskcluster :: General, defect, blocker)

Not set


(Not tracked)



(Reporter: cbook, Assigned: garndt)





(1 file)

[task 2017-03-16T11:04:19.955263Z] HTTPError: 400 Client Error: Bad Request for url: http://taskcluster/queue/v1/task/MqSPtdETSNySrz2lCMqGQw

not sure if this related to bug 1347569 but the problem is that this failed now again and so we are risking of losing another day of linux and android nightlies so this is a real blocking bug :(
Flags: needinfo?(dustin)
Flags: needinfo?(bugspam.Callek)
<&pmoore> Tomcat|sheriffduty: i wonder if this might be an in-tree problem this time, it looks like we are getting a 400 back from the queue, indicating an invalid request
12:26 Tomcat|sheriffduty: also it says "data.created should be string"
12:27 so i wonder if something has changed in-tree wrt nightly task generation ... :/
12:28 my guess is, there is a decision task generated for the nightly builds, and that only runs when the nightlies run, so maybe it landed and didn't cause any bustage
12:28 since you would only see the bustage the next time the nightlies run
12:29 iirc we now have an in-tree mechanism for scheduling tasks on a cron, i can take a look in-tree to see if i can find it
<Tomcat|sheriffduty> maybe ?
<&pmoore> hmmm - could be, i'm not sure - let me see if i can find the generated task to find the problem
12:32 hmmm in we have `pushdate: 0` - i wonder if that is normal (could be unrelated, just looks strange)
<Tomcat|sheriffduty> i could retrigger them, but not sure if this would help
<&pmoore> Tomcat|sheriffduty: so i suspect that there is a mach command that will take the parameters file from and produce out a list of tasks definitions, and those definitions will have a badly formatted "created" parameter
<Tomcat|sheriffduty> pmoore: there was a mach change in tree
<&pmoore> Tomcat|sheriffduty: the people that probably know how this task generation stuff work (as I think they have worked on it) are dustin, jlund|away, ahal, gps, Callek
12:39 Tomcat|sheriffduty: that change looks ok to me
12:39 Tomcat|sheriffduty: i think dustin will be around soon - he will know
<Tomcat|sheriffduty> yeah
<&pmoore> you could play with backing out any changes under /taskcluster top level directory
12:41 Tomcat|sheriffduty: if there is only a single change in there, it is likely to be the one
12:41 Tomcat|sheriffduty: it certainly looks like an issue with task generation to me, since it looks like a malform created parameter in the task definition
12:41 *malformed*
<Tomcat|sheriffduty> pmoore: ok will sylvestre
<Tomcat|sheriffduty> pmoore: will ping sylvestre
sylvestre: could this be a regression from Bug 1347474 ?
Flags: needinfo?(sledru)
See Also: → 1347896
My patch is just changing strings content. I don't think it is called in any automation currently.
Flags: needinfo?(sledru)
Comment on attachment 8848072 [details]
Bug 1347889 - use 'relative-datestamp' instead of typo 'relative-timestamp' in morphs, to unbreak task submission.

::: taskcluster/taskgraph/
(Diff revision 1)
>      task_def = {
>          'provisionerId': 'aws-provisioner-v1',
>          'workerType': 'gecko-misc',
>          'dependencies': [task.task_id, image_taskid],
> -        'created': {'relative-timestamp': '0 seconds'},
> +        'created': {'relative-datestamp': '0 seconds'},

Given that relative-datestamp is also used for the initial task generation, I think that it's the right call to use it here too.  I'll let Dustin be the final word though.
Comment on attachment 8848072 [details]
Bug 1347889 - use 'relative-datestamp' instead of typo 'relative-timestamp' in morphs, to unbreak task submission.
Attachment #8848072 - Flags: review?(dustin) → review+
Pushed by
use 'relative-datestamp' instead of typo 'relative-timestamp' in morphs, to unbreak task submission. r=dustin
Dustin, can we force the hooks service to run this while we're all around, to check it works now (Callek thought there may also be scopes issues)? Maybe if we alter the scheduled run time, and then reset it afterwards to its normal value?

Or would this cause too much load during the day? If we haven't had nightlies for two nights, it might not be a bad thing.
We've actually had nightlies today, this is just the submission of the index tasks failing, which happened after the nightly jobs were submitted, so the normal nightly stuff was fine, just l10n stuff wasn't added to the gecko index....
Flags: needinfo?(bugspam.Callek)
Assignee: nobody → garndt
Closed: 3 years ago
Flags: needinfo?(dustin)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.