Closed
Bug 1461919
Opened 7 years ago
Closed 2 years ago
[tracking] improve reruns in release automation
Categories
(Release Engineering :: Release Automation, task)
Release Engineering
Release Automation
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mtabara, Unassigned)
References
(Depends on 4 open bugs)
Details
(Whiteboard: [releaseduty])
Am back in releaseduty cycle 61 and I couldn't help notice how flaky and fragile the reruns are. Lots of them need manual reruns; however some of them do have it.
We've already filed a bunch of similar bugs in the past few weeks, let's track the whole discussion of these here and bump this priority. We need to automatize the reruns as much as possible.
Reporter | ||
Comment 1•7 years ago
|
||
Kind of worrying to see the following (a jamun-based staging release task) referenced in the "balrog-my-linux64-nightly/opt" job's currently running in "balrogworker-3":
2018-05-16T09:53:57 DEBUG - Getting source url for balrog:beetmover:signing:partials:docker-image:parent Z2xxHbjbR5a64PcnVvfW-Q...
2018-05-16T09:53:57 INFO - balrog:beetmover:signing:partials:docker-image:parent Z2xxHbjbR5a64PcnVvfW-Q: found https://hg.mozilla.org/projects/jamun/raw-file/3544f46fd55df80340b38f96d793e739c93b99a2/.taskcluster.yml
2018-05-16T09:53:57 DEBUG - task_ids: {'default': 'Z2xxHbjbR5a64PcnVvfW-Q', 'decision': 'Z2xxHbjbR5a64PcnVvfW-Q'}
2018-05-16T09:53:57 INFO - Pushlog url https://hg.mozilla.org/projects/jamun/json-pushes?changeset=3544f46fd55df80340b38f96d793e739c93b99a2&tipsonly=1&version=2&full=1
2018-05-16T09:53:57 INFO - Downloading https://hg.mozilla.org/projects/jamun/json-pushes?changeset=3544f46fd55df80340b38f96d793e739c93b99a2&tipsonly=1&version=2&full=1
2018-05-16T09:53:57 INFO - Done
2018-05-16T09:53:57 WARNING - Pushlog error: expected a single push at https://hg.mozilla.org/projects/jamun/json-pushes?changeset=3544f46fd55df80340b38f96d793e739c93b99a2&tipsonly=1&version=2&full=1 but got {}!
2018-05-16T09:53:57 CRITICAL - Fatal exception
Traceback (most recent call last):
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/worker.py", line 124, in main
loop.run_until_complete(async_main(context))
File "/tools/python36/lib/python3.6/asyncio/base_events.py", line 468, in run_until_complete
return future.result()
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/worker.py", line 99, in async_main
await run_tasks(context)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/worker.py", line 64, in run_tasks
await verify_chain_of_trust(chain)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1843, in verify_chain_of_trust
task_count = await verify_task_types(chain)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1624, in verify_task_types
await valid_task_types[task_type](chain, obj)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1393, in verify_parent_task
await verify_parent_task_definition(chain, link)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1298, in verify_parent_task_definition
chain, parent_link, decision_link, tasks_for
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1224, in populate_jsone_context
await _get_additional_hgpush_jsone_context(parent_link, decision_link)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1138, in _get_additional_hgpush_jsone_context
pushlog_id = list(pushlog_info['pushes'].keys())[0]
Comment 2•7 years ago
|
||
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #1)
> Kind of worrying to see the following (a jamun-based staging release task)
> referenced in the "balrog-my-linux64-nightly/opt" job's currently running in
> "balrogworker-3":
I'm guessing we first landed a docker-image change on jamun? We'll keep using that image until the hash of the various files for that docker image change.
Reporter | ||
Comment 3•7 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #2)
> (In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #1)
> > Kind of worrying to see the following (a jamun-based staging release task)
> > referenced in the "balrog-my-linux64-nightly/opt" job's currently running in
> > "balrogworker-3":
>
> I'm guessing we first landed a docker-image change on jamun? We'll keep
> using that image until the hash of the various files for that docker image
> change.
This sounds plausable, I'll try to dive in a bit when I tackle this.
Reporter | ||
Comment 4•7 years ago
|
||
To chase the low-hanging fruits we'd need to improve the exitCodes in the graph, possibly rope in ciduty for those one-liners. For the others, we'd need to take the things step by step.
Comment 5•7 years ago
|
||
61.0b6 reruns:
J-3x3FJ0Tfar5hx8CKNLrA
Jo_-PXtwS32sxXC1GDAkkQ
Reporter | ||
Comment 6•7 years ago
|
||
We agreed that this is something that ciduty could help with if it happens again. Might be a good starting point for them in the release overview process.
Comment 7•7 years ago
|
||
repackage-l10n-mai-win32-nightly bUByc-Y_TiCsGG7HYf6M6w - failed to checkout
needed rerun
Updated•3 years ago
|
Updated•2 years ago
|
Severity: normal → S3
Comment 8•2 years ago
|
||
Most dependencies are fixed, no need to keep this old tracker open.
Status: NEW → RESOLVED
Type: defect → task
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•