Closed Bug 1813243 Opened 2 years ago Closed 2 years ago

beetmoverscript upload failed but task succeeded

Categories

(Release Engineering :: Release Automation, enhancement)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gbrown, Assigned: hneiva)

Details

There was some confusion today when beetmover uploads failed but the tasks succeeded.
https://treeherder.mozilla.org/jobs?repo=mozilla-release&selectedTaskRun=cyIHF1EsQheOStIg7wmUig.0&searchStr=beetmover&revision=30244986d6ff55bc3396db436fe1dba555828106
https://firefoxci.taskcluster-artifacts.net/cyIHF1EsQheOStIg7wmUig/0/public/logs/live_backing.log

2023-01-27 21:51:42,833 - beetmoverscript.script - INFO - put /app/workdir/cot/bOnMkNQnTbakgTvMNd9Qxg/public/build/th/target.complete.mar: 200
2023-01-27 21:51:42,835 - charset_normalizer - DEBUG - Encoding detection on empty bytes, assuming utf_8 intention.
2023-01-27 21:51:42,838 - scriptworker.utils - WARNING - Async task failed with error: ('Connection aborted.', timeout('The write operation timed out'))
2023-01-27 21:51:42,839 - beetmoverscript.utils - WARNING - Skipped exception:
2023-01-27 21:51:43,880 - beetmoverscript.script - INFO - put /app/workdir/cot/WhmYerPwTay9Way9_FC1Vw/public/build/th/target.tar.bz2: 200
2023-01-27 21:51:43,882 - charset_normalizer - DEBUG - Encoding detection on empty bytes, assuming utf_8 intention.
2023-01-27 21:51:44,796 - beetmoverscript.task - INFO - Action types: ['push-to-candidates']
2023-01-27 21:51:44,798 - beetmoverscript.task - DEBUG - Loading release_props from task's payload: {'appName': 'Firefox', 'appVersion': '109.0.1', 'branch': 'mozilla-release', 'buildid': '20230127170202', 'hashType': 'sha512', 'platform': 'linux64'}
2023-01-27 21:51:44,800 - beetmoverscript.script - INFO - Success!
<traceback object at 0x7fb66806b840>
<no message>
exit code: 0

The files were missing on https://archive.mozilla.org/pub/firefox/candidates/109.0.1-candidates/build1/update/linux-x86_64/; a re-run resolved the issue.

When an upload fails we should do more than warn: these should probably be ERRORs and fail the task.

From "Skipped exception:" in the log, I surmise we are at https://github.com/mozilla-releng/scriptworker-scripts/blob/master/beetmoverscript/src/beetmoverscript/utils.py#L338 ... meaning fail_task_on_error is False. I see fail_task_on_error set to false for many environments - even production ones - in worker.yml. Is that intentional? Why/when should it not fail the task?

:hneiva - Can you clarify?

Flags: needinfo?(hneiva)

:gbrown we should confirm with SRE that gcloud is the primary storage facility and switch the properties for fail_task_on_error here:
https://github.com/mozilla-releng/scriptworker-scripts/blob/master/beetmoverscript/docker.d/worker.yml#L92

Assignee: gbrown → hneiva
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Component: Release Automation: Uploading → Release Automation
You need to log in before you can comment on or make changes to this bug.