Closed Bug 1163998 Opened 9 years ago Closed 7 years ago

Balrog rule locking doesn't work in merge scripts

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(firefox44 fixed)

RESOLVED INCOMPLETE
Tracking Status
firefox44 --- fixed

People

(Reporter: rail, Unassigned)

References

Details

Attachments

(2 files, 1 obsolete file)

Because bug 1143910 changed config format.
manually locked aurora to latest buildid

this should match what migration tries to do: http://mxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/merge_day/central_to_aurora.py#78
Blocks: 1178507
Blocks: 1190766
Blocks: 1206664
No longer blocks: 1190766
Attached patch lock_rules.diff (obsolete) — Splinter Review
Locking rules by rule ID doesn't work on multiple AUS instances - the IDs are different. This patch removes the assumption that we need to iterate over a list of AUS servers, instead we have to explicitly pass it.

After landing this, I will need to change the merge docs to:
* get rid of "-c mozharness/configs/balrog/production.py"
* explicitly pass --balrog-api-root and --balrog-username
Attachment #8664902 - Flags: review?(jlund)
Comment on attachment 8664902 [details] [diff] [review]
lock_rules.diff

Review of attachment 8664902 [details] [diff] [review]:
-----------------------------------------------------------------

looks good. one comment below. setting r- for now

::: testing/mozharness/mozharness/mozilla/updates/balrog.py
@@ +140,3 @@
>  
> +        self.info("Calling Balrog rule locking script.")
> +        return self.retry(self.run_command, attempts=5, args=cmd)

I think the return here is ignored by https://dxr.mozilla.org/mozilla-central/rev/abe43c30d78d7546ed7c622c5cf62d265709cdfd/testing/mozharness/scripts/merge_day/gecko_migration.py?offset=0#525

as I see it we have two options.

1) allow the failure and continue but set self.return_code > 0 so that we notice the failure.
2) set the self.retry here to halt_on_failure=True so that we stop here before trying to commit, trigger pushes, etc

fwiw, either wfm. (1) might make more sense since we can always do it manually after or call the script with just the --lock-update-path action after a patch fix.
Attachment #8664902 - Flags: review?(jlund) → review-
Attached patch lock_rules.diffSplinter Review
Good catch! I blindly ported whatever we had before.

I added kwargs={"halt_on_failure": True} and removed "return", because there is no reason to return if we halt.
Attachment #8664902 - Attachment is obsolete: true
Attachment #8665229 - Flags: review?(jlund)
Comment on attachment 8665229 [details] [diff] [review]
lock_rules.diff

Review of attachment 8665229 [details] [diff] [review]:
-----------------------------------------------------------------

looks good but I may have been wrong about using halt_on_failure. After refreshing my mozharness BaseScript understanding, I *think* retry won't retry if we halt immediately on failure because we will self.fatal in the run_command call (the inner most nested try/catch exception):
https://dxr.mozilla.org/mozilla-central/rev/f1dffc8682fbba463cb4bb305f293ddcccbc20b4/testing/mozharness/mozharness/base/script.py#1135


I think we may need to use error_level=Fatal as a arg to retry() itself:
https://dxr.mozilla.org/mozilla-central/rev/f1dffc8682fbba463cb4bb305f293ddcccbc20b4/testing/mozharness/mozharness/base/script.py#867
Attachment #8665229 - Flags: review?(jlund) → review+
> I think we may need to use error_level=Fatal as a arg to retry() itself:
s/Fatal/FATAL
https://hg.mozilla.org/mozilla-central/rev/3dc71eb85e09
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Missing config variables:

* balrog_api_root
* balrog_username

Also:

07:44:43     INFO - Calling Balrog rule locking script.
07:44:43     INFO - retry: Calling run_command with args: ['python', '/home/rail/work/mozilla/merge/build/tools/scripts/updates/balrog-nightly-locker.py', '--credentials-file', '/home/rail/wo
rk/mozilla/merge/oauth.txt', '--api-root', 'https://aus4-admin.mozilla.org/api', '--username', 'ffxbld', '-r', '8', '-r', '10', '-r', '18', '-r', '106', '--verbose', 'lock'], kwargs: {'halt_o
n_failure': True}, attempt #1
07:44:43     INFO - retry: attempt #1 caught exception: run_command() takes at most 14 arguments (20 given)
07:44:43     INFO - retry: Failed, sleeping 60 seconds before retrying
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
back to the pool
Assignee: rail → nobody
Status: REOPENED → RESOLVED
Closed: 9 years ago7 years ago
Resolution: --- → INCOMPLETE
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: