Balrog submitter should retry smarter

RESOLVED FIXED

Status

Release Engineering
General Automation
P1
major
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: rail, Assigned: rail)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Assignee)

Description

3 years ago
We submit 2 PUT requests for every nightly builds. If the second one fails, we start from the first one. This adds unnecessary entries into the history table. We should retry 2 requests separately.
(Assignee)

Updated

3 years ago
Assignee: nobody → rail
Severity: normal → major
Priority: -- → P1
(Assignee)

Comment 1

3 years ago
Created attachment 8639470 [details] [diff] [review]
idempotent_submitter-tools.diff

This is an untested patch. It relies on dictionary comparison, which surprisingly works. I can test the patch by backing a custom funsize-balrog-submitter docker image with this patch applied.

One thing that bothers me is the test case, which uses mock's side_effects, what is a sign that the code could be refactored. I'm not sure if it's worth to make a big change for a simple if/else case...
Attachment #8639470 - Flags: feedback?(bhearsum)
Comment on attachment 8639470 [details] [diff] [review]
idempotent_submitter-tools.diff

Per IRC, I think it would be nicer to add retry calls around each update_build method. This gives you better granularity (retries will only happen for very specific failures), and avoids the need to check for existing data. It probably simplifies your test cases a lot too.

Might want to add retries around the get_data calls as well.
(Assignee)

Comment 3

3 years ago
Comment on attachment 8639470 [details] [diff] [review]
idempotent_submitter-tools.diff

Thanks, I'll update the patch
Attachment #8639470 - Flags: feedback?(bhearsum)
(Assignee)

Comment 4

3 years ago
Created attachment 8639650 [details] [diff] [review]
redo.diff
Attachment #8639650 - Flags: review?(bhearsum)
Comment on attachment 8639650 [details] [diff] [review]
redo.diff

Review of attachment 8639650 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/python/balrog/submitter/cli.py
@@ +299,5 @@
> +                                                           data)),
> +                alias=json.dumps(alias),
> +                schemaVersion=schemaVersion, data_version=data_version)
> +
> +        retry(update_dated, sleeptime=10)

Do you want to be more specific about how you retry? retrier would let you deal with specific http return codes: http://mxr.mozilla.org/build/source/tools/buildfarm/utils/repository_manifest.py#122

It's not going to be a huge problem if you just retry for anything, so up to you.
Attachment #8639650 - Flags: review?(bhearsum) → review+
(Assignee)

Comment 6

3 years ago
Comment on attachment 8639650 [details] [diff] [review]
redo.diff

https://hg.mozilla.org/build/tools/rev/346d313d1180

I pushed it without any changes. I may customize the list of exceptions in the future, when we have more data on common errors happening here.
Attachment #8639650 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(Assignee)

Comment 7

3 years ago
Probably I should work on the first patch to make sure we don't submit blobs if they are already there.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 8

3 years ago
Created attachment 8643339 [details] [diff] [review]
idempotent_submission-tools.diff

Here we go. This one checks both dated and latest with moar tests.
Attachment #8639470 - Attachment is obsolete: true
Attachment #8643339 - Flags: review?(bhearsum)
Comment on attachment 8643339 [details] [diff] [review]
idempotent_submission-tools.diff

Review of attachment 8643339 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine. The only thing I could say is that it might be useful to add a test where both dated and latest are up to date already.
Attachment #8643339 - Flags: review?(bhearsum) → review+
(Assignee)

Comment 10

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #9)
> Comment on attachment 8643339 [details] [diff] [review]
> idempotent_submission-tools.diff
> 
> Review of attachment 8643339 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks fine. The only thing I could say is that it might be useful to add a
> test where both dated and latest are up to date already.

The last one (test_same_latest_data()) covers that:

  self.assertEqual(update_build.call_count, 0)
(Assignee)

Comment 11

3 years ago
Comment on attachment 8643339 [details] [diff] [review]
idempotent_submission-tools.diff

https://hg.mozilla.org/build/tools/rev/224af2cb8bcc
Attachment #8643339 - Flags: checked-in+
(Assignee)

Comment 12

3 years ago
Created attachment 8643759 [details] [diff] [review]
idempotent_submission-tools-1.diff

A small followup for the previous one. This handles the cases where we have multiple partials and we should compare subsets of the blob.
Attachment #8643759 - Flags: review?(bhearsum)
Attachment #8643759 - Flags: review?(bhearsum) → review+
(Assignee)

Comment 13

3 years ago
Comment on attachment 8643759 [details] [diff] [review]
idempotent_submission-tools-1.diff

https://hg.mozilla.org/build/tools/rev/6bcab61fa67d
Attachment #8643759 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.