Closed Bug 1187962 Opened 9 years ago Closed 9 years ago

Balrog submitter should retry smarter

Categories

(Release Engineering :: General, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rail, Assigned: rail)

References

Details

Attachments

(3 files, 1 obsolete file)

We submit 2 PUT requests for every nightly builds. If the second one fails, we start from the first one. This adds unnecessary entries into the history table. We should retry 2 requests separately.
Assignee: nobody → rail
Severity: normal → major
Priority: -- → P1
Attached patch idempotent_submitter-tools.diff (obsolete) — Splinter Review
This is an untested patch. It relies on dictionary comparison, which surprisingly works. I can test the patch by backing a custom funsize-balrog-submitter docker image with this patch applied.

One thing that bothers me is the test case, which uses mock's side_effects, what is a sign that the code could be refactored. I'm not sure if it's worth to make a big change for a simple if/else case...
Attachment #8639470 - Flags: feedback?(bhearsum)
Comment on attachment 8639470 [details] [diff] [review]
idempotent_submitter-tools.diff

Per IRC, I think it would be nicer to add retry calls around each update_build method. This gives you better granularity (retries will only happen for very specific failures), and avoids the need to check for existing data. It probably simplifies your test cases a lot too.

Might want to add retries around the get_data calls as well.
Comment on attachment 8639470 [details] [diff] [review]
idempotent_submitter-tools.diff

Thanks, I'll update the patch
Attachment #8639470 - Flags: feedback?(bhearsum)
Attached patch redo.diffSplinter Review
Attachment #8639650 - Flags: review?(bhearsum)
Comment on attachment 8639650 [details] [diff] [review]
redo.diff

Review of attachment 8639650 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/python/balrog/submitter/cli.py
@@ +299,5 @@
> +                                                           data)),
> +                alias=json.dumps(alias),
> +                schemaVersion=schemaVersion, data_version=data_version)
> +
> +        retry(update_dated, sleeptime=10)

Do you want to be more specific about how you retry? retrier would let you deal with specific http return codes: http://mxr.mozilla.org/build/source/tools/buildfarm/utils/repository_manifest.py#122

It's not going to be a huge problem if you just retry for anything, so up to you.
Attachment #8639650 - Flags: review?(bhearsum) → review+
Comment on attachment 8639650 [details] [diff] [review]
redo.diff

https://hg.mozilla.org/build/tools/rev/346d313d1180

I pushed it without any changes. I may customize the list of exceptions in the future, when we have more data on common errors happening here.
Attachment #8639650 - Flags: checked-in+
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Probably I should work on the first patch to make sure we don't submit blobs if they are already there.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Here we go. This one checks both dated and latest with moar tests.
Attachment #8639470 - Attachment is obsolete: true
Attachment #8643339 - Flags: review?(bhearsum)
Comment on attachment 8643339 [details] [diff] [review]
idempotent_submission-tools.diff

Review of attachment 8643339 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine. The only thing I could say is that it might be useful to add a test where both dated and latest are up to date already.
Attachment #8643339 - Flags: review?(bhearsum) → review+
(In reply to Ben Hearsum [:bhearsum] from comment #9)
> Comment on attachment 8643339 [details] [diff] [review]
> idempotent_submission-tools.diff
> 
> Review of attachment 8643339 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks fine. The only thing I could say is that it might be useful to add a
> test where both dated and latest are up to date already.

The last one (test_same_latest_data()) covers that:

  self.assertEqual(update_build.call_count, 0)
Comment on attachment 8643339 [details] [diff] [review]
idempotent_submission-tools.diff

https://hg.mozilla.org/build/tools/rev/224af2cb8bcc
Attachment #8643339 - Flags: checked-in+
A small followup for the previous one. This handles the cases where we have multiple partials and we should compare subsets of the blob.
Attachment #8643759 - Flags: review?(bhearsum)
Attachment #8643759 - Flags: review?(bhearsum) → review+
Comment on attachment 8643759 [details] [diff] [review]
idempotent_submission-tools-1.diff

https://hg.mozilla.org/build/tools/rev/6bcab61fa67d
Attachment #8643759 - Flags: checked-in+
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: