Closed Bug 833062 Opened 13 years ago Closed 12 years ago

release runner shouldn't die when it gets ISE 500 errors from ship it

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

Details

(Whiteboard: [shipit])

Attachments

(3 files, 2 obsolete files)

retry 13 years ago Rail Aliiev [:rail] 2.96 KB, patch		Details \| Diff \| Splinter Review
retry 13 years ago Rail Aliiev [:rail] 3.50 KB, patch		Details \| Diff \| Splinter Review
retry 13 years ago Rail Aliiev [:rail] 3.01 KB, patch	bhearsum : review+ rail : checked-in+	Details \| Diff \| Splinter Review
retry 2 13 years ago Rail Aliiev [:rail] 1.09 KB, patch		Details \| Diff \| Splinter Review
use our retry in the api 12 years ago bhearsum@mozilla.com (:bhearsum) 2.90 KB, patch	rail : review+ bhearsum : checked-in+	Details \| Diff \| Splinter Review

bhearsum@mozilla.com (:bhearsum)

Assignee

Description

•

13 years ago

1) Release runner should retry connections to ship it 2) It shouldn't halt, even if it can't update status. It should still exit with a non-zero status code, but interrupting the starting of the release because it can't update status is bad

Rail Aliiev [:rail]

Comment 1

•

13 years ago

Attached patch retry (obsolete) — Details — Splinter Review

An easy patch for 1). Everything except self.config changes is PEP8.

Attachment #706254 - Flags: review?(bhearsum)

Rail Aliiev [:rail]

Comment 2

•

13 years ago

Attached patch retry (obsolete) — Details — Splinter Review

Actually this one includes a fix for 2) as well.

Attachment #706254 - Attachment is obsolete: true

Attachment #706254 - Flags: review?(bhearsum)

Attachment #706255 - Flags: review?(bhearsum)

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 3

•

13 years ago

Comment on attachment 706255 [details] [diff] [review] retry Review of attachment 706255 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/python/kickoff/api.py @@ +73,5 @@ > url_template_vars = {'name': name} > + try: > + return self.request(method='POST', data=data, url_template_vars=url_template_vars).content > + except IOError: > + log.warn("Cannot set status for %s to '%s'. Skipping..." % (name, status)) Hm, I'm not sure that eating errors in the API is the right thing to do. The logging is good, but I think the consumer should be allowed to make a decision about what to do upon error. What do you think?

Rail Aliiev [:rail]

Comment 4

•

13 years ago

Comment on attachment 706255 [details] [diff] [review] retry (In reply to Ben Hearsum [:bhearsum] from comment #3) > Hm, I'm not sure that eating errors in the API is the right thing to do. The > logging is good, but I think the consumer should be allowed to make a > decision about what to do upon error. What do you think? I'll rework the patch (should.stop.posting.patches.after.midnight)

Attachment #706255 - Flags: review?(bhearsum)

bhearsum@mozilla.com (:bhearsum)

Assignee

Updated

•

13 years ago

Assignee: nobody → rail

Rail Aliiev [:rail]

Comment 5

•

13 years ago

Attached patch retry — Details — Splinter Review

Attachment #706255 - Attachment is obsolete: true

Attachment #707055 - Flags: review?(bhearsum)

bhearsum@mozilla.com (:bhearsum)

Assignee

Updated

•

13 years ago

Attachment #707055 - Flags: review?(bhearsum) → review+

Rail Aliiev [:rail]

Comment 6

•

13 years ago

Comment on attachment 707055 [details] [diff] [review] retry http://hg.mozilla.org/build/tools/rev/5fcd9ca39f5a

Attachment #707055 - Flags: checked-in+

Rail Aliiev [:rail]

Comment 7

•

13 years ago

I deployed the changes on bm36 and restarted supervisord.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Rail Aliiev [:rail]

Comment 8

•

13 years ago

Not only status updates going to ship-it, but also release queries shouldn't fail easily. 2013-02-01 20:48:34,523 - DEBUG - Fetching release requests 2013-02-01 20:48:34,621 - ERROR - Caught HTTPError: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <title>Service Unavailable</title> <style type="text/css"> body, p, h1 { font-family: Verdana, Arial, Helvetica, sans-serif; } h2 { font-family: Arial, Helvetica, sans-serif; color: #b10b29; } </style> </head> <body> <h2>Service Unavailable</h2> <p>The service is temporarily unavailable. Please try again later.</p> </body> </html> Traceback (most recent call last): File "release-runner.py", line 268, in <module> new_releases = rr.get_release_requests() File "release-runner.py", line 73, in get_release_requests new_releases = self.releases_api.getReleases() File "/home/cltbld/release-runner/tools/lib/python/kickoff/api.py", line 61, in getReleases return json.loads(self.request(params={'ready': ready, 'complete': complete}).content) File "/home/cltbld/release-runner/tools/lib/python/kickoff/api.py", line 51, in request auth=self.auth, params=params) File "/home/cltbld/release-runner/tools/lib/python/vendor/requests-0.10.8/requests/sessions.py", line 203, in request r.send(prefetch=prefetch) File "/home/cltbld/release-runner/tools/lib/python/vendor/requests-0.10.8/requests/models.py", line 585, in send self.response.raise_for_status() File "/home/cltbld/release-runner/tools/lib/python/vendor/requests-0.10.8/requests/models.py", line 816, in raise_for_status raise http_error requests.exceptions.HTTPError: 500 Server Error

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Rail Aliiev [:rail]

Comment 9

•

13 years ago

Attached patch retry 2 — Details — Splinter Review

Attachment #709432 - Flags: review?(bhearsum)

Rail Aliiev [:rail]

Updated

•

13 years ago

Priority: -- → P2

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 10

•

13 years ago

Comment on attachment 709432 [details] [diff] [review] retry 2 Review of attachment 709432 [details] [diff] [review]: ----------------------------------------------------------------- ::: buildfarm/release/release-runner.py @@ +266,5 @@ > > # Main loop waits for new releases, processes them and exits. > while True: > log.debug('Fetching release requests') > + new_releases = retry(rr.get_release_requests) Shouldn't Requests be doing this retrying? Or are we getting a bad 2xx response in some cases?

Rail Aliiev [:rail]

Comment 11

•

13 years ago

It should (and I hope it does! :) ), but there is no interval between requests, what may cause this kind of errors.

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 12

•

13 years ago

(In reply to Rail Aliiev [:rail] from comment #11) > It should (and I hope it does! :) ), but there is no interval between > requests, what may cause this kind of errors. Can we set an interval between Requests' retries then? It seems better to use that rather than our own if possible.

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 13

•

13 years ago

Per IRC, Requests doesn't support sleeping between retries. Given that, we should use our own retry everywhere rather than max_retries. Can you make that change?

Rail Aliiev [:rail]

Comment 14

•

13 years ago

Comment on attachment 709432 [details] [diff] [review] retry 2 I will update the patch with in-house retry() :)

Attachment #709432 - Flags: review?(bhearsum)

bhearsum@mozilla.com (:bhearsum)

Assignee

Updated

•

13 years ago

Whiteboard: [kickoff] → [shipit]

Rail Aliiev [:rail]

Updated

•

12 years ago

Assignee: rail → nobody

Priority: P2 → --

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 15

•

12 years ago

Since we only sleep for 60s after release runner dies, P3.

Priority: -- → P3

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 16

•

12 years ago

Attached patch use our retry in the api — Details — Splinter Review

I tested this locally by polling a local ship it instance that wasn't running. It retried a few times, then I started the instance and it successfully got data back from it.

Assignee: nobody → bhearsum

Status: REOPENED → ASSIGNED

Attachment #728255 - Flags: review?(rail)

Rail Aliiev [:rail]

Updated

•

12 years ago

Attachment #728255 - Flags: review?(rail) → review+

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 17

•

12 years ago

Comment on attachment 728255 [details] [diff] [review] use our retry in the api Landed and updated bm36.

Attachment #728255 - Flags: checked-in+

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 18

•

12 years ago

I'm pretty sure this will be fixed now.

Status: ASSIGNED → RESOLVED

Closed: 13 years ago → 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

12 years ago

Product: mozilla.org → Release Engineering

You need to log in before you can comment on or make changes to this bug.