Closed
Bug 762922
Opened 13 years ago
Closed 12 years ago
improve signing client retry logic
Categories
(Release Engineering :: General, enhancement, P3)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Assigned: catlee)
Details
(Whiteboard: [signing])
Attachments
(1 file)
3.31 KB,
patch
|
bhearsum
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
The signing client currently seems to retry the same server 20 times before moving on to another one. For example:
2012-06-07 17:09:41,734 - a53774f6112fb2458853012d2ea02a21226f4e03: processing mac/is/Thunderbird 14.0b1.dmg on https://mac-signing2.srv.releng.scl3.mozilla.com:9120
2012-06-07 17:10:56,834 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:12:13,440 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:13:30,014 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:14:46,606 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:16:03,195 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:17:19,787 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:18:36,360 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:19:52,937 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:21:09,529 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:22:26,123 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:23:42,700 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:24:59,289 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:26:15,864 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:27:32,462 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:28:49,042 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:30:05,615 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:31:22,202 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:32:38,794 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:33:55,363 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:35:11,947 - a53774f6112fb2458853012d2ea02a21226f4e03: connection error; trying again soon
2012-06-07 17:35:12,948 - a53774f6112fb2458853012d2ea02a21226f4e03: giving up after 20 tries
2012-06-07 17:35:13,098 - a53774f6112fb2458853012d2ea02a21226f4e03: processing mac/is/Thunderbird 14.0b1.dmg on https://mac-signing4.build.scl1.mozilla.com:9100
2012-06-07 17:35:13,445 - a53774f6112fb2458853012d2ea02a21226f4e03: uploading for signing
2012-06-07 17:35:22,903 - a53774f6112fb2458853012d2ea02a21226f4e03: OK
For batched repacks, this more or less guarantees that your token will expire before your job is done if a signing server is down. The client should be switching servers after fewer failures than this.
Comment 1•13 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #0)
> For batched repacks, this more or less guarantees that your token will
> expire before your job is done if a signing server is down. The client
> should be switching servers after fewer failures than this.
Do we have to wait before switching servers at all, i.e. can we iterate through all possible servers before 'trying again soon' on each cycle?
Reporter | ||
Comment 2•13 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #1)
> (In reply to Ben Hearsum [:bhearsum] from comment #0)
> > For batched repacks, this more or less guarantees that your token will
> > expire before your job is done if a signing server is down. The client
> > should be switching servers after fewer failures than this.
>
> Do we have to wait before switching servers at all, i.e. can we iterate
> through all possible servers before 'trying again soon' on each cycle?
The reason we wait right now is because we're retrying the same request to the same server, and giving it a chance to come back up first. Something like this would probably be better:
* Try server A
* If that fails, try server B
* If that fails, try server C
* If that fails, wait N seconds and try them all again.
Probably should shuffle the servers, though.
Assignee | ||
Updated•13 years ago
|
Whiteboard: [signing]
Reporter | ||
Updated•13 years ago
|
Severity: normal → enhancement
Priority: -- → P3
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → catlee
Assignee | ||
Comment 3•12 years ago
|
||
this moves handling of multiple urls to inside remote_signfile.
the urls are first shuffled, and then are tried in order. if we fail on one url, that url is moved to the end of the list. I think it's worthwhile to keep the small sleep that's in there in case there's something network-wide that's failing.
Attachment #651979 -
Flags: review?(bhearsum)
Reporter | ||
Comment 4•12 years ago
|
||
Comment on attachment 651979 [details] [diff] [review]
move url retrying into remote_signfile
Review of attachment 651979 [details] [diff] [review]:
-----------------------------------------------------------------
Looks reasonable to me.
Attachment #651979 -
Flags: review?(bhearsum) → review+
Assignee | ||
Updated•12 years ago
|
Attachment #651979 -
Flags: checked-in+
Assignee | ||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•