Closed
Bug 890302
Opened 12 years ago
Closed 12 years ago
Increase timeout in test-mar-url.sh
Categories
(Release Engineering :: Release Automation, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 894368
People
(Reporter: jhopkins, Assigned: jhopkins)
Details
Attachments
(1 file)
1.15 KB,
patch
|
jhopkins
:
review+
jhopkins
:
checked-in+
|
Details | Diff | Splinter Review |
At http://hg.mozilla.org/build/tools/file/2787a5b6689e/release/test-mar-url.sh#l6 there is this line:
curl --retry 5 --retry-max-time 30 -k -s -I -L "${mar_url}" > "${mar_headers_file}" 2>&1
This will only retry up to 5 times or 30 seconds _total_ time for all retries, whichever comes first.
We should bump this up to something like
--retry 50 --retry-max-time 300
to avoid failing the whole verify operation on transient failures.
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → pmoore
Comment 1•12 years ago
|
||
Hi John,
I've also updated retry attempts and timeout options for get-update-xml.sh, which is the script called by final-verification.sh that fetches the update.xml files, which are then parsed in order to retrieve the urls which are then tested in test-mar-url.sh.
Thanks,
Pete
Attachment #773233 -
Flags: review?(jhopkins)
Updated•12 years ago
|
Assignee: pmoore → jhopkins
Assignee | ||
Updated•12 years ago
|
Attachment #773233 -
Flags: review?(jhopkins) → review+
Assignee | ||
Comment 2•12 years ago
|
||
Comment on attachment 773233 [details] [diff] [review]
Increased timeouts and retry attempts for curl operations during final-verification.sh steps
https://hg.mozilla.org/build/tools/rev/32a4053c58e5
Attachment #773233 -
Flags: checked-in+
Assignee | ||
Comment 3•12 years ago
|
||
Reopen if there are any problems.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 4•12 years ago
|
||
This change may not have done what was expected, because of this from the man page:
--retry <num>
If a transient error is returned when curl tries to perform a transfer, it will retry this number of times before giving up. ... Transient error means either: a timeout, an FTP 5xx response code or an HTTP 5xx response code.
We're getting 408 responses instead.
Comment 5•12 years ago
|
||
I think we can try some extra tactics to reduce the http 408 errors that we are now getting:
1) detect http/408 errors, and retry automatically (as per Nick's comment above, curl is not natively retrying server timeouts, so we could do this either using retry.py or handle directly in the shell script(s))
2) reduce number of parallel processes, to reduce load on CDN (suggest 48 instead of 128)
I'd also like to simplify the error reporting when all update.xml files are consistent, so that not every identical copy is output in the failure report (only when they are not identical). In the case they are all identical, just one copy would be shown, with an explanation that they were all identical.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•12 years ago
|
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → DUPLICATE
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•