Closed
Bug 513960
Opened 15 years ago
Closed 15 years ago
talos should go orange on failure to post to graph server
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: anodelman, Assigned: anodelman)
References
Details
Attachments
(2 files, 1 obsolete file)
505 bytes,
patch
|
rdoherty
:
review+
anodelman
:
checked-in+
|
Details | Diff | Splinter Review |
8.89 KB,
patch
|
catlee
:
review+
anodelman
:
checked-in+
|
Details | Diff | Splinter Review |
Currently stays green on failure to post, but should report the error. We should first attempt to resend the data 3-5 times before giving up and going orange.
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → anodelman
Assignee | ||
Comment 1•15 years ago
|
||
Some clean up here in how we generate correctly formatted data to send to the graph server - mostly getting rid of using temp files.
Attachment #398466 -
Flags: review?(catlee)
Assignee | ||
Comment 2•15 years ago
|
||
Found with the new error reporting code. Figured that it can be part of this bug as we can't roll out the error reporting till we are vaguely sure that we won't cause everything to burn.
Attachment #398529 -
Flags: review?(rdoherty)
Comment 3•15 years ago
|
||
Comment on attachment 398529 [details] [diff] [review] [Checked in]add extra character (%) to allowed string in graph server Ran tests, all pass. Also changed assert c.StringValidator.validate('1Aa9Zz._()-+ ') == '1Aa9Zz._()-+ ' to assert c.StringValidator.validate('1Aa9Zz._()%-+ ') == '1Aa9Zz._()%-+ ' in server/pyfomatic/test/test_collect.py to verify it will accept a %. If that could be included when committing it would make it even awesomer :)
Attachment #398529 -
Flags: review?(rdoherty) → review+
Comment 4•15 years ago
|
||
Comment on attachment 398466 [details] [diff] [review] report graph posting errors, try to send 5 times before failing out >+ #send all the strings along to the graph server >+ for data_string in result_strings: >+ RETRIES = 5 >+ times = 0 >+ msg = "" >+ while (times < RETRIES): >+ try: >+ utils.stamped_msg("Transmitting test: " + testname, "Started") >+ links += process_Request(post_file.post_multipart(results_server, results_link, [("key", "value")], [("filename", "data_string", data_string)])) >+ break >+ except talosError, e: >+ times += 1 >+ msg = e.msg >+ if times == RETRIES: >+ raise talosError("Failed to send data %d times... quitting\n%s" % (RETRIES, msg)) There should be a time.sleep() call in the except block so that the graph server has a chance to recover from whatever problem it's experiencing. It should also increase the time before the next retry each time through the loop. E.g. sleep for 5 seconds after the first failure, 15 seconds after the second, 45 after the 3rd, etc.
Attachment #398466 -
Flags: review?(catlee) → review-
Assignee | ||
Comment 5•15 years ago
|
||
Wait between each attempt to send to graph server. Double wait time after each failure.
Attachment #398466 -
Attachment is obsolete: true
Attachment #398760 -
Flags: review?(catlee)
Updated•15 years ago
|
Attachment #398760 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 6•15 years ago
|
||
Comment on attachment 398529 [details] [diff] [review] [Checked in]add extra character (%) to allowed string in graph server changeset: 241:0b8873afb30f
Attachment #398529 -
Attachment description: add extra character (%) to allowed string in graph server → [Checked in]add extra character (%) to allowed string in graph server
Attachment #398529 -
Flags: checked-in+
Assignee | ||
Comment 7•15 years ago
|
||
Changes to graph server pushed to production.
Assignee | ||
Updated•15 years ago
|
Attachment #398760 -
Attachment description: report graph posting errors, try to send 5 times before failing out (take 2) → [checked in]report graph posting errors, try to send 5 times before failing out (take 2)
Attachment #398760 -
Flags: checked-in+
Assignee | ||
Comment 8•15 years ago
|
||
Successfully reported graph server errors.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•