Closed Bug 848227 Opened 12 years ago Closed 12 years ago

make mozharness retries generic

Categories

(Release Engineering :: Applications: MozharnessCore, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mozilla, Assigned: mozilla)

References

Details

(Whiteboard: [mozharness])

Attachments

(1 file)

Currently we have a lot of retry logic in various places in mozharness, but they each have their own logic to do those retries.

We should make this generic.
Run a python call (that can be a self.run_command() or other), up to max retries.

A check for "good" (status 0 could be a default check, but parsing output [whitelist/blacklist], expecting a value or something in a set of values, etc.).

A time.sleep() between retries that can be set to 0.

A max_retries as well as a final step.

The final step could be used to set buildbot RETRY status if applicable.
Blocks: 847727
Assignee: nobody → aki
Attached patch generic retry()Splinter Review
Ok, this is another big one, sorry about that.
My work is here: https://github.com/escapewindow/mozharness/compare/generic-retry
I can break this patch up if that helps.

This patch:
* merges OSMixin and ShellMixin.  They had interdependencies, and I don't think we ever inherited one without the other.  Hello ScriptMixin.
* creates a generic ScriptMixin.retry() method that's based off of build/tools' util.retry.retry().
* ports various methods to use retry() instead of homegrown versions.
** This may result in different error messages, different num_retries and sleep times, etc.
** download_file() will now do an nslookup on every URLError, not just the final one.
** vcs_checkout() will rmtree(dest) on every failure, even the final attempt
* also gets a few more files pep8-whitespace-compliant.
* updates setup.py to add test dependencies.  (maybe I should add mercurial==1.7.3 until bug 828029 is fixed?)
* adds the retry tests from build/tools into test_base_script.py

This is not only a large patch, but it touches a lot of critical infrastructure methods.  I'd like to land on default, and get a green run on all Cedar platforms, before merging to production.
Attachment #722993 - Flags: review?(rail)
Comment on attachment 722993 [details] [diff] [review]
generic retry()

Review of attachment 722993 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/base/script.py
@@ +154,5 @@
> +    def _download_file(self, url, file_name):
> +        """ Helper script for download_file()
> +            """
> +        try:
> +            f = urllib2.urlopen(url)

Not a blocker, probably could be done as a separate bug/patch. It would be great to add and pass timeout to urlopen to avoid buildbot killing the process. Even None is OK.
Attachment #722993 - Flags: review?(rail) → review+
(In reply to Rail Aliiev [:rail] from comment #4)
> Comment on attachment 722993 [details] [diff] [review]
> generic retry()
> 
> Review of attachment 722993 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mozharness/base/script.py
> @@ +154,5 @@
> > +    def _download_file(self, url, file_name):
> > +        """ Helper script for download_file()
> > +            """
> > +        try:
> > +            f = urllib2.urlopen(url)
> 
> Not a blocker, probably could be done as a separate bug/patch. It would be
> great to add and pass timeout to urlopen to avoid buildbot killing the
> process. Even None is OK.

Makes sense; that may help the downloads that hit bug 840305.
Looks like we need to make sure there are no python 2.5 test machines first, though.
Comment on attachment 722993 [details] [diff] [review]
generic retry()

http://hg.mozilla.org/build/mozharness/rev/0815e7ebbd57

Merging m-c to cedar to make sure nothing is egregiously broken before merging to production.
Attachment #722993 - Flags: checked-in+
Gaia UI tests look broken, but I *think* those are mozpool-client retries.
https://tbpl.mozilla.org/php/getParsedLog.php?id=20537486&tree=Cedar&full=1#error0
Merged to production.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → Mozharness
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: