Closed Bug 574473 Opened 12 years ago Closed 11 years ago

Merge mozharness into mainstream use

Categories

(Release Engineering :: Applications: MozharnessCore, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aki, Assigned: aki)

References

Details

(Whiteboard: [mozharness])

Attachments

(4 files, 2 obsolete files)

Lower priority, but may involve lots of changes.
http://hg.mozilla.org/users/asasaki_mozilla.com/mozharness/file/PRODUCTION

Also
https://bugzilla.mozilla.org/attachment.cgi?id=452148 (remove signdebs.mk) once that's reviewed and landed.
I've been making changes to the default branch for multilocale, and we may need to make independent changes soon to deb signing for the 2.0a1 release.

I could create a named branch in my user repo, but I figure this is a good time to land in tools/.
Attachment #464132 - Flags: review?(jhford)
Comment on attachment 464132 [details] [diff] [review]
PRODUCTION mozharness tag -> tools/harness/

I haven't done a full review, but I have a few comments before I start an in-depth review:

According to PEP 8 (http://www.python.org/dev/peps/pep-0008/)

      Modules should have short, all-lowercase names.  Underscores can be used
      in the module name if it improves readability.  Python packages should
      also have short, all-lowercase names, although the use of underscores is
      discouraged.

I don't see any reason that we shouldn't follow this guidance since this is the initial implementation and don't risk breaking much. An example is that Log.py should be log.py.  

Also, from PEP8
      Note: When using abbreviations in CapWords, capitalize all the letters
      of the abbreviation.  Thus HTTPServerError is better than
      HttpServerError.

It would be great to rename SshErrorRegex to SSHErrorRegex (and make it plural while there)

I haven't read through the entire harness yet, but it seems like BasicFunctions and other classes in log.py are in danger of becoming god objects, if they aren't already.

I would like to see greater separation of configuration, logging, stock commands and subprocess management logic.  It seems like it'd be difficult to test small changes to implementing test runs.  I also fear that each implementing script is going either have to add to BasicFunctions or duplicate logic found in other scripts.  I see that the multilocalte repack step has logic for doing hg operations.  Would a theoretical build script have different logic? Would it be copy and pasted in from somewhere else?

On the topic of subprocess management, I see |stdout, stderr = p.communicate()| in line 144 of log.py.  According to the API docs, this might not be good for something like a 20MB+ build log (http://docs.python.org/library/subprocess.html#subprocess.Popen.communicate).  

I would prefer the command line arguments to be in the format --this-has-lots-of-words.  We use this convention in a lots of places in our own code and the mozilla-product python scripts

Some examples:
http://hg.mozilla.org/build/tools/file/826eee2affb1/buildbot-helpers/force_release_l10n.py#l157
http://hg.mozilla.org/build/tools/file/826eee2affb1/buildfarm/breakpad/cleanup-breakpad-symbols.py#l33
http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/runxpcshelltests.py#531

For more examples, the following search will show that this is a common convention:
http://mxr.mozilla.org/mozilla-central/search?string=\-\-[a-z]*\-[a-z]&regexp=1&find=\.py&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central 

Conversely, I can't find any examples of the --helloWorld style
http://mxr.mozilla.org/mozilla-central/search?string=\-\-[a-z]*[A-Z]%2B&regexp=1&find=\.py&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central

In this patch, the configuration files (*.json) use camelcasing where our current configuration logic uses '_'.join(words) styling.  I don't see any reason to change to using camelcasing over the established convention.

There is an example in the documentation (http://docs.python.org/library/optparse.html?highlight=optparse#adding-new-actions) that seems to describe what you are trying to accomplish with your MozOptionParser.  Since OptionParser has built in logic for handling lists it seems advantageous to take advantage of it.  It is also confusing that the number of times that you define an option on your command line determines whether or not the option is a list or a single item.

Because this is a very complicated harness, I think some more unit tests would be critical before widespread use.
(In reply to comment #2)
> I haven't read through the entire harness yet, but it seems like BasicFunctions
> and other classes in log.py are in danger of becoming god objects, if they
> aren't already.

Right, I saw this as a problem but haven't found a way around this yet -- do you have suggestions?

Naming changes are fine.  I'll look at the communicate() and other bits too, but I'm probably going to land these on default rather than PRODUCTION, though I'll have to deal with needing to update PRODUCTION config updates at some point soon.
(In reply to comment #3)
> (In reply to comment #2)
> > I haven't read through the entire harness yet, but it seems like BasicFunctions
> > and other classes in log.py are in danger of becoming god objects, if they
> > aren't already.
> 
> Right, I saw this as a problem but haven't found a way around this yet -- do
> you have suggestions?

Well, you could have one class that knows how to take a common format and run that and other classes that understand the high level operations and they know how to render down into a format that the runner class understands.  With that structure.

> Naming changes are fine.  I'll look at the communicate() and other bits too,
> but I'm probably going to land these on default rather than PRODUCTION, though
> I'll have to deal with needing to update PRODUCTION config updates at some
> point soon.

I think that the .communicate() issue is still fairly important.  A way to side-step this issue without changing current behaviour is to open three files, stdin(r+), stdout(w+), stderr(w+) and pass those to the Popen call instead of the subprocess.PIPE.  This will let you write to a file which avoids the problems of storing the entire file in memory.  These files can be then .seek(0)'d and read in for parsing either line by line (.readline()) or in one big go (.readlines()).
(In reply to comment #2)
> Modules should have short, all-lowercase names.

Fixed. I'm not sold on how I'm doing the sys.path stuff and we may want to split lib/ out to a generic area away from the moz-specific configs and scripts, but this works for me for now.

> It would be great to rename SshErrorRegex to SSHErrorRegex (and make it plural
> while there)

Done.  Moved these into errors.py.

> I haven't read through the entire harness yet, but it seems like BasicFunctions
> and other classes in log.py are in danger of becoming god objects, if they
> aren't already.

I consolidated BasicFunctions and SimpleConfig in BaseScript, because it seems like they belong there.

> I would like to see greater separation of configuration, logging, stock
> commands and subprocess management logic.

Me too, further than what I've done already, but I think it's going in the right direction.

> I see that the multilocalte repack step has
> logic for doing hg operations.

Pulled out into MercurialScript (and AbstractMercurialScript)

> On the topic of subprocess management, I see |stdout, stderr = p.communicate()|

Fixed.

I think I want to do a lot more in runCommand() and getOutputFromCommand(), as evidenced by the TODOs. However, things are much better at this point, and I can verify that long running commands output as they go, with timestamps.

(tested with find /)

> I would prefer the command line arguments to be in the format
> --this-has-lots-of-words.

Fixed.

> In this patch, the configuration files (*.json) use camelcasing where our
> current configuration logic uses '_'.join(words) styling.

Fixed.

> There is an example in the documentation
> (http://docs.python.org/library/optparse.html?highlight=optparse#adding-new-actions)
> that seems to describe what you are trying to accomplish with your
> MozOptionParser.

Fixed.

> It is also confusing that the
> number of times that you define an option on your command line determines
> whether or not the option is a list or a single item.

Nope, extend is always a list (as was *append*_split)

> Because this is a very complicated harness, I think some more unit tests would
> be critical before widespread use.

Me too.
Added coverage via easy_install and will add those.
I find it pretty easy to just run one action, sometimes with added self.debug()s, and test that way. But agreed, unit tests would rock.
Attachment #464132 - Attachment is obsolete: true
Attachment #466507 - Flags: review?(jhford)
Attachment #464132 - Flags: review?(jhford)
3 weeks since the fixed patch was attached; no review.
This is blocking a number of Q3 goals.
Comment on attachment 466507 [details] [diff] [review]
copy over non-multilocale files from mozharness e06c9254b5ab

Hey Catlee -- This is a pretty big patch. It might be best to look at the default branch of http://hg.mozilla.org/users/asasaki_mozilla.com/mozharness/ , minus the in-progress multilocale script/json configs.

This is my attempt at writing factory-level build scripts, though you can run actions independently (which is how I'm planning on running the multilocale script).
Attachment #466507 - Flags: feedback?(catlee)
Attachment #466507 - Attachment is obsolete: true
Attachment #466507 - Flags: review?(jhford)
Attachment #466507 - Flags: feedback?(catlee)
Catlee -- I believe I have addressed all your concerns to date, other than full test coverage.  Anything else that would prevent landing an initial pass of this, and/or using mozharness for multilocale?

Syed's porting o' the release builds to 080 make me want to get multilocale out of the way so we can get mobile over as well... hence the ping.
Priority: P5 → P2
At revision e8c258192101, I have:

 lib/config        91% coverage
 lib/errors       100% coverage
 lib/log           96% coverage
 lib/script        90% coverage
Adding adrianp who will be working this semester on mozharness.
I'm basically copying over all the files except .hgtags from users/asasaki_mozilla.com/mozharness into build/mozharness, then doing an hg addremove.  (I can also import all the old changesets for history if desired.)

* There is no direct outside pressure on this.  This is good to do, and should be done, but this should not be a blocker-priority review.
** However, the longer the review takes, the more likely this patch will slowly bitrot.
* This is a huge review, and this is not my first attempt at getting this reviewed. If there's anything I can do to help this along, let me know.
* There are a lot of TODOs; known.  If you have questions about these, let me know.
* Both multil10n.py and signdebs.py are currently live in production, and are currently part of our nightlies and mobile release process.  Not ideal, but driven by time pressures.
* unit.sh provides coverage for >90% of lib/base/. (script.py slipped to 89% recently)
Comment on attachment 487450 [details] [diff] [review]
mozharness addremove at changeset 73bd3e506866

Hoping Coop or Catlee has time for this over the next couple weeks.
Attachment #487450 - Flags: review?(coop)
Attachment #487450 - Flags: review?(catlee)
Whiteboard: [mozharness]
Comment on attachment 487450 [details] [diff] [review]
mozharness addremove at changeset 73bd3e506866

>+# ReadOnlyDict {{{1
>+class ReadOnlyDict(dict):
>+    def __init__(self, dictionary):
>+        self.__lock = False
>+        self.update(dictionary.copy())
>+
>+    def __checkLock__(self):
>+        assert not self.__lock, "ReadOnlyDict is locked!"

I think a single leading underscore is sufficient for __lock, and
__checkLock__.  Double underscores usually indicate special meaning,
particularly with methods.  Check out PEP8, Naming Conventions for some
rationale.

>+# parseConfigFile {{{1
>+def parseConfigFile(file_name, quiet=False):
>+    """Read a config file and return a dictionary.
>+    """
>+    file_path = None
>+    search_path = ['.', os.path.join(sys.path[0], '..', 'configs'),
>+                   os.path.join(sys.path[0], '..', '..', 'configs')]

I think we need to get rid of having this hardcoded list of directories to
search.  Passing in an explicit list of directories to check would be better
than internally searching ., ../configs, and ../../configs

>+# BaseConfig {{{1
>+class BaseConfig(object):
>+    """Basic config setting/getting.
>+    Debating whether to be paranoid about this stuff and put it all in
>+    self._config+rand(10000) and forcing everyone to use methods to access
>+    it, as I did elsewhere to lock down the config during runtime, but
>+    that's a little heavy handed to go with as the default.
>+    """

Should be a comment rather than a docstring?  Is this still needed now that you
have locked configs?

>diff --git a/lib/base/script.py b/lib/base/script.py
>new file mode 100755
>--- /dev/null
>+++ b/lib/base/script.py
>@@ -0,0 +1,422 @@
>+#!/usr/bin/env python
>+"""Generic script objects.
>+"""
>+
>+import os
>+import re
>+import shutil
>+import subprocess
>+import sys
>+import tempfile
>+import urllib2
>+
>+sys.path.insert(1, os.path.dirname(sys.path[0]))

Why is this required?  Also seems like a really bad idea to be modifying the
module search path when you import this module.

>+                               **kwargs)
>+        self.config = rw_config.getReadOnlyConfig()
>+        self.actions = tuple(rw_config.actions)
>+        if os.path.exists("localconfig.json"):
>+            self.move("localconfig.json", "localconfig.json.bak")
>+        rw_config.dumpConfig(file_name="localconfig.json")
>+        self.newLogObj(default_log_level=default_log_level)
>+        """I can definitely see wanting to get more runtime info before
>+        locking -- what's my hg revision? What's the latest ____
>+        in this json feed?  ... that you might want to save for later
>+        for a respin.  But as I think of what I'd want to add
>+        to this list, I keep thinking of more and more things.
>+
>+        Now I'm thinking it's two steps: 1) figure out runtime details;
>+        2) set up configs and run.  (2) can be iterated over multiple
>+        times during respins. (1) should be external to that."""

Use # for comments.

>+    def __lockConfig(self):
>+        self.config.lock()

Only needs to be called _lockConfigs

>diff --git a/scripts/multil10n.py b/scripts/multil10n.py
>new file mode 100755
>--- /dev/null
>+++ b/scripts/multil10n.py
>@@ -0,0 +1,354 @@
>+#!/usr/bin/env python
>+"""multil10n.py
>+
>+Our initial [successful] attempt at a multi-locale repack happened inside
>+of MaemoBuildFactory.  However, this was highly buildbot-intensive,
>+requiring runtime step truncation/creation with large amounts of build
>+properties that disallowed the use of "Force Build" for any multi-locale
>+nightly.
>+
>+To improve things, we're moving the logic slave-side where a dedicated
>+slave can use its cycles determining which locales to repack.
>+
>+Currently oriented towards Android multilocale, bug 563382.
>+
>+TODO: finish work to make this a standalone runnable script.
>+"""
>+
>+import hashlib
>+import os
>+import re
>+import sys
>+
>+# load modules from parent dir
>+sys.path.insert(1, os.path.join(os.path.dirname(sys.path[0]), "lib"))

same here

>+[fennec]
>+name =     Mozilla %(long_catalog_name)s %(locale)s Catalog
>+uri = %(repo_url)s
>+dist = %(platform)s
>+components = %(section)s
>+""" % replace_dict
>+        self.info("Writing install file to %s" % file_path)
>+        if self.config['noop']:
>+            print contents
>+            return
>+        fh = open(file_path, 'w')
>+        print >> fh, contents

fh.write("%s\n" % contents) is better here?  or do you need platform-specific newlines?

Overall, looks good, mostly just python style issues.  Don't see anything blocking landing this now really.

+1000000 for tests!
Attachment #487450 - Flags: review?(catlee) → review+
Comment on attachment 487450 [details] [diff] [review]
mozharness addremove at changeset 73bd3e506866

Woo woo: http://hg.mozilla.org/build/mozharness

I fixed Catlee's nits, other than the sys.path stuff (tracked in bug 611791): http://hg.mozilla.org/build/mozharness/rev/6fa417c68419

I closed the 0.1 branch and removed the PRODUCTION tag.
I pushed the entire repo rather than addremoving, to avoid further bitrotting the Seneca students.

To do:  Write+land patch to point buildbot to build/mozharness instead of users/asasaki_mozilla.com/mozharness.
Attachment #487450 - Flags: checked-in+
After these 3 patches land, you'll be patching http://hg.mozilla.org/build/mozharness for 4.0b3 instead of http://hg.mozilla.org/users/asasaki_mozilla.com/mozharness.  Also, you will no longer have to bump the PRODUCTION tag.

I'll update the docs once this lands.
Attachment #490330 - Flags: review?(lsblakk)
Attachment #490330 - Flags: review?(lsblakk) → review+
Attachment #490331 - Flags: review?(lsblakk) → review+
Attachment #490332 - Flags: review?(lsblakk) → review+
Flags: needs-reconfig?
Flags: needs-reconfig?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Android nightly went green; test deb repo updates went green.
Product: mozilla.org → Release Engineering
Component: Other → Mozharness
You need to log in before you can comment on or make changes to this bug.