Closed Bug 774916 Opened 12 years ago Closed 12 years ago

new mozbase package: mozfile

Categories

(Testing :: Mozbase, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: k0scist, Unassigned)

Details

Attachments

(3 files)

https://wiki.mozilla.org/Auto-tools/Projects/MozBase#File_Handling_-_MozFile

mozfile should take care of common file (and perhaps URL?) related
tasks, such as extracting, etc.  These will mostly be convenience
functions to not have to duplicate logic multiple places.  

For an example of this sort of thing, see

https://bugzilla.mozilla.org/show_bug.cgi?id=774875
https://github.com/mozilla/mozbase/blob/a48971b7cd835835efaeef88e3a7c8171b926174/mozinstall/mozinstall/mozinstall.py#L112
I have to admit that the idea behind this package makes me a little bit sad, but it's probably a good idea to consolidate this collection of hacks somewhere.
Yes, it makes me sad too.  I would really love to see a strong effort battling the uphill battle getting things like this into python proper (pretending for one second that we were actually on python 3.3).  That said, I'd rather gather them somewhere rather than have them as copy+paste workarounds
Another candidate I've worked around several times:

from help(shutil.copytree)

    copytree(src, dst, symlinks=False, ignore=None)
        Recursively copy a directory tree using copy2().
        
        The destination directory must not already exist.

This is actually easily fixable in the python source by changing os.makedirs to be conditional:

http://svn.python.org/projects/python/trunk/Lib/shutil.py

But it can't be monkey-patched :(
Comment on attachment 654654 [details]
just have 2 modules inside at present  ... wat else should i add

I think these two functions are enough for now.  We can also add more as we go forward.  I'm more interested in getting this module out there and getting consumers to use it.

It would be nice to have some tests for these things.

>
>#mozfile.py 

This should be a docstring, not a comment, preferably with a link to the bug

>
># extract tar or zip files to dest 
>
>def _extract(src, dest=None, delete=False):

please name this extract without the opening _ . I realize that's how the original function is, but its probably a mistake there too

>    """
>Takes in a tar or zip file and extracts it to dest
>
>If dest is not specified, extracts to os.path.dirname(src)
>If delete is set to True, deletes the bundle at path
>
>Returns the list of top level files that were extracted
>"""
>
>    import os #not sure if needed almost all files import os 
>    import tarfile
>    import zipfile

please put all imports at the module global level

>    assert not os.path.isfile(dest), "dest cannot be a file"
>
>    if dest is None:
>        dest = os.path.dirname(src)
>    elif not os.path.isdir(dest):
>        os.makedirs(dest)
>
>    if zipfile.is_zipfile(src):
>        bundle = zipfile.ZipFile(src)
>        namelist = bundle.namelist()
>
>        if hasattr(bundle, 'extractall'):
>            # zipfile.extractall doesn't exist in Python 2.5
>            bundle.extractall(path=dest)

We probably shouldn't have this conditional logic.  As long as we support 2.5, it is just better to roll our own rather than having two separate paths of failure. (I know as I've specifically been bitten by this before)

>        else:
>            for name in namelist:
>                filename = os.path.realpath(os.path.join(dest, name))
>                if name.endswith('/'):
>                    os.makedirs(filename)
>                else:
>                    path = os.path.dirname(filename)
>                    if not os.path.isdir(path):
>                        os.makedirs(path)
>                    dest = open(filename, 'wb')
>                    dest.write(bundle.read(name))
>                    dest.close()
>
>    elif tarfile.is_tarfile(src):
>        bundle = tarfile.open(src)
>        namelist = bundle.getnames()
>
>        if hasattr(bundle, 'extractall'):
>            # tarfile.extractall doesn't exist in Python 2.4
>            bundle.extractall(path=dest)

Likewise

>        else:
>            for name in namelist:
>                bundle.extract(name, path=dest)
>    else:
>        return
>
>    bundle.close()
>
>    if delete:
>        os.remove(src)

IMHO this is a misfeature, but fine to leave for now.

>    # namelist returns paths with forward slashes even in windows
>    top_level_files = [os.path.join(dest, name) for name in namelist
>                             if len(name.rstrip('/').split('/')) == 1]
>
>    # namelist doesn't include folders, append these to the list
>    for name in namelist:
>        root = os.path.join(dest, name[:name.find('/')])
>        if root not in top_level_files:
>            top_level_files.append(root)
>
>    return top_level_files
>
>
>
>
>
>
># recursively delete files for windows 
>
>
> 
>def rmdirRecursive(dir):
>    
>    import os 
>    from twisted.python import log

We don't want this import.  Please remove all references to log

>    
>    """This is a replacement for shutil.rmtree that works better under
>    windows. Thanks to Bear at the OSAF for the code."""
>    if not os.path.exists(dir):
>        return
>    if os.path.islink(dir):
>        os.remove(dir)
>        return
>    # Verify the directory is read/write/execute for the current user
>    os.chmod(dir, 0700)
>    # os.listdir below only returns a list of unicode filenames if the parameter is unicode
>    # Thus, if a non-unicode-named dir contains a unicode filename, that filename will get garbled.
>    # So force dir to be unicode.
>    try:
>        dir = unicode(dir, "utf-8")
>    except:
>        log.msg("rmdirRecursive: decoding from UTF-8 failed")

...Like this one

>
>        for name in os.listdir(dir):

Indentation error?

>            full_name = os.path.join(dir, name)
>            # on Windows, if we don't have write permission we can't remove
>            # the file/directory either, so turn that on
>            if os.name == 'nt':
>                if not os.access(full_name, os.W_OK):
>                    # I think this is now redundant, but I don't have an NT
>                    # machine to test on, so I'm going to leave it in place
>                    # -warner
>                    os.chmod(full_name, 0600)
>
>            if os.path.islink(full_name):
>                os.remove(full_name) # as suggested in bug #792
>            elif os.path.isdir(full_name):
>                rmdirRecursive(full_name)
>            else:
>                if os.path.isfile(full_name):
>                os.chmod(full_name, 0700)
>                os.remove(full_name)
>                os.rmdir(dir)
>

r- ing for now until these issues are fixed.  They shouldn't be too bad. Then I can help package and write tests
Attachment #654654 - Flags: feedback?(jhammel) → feedback-
Attached file changes made
Attachment #656508 - Flags: feedback?(jhammel)
Comment on attachment 656508 [details]
changes made

outside of way too much whitespace and some comment cleanup, this mostly looks good to me.  It will need to be packaged, but I'm happy to do that.

Any thoughts, :wlach?
Attachment #656508 - Flags: feedback?(jhammel) → feedback+
(In reply to Jeff Hammel [:jhammel] from comment #12)
> Comment on attachment 656508 [details]
> changes made
> 
> outside of way too much whitespace and some comment cleanup, this mostly
> looks good to me.  It will need to be packaged, but I'm happy to do that.
> 
> Any thoughts, :wlach?

Looks good to me on the whole, minus the comment and whitespace issues that you noticed. Two more things:

* This is kind of nitpicky, but I'd prefer we use pep8-recommended function/variable naming (foo_bar vs. fooBar) for the function/variable names in new mozbase modules. For this case, I'm partial to "rm_rf" vs. rmdirRecursive (nice and concise).
* Let's run pyflakes on this file just to be sure there's no undefined variables or obvious errors. For bonus anal-retentive points, also run pep8 on it.

Tests would indeed be nice here.
(In reply to William Lachance (:wlach) from comment #13)
> (In reply to Jeff Hammel [:jhammel] from comment #12)
> > Comment on attachment 656508 [details]
> > changes made
> > 
> > outside of way too much whitespace and some comment cleanup, this mostly
> > looks good to me.  It will need to be packaged, but I'm happy to do that.
> > 
> > Any thoughts, :wlach?
> 
> Looks good to me on the whole, minus the comment and whitespace issues that
> you noticed. Two more things:
> 
> * This is kind of nitpicky, but I'd prefer we use pep8-recommended
> function/variable naming (foo_bar vs. fooBar) for the function/variable
> names in new mozbase modules. For this case, I'm partial to "rm_rf" vs.
> rmdirRecursive (nice and concise).

How do you feel about rmtree to be in parity with shutil.rmtree?

> * Let's run pyflakes on this file just to be sure there's no undefined
> variables or obvious errors. For bonus anal-retentive points, also run pep8
> on it.

Sounds like a plan.
 
> Tests would indeed be nice here.

Yeah, I'll write a few smoke-screen tests.  We'll also need a setup.py

So, if rmtree is okay, I'll

* rename this
* cleanup whitespace a bit
* run pyflakes
* make a setup.py
* add some smokescreen tests
* put up for review

Sound like a plan?
Oh, and of course:

* add a README.md so that people don't have to blindly guess wth this software does
* update the MozBase wiki page accordingly
(In reply to Jeff Hammel [:jhammel] from comment #14)
> (In reply to William Lachance (:wlach) from comment #13)
> > (In reply to Jeff Hammel [:jhammel] from comment #12)
> > > Comment on attachment 656508 [details]
> > > changes made
> > > 
> > > outside of way too much whitespace and some comment cleanup, this mostly
> > > looks good to me.  It will need to be packaged, but I'm happy to do that.
> > > 
> > > Any thoughts, :wlach?
> > 
> > Looks good to me on the whole, minus the comment and whitespace issues that
> > you noticed. Two more things:
> > 
> > * This is kind of nitpicky, but I'd prefer we use pep8-recommended
> > function/variable naming (foo_bar vs. fooBar) for the function/variable
> > names in new mozbase modules. For this case, I'm partial to "rm_rf" vs.
> > rmdirRecursive (nice and concise).
> 
> How do you feel about rmtree to be in parity with shutil.rmtree?

Sounds good.

> > * Let's run pyflakes on this file just to be sure there's no undefined
> > variables or obvious errors. For bonus anal-retentive points, also run pep8
> > on it.
> 
> Sounds like a plan.
>  
> > Tests would indeed be nice here.
> 
> Yeah, I'll write a few smoke-screen tests.  We'll also need a setup.py
> 
> So, if rmtree is okay, I'll
> 
> * rename this
> * cleanup whitespace a bit
> * run pyflakes
> * make a setup.py
> * add some smokescreen tests
> * put up for review
> 
> Sound like a plan?

Yup, but run pep8 in addition to pyflakes. ;)
So looking at the patch, I see a few things I'd like to fix before landing:

- if extract is given something other than a zipfile or a tarfile, it returns.  Instead it should fail

- I see no reason for the 'delete' argument to extract.  The caller can do this

- extract is pretty magical.  I'm kinda fine on this, but I'd like to have extract_tarball and extract_zipfile which it calls.  This allows you to be explicit if you want.

I'll make these changes and add a test or two
(In reply to Andrew Halberstadt [:ahal] from comment #18)
> A couple more candidates I've needed in a few places:
> https://hg.mozilla.org/build/mozharness/file/bb5df91e065e/mozharness/base/
> script.py#l302
> https://hg.mozilla.org/build/mozharness/file/bb5df91e065e/mozharness/base/
> script.py#l121
> 
> If this is ready to land I can file a new bug.

I have a WIP here: https://github.com/k0s/mozbase/tree/bug-774916  It is essentially done except tests.  I'll try to finish this up soon, though (not unusually) unexpected things have come up.  So if you're blocked, feel free to take what I have and run with it (otherwise hopefully I'll finish up this week or early next)
This includes tests and a setup.py
Attachment #664719 - Flags: review?(wlachance)
Comment on attachment 664719 [details] [diff] [review]
proposed implementation

I'm happy with this. Some minor comments/suggestions:

>diff --git a/mozfile/README.md b/mozfile/README.md
>new file mode 100644
>index 0000000..15587b4
>--- /dev/null
>+++ b/mozfile/README.md
>@@ -0,0 +1,4 @@
>+mozfile should take care of common file (and perhaps URL?) related
>+tasks, such as extracting, etc.  These will mostly be convenience
>+functions to not have to duplicate logic multiple places.
>+

This kind of reads more as a statement of what the module would do if it were written. But it is written. ;)

Maybe better:

mozfile is a convenience library for taking care of some common file-related tasks in automated testing, such as extracting files or recursively removing directories.

>+        raise Exception("mozfile.extract: no archive format found for '%s'" % src)

This line is more than 80 characters long (says pep8)

>+    def create_tarball(self):
>+        """create a stub tarball for testing"""
>+        tempdir = create_stub()
>+        filename = tempfile.mktemp(suffix='.tar')
>+        archive = tarfile.TarFile(filename, mode='w')
>+        try:
>+            for path in files:
>+                archive.add(os.path.join(tempdir, *path), arcname=os.path.join(*path))
>+        except Exception, e:
>+            os.remove(archive)
>+            raise

pyflakes complains that "e" is created and never used here.

>+    def create_zip(self):
>+        """create a stub zipfile for testing"""
>+
>+        tempdir = create_stub()
>+        filename = tempfile.mktemp(suffix='.zip')
>+        archive = zipfile.ZipFile(filename, mode='w')
>+        try:
>+            for path in files:
>+                archive.write(os.path.join(tempdir, *path), arcname=os.path.join(*path))
>+        except Exception, e:
>+            os.remove(filename)
>+            raise

Likewise, complaints about an "e" being created and never used.
Attachment #664719 - Flags: review?(wlachance) → review+
pushed, with nits fixed: https://github.com/mozilla/mozbase/commit/673444c709d7338a29f2de9c2fbe931d5ff8d7a4

much better README (albeit small), thanks wlach!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: