Closed Bug 920938 Opened 6 years ago Closed 5 years ago

[manifestparser] handle symlinks in populate_directory and from directories

Categories

(Testing :: Mozbase, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla37

People

(Reporter: k0scist, Assigned: parkouss)

References

Details

Attachments

(1 file, 2 obsolete files)

[ManifestDestiny] handle symlinks in populate_directory and from directories
:cc: :wlach, :jgriffin

Bug 902610 introduced a decent cached directory walker to make
filtering fit our whims as well as to cache results so that multiple
walkings wouldn't be necessary e.g. for determining if a directory of
directories was empty:

https://github.com/mozilla/mozbase/commit/af120878d417aa630800726007522fad7db7b65e#L0R759

However, currently this class does not handle links well.  See
exposition in https://bugzilla.mozilla.org/show_bug.cgi?id=902610#c26
through https://bugzilla.mozilla.org/show_bug.cgi?id=902610#c32 .

In short, it is not necessarily straightforward what to do with
symlinks.  A few options:

- don't follow symlinks at all.  This is what os.walk does by default

- follow symlinks only if specified by caller.  os.walk allows this
  optionally if the caller sets `followlinks` to True.  However, it
  has the same problem as the code in
  https://bug902610.bugzilla.mozilla.org/attachment.cgi?id=810193 in
  that it does not logically teminate.

- allow symlinks so long as they cause no cyclical problems.  That is,
  you have a DAG, not just a graph.

- other: the sky's the limit!

It is also worth pointing out that how 'relative_to' should work wrt
symlinked resources is up for debate, but one conundrum at a time.
See Also: → 902610
Hi,

I worked a bit on it and this is an implementation that allow symlinks in every case.

Patch are stored to avoid infinite recursion and reprocessing of a given path. The code seems much simpler to me.

Note that I choosed to not recurse on a ignored folder name - I'm not sure what the previous code was doing for that (this may be changed if needed).

All the unit tests are passing (even the one with symlink now), I lauched them locally. I think it is fast enough but if it is a concern you may want to test more.

An option could be added to (dis)allow symlinks, but I was no sure it is needed.

What do you think ?
Attachment #8511514 - Flags: feedback?(wlachance)
Attachment #8511514 - Flags: feedback?(k0scist)
(In reply to Julien Pagès from comment #1)
> Patch are stored to avoid infinite recursion and reprocessing of a given

I meant "Path are stored ..." here. :)
Attachment #8511514 - Flags: feedback?(k0scist)
Comment on attachment 8511514 [details] [diff] [review]
handle symlinks in populate_directory and from directories

I feel like I'm not the best person to give feedback on this, as I don't work with manifests that much. Jmaher, could you have a quick look at Julien's approach and let us know if it makes sense?
Attachment #8511514 - Flags: feedback?(wlachance) → feedback?(jmaher)
Comment on attachment 8511514 [details] [diff] [review]
handle symlinks in populate_directory and from directories

Review of attachment 8511514 [details] [diff] [review]:
-----------------------------------------------------------------

overall this is a took a bit of thinking for me to put this patch into perspective- I have a few comments.  Nothing is jumping out at me, we do have some unittests, please ensure they work.  I am not sure if adding unittests for this is possible, but if so we should look into it.

::: testing/mozbase/manifestparser/manifestparser/manifestparser.py
@@ +884,5 @@
>  
> +        if isinstance(pattern, basestring):
> +            patterns = [pattern]
> +        else:
> +            patterns = pattern

what about the none case, in the previous example if pattern==None, then pattern=set()

@@ +898,2 @@
>  
> +        if not ignore:

if ignore=set(None) do we still work here?

@@ +936,5 @@
>  
> +                # here we got all subdirs and files filtered, we can
> +                # call the callback function if directory is not empty
> +                if subdirs or files:
> +                    function(rootdirectory, directory, subdirs, files)

I don't like this line, what is function, can we come up with a better name for this?
Attachment #8511514 - Flags: feedback?(jmaher) → feedback+
(In reply to Joel Maher (:jmaher) from comment #4)
> Comment on attachment 8511514 [details] [diff] [review]
> handle symlinks in populate_directory and from directories
> 
> ::: testing/mozbase/manifestparser/manifestparser/manifestparser.py
> @@ +884,5 @@
> >  
> > +        if isinstance(pattern, basestring):
> > +            patterns = [pattern]
> > +        else:
> > +            patterns = pattern
> 
> what about the none case, in the previous example if pattern==None, then
> pattern=set()
> 

The None test is handled few lines after (892):

> + if not patterns:
> +     accept_filename = lambda filename: True

> @@ +898,2 @@
> >  
> > +        if not ignore:
> 
> if ignore=set(None) do we still work here?

You're right, but ignore is not handled in the current code too - It can be
handled if needed.

> @@ +936,5 @@
> >  
> > +                # here we got all subdirs and files filtered, we can
> > +                # call the callback function if directory is not empty
> > +                if subdirs or files:
> > +                    function(rootdirectory, directory, subdirs, files)
> 
> I don't like this line, what is function, can we come up with a better name
> for this?


Sure; I took it from the current code. Maybe 'callback' or 'dir_callback' ?

I ran all the unittests with succes, see comment 1.
 I think we can switch 'function' to 'callback', it would make sense.

Are there any additional unittests you can write for this?  Otherwise, lets get this up for review soon!
I may add a couple of unittests for this, at least one to check that infinite symlinks recursion does not appear.

I would like to use "unittest.skipIf" to only ran this test on platform that have os.symlink, but is new in python 2.7. May I use it, or do you need test compatibility with older python version ?

Another solution in this case may be to use unittest2 as a test dependency (https://pypi.python.org/pypi/unittest2) or write custom code like it is already done (http://dxr.mozilla.org/mozilla-central/source/testing/mozbase/manifestparser/tests/test_convert_symlinks.py#61).
Python 2.7 should now be the oldest version we support, and we already have some conditional decorators in place for other modules. So this should be fine.
Well I changed the 'function' name to 'callback' and added one test to check cyclical symlimks.

I ran locally all the tests with success.
Attachment #8511514 - Attachment is obsolete: true
Attachment #8515930 - Flags: review?(jmaher)
Comment on attachment 8515930 [details] [diff] [review]
handle symlinks in populate_directory and from directories

Review of attachment 8515930 [details] [diff] [review]:
-----------------------------------------------------------------

thanks for updating the tests!  I would mark this as checkin-needed in the keywords and it will get picked up by the sheriffs!
Attachment #8515930 - Flags: review?(jmaher) → review+
Assignee: nobody → j.parkouss
Status: NEW → ASSIGNED
Keywords: checkin-needed
Sorry for this;

This is working well on my GNU/Linux system (I tried on archlinux and ubuntu).

The errors shown seems to appears on OSX, that unfortunately I don't have - so I can't investigate the problem.

Maybe someone can ? Or maybe I can do something with try server - I have a recent level 1 access so I can push to try but I don't know what command to give to test this - and I never pushed to try yet - if someone can guide me...
These seem to be b2g specific errors (running unit tests during the build step), Looking on http://trychooser.pub.build.mozilla.org/, we should be good with this syntax:
try: -b do -p macosx64_gecko -u none -t none
Thanks Joel for this. :) I'll look into it soon.
Summary: [ManifestDestiny] handle symlinks in populate_directory and from directories → [manifestparser] handle symlinks in populate_directory and from directories
Ok, I found the bug. It appears that on these machines, the tempfile.mkdtemp does not return a realpath - it is a symlinked path.

Since this patch needs realpath to check symlink recursion, the returned values are realpath. (understand os.path.realpath)

I fixed it by replacing every tempfile.mkdtemp() call with os.path.realpath(tempfile.mkdtemp()) and try push is now green:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=268072d592b6

However this is tricky - I will write a function in the file that is impacted that does that and explains the why as a comment. Is that ok for you ?
Julien, thanks for looking into this.  Your approach sounds great.  I look forward to reviewing it!
So, the only differences from the previous patch are in testing/mozbase/manifestparser/manifestparser/manifestparser.py.

Test runs fine on my local linux machine and try push is now green on B2G Desktop OS X: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=ab9310bb6608. Everything should be good then. :)
Attachment #8515930 - Attachment is obsolete: true
Attachment #8532570 - Flags: review?(jmaher)
Oups typo, the only differences are in testing/mozbase/manifestparser/tests/test_convert_directory.py, not in manifestparser.py;
Comment on attachment 8532570 [details] [diff] [review]
handle symlinks in populate_directory and from directories

Thanks for jumping on this and fixing it!
Attachment #8532570 - Flags: review?(jmaher) → review+
https://hg.mozilla.org/mozilla-central/rev/00c9dd32fd19
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
You need to log in before you can comment on or make changes to this bug.