Closed Bug 1468869 Opened 6 years ago Closed 6 years ago

Add git-cinnabar support for wpt manifest download

Categories

(Testing :: web-platform-tests, enhancement)

Version 3
enhancement
Not set
normal

Tracking

(firefox62 fixed)

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: cactusmachete, Assigned: cactusmachete)

Details

Attachments

(1 file)

      No description provided.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review257528

This is looking really good, just a few minor changes to make before we land it.

::: testing/web-platform/manifestdownload.py:30
(Diff revision 1)
> +
> +def git_commits(repo_root):
> +    git = vcs.Git.get_func(repo_root)
> +    cinnabar = vcs.Cinnabar.get_func(repo_root)
> +    try:
> +        return [item for item in cinnabar("git2hg", git("log", "--format=%H",

So one thing you could do here is instead of returning an actual list, return a generator that will prevent us having to actually run `git2hg` on every commit, something like

```
for item in git(log, […]).splitlines():
    yield cinnabar("git2hg", item)
````

::: testing/web-platform/manifestdownload.py:39
(Diff revision 1)
> +    except Exception:
> +        return False
> +
> +
> +def get_commits(repo_root):
> +    return hg_commits(repo_root) or git_commits(repo_root)

So, just trying both is fine, but we can also test if the repo is a git repo or a hg repo in a cheaper way. In `testing/web-platform/tests/wptrunner/wptrunner/vcs.py` we have a function like that for git, and for hg I think we can run `hg root` and see if it fails (or we could just test for `os.path.exists(os.path.join(repo_root, ".hg"))` which I think is what `hg root` is actually doing.

::: testing/web-platform/manifestdownload.py:85
(Diff revision 1)
>  def download_manifest(logger, wpt_dir, commits_func, url_func, force=False):
>      if not force and not should_download(logger, wpt_dir):
>          return False
>  
>      commits = commits_func()
> +    if not commits:

Instead of using `if not commits` here and putting the exception handler in each `commits_func` varaint, put the exception handler here and return if there's an exception.

I'm not sure if catching every exception is good, or if we'd be better off catching `subprocess.CalledProcessError`

::: testing/web-platform/vcs.py:24
(Diff revision 1)
>              # TODO: Test on Windows.
>          return hg
> +
> +
> +class Git(object):
> +

No blank line

::: testing/web-platform/vcs.py:42
(Diff revision 1)
> +                return None
> +            # TODO: Test on Windows.
> +        return git
> +
> +
> +class Cinnabar(object):

Since we only really use this in one place, we can probably just write it as

```
git("cinnabar", "hg2git", […])
```

Rather than having a whole class just for this variant.
Attachment #8985535 - Flags: review?(james) → review-
Attachment #8985535 - Flags: review?(ato)
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review257528

> So, just trying both is fine, but we can also test if the repo is a git repo or a hg repo in a cheaper way. In `testing/web-platform/tests/wptrunner/wptrunner/vcs.py` we have a function like that for git, and for hg I think we can run `hg root` and see if it fails (or we could just test for `os.path.exists(os.path.join(repo_root, ".hg"))` which I think is what `hg root` is actually doing.

And when you do this you can get rid of the try…except blocks in
hg_commits and git_commits.  There are two problems with them:

  (1) They act as catch-alls for all problems inside the function,
      making it difficult to spot real problems with the actual
      command line you execute.

  (2) Differing the return type by returning a list or a boolean
      depending on the runtime environment (whether it is a git or
      hg repo) is kind of awkward.

> Instead of using `if not commits` here and putting the exception handler in each `commits_func` varaint, put the exception handler here and return if there's an exception.
> 
> I'm not sure if catching every exception is good, or if we'd be better off catching `subprocess.CalledProcessError`

If we have a check for whether the repo is hg or git we presumably
don’t need to catch anything here because we will now that there
is no repo.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review257952

I will make another pass once you have addressed the open issues
on this review.

::: testing/web-platform/vcs.py:17
(Diff revision 1)
> -            except Exception as e:
> -                raise(e)
> +            except subprocess.CalledProcessError:
> +                return None

I think it is slightly dangerous to not be alerted that the command
didn’t run.  When you implement the VCS detection as jgraham explained
above, it would make sense to drop this try…except altogether in
my opinion.

::: testing/web-platform/vcs.py:29
(Diff revision 1)
> +
> +    def __init__(self, repo_root, url_base):
> +        self.root = os.path.abspath(repo_root)
> +        self.git = Git.get_func(repo_root)
> +
> +

Another superfluous blank line here.
Attachment #8985535 - Flags: review?(ato) → review-
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

Some more comments, most of which are minor style trivialites.
Looking forward to your update!

::: testing/web-platform/manifestdownload.py:18
(Diff revision 2)
>      return os.path.abspath(os.path.expanduser(path))
>  
>  
>  def hg_commits(repo_root):
> -    hg = Mercurial.get_func(repo_root)
> -    return [item for item in hg("log", "-fl50", "--template={node}\n",
> +    hg = vcs.Mercurial.get_func(repo_root)
> +    return hg("log", "-fl50", "--template={node}\n",

Is the \n significant?  If not drop it.

::: testing/web-platform/manifestdownload.py:18
(Diff revision 2)
> -    return [item for item in hg("log", "-fl50", "--template={node}\n",
> -            "testing/web-platform/tests/", "testing/web-platform/mozilla/tests").split("\n")
> +    return hg("log", "-fl50", "--template={node}\n",
> +            "testing/web-platform/tests/", "testing/web-platform/mozilla/tests").splitlines()

I think it would be a good idea to make hg_commits have the same
return type as git_commits, which is an iterable over individual
commits.

Can you make this loop over the lines and yield each one?

::: testing/web-platform/manifestdownload.py:19
(Diff revision 2)
>  
>  
>  def hg_commits(repo_root):
> -    hg = Mercurial.get_func(repo_root)
> -    return [item for item in hg("log", "-fl50", "--template={node}\n",
> -            "testing/web-platform/tests/", "testing/web-platform/mozilla/tests").split("\n")
> +    hg = vcs.Mercurial.get_func(repo_root)
> +    return hg("log", "-fl50", "--template={node}\n",
> +            "testing/web-platform/tests/", "testing/web-platform/mozilla/tests").splitlines()

Drop "/" at end of first path.

::: testing/web-platform/manifestdownload.py:24
(Diff revision 2)
> -            "testing/web-platform/tests/", "testing/web-platform/mozilla/tests").split("\n")
> -            if item]
> +            "testing/web-platform/tests/", "testing/web-platform/mozilla/tests").splitlines()
> +
> +
> +def git_commits(repo_root):
> +    git = vcs.Git.get_func(repo_root)
> +    for item in git("log", "--format=%H", "-n50", "testing/web-platform/tests/",

We are we getting 150 commits from hg but only 50 from git?

::: testing/web-platform/manifestdownload.py:26
(Diff revision 2)
> +
> +def git_commits(repo_root):
> +    git = vcs.Git.get_func(repo_root)
> +    for item in git("log", "--format=%H", "-n50", "testing/web-platform/tests/",
> +                    "testing/web-platform/mozilla/tests/").splitlines():
> +                    yield git("cinnabar", "git2hg", item)

Because indentation matters in Python I think this will create some
problems.  I would expect this to be indented at two levels, which
equates to eight columns.

::: testing/web-platform/manifestdownload.py:29
(Diff revision 2)
> +    for item in git("log", "--format=%H", "-n50", "testing/web-platform/tests/",
> +                    "testing/web-platform/mozilla/tests/").splitlines():
> +                    yield git("cinnabar", "git2hg", item)
> +
> +
> +def get_commits(logger,repo_root):

Space after ,

::: testing/web-platform/manifestdownload.py:30
(Diff revision 2)
> +    if (vcs.Git.is_git_repo(repo_root)):
> +        return git_commits(repo_root)
> +
> +    elif (vcs.Mercurial.is_hg_repo(repo_root)):
> +        return hg_commits(repo_root)

Since Mercurial is the canonical VCS I think we should run is_hg_repo
first.

::: testing/web-platform/manifestdownload.py:30
(Diff revision 2)
> +    if (vcs.Git.is_git_repo(repo_root)):
> +        return git_commits(repo_root)
> +
> +    elif (vcs.Mercurial.is_hg_repo(repo_root)):

Parenthesis around statements in if-conditions are superfluous.

::: testing/web-platform/vcs.py:21
(Diff revision 2)
>          return hg
> +
> +    @staticmethod
> +    def is_hg_repo(repo_root):
> +        try:
> +            hg=Mercurial.get_func(repo_root)

Spaces around =.

::: testing/web-platform/vcs.py:22
(Diff revision 2)
> +
> +    @staticmethod
> +    def is_hg_repo(repo_root):
> +        try:
> +            hg=Mercurial.get_func(repo_root)
> +            rv = hg("root")

rv unused.  You will want to check that that it returned a newline
like is_git_repo() does.

::: testing/web-platform/vcs.py:33
(Diff revision 2)
> +class Git(object):
> +    def __init__(self, repo_root, url_base):
> +        self.root = os.path.abspath(repo_root)
> +        self.git = Git.get_func(repo_root)
> +
> +

One blank line too much.

::: testing/web-platform/vcs.py:42
(Diff revision 2)
> +            full_cmd = ["git", cmd] + list(args)
> +            return subprocess.check_output(full_cmd, cwd=repo_path, stderr=subprocess.STDOUT)
> +            # TODO: Test on Windows.
> +        return git
> +
> +    @classmethod

Since you’re not constructing cls, I think this should be a
@staticmethod too?

::: testing/web-platform/vcs.py:45
(Diff revision 2)
> +        return git
> +
> +    @classmethod
> +    def is_git_repo(cls, repo_root):
> +        try:
> +            git=Git.get_func(repo_root)

Spaces around =.
Attachment #8985535 - Flags: review?(ato) → review-
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

> We are we getting 150 commits from hg but only 50 from git?

50 from each. the flag is -f(L)50. -l in hg does the same as -n in git.

> rv unused.  You will want to check that that it returned a newline
> like is_git_repo() does.

It doesn't return a newline, though - hg root returns the topsrcdir. I can't think of a usecase where those results would be different (right now), but this was just to be safe.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

> Is the \n significant?  If not drop it.

Yes, it is. Without it the commits end up as one large unseparated string.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

> One blank line too much.

I thought it was two lines between functions? Or does that not include decorators?
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

> It doesn't return a newline, though - hg root returns the topsrcdir. I can't think of a usecase where those results would be different (right now), but this was just to be safe.

So I’m trying to understand why is_git_repo() checks that rv ==
"\n", and as far as I can tell it returns the relative path to the
top level directory of the repository, so if your CWD is that it
returns "\n".  If you’re in a subdirectory like testing/marionette,
it will return "../../".

This is probably the wrong assertion to be making because the repo
is still a git repo if you’re in a subdirectory.

I propose we change this from using subprocess.check_output to
.check_call, which waits for the program to finish and raises a
CalledProcessError if the command did not complete successfully.
I checked that both hg and git return an exit code of 0 when they
complete successfully.  Their exit code when they don’t differs,
but this shouldn’t be a problem.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

> 50 from each. the flag is -f(L)50. -l in hg does the same as -n in git.

Ah.  I misread the font.  Thanks for clarifying.

> I thought it was two lines between functions? Or does that not include decorators?

Double lines between all top-level classes and functions, single
line internally in a class.

(Yes this is silly.)
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258700
Attachment #8985535 - Flags: review?(ato) → review-
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258462

> So I’m trying to understand why is_git_repo() checks that rv ==
> "\n", and as far as I can tell it returns the relative path to the
> top level directory of the repository, so if your CWD is that it
> returns "\n".  If you’re in a subdirectory like testing/marionette,
> it will return "../../".
> 
> This is probably the wrong assertion to be making because the repo
> is still a git repo if you’re in a subdirectory.
> 
> I propose we change this from using subprocess.check_output to
> .check_call, which waits for the program to finish and raises a
> CalledProcessError if the command did not complete successfully.
> I checked that both hg and git return an exit code of 0 when they
> complete successfully.  Their exit code when they don’t differs,
> but this shouldn’t be a problem.

This makes the code a bit weird to look at, since we can't use get_func anymore and need to suppress output. I couldn't find options within git and hg to suppress the output, so redirecting it to dev/null seemed like the next best idea.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258714

::: testing/web-platform/vcs.py:13
(Diff revision 4)
> -
>      @staticmethod
> -    def get_func(repo_path):
> +    def get_func(repo_root):
>          def hg(cmd, *args):
>              full_cmd = ["hg", cmd] + list(args)
> -            try:
> +            return subprocess.check_output(full_cmd, cwd=repo_root, stderr=subprocess.STDOUT)

I’m curious, what is the reason we redirect stderr to stdout?

::: testing/web-platform/vcs.py:20
(Diff revision 4)
> +            devnull = open(os.devnull, 'w')
> +            rv = subprocess.check_call(["hg", "root"], cwd=repo_root, stdout=devnull,
> +                                        stderr=subprocess.STDOUT)

You can replace devnull with subprocess.DEVNULL.

::: testing/web-platform/vcs.py:21
(Diff revision 4)
> +            rv = subprocess.check_call(["hg", "root"], cwd=repo_root, stdout=devnull,
> +                                        stderr=subprocess.STDOUT)

We probably also want to suppress stderr output since "hg root"
would fail if the repository isn’t Mercurial.

::: testing/web-platform/vcs.py:21
(Diff revision 4)
> +            rv = subprocess.check_call(["hg", "root"], cwd=repo_root, stdout=devnull,
> +                                        stderr=subprocess.STDOUT)
> +        except subprocess.CalledProcessError:
> +            return False
> +        # TODO: Test on windows
> +        return rv == 0

Checking that rv == 0 here is unnecessary because it will always
be 0 if check_call doesn’t raise.
Attachment #8985535 - Flags: review?(ato) → review-
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258714

> I’m curious, what is the reason we redirect stderr to stdout?

My understanding of it is that we've been doing it for debugging purposes. 
I'm not sure if an exception thrown by check_output will have the same error details as git/hg would provide, also - "If there's an unexpected exception, give as many details as you can". 
I can understand not doing this in check_call because we rely on that to fail/pass to figure out which vcs is in use, but IMO it would be best to give as many details as possible here because this is *definitely* expected to pass when it gets called.

> You can replace devnull with subprocess.DEVNULL.

subprocess.DEVNULL isn't included in python 2.x :/
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258714

> subprocess.DEVNULL isn't included in python 2.x :/

OK, then you will want to define this as a constant near the top
of the file and probably open it with os.O_RDONLY or os.O_RDWR.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258724

::: testing/web-platform/manifestdownload.py:80
(Diff revision 5)
>      return ("https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central."
>              "latest.source.manifest-upload")
>  
>  
>  def download_manifest(logger, wpt_dir, commits_func, url_func, force=False):
> -    if not force and not should_download(logger, wpt_dir):
> +    if not force and not should_download(logger, os.path.join(wpt_dir, "meta", "MANIFEST.json")):

What about mozilla/meta/MANIFEST.json?

::: testing/web-platform/vcs.py:21
(Diff revision 5)
> +
> +    @staticmethod
> +    def is_hg_repo(repo_root):
> +        try:
> +            devnull = open(os.devnull, 'w')
> +            rv = subprocess.check_call(["hg", "root"], cwd=repo_root, stdout=devnull,

rv is unused.

::: testing/web-platform/vcs.py:46
(Diff revision 5)
> +
> +    @staticmethod
> +    def is_git_repo(repo_root):
> +        try:
> +            devnull = open(os.devnull, 'w')
> +            rv = subprocess.check_call(["git", "rev-parse", "--show-cdup"], cwd=repo_root,

rv is unused.
Attachment #8985535 - Flags: review?(ato) → review-
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258714

> My understanding of it is that we've been doing it for debugging purposes. 
> I'm not sure if an exception thrown by check_output will have the same error details as git/hg would provide, also - "If there's an unexpected exception, give as many details as you can". 
> I can understand not doing this in check_call because we rely on that to fail/pass to figure out which vcs is in use, but IMO it would be best to give as many details as possible here because this is *definitely* expected to pass when it gets called.

Please help me understand this, are we actually throwing away stderr?
If not, stderr will be printed too.
Comment on attachment 8985535 [details]
Bug 1468869 - Add git-cinnabar support for wpt-manifest download,

https://reviewboard.mozilla.org/r/251142/#review258776
Attachment #8985535 - Flags: review?(ato) → review+
Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/424703a5d9e1
Add git-cinnabar support for wpt-manifest download, r=ato
https://hg.mozilla.org/mozilla-central/rev/424703a5d9e1
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: