The default bug view has changed. See this FAQ.

gaia-try: do a merge locally instead of using the pull request reference

RESOLVED WONTFIX

Status

Release Engineering
General Automation
RESOLVED WONTFIX
3 years ago
a year ago

People

(Reporter: jhford, Assigned: jhford)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3386] )

Attachments

(4 attachments, 3 obsolete attachments)

When we started doing Gaia-Try PR testing we took commit specified as the head of the branch to test.  In doing this, we weren't testing the combination of the pull request on the head of the upstream (i.e. master) branch.

Then we switched (bug 1007437) to specifying only the pull request number, which we used to fetch a reference to the mozilla-b2g/gaia internal reference to the pull request branch.  This works, but it has a race condition.

Say the following situation occurs:

1. developer submits PR#12345 which has test jobs A B C D E with commit abc1234
2. Gaia-Try starts A B C for abc1234, commit to use resolves as master+abc1234
3. developer updates PR#12345 with commit abc1235
4. Gaia-Try starts D E for 1. and A B C D E for 3.

The tests run against:

For the creation:
A tested abc1234
B tested abc1234
C tested abc1234
D tested abc1235
E tested abc1235

For the update:
A tested abc1235
B tested abc1235
C tested abc1235
D tested abc1235
E tested abc1235

There is a correctness and efficiency problem here.  It's incorrect because the results from the jobs triggered creation are actually testing the update.  It's also incorrect because there's no way to determine which commit a given set of tests were run against.  It's inefficient because we're duplicating the combination of suites D and E on commit abc1235.



There are two things we can do to fix this:

1. cancel all previous jobs for a pull request when we update the pull request
2. send the pull requeset head commit and do the merge on the test machines

We definitely want to cancel previous jobs, but that doesn't address the correctness problem.  I think it's important to address this correctness problem.

What I propose doing is:

1. change the gaia.json format from:

  {
    "git": {
      "git_revision": "replacedByPrNumber",
      "remote": "https://github.com/try-server-hook/gaia.git",
      "github_pr_number": 1
    }
  }

to:

  {
    "git": {
      "git_revision": "master_branch_commit",
      "pr_git_revision": "revision_to_test",
      "remote": "https://github.com/mozilla-b2g/gaia.git",
      "pr_remote": "https://github.com/jhford/gaia.git",
      "github_pr_number": 1
    }
  }

2. Change the mozharness logic from http://hg.mozilla.org/build/mozharness/file/a9b8af62cf14/mozharness/mozilla/gaia.py#l82

to something that does:

  git clone $remote
  git checkout $git_revision
  if $github_pr_number:
    git fetch $pr_remote
    git merge --no-ff $pr_git_revision
  run_tests

This is what the Github systems do internally according to https://github.com/blog/843-the-merge-button
  
I'm happy to write and test this if there's a staging Gaia-Try tree somewhere.

Updated

3 years ago
Summary: do a merge locally instead of using the pull request reference → gaia-try: do a merge locally instead of using the pull request reference
Created attachment 8440150 [details]
Run the linter locally

Based on info from Aki, this is how you run the linter mozharness script locally
Created attachment 8440215 [details] [diff] [review]
do merge locally instead of relying on branch pointer (based on b88216680a51)

Reviewer notes:

The first change is to improve clobber, clone and fetch logic.  We could do this for regular builds as well, but I'd rather not test that.  This change first checks if we have the commit objects we need before doing any cloning or fetching.  It also cleans up the working directory with git clean to pristine.  In this case, we assure ourselves that the repository has the two commits we care about and is pristine as if it were freshly cloned.

The code that does merges is idempotent and deterministic.


I made the fewest number of non-pr codepath changes possible, which is why the code looks a little strange.  I'm happy to answer questions about adopting this for non-gaia-try, but I don't have the time or resources to test that.  The places where I touch the default codepath are (modified diffs):

- if os.path.exists(dest) and os.path.exists(os.path.join(dest, '.git')):
+ if not pr_remote and os.path.exists(dest) and os.path.exists(os.path.join(dest, '.git')):

This change makes sure we don't bother doing the reading of 'git remote' for pull requests, because we have new logic for that in this patch

+ needs_clone = True
+ <code that's PR only>
- cmd = [git_cmd,
-        'clone',
-        remote]
+ if needs_clone:
+     cmd = [git_cmd,
-            'fetch',
-            'origin',
-            '+refs/pull/%d/merge:%s' % (pr_num, local_pr_branch)]
+            'clone',
+             remote]

This was needed because we handle clones differently from the mainline path.  I just guarded the clone command by a boolean that's only ever not True when doing a pull request


{
  "git": {
    "git_revision": "259f517689d26e3b6099a97be5696453e27ac2ed",
    "pr_git_revision": "cd81fe3a2609986952fac98b3ca19dee80ebd7ff",
    "pr_remote": "file:///Users/jhford/b2g/gaia-test"
    "remote": "file:///Users/jhford/b2g/gaia",
    "github_pr_number": 324
  }
}


Gaia not there:

18:24:40     INFO - Running pre-action listener: _resource_record_pre_action
18:24:40     INFO - Running main action method: pull
18:24:40     INFO - dest: /Users/jhford/b2g/mozharness/gaia
18:24:40     INFO - retry: Calling <bound method GaiaIntegrationTest.load_json_from_url of <__main__.GaiaIntegrationTest object at 0x10da22a10>> with args: ('https://hg.mozilla.org/users/jford_mozilla.com/gaia-try/raw-file/default/gaia.json',), kwargs: {}, attempt #1
18:24:40     INFO - Running command: ['git', 'clone', u'file:///Users/jhford/b2g/gaia'] in /Users/jhford/b2g/mozharness
18:24:40     INFO - Calling ['git', 'clone', u'file:///Users/jhford/b2g/gaia'] with output_timeout 1760
18:24:40     INFO -  Cloning into 'gaia'...
18:25:12     INFO -  Checking out files:  37% (3185/8499)
<snip>
18:25:12     INFO -  Checking out files: 100% (8499/8499), done.
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'checkout', u'259f517689d26e3b6099a97be5696453e27ac2ed'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  Note: checking out '259f517689d26e3b6099a97be5696453e27ac2ed'.
18:25:12     INFO -  You are in 'detached HEAD' state. You can look around, make experimental
18:25:12     INFO -  changes and commit them, and you can discard any commits you make in this
18:25:12     INFO -  state without impacting any branches by performing another checkout.
18:25:12     INFO -  If you want to create a new branch to retain commits you create, you may
18:25:12     INFO -  do so (now or later) by using -b with the checkout command again. Example:
18:25:12     INFO -    git checkout -b new_branch_name
18:25:12     INFO -  HEAD is now at 259f517... Revert "Merge pull request #20158 from yzen/bug-1018214"
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:25:12    ERROR - Return code: 1
18:25:12     INFO - Running command: ['git', 'fetch', u'file:///Users/jhford/b2g/gaia-test'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  From file:///Users/jhford/b2g/gaia-test
18:25:12     INFO -   * branch            HEAD       -> FETCH_HEAD
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  cd81fe3a2609986952fac98b3ca19dee80ebd7ff
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  cd81fe3a2609986952fac98b3ca19dee80ebd7ff
18:25:12     INFO - Return code: 0
18:25:12     INFO - If you want to prove that this merge commit is the same
18:25:12     INFO - you get, use this environment while doing the merge
18:25:12     INFO - Running command: ['git', 'reset', '--hard', 'HEAD'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  HEAD is now at 259f517 Revert "Merge pull request #20158 from yzen/bug-1018214"
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'clean', '--force', '-x', '-d'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'merge', '--no-ff', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO - Using env: {'GIT_AUTHOR_DATE': 'Wed Feb 16 14:00 2037 +0100',
18:25:12     INFO -  'GIT_AUTHOR_EMAIL': 'auto@mati.on',
18:25:12     INFO -  'GIT_AUTHOR_NAME': 'automation',
18:25:12     INFO -  'GIT_COMMITTER_DATE': 'Wed Feb 16 14:00 2037 +0100',
18:25:12     INFO -  'GIT_COMMITTER_EMAIL': 'auto@mati.on',
18:25:12     INFO -  'GIT_COMMITTER_NAME': 'automation'}
18:25:12     INFO -  Merge made by the 'recursive' strategy.
18:25:12     INFO -   README.md | 1 +
18:25:12     INFO -   1 file changed, 1 insertion(+)
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'rev-parse', 'HEAD'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  82b97139dd060bdb8156665e0b432ab3d366e31c
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'log', '-1'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  commit 82b97139dd060bdb8156665e0b432ab3d366e31c
18:25:12     INFO -  Merge: 259f517 cd81fe3
18:25:12     INFO -  Author: automation <auto@mati.on>
18:25:12     INFO -  Date:   Mon Feb 16 14:00:00 2037 +0100
18:25:12     INFO -      Merge commit 'cd81fe3a2609986952fac98b3ca19dee80ebd7ff' into HEAD
18:25:12     INFO - Return code: 0
18:25:12     INFO - Running command: ['git', 'branch'] in /Users/jhford/b2g/mozharness/gaia
18:25:12     INFO -  * (detached from 259f517)
18:25:12     INFO -    master
18:25:12     INFO - Return code: 0
18:25:12     INFO - Pull has nothing to do!
18:25:12     INFO - Running post-action listener: _resource_record_post_action

Gaia already there:

18:20:43     INFO - Running pre-action listener: _resource_record_pre_action
18:20:43     INFO - Running main action method: pull
18:20:43     INFO - dest: /Users/jhford/b2g/mozharness/gaia
18:20:43     INFO - retry: Calling <bound method GaiaIntegrationTest.load_json_from_url of <__main__.GaiaIntegrationTest object at 0x104169990>> with args: ('https://hg.mozilla.org/users/jford_mozilla.com/gaia-try/raw-file/default/gaia.json',), kwargs: {}, attempt #1
18:20:44     INFO - Running command: ['git', 'clean', '--force', '-x', '-d'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'259f517689d26e3b6099a97be5696453e27ac2ed^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  259f517689d26e3b6099a97be5696453e27ac2ed
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'259f517689d26e3b6099a97be5696453e27ac2ed^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  259f517689d26e3b6099a97be5696453e27ac2ed
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'checkout', u'259f517689d26e3b6099a97be5696453e27ac2ed'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  Warning: you are leaving 2 commits behind, not connected to
18:20:44     INFO -  any of your branches:
18:20:44     INFO -    82b9713 Merge commit 'cd81fe3a2609986952fac98b3ca19dee80ebd7ff' into HEAD
18:20:44     INFO -    cd81fe3 Testing!
18:20:44     INFO -  If you want to keep them by creating a new branch, this may be a good time
18:20:44     INFO -  to do so with:
18:20:44     INFO -   git branch new_branch_name 82b9713
18:20:44     INFO -  HEAD is now at 259f517... Revert "Merge pull request #20158 from yzen/bug-1018214"
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  cd81fe3a2609986952fac98b3ca19dee80ebd7ff
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  cd81fe3a2609986952fac98b3ca19dee80ebd7ff
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'rev-parse', '--quiet', '--verify', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff^{commit}'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  cd81fe3a2609986952fac98b3ca19dee80ebd7ff
18:20:44     INFO - Return code: 0
18:20:44     INFO - If you want to prove that this merge commit is the same
18:20:44     INFO - you get, use this environment while doing the merge
18:20:44     INFO - Running command: ['git', 'reset', '--hard', 'HEAD'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO -  HEAD is now at 259f517 Revert "Merge pull request #20158 from yzen/bug-1018214"
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'clean', '--force', '-x', '-d'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO - Return code: 0
18:20:44     INFO - Running command: ['git', 'merge', '--no-ff', u'cd81fe3a2609986952fac98b3ca19dee80ebd7ff'] in /Users/jhford/b2g/mozharness/gaia
18:20:44     INFO - Using env: {'GIT_AUTHOR_DATE': 'Wed Feb 16 14:00 2037 +0100',
18:20:44     INFO -  'GIT_AUTHOR_EMAIL': 'auto@mati.on',
18:20:44     INFO -  'GIT_AUTHOR_NAME': 'automation',
18:20:44     INFO -  'GIT_COMMITTER_DATE': 'Wed Feb 16 14:00 2037 +0100',
18:20:44     INFO -  'GIT_COMMITTER_EMAIL': 'auto@mati.on',
18:20:44     INFO -  'GIT_COMMITTER_NAME': 'automation'}
18:20:45     INFO -  Merge made by the 'recursive' strategy.
18:20:45     INFO -   README.md | 1 +
18:20:45     INFO -   1 file changed, 1 insertion(+)
18:20:45     INFO - Return code: 0
18:20:45     INFO - Running command: ['git', 'rev-parse', 'HEAD'] in /Users/jhford/b2g/mozharness/gaia
18:20:45     INFO -  82b97139dd060bdb8156665e0b432ab3d366e31c
18:20:45     INFO - Return code: 0
18:20:45     INFO - Running command: ['git', 'log', '-1'] in /Users/jhford/b2g/mozharness/gaia
18:20:45     INFO -  commit 82b97139dd060bdb8156665e0b432ab3d366e31c
18:20:45     INFO -  Merge: 259f517 cd81fe3
18:20:45     INFO -  Author: automation <auto@mati.on>
18:20:45     INFO -  Date:   Mon Feb 16 14:00:00 2037 +0100
18:20:45     INFO -      Merge commit 'cd81fe3a2609986952fac98b3ca19dee80ebd7ff' into HEAD
18:20:45     INFO - Return code: 0
18:20:45     INFO - Running command: ['git', 'branch'] in /Users/jhford/b2g/mozharness/gaia
18:20:45     INFO -  * (detached from 259f517)
18:20:45     INFO -    master
18:20:45     INFO - Return code: 0
18:20:45     INFO - Pull has nothing to do!
18:20:45     INFO - Running post-action listener: _resource_record_post_action
Attachment #8440215 - Flags: review?(aki)
Oops, pressed submit before I meant to.  First paragraph should be:

The first change is to improve clobber, clone and fetch logic.  We first checks if we have the commit objects we need before doing any cloning or fetching.  Then we clean up the working directory with 'git clean'.  We know that because of these two things, we have all the git objects we depend on and we have a pristine working copy.  This is equivalent to a fresh clone for our purposes.  The second change is to create a deterministic merge commit locally instead of relying on the pull reference in the base repository.
My changes for this have been deployed.

Sample JSON from me:

     1.4    "git": {
     1.5      "git_revision": "c0709e95fa79e35913e3e6788e80cd874d29990e",
     1.6      "remote": "https://github.com/try-server-hook/gaia.git",
     1.7      "pr_git_revision": "9fa117e9d588e99cd27062617c7030e503982526",
     1.8      "pr_remote": "https://github.com/try-server-hook/gaia.git",
     1.9      "github_pr_number": 1
    1.10    },

Comment 5

3 years ago
Comment on attachment 8440215 [details] [diff] [review]
do merge locally instead of relying on branch pointer (based on b88216680a51)

I'm bringing up some questions, and leaving it to your discretion.

* Could you use -U9 for any future patches?
* The |if git:| block is getting pretty large and might belong in helper method(s).

>+                pr_git_revision = contents['git'].get('pr_git_revision')
>+                pr_remote = contents['git'].get('pr_remote')
>+                if pr_remote or pr_git_revision:
>+                    if not (pr_remote and pr_git_revision):
>+                        self.fatal('Pull request mode requres rev *and* remote')
>                 if not (branch or revision):
>                     self.fatal('Must specify branch or revision for git repo')

We should probably add exit_code=3 to both of these fatal()s so they show purple rather than red.

>-            if os.path.exists(dest) and os.path.exists(os.path.join(dest, '.git')):
>+            # For pull requests, we only want to clobber when we can't find the
>+            # two exact commit ids that we'll be working with.  As long as we
>+            # have those two commits, we don't care about the rest of the repo

Do we really never need cleanup?  I'd be surprised if a git reset --hard or git clean -ffdx didn't help with sanity.

>-            if pr_num:
>-                local_pr_branch = 'local_pr_%d' % pr_num
>+            if needs_clone:
>+                # git clone
>                 cmd = [git_cmd,
>-                       'fetch',
>-                       'origin',
>-                       '+refs/pull/%d/merge:%s' % (pr_num, local_pr_branch)]
>+                       'clone',
>+                       remote]
>                 self.run_command(cmd,
>-                                 cwd=dest,
>+                                 cwd=os.path.dirname(dest),
>                                  output_timeout=1760,
>                                  halt_on_failure=True,
>                                  fatal_exit_code=3)

I think you can also |git clone URL /abs/path| from anywhere, but cwd=os.path.dirname(dest) works too.
Looks like this isn't your code, though.
Attachment #8440215 - Flags: review?(aki) → review+
(In reply to Aki Sasaki [:aki] from comment #5)
> Comment on attachment 8440215 [details] [diff] [review]
> do merge locally instead of relying on branch pointer (based on b88216680a51)
> 
> I'm bringing up some questions, and leaving it to your discretion.
> 
> * Could you use -U9 for any future patches?

edited my .hgrc

> * The |if git:| block is getting pretty large and might belong in helper
> method(s).

I agree, it's probably a good idea to do that.  I can take a look at doing that when I have time, but right now I don't really have time to test all the other non-pull request codepaths.

> We should probably add exit_code=3 to both of these fatal()s so they show
> purple rather than red.

Done, and also two other cases where we Fatal for similar reasons.  Good call.
 
> Do we really never need cleanup?  I'd be surprised if a git reset --hard or
> git clean -ffdx didn't help with sanity.

We could throw in a reset, but the git clean followed by a checkout should be a super set of what that'd do.

> I think you can also |git clone URL /abs/path| from anywhere, but
> cwd=os.path.dirname(dest) works too.

Yep.

> Looks like this isn't your code, though.

Yep.
Comment on attachment 8440215 [details] [diff] [review]
do merge locally instead of relying on branch pointer (based on b88216680a51)

https://hg.mozilla.org/build/mozharness/rev/74f415324874
Attachment #8440215 - Flags: checked-in+
This was backed out because it caused errors.  I've filed bug 1026246 to get a proper test environment for these sorts of changes.
Attachment #8440215 - Flags: checked-in+ → checked-in-
Depends on: 1026246, 1026280
Created attachment 8445557 [details] [diff] [review]
Tested in production

From production:

https://pastebin.mozilla.org/5466469

https://tbpl.mozilla.org/?tree=Gaia-Try&rev=2156031ac614 -- All of those failures are because I short circuited the results by adding a self.fatal call at the end of the test run.

I've also tested locally the following cases:

1) no existing clone of Gaia
2) existing clone of gaia without an 'other' remote
3) existing clone of gaia with an 'other' remote

This should cover all the cases.

Thanks!
Assignee: nobody → jhford
Attachment #8440215 - Attachment is obsolete: true
Attachment #8445557 - Flags: review?(aki)

Comment 10

3 years ago
Comment on attachment 8445557 [details] [diff] [review]
Tested in production

I think this will be ok as-is, but it may be prudent to trigger a Cypress gaia test after landing on default, before merging to production.

>-            if os.path.exists(dest) and os.path.exists(os.path.join(dest, '.git')):
>+            # For pull requests, we only want to clobber when we can't find the
>+            # two exact commit ids that we'll be working with.  As long as we
>+            # have those two commits, we don't care about the rest of the repo
>+            def has_needed_commit(commit, fatal=False):
>+                cmd = [git_cmd, 'rev-parse', '--quiet', '--verify', '%s^{commit}' % commit]
>+                rc = self.run_command(cmd, cwd=dest, halt_on_failure=False, status_codes=[1,0])

s,status_codes,success_codes,
http://hg.mozilla.org/build/mozharness/file/c49fa160c924/mozharness/base/script.py#l613
Attachment #8445557 - Flags: review?(aki) → review+
Created attachment 8445598 [details] [diff] [review]
Fix typo

Carrying the r+ forward, minor typo fix.
Attachment #8445557 - Attachment is obsolete: true
Attachment #8445598 - Flags: review+
https://hg.mozilla.org/build/mozharness/rev/e89b33af0900

Landed.
Created attachment 8449133 [details] [diff] [review]
delete non-git gaias

I think what's going on here is that both HG and Git copies of Gaia are trying to be in the same location.  Ideally, we'd have different test slave side clone locations for hg and git, but I don't know how to accomplish that.

So, here's a patch that checks if the /builds/slave/test/gaia directory is a git repository.  If it's a git repo, we don't need to clone or clobber.  If it's missing, we need to clone but not clobber.  If it's a mercurial repository we need to clobber and clone.

This is in production for Gaia-Try right now.
Attachment #8449133 - Flags: review?(aki)
Created attachment 8449145 [details] [diff] [review]
gaia-try expects 1 to 5 files, more than the average tests change

Gaia-try changes are different than non-gaia-try test changes.  For non-gaia-try test changes, we expect there to only be between 1 to 3 files change (browser, tests, symbols?).

Gaia-try works differently.  We change gaia.json every build and potentially up to 4 other files which are each used to point to specific browser and test URLS for each platform.
Attachment #8449145 - Flags: review?(aki)
Comment on attachment 8449145 [details] [diff] [review]
gaia-try expects 1 to 5 files, more than the average tests change

so IIUC this whole try block[1] serves to override or initialize installer_url, symbols_url, and/or test_url. for gaia try this does not seem to be the case. In fact this code block of logic has been working up till now but only by mistake. For e.g., this else block[2] states that the remaining file here is the installer_url. This is illogical for gaia-try. It only works because we set the installer_url and test_url explicitly from self.config here[3].

This is my first time looking at gaia-try so as a 'band-aid' I'll r+. But John if you can confirm for sure we *never* pass the symbols/tests/installer url via files and in gaia-try test jobs and we only soley rely on self.config, I would much prefer to r+ something like:

# gaia-try uses 'files' for a different purpose than installer/test/symbol url
if not buildbot_prop_branch.startswith('gaia-try'): 
    try:
         files = self.buildbot_config['sourcestamp']['changes'][-1]['files']
         # etc ...

furthermore, this feels like we shouldn't be binding this logic to a branch. It feels like this should be based off a platform + suite/job. If we are or are going to be doing these tests outside of gaia-try we are going to run into this len(5) == files problem again.

does this help?

[1] http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/testing/testbase.py#134
[2] http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/testing/testbase.py#155
[3] http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/testing/testbase.py#130
(In reply to Jordan Lund (:jlund) from comment #15)
> This is my first time looking at gaia-try so as a 'band-aid' I'll r+. But
> John if you can confirm for sure we *never* pass the symbols/tests/installer
> url via files and in gaia-try test jobs and we only soley rely on
> self.config, I would much prefer to r+ something like:

We will never check in symbols/tests/installers to the gaia-try repository and unless someone changes how gaia-try works, we should never be sendchanging for it.

> # gaia-try uses 'files' for a different purpose than installer/test/symbol
> url
> if not buildbot_prop_branch.startswith('gaia-try'): 
>     try:
>          files = self.buildbot_config['sourcestamp']['changes'][-1]['files']
>          # etc ...

Yah, that definately feels cleaner.  My concern here is that the patch is very large because of the indentation.  I'd like to minimize the control flow changes to lessen the risk of breaking !gaia-try. 


> furthermore, this feels like we shouldn't be binding this logic to a branch.
> It feels like this should be based off a platform + suite/job. If we are or
> are going to be doing these tests outside of gaia-try we are going to run
> into this len(5) == files problem again.

This is one of a million places where, aiui, gaia-try is special cased by name.  This far out of the gaia clone function, I have no clue how else to figure out how to know that I'm on a gaia-try like branch.

> does this help?

Yes, thanks
Comment on attachment 8449145 [details] [diff] [review]
gaia-try expects 1 to 5 files, more than the average tests change

Review of attachment 8449145 [details] [diff] [review]:
-----------------------------------------------------------------

I spoke with jhford following his feedback in comment 16. we agreed to go with his patch as he has proof of it fixing things[1]. I will follow up with a patch of my own to remove the try block entirely for gaia-try and get aki to review it when he is back from PTO.

[1] https://tbpl.mozilla.org/?tree=Gaia-Try&rev=4b10071a9b8d
Attachment #8449145 - Flags: review?(aki) → review+
Comment on attachment 8449145 [details] [diff] [review]
gaia-try expects 1 to 5 files, more than the average tests change

https://hg.mozilla.org/build/mozharness/rev/a086b8d41e88
Attachment #8449145 - Flags: checked-in+
Comment on attachment 8449133 [details] [diff] [review]
delete non-git gaias

I am trying to get up to speed on this. Where is this code living? Is this a special mozharness repo? I ask because when I look at our production/default repo, I see: http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/gaia.py#57 instead
Maybe MXR is out of date?

http://hg.mozilla.org/build/mozharness/rev/e89b33af0900
Oh, also http://hg.mozilla.org/users/jford_mozilla.com/mozharness is what's running in production
Comment on attachment 8449133 [details] [diff] [review]
delete non-git gaias

Review of attachment 8449133 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/gaia.py
@@ +102,5 @@
> +                if os.path.exists(dest):
> +                    if os.path.exists(os.path.join(dest, '.git')):
> +                        needs_clone = False
> +                    else:
> +                        needs_clobber = True

So, this is somethign that I'd like to let aki get to once he's back. At any rate, I don't want to block so I'll try to review and keep this moving in the meantime.

couple notes: 

* all these conditions in the 'if git' block are very similar are hard for me to parse so sorry if I'm mistaken on something. At any rate, I feel like there could be a clean up. For example, these five lines [1][2][3][4][5] are either duplicates or very similar conditions.

* I think there is at least one situation where we we could get: (needs_clobber == True and needs_clone == False) IIUC, allowing for that scenario seems wrong. basically we would rmtree(dest) and not clone, followed by trying and failing to run a series of git commands here[6] and here[7] depending on if it's jhford's user repo or mainline mozharness repo.

This could happen if...

if not has_needed_commit(revision):
    self.warn('Repository does not contain required revisions, clobbering')
    needs_clobber = True  # THIS GETS HIT

if os.path.exists(dest):
    if os.path.exists(os.path.join(dest, '.git')):
        needs_clone = False  # AND THIS GETS HIT


does that make sense? Am I parsing this right?


[1] http://hg.mozilla.org/users/jford_mozilla.com/mozharness/file/0b8f093023c9/mozharness/mozilla/gaia.py#l76
[2] http://hg.mozilla.org/users/jford_mozilla.com/mozharness/file/0b8f093023c9/mozharness/mozilla/gaia.py#l85
[3] http://hg.mozilla.org/users/jford_mozilla.com/mozharness/file/0b8f093023c9/mozharness/mozilla/gaia.py#l87
[4] http://hg.mozilla.org/users/jford_mozilla.com/mozharness/file/0b8f093023c9/mozharness/mozilla/gaia.py#l102
[5] http://hg.mozilla.org/users/jford_mozilla.com/mozharness/file/0b8f093023c9/mozharness/mozilla/gaia.py#l104

[6] http://hg.mozilla.org/users/jford_mozilla.com/mozharness/file/0b8f093023c9/mozharness/mozilla/gaia.py#l123
[7] http://hg.mozilla.org/build/mozharness/file/141595cb922b/mozharness/mozilla/gaia.py#l121
Attachment #8449133 - Flags: review?(aki) → review-
> I spoke with jhford following his feedback in comment 16. we agreed to go
> with his patch as he has proof of it fixing things[1]. I will follow up with
> a patch of my own to remove the try block entirely for gaia-try and get aki
> to review it when he is back from PTO.

closing loop on this, aki replied via email and suggested the best fix would be to change how we schedule in buildbot itself rather than deal with it only from mozharness.

it sounds like we can 'get by' with what we have now. I am a bit swamped with other tasks but if I can look at it if it becomes a priority.
(In reply to Jordan Lund (:jlund) from comment #23)
> closing loop on this, aki replied via email and suggested the best fix would
> be to change how we schedule in buildbot itself rather than deal with it
> only from mozharness.

It's fine by me for this to be fixed differently, but someone else is going to have to do the changes if it involves scheduling changes inside of buildbot.

> it sounds like we can 'get by' with what we have now. I am a bit swamped
> with other tasks but if I can look at it if it becomes a priority.

We can't really get by with what's on mozharness's production or default branch, it's not doing the right thing.  I've had to make a couple small changes, so why don't I prepare a new diff and attach it.
Created attachment 8452886 [details] [diff] [review]
omnibus version

This is the updated patch of what has been running in production on Gaia-Try for the last couple weeks.  It works great there.  This production use covers both the PR code path and the regular git branch push code path without altering the hg codepath.  I've also verified that this patch is catching the exact situation that it was designed to catch.

The list of changes are:

1) make the git clean have -f -f instead of just one --force, basically a no-op
2) adjust needs_clone and needs_clobber to clobber+clone when there's a .hg directory but otherwise only clone if there is not a gaia.
3) only use git checkout for non-pull requests runs only.  For pull requests, instead force the HEAD to the desired revision

Once this (or equivalent) is done, I'm happy to help write a general cleanup patch, but I don't have the access required to test it everywhere, so someone else would need to take that on.
Attachment #8449133 - Attachment is obsolete: true
Attachment #8452886 - Flags: review?(jlund)
Just realised that I hadn't responded to your review feedback, sorry.

(In reply to Jordan Lund (:jlund) from comment #22)
> * all these conditions in the 'if git' block are very similar are hard for
> me to parse so sorry if I'm mistaken on something. At any rate, I feel like
> there could be a clean up. For example, these five lines [1][2][3][4][5] are
> either duplicates or very similar conditions.

Nope, those are all places where we have to 'insert' code for PRs.  There are so many checks because the ordering is critical and we'd otherwise have to duplicate whole sections for pr vs non-pr when the commands would be identical.

> * I think there is at least one situation where we we could get:
> (needs_clobber == True and needs_clone == False) IIUC, allowing for that
> scenario seems wrong. basically we would rmtree(dest) and not clone,
> followed by trying and failing to run a series of git commands here[6] and
> here[7] depending on if it's jhford's user repo or mainline mozharness repo.
> 
> This could happen if...
> 
> if not has_needed_commit(revision):
>     self.warn('Repository does not contain required revisions, clobbering')
>     needs_clobber = True  # THIS GETS HIT
> 
> if os.path.exists(dest):
>     if os.path.exists(os.path.join(dest, '.git')):
>         needs_clone = False  # AND THIS GETS HIT
> 
> 
> does that make sense? Am I parsing this right?

Yes, that's a possibility.  I think doing this:

+            if needs_clobber:
+                self.rmtree(dest)
+                needs_clone = True

would solve the issue here, since any time we do a clobber (PR or not), we will need to clone.  Also, needs_clone is hardcoded to True for non-PR runs.  I've made this change in my user repo and am about to deploy it to my user repo.
> > it sounds like we can 'get by' with what we have now. I am a bit swamped
> > with other tasks but if I can look at it if it becomes a priority.
> 
> We can't really get by with what's on mozharness's production or default
> branch, it's not doing the right thing.  I've had to make a couple small
> changes, so why don't I prepare a new diff and attach it.

sure sounds good. just sanity checking, my comments here are referring to comment 17 that I r+ not comment 22 that I r-. I didn't realize https://bugzilla.mozilla.org/attachment.cgi?id=8449145 wasn't solving or doing the right thing with the 'files' issue on production/default.
Comment on attachment 8452886 [details] [diff] [review]
omnibus version

weird, I'm signed in as jlund@mozilla.com yet I do not have any options to 'review' this patch. 

At any rate. lgtm. I'll trust in you that -f -f is needed even if it is a no-op as you say.

Also, unless you feel differently now, I think we should go with your suggestion in comment 26 by adding needs_clone = True every time we clobber. That is if the deploy on your user repo fairs well :)
Attachment #8452886 - Flags: review?(jlund) → review+
(In reply to Jordan Lund (:jlund) from comment #28)
> Comment on attachment 8452886 [details] [diff] [review]
> omnibus version
> 
> weird, I'm signed in as jlund@mozilla.com yet I do not have any options to
> 'review' this patch. 
> 
> At any rate. lgtm. I'll trust in you that -f -f is needed even if it is a
> no-op as you say.
> 
> Also, unless you feel differently now, I think we should go with your
> suggestion in comment 26 by adding needs_clone = True every time we clobber.
> That is if the deploy on your user repo fairs well :)

-f == --force, -f -f is forces cleaning of submodules which we don't have any of right now.  Adding 

Awesome,  I'll keep the needs_clone=True in there.

Landing on default.  Once this is merged over to production, I'll file a bug to point Gaia-Try back to upstream
Comment on attachment 8452886 [details] [diff] [review]
omnibus version

Checked in with the extra needs_clone = True line discussed.

https://hg.mozilla.org/build/mozharness/rev/8bd5a04c10f1
Attachment #8452886 - Attachment is patch: true
Attachment #8452886 - Flags: checked-in+

Comment 31

3 years ago
This is merged, and I reverted bug 1026721 (should take effect next reconfig, potentially tomorrow).

Updated

2 years ago
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3379]

Updated

2 years ago
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3379] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3384]

Updated

2 years ago
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3384] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3386]
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.