The default bug view has changed. See this FAQ.

Status

Release Engineering
General Automation
RESOLVED FIXED
5 years ago
2 years ago

People

(Reporter: aki, Assigned: armenzg)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [easier-mozharness], URL)

Attachments

(19 attachments, 25 obsolete attachments)

1.47 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
568 bytes, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
2.31 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
12.62 KB, patch
Details | Diff | Splinter Review
9.15 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
1.52 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
1.96 KB, patch
rail
: review+
catlee
: review-
armenzg
: checked-in+
Details | Diff | Splinter Review
1.60 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
3.02 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
3.79 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
6.96 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
6.26 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
2.68 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
6.53 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
1.21 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
858 bytes, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
6.57 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
2.63 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
17.61 KB, patch
rail
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
(Reporter)

Description

5 years ago
To help mozharness developers test their scripts in a RelEng environment.
(Reporter)

Updated

5 years ago
(Reporter)

Comment 1

5 years ago
Created attachment 662024 [details] [diff] [review]
[buildbotcustom] rebased generic scheduler code

Essentially Catlee's generic branch on buildbotcustom, rebased.

This passes test-masters.  This is currently pointing at Catlee's tools repo; I need to decide if it'll point at tools or mozharness.

There currently is only a generateBranchObjects builder (builds) not a generateTalosBranchObjects builder (tests).
(Reporter)

Comment 2

5 years ago
"(Maybe we need a "preproduction" integration environment like we have for buildbot code, where mozharness default changes run there, and can be merged to a production branch afterwards?)"

This may also be a way to catch errors in soon-to-go-live code, though we don't always catch problems in buildbot preprod.  This would probably be best in addition to rather than instead of mozharness try.
(In reply to Aki Sasaki [:aki] from comment #2)
> "(Maybe we need a "preproduction" integration environment like we have for
> buildbot code, where mozharness default changes run there, and can be merged
> to a production branch afterwards?)"
> 
> This may also be a way to catch errors in soon-to-go-live code, though we
> don't always catch problems in buildbot preprod.  This would probably be
> best in addition to rather than instead of mozharness try.

Yes, it would be nice to be able to land mozharness changes in a separate 'staging' branch and see the results somewhere in TBPL (maybe a specific gecko project branch), before rolling out the mozharness changes to production.

For instance, when updating the emulator for B2G tests (which requires a change to mozharness configs) it would be nice to be able to test that somewhere before making it live, since there is always a risk that the newer Gaia that comes with a new emulator will break something.
Currently, we update mozharness to default.

If we had something like talos.json but for mozharness that would be great [1].

We could make it point to a user repo and be able to push to try.
I can work on this if you would like me to.

[1] http://hg.mozilla.org/mozilla-central/raw-file/default/testing/talos/talos.json

Comment 5

4 years ago
From the peanut gallery, but I think a talos.json-like approach makes a lot of sense too.
(Reporter)

Comment 6

4 years ago
I think it's problematic, since:

a) not every mozharness job has a tree+revision directly associated
b) mozharness has to be pulled before the tree is pulled in the other cases

I've been arguing with Armen about this, but am open to looking at his approach should he get a working solution.
(Reporter)

Comment 7

4 years ago
Oh, and

c) this is going to be changing a *lot* across a lot of trees.
d) this could affect release revisions (go to build revision sent, but we have to update the mozharness revision in-tree)
e) possibly more
Created attachment 704640 [details] [diff] [review]
poc determine script_repo_revision from tree

This is a simplistic approach.
A more resilient and non-harcoded version would do the trick.
We would also have to check if a human has not set it manually through force build.
This also allow us to ride the trains.
Attachment #704640 - Flags: feedback?(catlee)
Attachment #704640 - Flags: feedback?(aki)
http://hg.mozilla.org/projects/ash/file/default/testing/talos/talos.json
     6     "repos": {
     7         "mozharness": {
     8             "repo": "http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness",
     9             "revision": "6d81a1321351"
    10         }
    11     }
(Reporter)

Comment 10

4 years ago
Comment on attachment 704640 [details] [diff] [review]
poc determine script_repo_revision from tree

This would need to be conditional, if we take this approach.

I'm ok with turning this sort of check on for tests, but I don't think it should necessarily be on for l10n or release jobs (which should take the release tag, not an in-tree revision).

This would also completely break build/tools based scripts, unless we somehow have a sha1 collision between mozharness and build/tools and that happens to be the revision we want to use for everything.

Since this has an ash hardcode and doesn't account for most scenarios (optimizes for tests, breaks everything else), f-.
Attachment #704640 - Flags: feedback?(aki) → feedback-
(Assignee)

Updated

4 years ago
Attachment #704640 - Flags: feedback?(catlee)
(Reporter)

Comment 11

4 years ago
Not actively working on this.  For the moment, ash-mozharness is working well.
Assignee: aki → nobody
Product: mozilla.org → Release Engineering
Ash-mozharness + Ash is being a wild wild west these days.
If we could revive this for Q3 or give it to an intern it would be great.
(Reporter)

Comment 13

3 years ago
I doubt this will happen before Taskcluster.

Updated

3 years ago
Component: Tools → Mozharness
Until we have this, effectively one person at a time can test changes on ash-mozharness. This has become inconvenient enough recently that I'd like to look into what it would take to revive this effort.

jlund, do you have any pointers for me? Much appreciated!
Flags: needinfo?(jlund)
Created attachment 8498377 [details] [diff] [review]
mozharness-try

I had a think about this. Because of the way buildbot must preconfigure everything before runtime, it takes a bit of love to get this to work.

I think I grep what armen was trying to do with his patch and I also see Aki's concerns for implementing a mozharness try. This patch I think addresses both of those. It will surely fail. It is also incomplete: off my head, it needs to handle every place we define ash-mozharness[1] and instantiate ScriptFactory/SigningScriptFactory for ash builders[2]

But, it will limit this to just ash (for now) and allow us to specify a mozharness revision and url from in tree.

Armen, I was not awarded any time to work on this or other in tree stuff as other projects took priority. Since you know releng infra really well, maybe you can play with this patch or do a whole new one? I support mozharness-try and I'm happy to be part of the review.

cmanchester, feel free to also iterate on this or come up with a new way.

[1] http://mxr.mozilla.org/build/search?string=ash-mozharness&find=&findi=&filter=%5E%5B%5E%5C0%5D*%24&hitlimit=&tree=build
[2] http://mxr.mozilla.org/build/search?string=ScriptFactory
Attachment #8498377 - Flags: feedback?(cmanchester)
Attachment #8498377 - Flags: feedback?(armenzg)
Flags: needinfo?(jlund)
Comment on attachment 8498377 [details] [diff] [review]
mozharness-try

Review of attachment 8498377 [details] [diff] [review]:
-----------------------------------------------------------------

::: misc.py
@@ +730,5 @@
>  
> +    scriptRepo = None
> +    mozharness_json = None
> +    if config.get('mozharness_json'):
> +        mozharness_json = config.get('mozharness_json')

bah, obviously we do not need the condition above, nor the separate init for mozharness_json since dict.get(key) returns none if key does not exist.

this is probably one of many ugly parts in this patch.

::: process/factory.py
@@ +6174,5 @@
> +                name='get_script_user_url',
> +                property='script_repo_url',
> +                command=[
> +                    'python', '-c',
> +                    'import json; import urllib; print str(json.load(urllib.urlopen(%s))["repos"]["mozharness"]["url"])'

this should be like above so:

s/["repos"]["mozharness"]/["repo"]/
Comment on attachment 8498377 [details] [diff] [review]
mozharness-try

Review of attachment 8498377 [details] [diff] [review]:
-----------------------------------------------------------------

This should do the job.
As a bonus we should have try syntax enabled on Ash to reduce unnecessary usage.

I think this is easy enough to test for me that I should take it. It will make my Q4 goal so much easier.

::: process/factory.py
@@ +6164,5 @@
> +                property='script_repo_revision',
> +                command=[
> +                    'python', '-c',
> +                    'import json; import urllib; print str(json.load(urllib.urlopen(%s))["repo"]["revision"])'
> +                    % WithProperties(script_json_url)

I think this is not supposed to use WithProperties().
I think a simple string substitution is all it is needed. I could be wrong though.
Attachment #8498377 - Flags: feedback?(armenzg) → feedback+
(Assignee)

Updated

3 years ago
Whiteboard: [mozharness] → [easier-mozharness]
Comment on attachment 8498377 [details] [diff] [review]
mozharness-try

Review of attachment 8498377 [details] [diff] [review]:
-----------------------------------------------------------------

From this I think I have a good idea of the viable approach. Thank you!

::: mozilla-tests/b2g_config.py
@@ +2181,4 @@
>  
>  BRANCHES['ash']['branch_name'] = "Ash"
>  BRANCHES['ash']['repo_path'] = "projects/ash"
> +BRANCHES['ash']['mozharness_json'] = "https://hg.mozilla.org/projects/ash/raw-file/s(revision)s/testing/talos/talos.json"

Maybe we can add testing/config/mozharness.json

Most useful, I think, would be to have this available for any push to try.
Attachment #8498377 - Flags: feedback?(cmanchester) → feedback+
(In reply to Chris Manchester [:chmanchester] from comment #18)
> Comment on attachment 8498377 [details] [diff] [review]
> mozharness-try
> 
> Review of attachment 8498377 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> From this I think I have a good idea of the viable approach. Thank you!
> 
> ::: mozilla-tests/b2g_config.py
> @@ +2181,4 @@
> >  
> >  BRANCHES['ash']['branch_name'] = "Ash"
> >  BRANCHES['ash']['repo_path'] = "projects/ash"
> > +BRANCHES['ash']['mozharness_json'] = "https://hg.mozilla.org/projects/ash/raw-file/s(revision)s/testing/talos/talos.json"
> 
> Maybe we can add testing/config/mozharness.json
> 
> Most useful, I think, would be to have this available for any push to try.

for try, the issue with this patch is it assumes the mozharness.json will always exist (something that is easy to enforce on ash). We could extend this logic to add a few steps that check if the in-tree json exists, otherwise use default mainline mozharness though it probably will be ugly having to put that logic in at config time (before runtime). I'm OK supporting that but I think, at least for now, we should limit it to Try/Ash.
> ::: process/factory.py
> @@ +6164,5 @@
> > +                property='script_repo_revision',
> > +                command=[
> > +                    'python', '-c',
> > +                    'import json; import urllib; print str(json.load(urllib.urlopen(%s))["repo"]["revision"])'
> > +                    % WithProperties(script_json_url)
> 
> I think this is not supposed to use WithProperties().
> I think a simple string substitution is all it is needed. I could be wrong
> though.

The reason I wrapped this in WithProperties is that I put %(revision)s in the script_json_url. That will only be known at runtime on the given ash push and will be part of the build properties.
(In reply to Jordan Lund (:jlund) from comment #19)
> (In reply to Chris Manchester [:chmanchester] from comment #18)
> > Comment on attachment 8498377 [details] [diff] [review]
> > mozharness-try
> > 
> > Review of attachment 8498377 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > From this I think I have a good idea of the viable approach. Thank you!
> > 
> > ::: mozilla-tests/b2g_config.py
> > @@ +2181,4 @@
> > >  
> > >  BRANCHES['ash']['branch_name'] = "Ash"
> > >  BRANCHES['ash']['repo_path'] = "projects/ash"
> > > +BRANCHES['ash']['mozharness_json'] = "https://hg.mozilla.org/projects/ash/raw-file/s(revision)s/testing/talos/talos.json"
> > 
> > Maybe we can add testing/config/mozharness.json
> > 
> > Most useful, I think, would be to have this available for any push to try.
> 
> for try, the issue with this patch is it assumes the mozharness.json will
> always exist (something that is easy to enforce on ash). We could extend
> this logic to add a few steps that check if the in-tree json exists,
> otherwise use default mainline mozharness though it probably will be ugly
> having to put that logic in at config time (before runtime). I'm OK
> supporting that but I think, at least for now, we should limit it to Try/Ash.

I see. Is this just a matter of checking in the file to central and letting it merge around?
I will try to give this a nice push as it literally will make my Q4 goals trivial.

I'm OK to land such a json file across while we only use it on Ash.

This is the same concept as downloading talos_from_code.py.
Assignee: nobody → armenzg
Component: Mozharness → General Automation
> > for try, the issue with this patch is it assumes the mozharness.json will
> > always exist (something that is easy to enforce on ash). We could extend
> > this logic to add a few steps that check if the in-tree json exists,
> > otherwise use default mainline mozharness though it probably will be ugly
> > having to put that logic in at config time (before runtime). I'm OK
> > supporting that but I think, at least for now, we should limit it to Try/Ash.
> 
> I see. Is this just a matter of checking in the file to central and letting
> it merge around?

it depends how we want to implement things. my patch tries to avoid aki's concerns listed at the top of this bug. i.e. the edge cases are all avoided by limiting this to ash. I'm open to extending it everywhere but the implementation will have to be more robust.
> This is the same concept as downloading talos_from_code.py.

true but the nice thing with talos is that logic is handled by mozharness. I think it will be uglier in buildbot since we won't know things like 'revision' or whether the mozharness json file exists at any given future job. not to mention that ScriptFactory is not meant to be purely for mozharness and is an entry point for so many jobs.

that might just be a matter of me not knowing how to store/use buildbot properties cleanly.
(In reply to Jordan Lund (:jlund) from comment #24)
> > This is the same concept as downloading talos_from_code.py.
> 
> true but the nice thing with talos is that logic is handled by mozharness. I
> think it will be uglier in buildbot since we won't know things like
> 'revision' or whether the mozharness json file exists at any given future
> job. not to mention that ScriptFactory is not meant to be purely for
> mozharness and is an entry point for so many jobs.
> 
> that might just be a matter of me not knowing how to store/use buildbot
> properties cleanly.

I'm from the previous wars. I remember things the old ways :P
http://hg.mozilla.org/build/buildbotcustom/file/56830bc4b6a5/process/factory.py#l5623

I forgot that we had switched to mozharness for talos! I saw the code and I recognized it.
Or this:
http://hg.mozilla.org/build/buildbotcustom/file/56830bc4b6a5/process/factory.py#l5672
I'm going to be adding this to try since I also want the try syntax.
I tried adding the try behavior to Ash, however, it requires lots of small little parameters that are only specific to try.

If we get this working for try, it should work for any other branch in the future.
(Assignee)

Updated

3 years ago
Depends on: 1080135
Created attachment 8502757 [details] [diff] [review]
wip - custom_try_mozharness.diff
Attachment #8502757 - Flags: review?(jlund)
Created attachment 8502759 [details] [diff] [review]
bconfigs_try_mozharness.diff
Attachment #8502759 - Flags: review?(jlund)
(Assignee)

Updated

3 years ago
Attachment #704640 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Attachment #8498377 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Attachment #8502757 - Flags: review?(jlund)
(Assignee)

Updated

3 years ago
Attachment #8502759 - Flags: review?(jlund)
Comment on attachment 8502757 [details] [diff] [review]
wip - custom_try_mozharness.diff

I got it working, however, I think I lost the good patch.

I need to put this to the side while I focus a bit on win8 64-bit talos.
Attachment #8502757 - Attachment description: custom_try_mozharness.diff → wip - custom_try_mozharness.diff
(Assignee)

Updated

2 years ago
Blocks: 1067535
Created attachment 8516980 [details] [diff] [review]
[buildbotcustom] use script_from_code and script_manifest

I download a script to help me download a manifest and set the properties.
I was considering moving the script inside of the tools repo if that was checked out everywhere, however, I believe the right course of action is to turn the script into a buildbot step class.

For the record, the script has to handle a default fallback if there is no mozharness.json.

What do you think?
Attachment #8502757 - Attachment is obsolete: true
Attachment #8516980 - Flags: feedback?(rail)
(Assignee)

Updated

2 years ago
Attachment #8516980 - Attachment description: mh_try_cu.diff → [buildbotcustom] use script_from_code and script_manifest
Created attachment 8516982 [details] [diff] [review]
[buildbot-configs] add a couple of flag to the try repo
Attachment #8502759 - Attachment is obsolete: true
Attachment #8516982 - Flags: feedback?(rail)
Created attachment 8516984 [details] [diff] [review]
[mc] add mozharness.json and create script to expose it as a manifest

As mentioned, the script does not have to live in there.

This setup was worked for me locally.
Attachment #8516984 - Flags: feedback?(rail)
Comment on attachment 8516980 [details] [diff] [review]
[buildbotcustom] use script_from_code and script_manifest

Review of attachment 8516980 [details] [diff] [review]:
-----------------------------------------------------------------

I will need to steal your brain to ask more questions. Some thoughts below.

::: misc.py
@@ +769,5 @@
>                          stagePlatform=None, stageProduct=None,
>                          mozharness=False, mozharness_python=None,
>                          mozharness_suite_config=None,
>                          mozharness_repo=None, mozharness_tag='production',
> +                        script_from_code=None, script_manifest=None, is_debug=None):

The script_from_code name wasn't obvious to me.

@@ +2762,5 @@
>                                  test_builder_kwargs['mozharness_repo'] = branch_config['mozharness_repo']
>                                  test_builder_kwargs['mozharness_tag'] = branch_config['mozharness_tag']
>                                  test_builder_kwargs['mozharness'] = True
> +                                test_builder_kwargs['script_from_code'] = branch_config.get('repository_from_code')
> +                                test_builder_kwargs['script_manifest'] = branch_config.get('mozharness_manifest')

Can you use the same names for variables and configs? Hard to grep.

::: process/factory.py
@@ +5241,5 @@
>  
>  
>  class ScriptFactory(RequestSortingBuildFactory, TooltoolMixin):
>  
> +    def __init__(self, scriptName, scriptRepo=None, cwd=None, interpreter=None,

I don't like the idea of changing the current signature. You can leave it as is and pass None explicitly.

@@ +5291,5 @@
>                  self.cmd = [interpreter, script_path]
>          else:
>              self.cmd = [script_path]
>  
> +        assert scriptRepo or (script_from_code and script_manifest)

I'd prefer more context in the logs here. Something like:
if ...:
    raise exception and message

@@ +5336,5 @@
>              log_eval_func=rc_eval_func({0: SUCCESS, None: RETRY}),
>          ))
> +        if script_from_code and script_manifest:
> +            # XXX: this does not retry
> +            self.addStep(ShellCommand(

I believe it's RetryingShellCommand

@@ +5347,5 @@
> +                name="set_script_repo_url_and_script_repo_revision",
> +                extract_fn=extractProperties,
> +                command=['bash', '-c',
> +                    WithProperties(
> +                    'python repository_from_code.py ' +

Wouldn't it be easier to restrict this part to fetch the manifest instead of running an arbitrary command? Running a command sounds more flexible, but adds a bit of complexity...
I will address the comments you addressed.
I have these two notes for you:

(In reply to Rail Aliiev [:rail] from comment #34)
> @@ +5336,5 @@
> >              log_eval_func=rc_eval_func({0: SUCCESS, None: RETRY}),
> >          ))
> > +        if script_from_code and script_manifest:
> > +            # XXX: this does not retry
> > +            self.addStep(ShellCommand(
> 
> I believe it's RetryingShellCommand
> 
That unfortunately forces that we also checkout the tools repository. I've tried it.

> @@ +5347,5 @@
> > +                name="set_script_repo_url_and_script_repo_revision",
> > +                extract_fn=extractProperties,
> > +                command=['bash', '-c',
> > +                    WithProperties(
> > +                    'python repository_from_code.py ' +
> 
> Wouldn't it be easier to restrict this part to fetch the manifest instead of
> running an arbitrary command? Running a command sounds more flexible, but
> adds a bit of complexity...

I'm not sure of what you mean. Do you want two steps?

I want to turn the script into a BuildbotStep so we don't need to download the script and we can just pass parameters for it.

Let me know if this approach makes sense to you.
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #35)
> I will address the comments you addressed.
> I have these two notes for you:
> 
> (In reply to Rail Aliiev [:rail] from comment #34)
> > @@ +5336,5 @@
> > >              log_eval_func=rc_eval_func({0: SUCCESS, None: RETRY}),
> > >          ))
> > > +        if script_from_code and script_manifest:
> > > +            # XXX: this does not retry
> > > +            self.addStep(ShellCommand(
> > 
> > I believe it's RetryingShellCommand
> > 
> That unfortunately forces that we also checkout the tools repository. I've
> tried it.

Ah, right...

> > @@ +5347,5 @@
> > > +                name="set_script_repo_url_and_script_repo_revision",
> > > +                extract_fn=extractProperties,
> > > +                command=['bash', '-c',
> > > +                    WithProperties(
> > > +                    'python repository_from_code.py ' +
> > 
> > Wouldn't it be easier to restrict this part to fetch the manifest instead of
> > running an arbitrary command? Running a command sounds more flexible, but
> > adds a bit of complexity...
> 
> I'm not sure of what you mean. Do you want two steps?
> 
> I want to turn the script into a BuildbotStep so we don't need to download
> the script and we can just pass parameters for it.
> 
> Let me know if this approach makes sense to you.

I don't mind to have 2 steps here. I was thinking that in this case we'd be stricter (no arbitrary commands to run) and easier for dev (you'd be editing the manifest only).
Comment on attachment 8516980 [details] [diff] [review]
[buildbotcustom] use script_from_code and script_manifest

We chatted a bit about this today.

The approach should work.
Attachment #8516980 - Flags: feedback?(rail) → feedback+
Attachment #8516982 - Flags: feedback?(rail) → feedback+
Comment on attachment 8516984 [details] [diff] [review]
[mc] add mozharness.json and create script to expose it as a manifest

Review of attachment 8516984 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mozharness/repository_from_code.py
@@ +36,5 @@
> +            # Fallback to default values for branches where the manifest
> +            # is not defined
> +            print "script_repo_url: %s" % args.default_repo
> +            print "script_repo_revision: %s" % args.default_revision
> +            exit(0)

exit(0) is not necessary here.
Attachment #8516984 - Flags: feedback?(rail) → feedback+
The caching code that just landed requires me to re-think this:
http://hg.mozilla.org/build/buildbotcustom/rev/c0185f80b7fa

The previous code worked because we were clobbering every-time.

We now have to determine in which cases we use the caching and which places we clobber.

I'm thinking of teaching hgtool.py learn how to checkout through a manifest.
Created attachment 8520650 [details] [diff] [review]
[tools][wip] adding the ability of using a manifest for checking out the right repo
Created attachment 8520854 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url (both with caching and without)
Attachment #662024 - Attachment is obsolete: true
Attachment #8516980 - Attachment is obsolete: true
Attachment #8520854 - Flags: review?(rail)
Created attachment 8520856 [details] [diff] [review]
[buildbot-configs] enable checking mozharness through a manifest on try
Attachment #8516982 - Attachment is obsolete: true
Attachment #8520856 - Flags: review?(rail)
Created attachment 8520862 [details] [diff] [review]
[tools] adding the ability of using a manifest for checking out the right repo

If you think that I can move the core functionality to somewhere common let me know.

I see repository_manifest.py to be deprecated in the future once we have the tools repository checkout shared across platforms.
Attachment #8520650 - Attachment is obsolete: true
Attachment #8520862 - Flags: review?(rail)
Created attachment 8520869 [details] [diff] [review]
[mc] add mozharness.json to lock mozharness checkout (only for try)
Attachment #8516984 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Attachment #8520869 - Flags: review?(rail)
Comment on attachment 8520862 [details] [diff] [review]
[tools] adding the ability of using a manifest for checking out the right repo

Sorry for the delay, I was at the conference last week.

I'm not that familiar with hgtool.py internals, forwarding the request to catlee.
Attachment #8520862 - Flags: review?(rail) → review?(catlee)
Comment on attachment 8520869 [details] [diff] [review]
[mc] add mozharness.json to lock mozharness checkout (only for try)

Review of attachment 8520869 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mozharness/mozharness.json
@@ +1,2 @@
> +{
> +    "repo": "http://hg.mozilla.org/build/mozharness",

Can you use https:// instead of http:// here?
Attachment #8520869 - Flags: review?(rail) → review+
Attachment #8520856 - Flags: review?(rail) → review+
Comment on attachment 8520862 [details] [diff] [review]
[tools] adding the ability of using a manifest for checking out the right repo

Review of attachment 8520862 [details] [diff] [review]:
-----------------------------------------------------------------

I kind think this doesn't belong in hgtool.py itself. Something else that fetches the manifest and calls hgtool with the right parameters maybe is cleaner?

What's the justification for putting support for this into hgtool?
Attachment #8520862 - Flags: review?(catlee)
(In reply to Chris AtLee [:catlee] from comment #47)
> Comment on attachment 8520862 [details] [diff] [review]
> [tools] adding the ability of using a manifest for checking out the right
> repo
> 
> Review of attachment 8520862 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I kind think this doesn't belong in hgtool.py itself. Something else that
> fetches the manifest and calls hgtool with the right parameters maybe is
> cleaner?
> 
> What's the justification for putting support for this into hgtool?

I wanted to make it tightly integrated with hgtool so it is available for any future repositories. I guess we could make it with just repository_manifest.py if anyone wants.
Created attachment 8524072 [details] [diff] [review]
[tools] repository_manifest.py
Attachment #8520862 - Attachment is obsolete: true
Attachment #8524072 - Flags: review?(rail)
Created attachment 8524074 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url (both with caching and without)
Attachment #8520854 - Attachment is obsolete: true
Attachment #8520854 - Flags: review?(rail)
Attachment #8524074 - Flags: review?(rail)
Comment on attachment 8524072 [details] [diff] [review]
[tools] repository_manifest.py

Review of attachment 8524072 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildfarm/utils/repository_manifest.py
@@ +15,5 @@
> +import urllib2
> +
> +# When an infra error happens we want to turn purple and
> +# let sheriffs determine if re-triggering is needed
> +INFRA_CODE=3

A nit. Surrounding spaces would increase your PEP8 point.

@@ +22,5 @@
> +    '''
> +    Determine which repository and revision mozharness.json indicates.
> +    If none is found we fall back to the default repository
> +    '''
> +    parser = argparse.ArgumentParser(description='Process some arguments.')

The description doesn't look informative. Can you either kill it or rephrase? :)

@@ +47,5 @@
> +            print "We have failed to retrieve the manifest (http code %s)" % http_code
> +            exit(INFRA_CODE)
> +    except Exception, e:
> +        print str(e)
> +        exit(INFRA_CODE)

Can you refactor the code so you don't have multiple exit() calls? Something like

exit_code = 0
if ...:
    print ...
elif:
    print ...
    exit_code = INFRA_CODE
...

# and finally exit for reals
exit(exit_code)
Attachment #8524072 - Flags: review?(rail) → review-
Comment on attachment 8524074 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url (both with caching and without)

Review of attachment 8524074 [details] [diff] [review]:
-----------------------------------------------------------------

::: process/factory.py
@@ +5377,5 @@
> +                    name="set_script_repo_url_and_script_repo_revision",
> +                    extract_fn=extractProperties,
> +                    command=['bash', '-c',
> +                        WithProperties(
> +                        'python %s ' % repository_manifest_path +

Why do you need to use bash -c "python ...." here instead of ["python", "...", "..."]? The latter would be more readable and safer.

@@ +5386,5 @@
> +                    haltOnFailure=True,
> +                ))
> +                script_repo_url = WithProperties('%(script_repo_url)s')
> +            else:
> +                script_repo_url = scriptRepo

Hmm, the else clause is hiding here... Can you refactor it to look like:

script_repo_url = scriptRepo
if script_repo_manifest:
    ...
    ...
    script_repo_url = WithProperties('%(script_repo_url)s')

so you get rid of that block completely.

@@ +5388,5 @@
> +                script_repo_url = WithProperties('%(script_repo_url)s')
> +            else:
> +                script_repo_url = scriptRepo
> +
> +            hgtool_cmd += [script_repo_url, self.script_repo_cache]

Will this work with unmodified hgtool.py (since we are not going to modify it)?

@@ +5419,5 @@
> +                self.addStep(ShellCommand(
> +                    command=['bash', '-c',
> +                             WithProperties('wget -Orepository_manifest.py ' + \
> +                             '--no-check-certificate --tries=10 --waitretry=3 ' + \
> +                             'http://hg.mozilla.org/users/armenzg_mozilla.com/tools/raw-file/default/buildfarm/utils/repository_manifest.py')],

This doesn't look like a final patch, correct? :)

@@ +5428,5 @@
> +                    name="set_script_repo_url_and_script_repo_revision",
> +                    extract_fn=extractProperties,
> +                    command=['bash', '-c',
> +                        WithProperties(
> +                        'python repository_manifest.py ' +

See my comment regarding bash -c above.

@@ +5437,5 @@
> +                    haltOnFailure=True,
> +                ))
> +                script_repo_url = WithProperties('%(script_repo_url)s')
> +            else:
> +                script_repo_url = scriptRepo

Hmm, the else clause is hiding here... Can you refactor it to look like:

script_repo_url = scriptRepo
if script_repo_manifest:
    ...
    ...
    script_repo_url = WithProperties('%(script_repo_url)s')

so you get rid of that block completely.
Attachment #8524074 - Flags: review?(rail) → review-
Created attachment 8524794 [details] [diff] [review]
[tools] repository_manifest.py
Attachment #8524072 - Attachment is obsolete: true
Attachment #8524794 - Flags: review?(rail)
Comment on attachment 8524794 [details] [diff] [review]
[tools] repository_manifest.py

lgtm!
Attachment #8524794 - Flags: review?(rail) → review+
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #48)
> (In reply to Chris AtLee [:catlee] from comment #47)
> > Comment on attachment 8520862 [details] [diff] [review]
> > [tools] adding the ability of using a manifest for checking out the right
> > repo
> > 
> > Review of attachment 8520862 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > I kind think this doesn't belong in hgtool.py itself. Something else that
> > fetches the manifest and calls hgtool with the right parameters maybe is
> > cleaner?
> > 
> > What's the justification for putting support for this into hgtool?
> 
> I wanted to make it tightly integrated with hgtool so it is available for
> any future repositories. I guess we could make it with just
> repository_manifest.py if anyone wants.

I remembered what was the reason.
The reason is that if we use the repo indicated through the manifest we can then determine if we should use the script_repo_cache (in case the default mozharness repo is specified), otherwise, we don't use the cache and clobber whatever other mozharness repo.

Without hgtool.py dealing with that case I doubt I will be able to do that.
Comment on attachment 8524074 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url (both with caching and without)

Review of attachment 8524074 [details] [diff] [review]:
-----------------------------------------------------------------

::: process/factory.py
@@ +5377,5 @@
> +                    name="set_script_repo_url_and_script_repo_revision",
> +                    extract_fn=extractProperties,
> +                    command=['bash', '-c',
> +                        WithProperties(
> +                        'python %s ' % repository_manifest_path +

I think it is a side effect of using extractProperties():
http://hg.mozilla.org/build/buildbotcustom/file/default/process/factory.py#l5262

If I don't, I get this issue:
Upon execvpe python /tools/checkouts/build-tools/buildfarm/utils/repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/7a1c0d2e330/testing/mozharness/mozharness.json ['python /tools/checkouts/build-tools/buildfarm/utils/repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/7a1c0d2e330/testing/mozharness/mozharness.json'] in environment id 139690170502512
:Traceback (most recent call last):
  File "/home/armenzg/workdir/mozharness_try/venv/local/lib/python2.7/site-packages/twisted/internet/process.py", line 414, in _fork
    executable, args, environment)
  File "/home/armenzg/workdir/mozharness_try/venv/local/lib/python2.7/site-packages/twisted/internet/process.py", line 460, in _execChild
    os.execvpe(executable, args, environment)
  File "/home/armenzg/workdir/mozharness_try/venv/lib/python2.7/os.py", line 353, in execvpe
    _execvpe(file, args, env)
  File "/home/armenzg/workdir/mozharness_try/venv/lib/python2.7/os.py", line 368, in _execvpe
    func(file, *argrest)
OSError: [Errno 2] No such file or directory
https://hg.mozilla.org/projects/ash/rev/5749782b8639
Created attachment 8526316 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url (both with caching and without)

Due to the complexity of this patch I'm going to split it into two.
One only dealing with normal path (w/o caching) and the other with caching (for a follow up bug).

Would you nevertheless have a look at this review to see what makes sense?
Attachment #8524074 - Attachment is obsolete: true
Created attachment 8526338 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url without caching
Attachment #8526338 - Flags: review?(rail)
Created attachment 8526339 [details] [diff] [review]
[buildbot-configs] enable manifest checkout through mozharness.json on Ash

Let's start first with Ash. We can roll Try next once we're confident.
Attachment #8526339 - Flags: review?(rail)
Attachment #8526339 - Flags: review?(rail) → review+
Attachment #8526338 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/tools/rev/085f1d0c2a68
(Assignee)

Updated

2 years ago
Attachment #8524794 - Flags: checked-in+
https://hg.mozilla.org/integration/mozilla-inbound/rev/a6cbb4db302b
(Assignee)

Updated

2 years ago
Attachment #8520869 - Flags: checked-in+
Created attachment 8526939 [details] [diff] [review]
[buildbotcustom] set script_repo_url as a properties by default for non-manifest based branches

I caught this on my dry run.
Attachment #8526939 - Flags: review?(rail)
Created attachment 8526945 [details] [diff] [review]
[tools] handle 404 case properly
Attachment #8526945 - Flags: review?(rail)
Attachment #8526945 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/buildbot-configs/rev/198d3b21d967
Attachment #8526939 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/tools/rev/730eb806d023
https://hg.mozilla.org/build/buildbotcustom/rev/95ec68e14777
(Assignee)

Updated

2 years ago
Attachment #8526339 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8526939 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8526945 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8526338 - Flags: checked-in+
In production: https://hg.mozilla.org/build/buildbot-configs/rev/198d3b21d967
https://hg.mozilla.org/mozilla-central/rev/a6cbb4db302b
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
This is live now on Ash.
I want to see if any issues come up there before enable it on Try.
So far so good.

http://armenzg.blogspot.ca/2014/11/pinning-mozharness-from-in-tree-aka.html
https://hg.mozilla.org/build/buildbot-configs/rev/96a1b24e0067
In production: https://hg.mozilla.org/build/buildbot-configs/rev/96a1b24e0067
Comment on attachment 8526338 [details] [diff] [review]
[buildbotcustom] checkout repository according to manifest_url without caching

BTW this was pushed straight to production branch and wasn't picked up by maintenance or change set
Comment on attachment 8520856 [details] [diff] [review]
[buildbot-configs] enable checking mozharness through a manifest on try

this caused try to be closed and bustage due to 10.6: https://tbpl.mozilla.org/php/getParsedLog.php?id=53351343&tree=Try

it looks like this broke ash for the same reason. I have not backed out the ash equivalent in case you want to debug.

backed out:
http://hg.mozilla.org/build/buildbot-configs/rev/1906412f1c56
Attachment #8520856 - Flags: checked-in-

Updated

2 years ago
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Created attachment 8528385 [details] [diff] [review]
[tools] use optparse instead of argparse

I'm sorry I should have been more patient and meticulous.

This deals with argparse not being available by switching to optparse.
Attachment #8528385 - Flags: review?(rail)
Comment on attachment 8528385 [details] [diff] [review]
[tools] use optparse instead of argparse

Can you add a check for required options (it was for free in argparse :( )? Calling urlopen with None passed may give more troubles understanding the error message. Something like

if not args.manifest_url or not options.default_repo or not options.default_revision:
    parser.error("blah blah blah")
Attachment #8528385 - Flags: review?(rail) → review-
Created attachment 8528396 [details] [diff] [review]
[tools] use optparse instead of argparse

I also adjusted usage so it looks better in the console.
Attachment #8528385 - Attachment is obsolete: true
Attachment #8528396 - Flags: review?(rail)
With missing option:

$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness 
Usage: 
   Reads a repository manifest and outputs the repo and
   revision/branch in a format digestable for buildbot
   properties ("key: value").


repository_manifest.py: error: You have to call the script with all options
Comment on attachment 8528396 [details] [diff] [review]
[tools] use optparse instead of argparse

Review of attachment 8528396 [details] [diff] [review]:
-----------------------------------------------------------------

,@;@,
        ,@;@;@;@;@;@/ )@;@;
      ,;@;@;@;@;@;@|_/@' e\
     (|@;@:@\@;@;@;@:@(    \
       '@;@;@|@;@;@;@;'`"--'
        '@;@;/;@;/;@;' 
         ) //   | ||
         \ \\   | ||
          \ \\  ) \\  jgs
           `"`  `"``
Attachment #8528396 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/tools/rev/b53b63b5e135
After landing that tools it seems to also be working for Mac:
https://tbpl.mozilla.org/?tree=Ash&rev=6a32858f85ff
https://hg.mozilla.org/build/buildbot-configs/rev/f8f66f175f5f
(Assignee)

Updated

2 years ago
No longer blocks: 1067535
https://hg.mozilla.org/build/buildbotcustom/rev/cb6ce28989c2
In production: https://hg.mozilla.org/build/buildbot-configs/rev/f8f66f175f5f
In production: https://hg.mozilla.org/build/buildbotcustom/rev/cb6ce28989c2
(Assignee)

Updated

2 years ago
Attachment #8528396 - Flags: checked-in+
Created attachment 8530294 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions

This fixes the horrible issues caused by a typo for the repository (bug 1105826).
Attachment #8530294 - Flags: review?(rail)
Comment on attachment 8530294 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions

Review of attachment 8530294 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildfarm/utils/repository_manifest.py
@@ +48,5 @@
>          http_code = url_opener.getcode()
>          if http_code == 200:
>              manifest = json.load(url_opener)
> +            repo = manifest["repo"]
> +            revision = manifest["revision"]

These 2 lines may raise ValueError and it'll be caught by the generic except Exception, e: (line 74 in patched file). So, a broken amy bring the infra down again.

Let's refactor this script to return INFA_CODE only on known errors.
Attachment #8530294 - Flags: review?(rail) → review-
Can't type today, broken amy = broken manifest
Created attachment 8530368 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions and handle invalided manifest

All of the cases we have discussed.

armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/valid_manifest.json ; echo $?
script_repo_url: http://hg.mozilla.org/mozilla-central
script_repo_revision: default
0
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/invalid_manifest.json ; echo $?
We have a non-valid json manifest.
Expecting property name: line 4 column 1 (char 82)
1
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/wrong_revision.json ; echo $?
We've had an exception when verifiying the existance of:
 -> http://hg.mozilla.org/mozilla-central/rev/aaaaaaaaaaaaa
HTTP Error 404: Not Found
1
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/wrong_repository.json ; echo $?
We've had an exception when verifiying the existance of:
 -> http://hg.mozilla.org/mozilla-central_typo
HTTP Error 404: Not Found
1
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/http_code404.json --timeout 0.1; echo $?
script_repo_url: https://hg.mozilla.org/build/mozharness
script_repo_revision: production
0
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/mozilla-central/raw-file/default/testing/mozharness/mozharness.json --timeout 0.1; echo $?
SSLError: The read operation timed out
3
Attachment #8530294 - Attachment is obsolete: true
Attachment #8530368 - Flags: review?(rail)
Created attachment 8530386 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions and handle invalided manifest

I have dropped the SSLError case since it was being force due to the insane 0.1 seconds timeout (plus I don't want to deal with manifests that have bad certs).

I have also tested for the bad url and missing url cases:
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/bad_url.json ; echo $?
URLError: <urlopen error unknown url type: htttp>
3
armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/missing_url.json ; echo $?
URLError: <urlopen error [Errno -2] Name or service not known>
3
Attachment #8530368 - Attachment is obsolete: true
Attachment #8530368 - Flags: review?(rail)
Attachment #8530386 - Flags: review?(rail)
Below you can see the output of one of the jobs that failed.
What we want to prevent is to return a code of 0.

We can see that this step is providing a URL to a repo that does not exist and the step should not continue.
> script_repo_url: https://hg.mozilla.org/user/armenzg_mozilla.com/mozharness
> script_repo_revision: default
> program finished with exit code 0


bash -c 'python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/a0271fbe61fe6380fff8f4c260b2345a3ce4e5c3/testing/mozharness/mozharness.json'
 in dir /builds/slave/test/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['bash', '-c', u'python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/a0271fbe61fe6380fff8f4c260b2345a3ce4e5c3/testing/mozharness/mozharness.json']
 environment:
  HOME=/home/cltbld
  LANG=en_US.UTF-8
  LOGNAME=cltbld
  MAIL=/var/mail/cltbld
  NODE_PATH=/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript
  PATH=/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
  PWD=/builds/slave/test/build
  SHELL=/bin/bash
  SHLVL=1
  TERM=linux
  TMOUT=86400
  USER=cltbld
  XDG_SESSION_COOKIE=5a152710dd62cc268aad214e000002d8-1417043021.702782-1827239843
  _=/tools/buildbot/bin/python
 using PTY: False
script_repo_url: https://hg.mozilla.org/user/armenzg_mozilla.com/mozharness
script_repo_revision: default
program finished with exit code 0
elapsedTime=0.898924
script_repo_revision: 'default'
script_repo_url: 'https://hg.mozilla.org/user/armenzg_mozilla.com/mozharness'
========= Finished set props: script_repo_revision script_repo_url (results: 0, elapsed: 3 secs) (at 2014-11-26 15:07:56.869293) =========
...
========= Started 'hg clone ...' failed (results: 5, elapsed: 1 secs) (at 2014-11-26 15:08:03.344679) =========
hg clone https://hg.mozilla.org/user/armenzg_mozilla.com/mozharness scripts
 in dir /builds/slave/test/. (timeout 1320 secs)
 watching logfiles {}
 argv: ['hg', 'clone', 'https://hg.mozilla.org/user/armenzg_mozilla.com/mozharness', 'scripts']
 environment:
  HOME=/home/cltbld
  LANG=en_US.UTF-8
  LOGNAME=cltbld
  MAIL=/var/mail/cltbld
  NODE_PATH=/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript
  PATH=/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
  PWD=/builds/slave/test
  SHELL=/bin/bash
  SHLVL=1
  TERM=linux
  TMOUT=86400
  USER=cltbld
  XDG_SESSION_COOKIE=5a152710dd62cc268aad214e000002d8-1417043021.702782-1827239843
  _=/tools/buildbot/bin/python
 using PTY: False
abort: HTTP Error 404: Not Found
program finished with exit code 255
Comment on attachment 8530386 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions and handle invalided manifest

Review of attachment 8530386 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildfarm/utils/repository_manifest.py
@@ +42,5 @@
>         not options.default_repo or \
>         not options.default_revision:
>          parser.error("You have to call the script with all options")
>  
> +    exit_code = SUCCESS_CODE

Can you set the default exit code to FAILURE_CODE to be super duper sure we never set it to SUCCESS_CODE implicitly?

@@ +52,5 @@
> +                manifest = json.load(url_opener)
> +                repo = manifest["repo"]
> +                revision = manifest["revision"]
> +                # Let's determine if the repo and revision exist
> +                urllib2.urlopen(repo, timeout=options.timeout)

You can drop the line above since you are checking the exact URL below.

@@ +55,5 @@
> +                # Let's determine if the repo and revision exist
> +                urllib2.urlopen(repo, timeout=options.timeout)
> +                urllib2.urlopen('%s/rev/%s' % (repo, revision), timeout=options.timeout)
> +                print "script_repo_url: %s" % repo
> +                print "script_repo_revision: %s" % revision

You'll need to set exit_code = SUCCESS_CODE when you change the default exit code

@@ +59,5 @@
> +                print "script_repo_revision: %s" % revision
> +            except urllib2.HTTPError, e:
> +                print "We've had an exception when verifiying the " \
> +                        "existance of:\n -> %s" % e.url
> +                print str(e)

Can you unify exception handling? traceback.print_exc() or something similar. I'd prefer logging.exception() though.

@@ +75,5 @@
> +        if e.getcode() == 404:
> +            # Fallback to default values for branches where the manifest
> +            # is not defined
> +            print "script_repo_url: %s" % options.default_repo
> +            print "script_repo_revision: %s" % options.default_revision

You'll need to set exit_code = SUCCESS_CODE when you change the default exit code

@@ +80,5 @@
> +        else:
> +            traceback.print_exc()
> +            exit_code = FAILURE_CODE
> +    except urllib2.URLError, e:
> +        print traceback.format_exc().splitlines()[-1]

Can you use the same approach here you used below? I mean 

traceback.print_exc()

In the future, it'd be better to use logging for this kind of stuff. log.exeption("blah") is more informative than a traceback.
Created attachment 8530397 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions and handle invalided manifest

Is this better?

(everything)armenzg@armenzg-thinkpad:~/repos/tools/buildfarm/utils$ for i in `cd repository_manifest_test_manifests && ls *json && cd ..`; do echo "## $i ##" && python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/$i; echo $?; done
## bad_url.json ##
ERROR:root:URLError for htttp://mozilla.org/rev/default
Traceback (most recent call last):
  File "repository_manifest.py", line 59, in main
    urllib2.urlopen(url, timeout=options.timeout)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 427, in _open
    'unknown_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1247, in unknown_open
    raise URLError('unknown url type: %s' % type)
URLError: <urlopen error unknown url type: htttp>
3
## invalid_manifest.json ##
ERROR:root:We have a non-valid json manifest.
Traceback (most recent call last):
  File "repository_manifest.py", line 54, in main
    manifest = json.load(url_opener)
  File "/usr/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 4 column 1 (char 82)
1
## missing_url.json ##
ERROR:root:URLError for http://example.domain.dont.exist/blah/rev/default
Traceback (most recent call last):
  File "repository_manifest.py", line 59, in main
    urllib2.urlopen(url, timeout=options.timeout)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 422, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
    raise URLError(err)
URLError: <urlopen error [Errno -2] Name or service not known>
3
## valid_manifest.json ##
script_repo_url: http://hg.mozilla.org/mozilla-central
script_repo_revision: default
0
## wrong_repository.json ##
ERROR:root:http://hg.mozilla.org/mozilla-central_typo/rev/default
Traceback (most recent call last):
  File "repository_manifest.py", line 59, in main
    urllib2.urlopen(url, timeout=options.timeout)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 404: Not Found
1
## wrong_revision.json ##
ERROR:root:http://hg.mozilla.org/mozilla-central/rev/aaaaaaaaaaaaa
Traceback (most recent call last):
  File "repository_manifest.py", line 59, in main
    urllib2.urlopen(url, timeout=options.timeout)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 404: Not Found
1
Attachment #8530397 - Flags: review?(rail)
Comment on attachment 8530397 [details] [diff] [review]
[tools] do not allow non-existant repositories or revisions and handle invalided manifest

Review of attachment 8530397 [details] [diff] [review]:
-----------------------------------------------------------------

\o/
Attachment #8530397 - Flags: review?(rail) → review+
Attachment #8530386 - Attachment is obsolete: true
Attachment #8530386 - Flags: review?(rail)
(Assignee)

Updated

2 years ago
Blocks: 1106143
https://hg.mozilla.org/build/tools/rev/c554d1982e43
(Assignee)

Updated

2 years ago
Attachment #8530397 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8520856 - Flags: checked-in- → checked-in+
Created attachment 8531057 [details] [diff] [review]
[buildbot-configs] enable try mozharness for mobile and b2g jobs + reset Ash/Cypress to be normal twig repos

This is the last thing I will be needing to close this bug.

This also resets Ash and Cypress to be normal twig repos.

Ash will have the bonus of having script_repo_manifest ability like try.

After this we can remove ash-mozharness.
Attachment #8531057 - Flags: review?(rail)
Comment on attachment 8531057 [details] [diff] [review]
[buildbot-configs] enable try mozharness for mobile and b2g jobs + reset Ash/Cypress to be normal twig repos

Review of attachment 8531057 [details] [diff] [review]:
-----------------------------------------------------------------

The patch looks ok to me. Over to Jordan to confirm that we don't need to use the mozharness-ash branch on ash anymore.
Attachment #8531057 - Flags: review?(rail)
Attachment #8531057 - Flags: review?(jlund)
Attachment #8531057 - Flags: review+
https://hg.mozilla.org/build/buildbot-configs/rev/93c8973c10c1
rail, jlund says that letting Ash point to the main mozharness repo.
He can land a mozharness.json change if he needs to change it.
Attachment #8531057 - Flags: review?(jlund)
Comment on attachment 8526939 [details] [diff] [review]
[buildbotcustom] set script_repo_url as a properties by default for non-manifest based branches

Review of attachment 8526939 [details] [diff] [review]:
-----------------------------------------------------------------

No reason we need the slave to echo this out. Please change this to use buildbotcustom.steps.misc.SetBuildProperty.
Attachment #8526939 - Flags: review-
(Assignee)

Updated

2 years ago
Attachment #8531057 - Flags: checked-in+
In production: https://hg.mozilla.org/build/buildbot-configs/rev/93c8973c10c1
https://hg.mozilla.org/build/tools/rev/b5d2f2b01b68
It seems that with all the recent network/hg/ftp issues a 30 seconds timeout is still being hit once in a while on try for the retrieving of mozharness.json from hg web.
The workaround for now is to retry the job.

Please don't back it out; it is not happening often enough, I'm watching Try for retriggers and I will have a fix soon.
I will be working a bit this weekend to have a fix by Monday (I already have a wip locally).
This is for the repository_manifest.py code that lives in tools.

bash -c 'python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/399c9fd1db437d8923e4d61e607e2eac59f47a8f/testing/mozharness/mozharness.json'
 in dir /builds/slave/test/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['bash', '-c', u'python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/399c9fd1db437d8923e4d61e607e2eac59f47a8f/testing/mozharness/mozharness.json']
 environment:
  HOME=/home/cltbld
  LANG=en_US.UTF-8
  LOGNAME=cltbld
  MAIL=/var/mail/cltbld
  NODE_PATH=/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript
  PATH=/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
  PWD=/builds/slave/test/build
  SHELL=/bin/bash
  SHLVL=1
  TERM=linux
  TMOUT=86400
  USER=cltbld
  XDG_SESSION_COOKIE=5a152710dd62cc268aad214e000002d8-1417848268.568816-551144976
  _=/tools/buildbot/bin/python
 using PTY: False
ERROR:root:URLError for https://hg.mozilla.org/build/mozharness/rev/production
Traceback (most recent call last):
  File "repository_manifest.py", line 58, in main
    urllib2.urlopen(url, timeout=options.timeout)
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 400, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 418, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
    raise URLError(err)
URLError: <urlopen error timed out>
program finished with exit code 3
elapsedTime=31.216216
Created attachment 8532864 [details] [diff] [review]
[tools] use retry logic to fetch manifest

I would like to land this on Monday or sooner if possible (I don't know if we're still going to see some of these intermittent timeouts).

gps:Any ideas as to how I could determine if these timeouts are still happening?
I wonder if there is some sort of logging of failed hg web fetches.
Flags: needinfo?(gps)
Attachment #8532864 - Flags: review?(rail)
Comment on attachment 8532864 [details] [diff] [review]
[tools] use retry logic to fetch manifest

Review of attachment 8532864 [details] [diff] [review]:
-----------------------------------------------------------------

It looks good to me. Please coordinate the landing with buildutty.

Extra bonus if you fix 4 PEP8 space violations.

::: buildfarm/utils/repository_manifest.py
@@ +29,5 @@
>  INFRA_CODE = 3
>  
> +# Logic based on lib/python/util/retry.py
> +# The day that we don't wget repository_manifest.py
> +# we can import directly the functionality

I would rather use retrier() instead. It gives you more control of how you handle possible exceptions. For example, I'd retry on HTTP 500 error, but retry() won't distinguish HTTP 404 and 500, because both are HTTPErrors.

See example here: http://hg.mozilla.org/build/tools/file/5561325c8115/lib/python/util/hg.py#l320
Attachment #8532864 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/tools/rev/403e6f08b968
rail and I spoke on IRC and I will make the changes requested in comment 105 in a follow patch.

Re-triggered jobs in Try seem to be working well.
I will review Try again later today.
The timeout issues to hg.mozilla.org are concerning to me. I didn't fully realize until last week how serious the problem was.

Server-side logs aren't enough to diagnose these failures: we need client-side metrics and logs. There's just too many things between buildbot slaves and hg.mozilla.org origin servers that can go wrong.

Feel free to file a bug against hg.mozilla.org with details of some failures and maybe we can track things down.
Flags: needinfo?(gps)
(In reply to Gregory Szorc [:gps] from comment #108)
> The timeout issues to hg.mozilla.org are concerning to me. I didn't fully
> realize until last week how serious the problem was.
> 
> Server-side logs aren't enough to diagnose these failures: we need
> client-side metrics and logs. There's just too many things between buildbot
> slaves and hg.mozilla.org origin servers that can go wrong.
> 
> Feel free to file a bug against hg.mozilla.org with details of some failures
> and maybe we can track things down.

Filed as bug 1109393
(Assignee)

Updated

2 years ago
Attachment #8532864 - Flags: checked-in+
Created attachment 8534627 [details] [diff] [review]
[custom] talos try mozharness support

Tested locally. Works as expected.
Attachment #8534627 - Flags: review?(rail)
Attachment #8534627 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/buildbotcustom/rev/d199583ea0a8
(Assignee)

Updated

2 years ago
Blocks: 1110286
Created attachment 8535260 [details] [diff] [review]
[custom] script_repo_manifest + caching support
Attachment #8535260 - Flags: review?(rail)
Created attachment 8535261 [details] [diff] [review]
[tools] repository_manifest.py can help determine to either use shared checkout or a different checkout location
Attachment #8535261 - Flags: review?(rail)
Created attachment 8535263 [details] [diff] [review]
[configs] enable mozharness try support for builds

We should wait for the other two to land first.
Attachment #8535263 - Flags: review?(rail)
mshal, the last three patches help you with what you need.
Remember that you have to specify a revision on the buildbot builder or you will hit bug 1109894.

If you could see a build running with your user repo all the way to completion please let me know.

I can only test that:
A) we use the right repo, revision
B) we call a script in the shared checkout OR the script from user repo (which lives under $basedir/build/scripts)

jlund, the tools patch makes the intelligent choice of choosing the shared checkout.
Flags: needinfo?(mshal)
Flags: needinfo?(jlund)
In production: https://hg.mozilla.org/build/buildbotcustom/rev/d199583ea0a8
Created attachment 8535731 [details] [diff] [review]
[tools] repository_manifest.py can help determine to either use shared checkout or a different checkout location

We should not forget that we need to support the old approach.
Attachment #8535261 - Attachment is obsolete: true
Attachment #8535261 - Flags: review?(rail)
Flags: needinfo?(mshal)
Flags: needinfo?(jlund)
Attachment #8535731 - Flags: review?(rail)
Output for try _build_ job which has caching:
script_repo_url: https://hg.mozilla.org/build/mozharness
script_repo_revision: production
script_repo_checkout: /tools/checkouts/mozharness

Output for try _build_ job which has caching but it points to user repo:
script_repo_url: https://hg.mozilla.org/users/armenzg_mozilla.com/mozharness
script_repo_revision: 1070041_move_to_tree
script_repo_checkout: /home/armenzg/workdir/mozharness_try/slaves/try/try-l64-0000000000000000000000/build/scripts

Output of a test job:
script_repo_url: https://hg.mozilla.org/build/mozharness
script_repo_revision: production

Code used in each case:
/tools/checkouts/build-tools/buildfarm/utils/repository_manifest.py - patched locally
http://hg.mozilla.org/users/armenzg_mozilla.com/tools/raw-file/default/buildfarm/utils/repository_manifest.py - patched user repo
Created attachment 8535744 [details] [diff] [review]
[configs] enable mozharness try support for builds on Ash

Let's start first with Ash. Just in case; you never know.
Attachment #8535744 - Flags: review?(rail)
I mentioned this in IRC, but as far as I can tell from my dev-master, these patches appear to be working. I do have lots of little changes locally, so I'm not sure how valid of a test it is overall, but I hope it lends some confidence to the approach at least.
Comment on attachment 8535260 [details] [diff] [review]
[custom] script_repo_manifest + caching support

Jordan has gracefully volunteered.
Attachment #8535260 - Flags: review?(rail) → review?(jlund)
(Assignee)

Updated

2 years ago
Attachment #8535263 - Flags: review?(rail) → review?(jlund)
(Assignee)

Updated

2 years ago
Attachment #8535731 - Flags: review?(rail) → review?(jlund)
(Assignee)

Updated

2 years ago
Attachment #8535744 - Flags: review?(rail) → review?(jlund)
(Assignee)

Updated

2 years ago
Attachment #8535260 - Flags: review?(jlund) → review?(rail)
(Assignee)

Updated

2 years ago
Attachment #8535263 - Flags: review?(jlund) → review?(rail)
(Assignee)

Updated

2 years ago
Attachment #8535731 - Flags: review?(jlund) → review?(rail)
(Assignee)

Updated

2 years ago
Attachment #8535744 - Flags: review?(jlund) → review?(rail)
Comment on attachment 8535260 [details] [diff] [review]
[custom] script_repo_manifest + caching support

This patch doesn't apply. Can you refresh it?
Attachment #8535260 - Flags: review?(rail)
Attachment #8535263 - Flags: review?(rail) → review+
Attachment #8535744 - Flags: review?(rail) → review+
Comment on attachment 8535731 [details] [diff] [review]
[tools] repository_manifest.py can help determine to either use shared checkout or a different checkout location

Review of attachment 8535731 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildfarm/utils/repository_manifest.py
@@ +84,5 @@
>      parser.add_option("--manifest-url", dest="manifest_url")
>      parser.add_option("--default-repo", dest="default_repo")
>      parser.add_option("--default-revision", dest="default_revision")
> +    parser.add_option("--default-checkout", dest="default_checkout")
> +    parser.add_option("--checkout", dest="checkout")

Can you add, help="" argument to clarify the difference? It would be great to add help to other options too (not blocking on that though).
Attachment #8535731 - Flags: review?(rail) → review-
Attachment #8535260 - Flags: review+
Created attachment 8536641 [details] [diff] [review]
[tools] repository_manifest.py can help determine to either use shared checkout or a different checkout location
Attachment #8535731 - Attachment is obsolete: true
Attachment #8536641 - Flags: review?(rail)
https://hg.mozilla.org/build/buildbot-configs/rev/93cf69c65c3b
https://hg.mozilla.org/build/buildbotcustom/rev/8e7940ff9558
Attachment #8536641 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/tools/rev/154c99255467
(Assignee)

Updated

2 years ago
Attachment #8534627 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8535260 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8535744 - Flags: checked-in+
(Assignee)

Updated

2 years ago
Attachment #8536641 - Flags: checked-in+
https://tbpl.mozilla.org/?tree=Try&rev=aef133c113a0
In production: https://hg.mozilla.org/build/buildbot-configs/rev/93cf69c65c3b
In production: https://hg.mozilla.org/build/buildbotcustom/rev/8e7940ff9558
https://hg.mozilla.org/build/buildbot-configs/rev/333ff4057cb9
I see *some* build jobs using the repository_manifest.py on Ash:
https://tbpl.mozilla.org/?tree=Ash&jobname=build

It seems that b2g-desktop, android and mulet jobs do not use ScriptFactory but MozillaBuildFactory:
http://hg.mozilla.org/build/buildbotcustom/file/default/process/factory.py#l402

I have landed a change to enable mozharness repo caching on Ash:
http://hg.mozilla.org/build/buildbot-configs/rev/333ff4057cb9

Once that lands we can test this new configuration before we enable it for try.

NOTE: Even if all we get now out of this latter work is that we have control over Firefox desktop mozharness on Try; we will also be getting ready for the day when we don't checkout tools and mozharness every single time for test jobs.
In production: https://hg.mozilla.org/build/buildbot-configs/rev/333ff4057cb9
Merging mc to ash:
https://tbpl.mozilla.org/?tree=Ash&jobname=build
Created attachment 8538021 [details] [diff] [review]
[tools] Let's create future checkout if not yet ready

To deal with the Linux machines not having build/scripts ready:
https://tbpl.mozilla.org/?tree=Ash&jobname=build


bash -c 'python /tools/checkouts/build-tools/buildfarm/utils/repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --default-checkout /tools/checkouts/mozharness --checkout /builds/slave/ash-lx-00000000000000000000000/build/scripts --manifest-url https://hg.mozilla.org/projects/ash/raw-file/393ee2a64e6243ec31b0a4f9795e842177588366/testing/mozharness/mozharness.json'
 in dir /builds/slave/ash-lx-00000000000000000000000/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['bash', '-c', u'python /tools/checkouts/build-tools/buildfarm/utils/repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --default-checkout /tools/checkouts/mozharness --checkout /builds/slave/ash-lx-00000000000000000000000/build/scripts --manifest-url https://hg.mozilla.org/projects/ash/raw-file/393ee2a64e6243ec31b0a4f9795e842177588366/testing/mozharness/mozharness.json']
...
script_repo_url: https://hg.mozilla.org/build/ash-mozharness
script_repo_revision: default
The specified checkout (/builds/slave/ash-lx-00000000000000000000000/build/scripts) does not exist.
program finished with exit code 1
Attachment #8538021 - Flags: review?(rail)
Created attachment 8538109 [details] [diff] [review]
[tools] stop checking for the existance of the checkout dir since hg will create it
Attachment #8538109 - Flags: review?(rail)
Attachment #8538109 - Flags: review?(rail) → review+
Attachment #8538021 - Attachment is obsolete: true
Attachment #8538021 - Flags: review?(rail)
https://hg.mozilla.org/build/tools/rev/2b272a6d73f2
(Assignee)

Updated

2 years ago
Attachment #8538109 - Flags: checked-in+
It seems I have to wait a bit before the Linux machines have an up-to-date tools checkout.
Once runner starts buildbot, it will mean that the tools/mozharness shared checkout will be out-of-date until that machine takes a job. Filed bug 1112873

https://tbpl.mozilla.org/?tree=Ash&jobname=build
https://hg.mozilla.org/build/buildbot-configs/rev/88ec902944f6
In production: https://hg.mozilla.org/build/buildbot-configs/rev/88ec902944f6
Excellent! [1][2][3][4]

glandium: you should now be able to point to your user mozharness by editing testing/mozharness/mozharness.json.

I will not backout for the following issue since the normal workflow (not touching mozharness.json) works as expected.

The only issue I have encountered is that the linux64-br-haz_try_dep builds don't work when I point them to a user repo (filed as bug 1113144):
https://treeherder.mozilla.org/#/jobs?repo=try&filter-searchStr=haz
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/armenzg@mozilla.com-8e764a594c80/try-linux64-br-haz/linux64-br-haz_try_dep-bm87-try1-build2482.txt.gz
It seems that normal workflow works so I won't back out.
06:31:42     INFO - Copy/paste: /builds/slave/l64-br-haz_try_dep-00000000000/build/scripts/external_tools/clobberer.py -s scripts -s logs -s buildprops.json -s token -s oauth.txt -t 168 https://api.pub.build.mozilla.org/clobberer/lastclobber try linux64-br-haz_try_dep l64-br-haz_try_dep-00000000000 b-linux64-hp-0002 http://buildbot-master87.srv.releng.scl3.mozilla.com:8101/
06:31:42    ERROR - caught OS error 2: No such file or directory while running ['/builds/slave/l64-br-haz_try_dep-00000000000/build/scripts/external_tools/clobberer.py', '-s', 'scripts', '-s', 'logs', '-s', 'buildprops.json', '-s', 'token', '-s', 'oauth.txt', '-t', '168', 'https://api.pub.build.mozilla.org/clobberer/lastclobber', u'try', u'linux64-br-haz_try_dep', 'l64-br-haz_try_dep-00000000000', u'b-linux64-hp-0002', u'http://buildbot-master87.srv.releng.scl3.mozilla.com:8101/']
06:31:42    ERROR - <bound method SpidermonkeyBuild.run_command of <__main__.SpidermonkeyBuild object at 0xe0b790>> failed after 3 tries!
06:31:42    FATAL - failed to clobber build


[1] With script_repo_cache
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jcoppeard@mozilla.com-ab7931ce4450/try-linux/try-linux-bm83-try1-build5167.txt.gz
script_repo_url: https://hg.mozilla.org/build/mozharness
script_repo_revision: production
script_repo_checkout: /tools/checkouts/mozharness

[2] Without script_repo_cache
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jcoppeard@mozilla.com-ab7931ce4450/try-macosx64-debug/try-macosx64-debug-bm83-try1-build14606.txt.gz
script_repo_url: https://hg.mozilla.org/build/mozharness
script_repo_revision: production

[3] With user repo (script_repo_cache)
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/armenzg@mozilla.com-8e764a594c80/try-linux/try-linux-bm76-try1-build4432.txt.gz
script_repo_url: https://hg.mozilla.org/users/armenzg_mozilla.com/mozharness
script_repo_revision: default
script_repo_checkout: /builds/slave/try-lx-00000000000000000000000/build/scripts

[4] With user repo (no script_repo_cache)
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/armenzg@mozilla.com-8e764a594c80/try-macosx64-debug/try-macosx64-debug-bm87-try1-build17412.txt.gz
script_repo_url: https://hg.mozilla.org/users/armenzg_mozilla.com/mozharness
script_repo_revision: default
This is basically done.
I will leave it open to deal with the refactoring agreed about using retrier().

I have a loaner so I might try to fix bug 1113144. Contributors are welcome.
Created attachment 8546145 [details] [diff] [review]
[tools] refactor + use retrier from retry.py + fix retrying for second file fetch

rail, my apologies for the big refactor. I tried to make it more readable.

This still passes all of my tests.

This fixes two issues:
* switching from retry to retrier
** requested earlier on this bug
* add retrier functionality to the second urlopen request
** this was discovered during a corruption of an hgweb this week

No rush for this review.
This is the last patch for this bug to be completed.
Attachment #8546145 - Flags: review?(rail)
Comment on attachment 8546145 [details] [diff] [review]
[tools] refactor + use retrier from retry.py + fix retrying for second file fetch

Review of attachment 8546145 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildfarm/utils/repository_manifest.py
@@ +70,5 @@
> +    parser.add_option("--debug", dest="debug", action="store_true",
> +                      default=False, help="Enable debug logging.")
> +
> +    global options
> +    options, args = parser.parse_args()

Instead of using "global options" can you pass them around as an argument instead?

@@ +86,5 @@
> +    if options.debug:
> +        log.setLevel(logging.DEBUG)
> +        log.info("Setting DEBUG logging.")
> +
> +    return options, args

Exactly like this. You don't need to have it global if you return it.

@@ +93,5 @@
> +    '''
> +    This function simply fetches a manifest from a URL.
> +    If succesful it returns the manifest.
> +    '''
> +    global exit_code

Can you return a tuple of exit code and manifest instead of modifying a global variable?

@@ +125,5 @@
> +                if e.getcode() == 404:
> +                    # Fallback to default values for branches where the manifest
> +                    # is not defined
> +                    print "script_repo_url: %s", options.default_repo
> +                    print "script_repo_revision: %s", options.default_revision

you have print_values(), why not use it?

@@ +150,5 @@
> +    except:
> +        log.exception("Unknown case")
> +        exit_code = FAILURE_CODE
> +
> +    return manifest

I mean:

return (exit_code, manifest)

This is not ideal though...

@@ +157,5 @@
> +    '''
> +    We validate that the URL passes is existent.
> +    If not, we dump the exception's infromation and change the global exit_code.
> +    '''
> +    global exit_code

The same here.

@@ +237,5 @@
> +        repo = manifest["repo"]
> +        revision = manifest["revision"]
> +        url = '%s/rev/%s' % (repo, revision)
> +        if valid_repository_revision(url):
> +            print_values(repo, revision)

Because you change exit_code somewhere else, from reading this part I had an impression that that the script always exits with FAILURE_CODE. Can I convince you to refactor the approach using globals?

Also, do you think that it'd be great to have unittests for this script? Once we had an issue caused a tree closure. Plus the script tries to cover more edge cases and could be more fragile than before. What do you think?
Attachment #8546145 - Flags: review?(rail) → review-
(Assignee)

Updated

2 years ago
Depends on: 1113144
(Assignee)

Updated

2 years ago
Attachment #8535263 - Flags: checked-in+
Created attachment 8547734 [details] [diff] [review]
[tools] refactor + use retrier from retry.py + fix retrying for second file fetch

I currently have a contributor working on adding the tests (bug 1109221).
I also run the tests locally before asking you for review [1]

I have stopped in this patch using options as a global variable but I instead pass it around.
I am using print_values in the other case as well.
I keep using global for exit_code.

With regards to exit_code, we discussed a while back that we would start with a failure code as the initial value and change to success in only those cases. In other cases we're interested to set it to INFRA_CODE instead of FAILURE_CODE to indicate to sheriffs that we hit a known infrastructure issue rather than a man produced issues (generally typos).

[1]
python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/7a1c0d2e3300/testing/mozharness/mozharness.json --timeout 2 --max-retries 1 --sleeptime 0 --debug; echo $?

for i in `cd repository_manifest_tests && ls *json && cd ..`; do echo "## $i ##" && python repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url http://localhost:8080/repository_manifest_tests/$i --timeout 5 --max-retries 1 --sleeptime 0; echo $?; done
Attachment #8546145 - Attachment is obsolete: true
Attachment #8547734 - Flags: review?(rail)
Comment on attachment 8547734 [details] [diff] [review]
[tools] refactor + use retrier from retry.py + fix retrying for second file fetch

Review of attachment 8547734 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildfarm/utils/repository_manifest.py
@@ +213,5 @@
> +            checkout = options.checkout
> +
> +        print "script_repo_checkout: %s" % checkout
> +
> +    return SUCCESS_CODE

The return value is not used anywhere. r+ with the line above removed (to avoid confusion).
Attachment #8547734 - Flags: review?(rail) → review+
Created attachment 8549008 [details] [diff] [review]
mh_try_tools.diff

The interdiff is minimal but I would like you to have a look at it.

I will be landing this tomorrow and watch Try closely.
Below you can see both success scenarios.

armenzg@armenzg-thinkpad:/tools/checkouts/build-tools$ python buildfarm/utils/repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/releases/mozilla-release/raw-file/default/testing/mozharness/mozharness.json --timeout 2 --max-retries 1 --sleeptime 0 --debug; echo $?
2015-01-14 12:23:42,896 Setting DEBUG logging.
2015-01-14 12:23:42,897 Fetching https://hg.mozilla.org/releases/mozilla-release/raw-file/default/testing/mozharness/mozharness.json
2015-01-14 12:23:42,897 Attempt number 1 out of 1
2015-01-14 12:23:44,188 About to print the values...
script_repo_url: https://hg.mozilla.org/build/mozharness
script_repo_revision: production
0
armenzg@armenzg-thinkpad:/tools/checkouts/build-tools$ python buildfarm/utils/repository_manifest.py --default-repo https://hg.mozilla.org/build/mozharness --default-revision production --manifest-url https://hg.mozilla.org/try/raw-file/7a1c0d2e3300/testing/mozharness/mozharness.json --timeout 2 --max-retries 1 --sleeptime 0 --debug; echo $?
2015-01-14 12:23:53,216 Setting DEBUG logging.
2015-01-14 12:23:53,216 Fetching https://hg.mozilla.org/try/raw-file/7a1c0d2e3300/testing/mozharness/mozharness.json
2015-01-14 12:23:53,216 Attempt number 1 out of 1
2015-01-14 12:23:54,469 Attempting to load manifest...
2015-01-14 12:23:54,470 Contents of manifest: {u'repo': u'http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness', u'revision': u'default'}
2015-01-14 12:23:54,470 We have a manifest {u'repo': u'http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness', u'revision': u'default'}
2015-01-14 12:23:54,471 Making sure that http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness/rev/default is a valid URL.
2015-01-14 12:23:54,471 Attempt number 1 out of 1
2015-01-14 12:23:54,870 We have been able to reach http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness/rev/default
2015-01-14 12:23:54,871 About to print the values...
script_repo_url: http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness
script_repo_revision: default
0
Attachment #8547734 - Attachment is obsolete: true
Attachment #8549008 - Flags: review?(rail)
Comment on attachment 8549008 [details] [diff] [review]
mh_try_tools.diff

the inteddiff lgtm
Attachment #8549008 - Flags: review?(rail) → review+
https://hg.mozilla.org/build/tools/rev/a3be7be13c78
Created attachment 8549646 [details] [diff] [review]
[tools] refactor + fork retrier from retry.py + fix retrying for second file fetch

We forgot that we download repository_manifest.py and there is no tools checkout.
Forking retrier instead of importing it.
I tested this locally. I will look into a way of testing it on Ash.

> Traceback (most recent call last):
>   File "repository_manifest.py", line 27, in <module>
>     from util.retry import retrier
> ImportError: No module named util.retry
Attachment #8549008 - Attachment is obsolete: true
Attachment #8549646 - Flags: review?(rail)
Created attachment 8549684 [details] [diff] [review]
[configs] allow to point to a different repository_manifest.py
Attachment #8549684 - Flags: review?(rail)
Created attachment 8549685 [details] [diff] [review]
[custom] allow to point to a different repository_manifest.py
Attachment #8549685 - Flags: review?(rail)
We can also just do the tools patch since I don't foresee anymore refactorings of repository_manifest.py
Attachment #8549684 - Flags: review?(rail) → review+
Attachment #8549685 - Flags: review?(rail) → review+
Attachment #8549646 - Flags: review?(rail) → review+
(Assignee)

Updated

2 years ago
Attachment #8549646 - Flags: checked-in+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c0d30d0b82fd
https://hg.mozilla.org/build/tools/rev/b0dc9e1cd9b9
Comment on attachment 8549684 [details] [diff] [review]
[configs] allow to point to a different repository_manifest.py

Since we did not make use of this I will obsolete it.
I don't expect any further developments, however, we have the patches obsoleted in here if we ever have to.
We never ended up gaining any value of having repository_manifest.py in the tools repo since we ended up needing to fork the retrier function.
Attachment #8549684 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Attachment #8549685 - Attachment is obsolete: true
All done in here! :)
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Resolution: --- → FIXED
To summarize what happened last in this bug, we added retry logic to the second hg web fetch [1] that we have in repository_manifest.py [1].

This solved the temporary issue that we saw in bug 1118267 which affected certain try users who tried to use their own mozharness repositories, however, one of the hg webs was corrupted and we had no retry logic for this situation.

[1] http://hg.mozilla.org/build/tools/file/b0dc9e1cd9b9/buildfarm/utils/repository_manifest.py#l187
You need to log in before you can comment on or make changes to this bug.