Closed Bug 1234500 Opened 8 years ago Closed 7 years ago

allow for ./mach talos-test to download tp5 pageset and use it properly

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(firefox53 fixed)

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: jmaher, Assigned: rwood)

References

Details

Attachments

(3 files, 1 obsolete file)

right now we don't have an easy way to download the pageset whilst running ./mach talos-test.  This should change!  Currently we download via a location in talos.json:
"pagesets_url": "http://talos-bundles.pvt.build.mozilla.org/zips/tp5n.zip"

from:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos.json?case=true&from=talos.json#47

I would like to either move that to a location that both local and automation can access, or add a secondary field for when local automation is run that we use a different url.

As a hack, we could take the pageset_url and hard code it if we are running locally.  I would prefer not to do that, but it would work :)
:Callek, is there a resource that local runs and buildbot can access the same file server?
Flags: needinfo?(bugspam.Callek)
Resolving this would solve bug 1185539 I believe the two bugs are duplicates. Feel free to reopen if I'm wrong.
we are confirming there are no issues with uploading these files, then the plan is to host the .zip file on tooltool which can be accessed by automation and developers.
Flags: needinfo?(bugspam.Callek)
ok - so would it be possible to have a md5sum of the file content somewhere also ? To allow automation to download the file once, but update in case it changes.
(In reply to Julien Pagès (:parkouss) from comment #5)
> ok - so would it be possible to have a md5sum of the file content somewhere
> also ? To allow automation to download the file once, but update in case it
> changes.

thats basically what tooltool does, (noteworthy, my commitment to upload to tooltool does not mean I will also change the automation to use it there though).  https://api.pub.build.mozilla.org/docs/usage/tooltool/
Oh excellent then. Thanks :Callek for the link!
:callek, as discussed in bug 1234611, we have authorization to upload tp5n.zip (http://people.mozilla.org/~jmaher/taloszips/zips/tp5n.zip) to tooltool.  This will be for our automation to access as well as the public (such as mach talos-test which a developer could run).  There should not need to be credentials here, as this is a resource for reproducing tests that any third party should have access to.
Flags: needinfo?(bugspam.Callek)
Attached file manifest.tt
Uploaded to tooltool, attached is the tooltool manifest this created:

callek@Centaurus:~/tmp$ python src/tooltool.py upload --authentication-file=~/.tooltool-token --message="Bug 1234500 - allow for ./mach talos-test to download tp5 pageset"
INFO - tp5n.zip: starting upload
INFO - tp5n.zip: uploaded
INFO - notifying server of upload completion for tp5n.zip
Flags: needinfo?(bugspam.Callek)
Joel, I'll be happy to work on this bug - I can look at it early next week.
Callek, how do I access this link?  I am not that familiar with tooltool and looking on dxr for other examples I find a lot of references to https://api.pub.build.mozilla.org/login_request?next=%2Ftooltool%2F

I don't think that is what we are looking for to access this.

:parkouss, let me assign this to you and see if you have the time and availability next week!  Thanks for taking this on!
Flags: needinfo?(j.parkouss)
Flags: needinfo?(bugspam.Callek)
I started to look at it, looks like tooltool works reading a manifest, an example here:

https://dxr.mozilla.org/mozilla-central/source/browser/config/tooltool-manifests/win32/releng.manifest

Then we need to call the tooltool script, and this can be done using the ToolToolMixin:

https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/tooltool.py?from=tooltool.py#24

I think I know everything to make it work, and from a local testing it seems good. Some more news soon.
Assignee: nobody → j.parkouss
Status: NEW → ASSIGNED
Flags: needinfo?(j.parkouss)
Flags: needinfo?(bugspam.Callek)
I have something almost working with the following patch:

https://hg.mozilla.org/try/rev/7659887a4f69

But it fails on windows (maybe on mac ?), https://treeherder.mozilla.org/#/jobs?repo=try&revision=3c9a2c72b9a9&selectedJob=15541842, with the following error:

> ERROR - Unknown zip extension for filename 'tp5n.zip'
> ERROR - The following files failed: 'tp5n.zip'
> Return code: 1

My guess is that on the machine, the installed tooltool is too old. I can see multiple possible fixes:

 - update tooltool on those machines
 - do not use tooltool to unzip the file, do that in the mozharness talos.py file (and remove "unpack": true in the manifest)
 - download tooltool script each time on windows (and mac?) by adding 'download_tooltool': True in the mozharness talos config files.

Any preference ? I believe the first option is the best one, but I don't know exactly what it involves. The two other options should be easy to implement.
download_tooltool seems like the easiest method and something we already do in many other jobs!  A few pros/cons about it- thoughts?
Well, WFM. The cons are that for each build the script will be downloaded (but it is small), the other problem is that the test will requires github server to be up. I remember this is something we should not do (depend on a third party web server that we don't own in harness), but if other jobs are doing it anyway...

I think I would prefer us to "manually" unpack inside the talos.py file, if we can't update tooltool. But that has cons too;

Joel, I let you decide - just tell me what is your preference. :)
for local developers we should use download_tooltool (which is already set for mozharness developer_config.py).  It seems that it would magically just work.  upon further review, I don't see production jobs using this.  Actually taskcluster uses it, but I am not sure if we want that.  Looking at the other two options, I don't see any that I really like!  How about we try download_tooltool :)
(In reply to Joel Maher (:jmaher) from comment #17)
> for local developers we should use download_tooltool (which is already set
> for mozharness developer_config.py).  It seems that it would magically just
> work.

Yes, there is no problem locally, that work just fine already with the patch.

> upon further review, I don't see production jobs using this. 
> Actually taskcluster uses it, but I am not sure if we want that.  Looking at
> the other two options, I don't see any that I really like!  How about we try
> download_tooltool :)

Thinking more about it I really like the idea of updating the old tooltool versions. It is the best long-term approach, as other jobs could reuse that without even thinking about a workaround. But of course currently it is the solution that cost more. I don't mind asking in #releng, if this is not possible to do that, we can fall back to one other solution ?

Else, I would go to implement the unzip ourself. This will be like 5 line of code to replicate tooltool behavior (remove the dir then unzip each time) and we don't break the rule of relying to an external service. Unless that rule doesn't apply in this case ?

As a side note joke, we could use tooltool to download itself. :)
that sounds like a plan!
So, I asked in #releng:

 - according to :dustin, it may be possible to update the machines, using puppet. Quoted from irc "https://github.com/mozilla/build-puppet/blob/master/modules/packages/templates/tooltool.py takes care of it for everything else (including windows builds)"
 - :Callek says that we are not quite there yet for puppet on every Windows host. Though he had a great idea, that is upload tp5.tar.gz instead of tp5n.zip with tooltool. Old tooltool version probably handle well this format.
$ unzip tp5n.zip 
$ tar czvf tp5n.tar.gz tp5n

...

callek@Centaurus:~/tmp$ python src/tooltool.py add --visibility public tp5n.tar.gz 
callek@Centaurus:~/tmp$ python src/tooltool.py upload --authentication-file=~/.tooltool-token --message="Bug 1234500 - allow for ./mach talos-test to download tp5 pageset -- use tar.gz"
INFO - tp5n.tar.gz: starting upload
INFO - tp5n.tar.gz: uploaded
INFO - notifying server of upload completion for tp5n.tar.gz
Thanks :Callek!
Comment on attachment 8709399 [details]
MozReview Request: Bug 1234500 - allow for ./mach talos-test to download tp5 pageset and use it properly. r=jmaher

https://reviewboard.mozilla.org/r/31425/#review28117

my only concern here is that we download the pageset for all tests now- could we have need_pageset: True|False?  If this were in the talos.json file, we could determine if we need to download the pageset.  Please correct me if there is other evidence to show I am incorrect.
Attachment #8709399 - Flags: review?(jmaher)
Comment on attachment 8709399 [details]
MozReview Request: Bug 1234500 - allow for ./mach talos-test to download tp5 pageset and use it properly. r=jmaher

https://reviewboard.mozilla.org/r/31425/#review28119

upon conversing in irc, we do cache the .tar.gz thanks to using tooltool- so we incur a cost of unpacking for all jobs (buildbot and local), but this will save the download for half the jobs and ensure we have all data available for all jobs while simplifying things.  Please ensure we have results for all platforms before pushing.
Attachment #8709399 - Flags: review+
Hm, We have an issue on windows, https://treeherder.mozilla.org/#/jobs?repo=try&revision=03c899cee4db&selectedJob=15616958

Errors like:

IOError: [Errno 22] invalid mode ('wb') or filename: '.\\tp5n\\linkedin.com\\www.linkedin.com\\in\\christopherblizzard@goback=.nppvan_%2Flemuelf.html:'

I made a diff of the content of tp5.zip and tp5n.tar.gz:

diff --recursive --brief tp5n-zip tp5n-tar-gz 
Only in tp5n-zip/bild.de/www.bild.de/code: core,15400948.29-15400978.22-15400980.22-15400986.22-15400984.22-15400960.27-15400952.18-15400958.22-15400950.22-15400966.31-15400974.24-15400968.39-15400962.22.bild.js
Only in tp5n-zip/bild.de/www.facebook.com/plugins: recommendations.php@site=www.bild.de&filter=news&api_key=25604775729&width=445&height=285&header=true&colorscheme=light&font=arial&border_color.html
Only in tp5n-zip/cnet.com/www.facebook.com/plugins: like.php@href=http%3A%2F%2Fwww.facebook.com%2Fcnet&layout=button_count&show_faces=false&width=120&action=like&font&colorscheme=light&height=21.html
Only in tp5n-zip/dailymotion.com/www.facebook.com/plugins: likebox.php@href=http%3A%2F%2Fwww.facebook.com%2Fdailymotionusa&width=300&colorscheme=light&connections=5&stream=false&header=false&height=180.html
Only in tp5n-zip/digg.com/dads.new.digg.com: view.html@kw=zone%3A5&kw=mozilla&kw=nice&kw=logo&kw=firefox&kw=mozzilla&kw=new&kw=proposal&kw=really&kw=browser&kw=check&kw=pagetype%3Apermalink&template=5.html
Only in tp5n-zip/filestube.com/www.facebook.com/plugins: likebox.php@href=http%3A%2F%2Fwww.facebook.com%2Fpages%2FFilesTube-Media-Search-Engine%2F135577699745&width=292&connections=0&stream=false&header=true&height=62.html
Only in tp5n-zip/terra.com.br/p2.trrsf.com.br/image: get@o=cf&w=89&h=67&src=http%3A%2F%2Fsdp.terra.com.br%2FThumBox%2Ffree%2Fcnt358895_h90_w120_imagens-mostram-frieza-de-assassino-em-escola-de-realengo.jpg
Only in tp5n-zip/tmall.com/a.tbcdn.cn/s/kissy/1.1.6: index.html@%3Fsuggest%2Fsuggest-pkg-min.js,anim%2Fanim-pkg-min.js,switchable%2Fswitchable-pkg-min.js,datalazyload%2Fdatalazyload-pkg-min.js,ajax%2Fajax-pkg-min.js
Only in tp5n-zip/yahoo.co.jp/b8.yahoo.co.jp: b@P=23L2vsvY..mh_8X5j6PdByB3RagwF02fgoIADTIT&T=14335av37%2FX=1302299266%2FE=2079181999%2FR=jp_toppage%2FK=5%2FV=3.1%2FW=J%2FY=jp%2FF=4001638655%2FQ=-1%2FS=1%2FJ=16F8D8CB

I am thinking that I will try to use the zip file, and do the unpack in mozharness to see if it is a zip/tar format issue. maybe the tar encoder or decoder are not playing well with some files in there.
See Also: → 1240814
Looks like bug 1240814 is on a good way, let's wait a bit for it.
Depends on: 1240814
Ok, so now it goes a bit further, but there is still an issue while decompressing the file on windows.

Looking at the log, we are trying to create a file "?AQB=1&ndh=1&AQE=1". Is that normal ? Looks like a url parameter to me. We should compare the zip on tooltool with the zip that was working previously - if both the same, we should look at how it was unzipped.
Flags: needinfo?(jmaher)
the url params are normal- we have json requests that are dynamic queries and we pull those dynamic queries down to be localhost.

I think we should compare the different versions of the .zip file and determine if the filenames are the same.  And as you said, compare the method for unzipping!

Thanks for jumping on this- I think we are close :)
Flags: needinfo?(jmaher)
:rwood, this bug might be easy to pick up and get landed- that should finish up the work required for talos in taskcluster- want to take it?
Flags: needinfo?(rwood)
(In reply to Joel Maher ( :jmaher) from comment #34)
> :rwood, this bug might be easy to pick up and get landed- that should finish
> up the work required for talos in taskcluster- want to take it?

Ok sir!
Assignee: j.parkouss → rwood
Flags: needinfo?(rwood)
Attachment #8709399 - Attachment is obsolete: true
Took :parkouss' existing patch, created a new mozreview patch, and sent it to try to see how it looks in it's current form:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=219ac62aff09dec65310d278e070e16679dced2d
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92024

::: testing/mozharness/mozharness/mozilla/testing/talos.py:279
(Diff revision 3)
>  
> -        src_talos_webdir = os.path.join(self.talos_path, 'talos')
> +        self.info("Downloading pageset with tooltool...")
> +        manifest_file = os.path.join(self.talos_path, 'pageset-tooltool.manifest')
> +        self.tooltool_fetch(
> +            manifest_file,
> +            output_dir=os.path.join(self.talos_path, 'talos', 'tests'),

should output dir be |src_talos_pageset| ?

::: testing/mozharness/mozharness/mozilla/testing/talos.py:289
(Diff revision 3)
> +            with open(manifest_file) as manifest:
> +                manifest_data = json.load(manifest)
> +                archive = os.path.join(output_dir, manifest_data['filename'])
> +                manifest.close()
> +            except:
> +                self.exception("Unable to read tool tool manifest")

I don't understand the need to open the manifest file- does tooltool_fetch download the tp5n.tar.bz into the manifest['filename']?

when we get an except it will cause the unzip a couple lines down to fail as archive doesn't exist?

::: testing/talos/pageset-tooltool.manifest:3
(Diff revision 3)
> +[
> +    {
> +        "filename": "tp5n.zip",

I would prefer if this file was called |tp5n-pageset.manifest| or something to indicate the pageset name.
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92024

> I don't understand the need to open the manifest file- does tooltool_fetch download the tp5n.tar.bz into the manifest['filename']?
> 
> when we get an except it will cause the unzip a couple lines down to fail as archive doesn't exist?

So the problem is unpacking via tooltoold doesn't work (I can't figure out why) so I was just trying out using tooltool to download but unzipping the same way we do now. However we don't use pageset url anymore, so I need to get the actual filename from somewhere so I was thinking of grabbing it from the tooltool manifest.
it is ok to unzip the old way- it would be good to add a comment or two to indicate why we are doing things, etc.
Attachment #8808428 - Flags: review?(jmaher)
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92342

this is great, thank you for figuring this out.

2 things to confirm:
1) does this run reliably on try server for all platforms?
2) does this work locally with |./mach talos-test -a tps| ?
Attachment #8808428 - Flags: review?(jmaher) → review+
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

Patch updated: Now when you run talos locally it will check if the suite you're running (--suite) or the test you're running (-a or --activeTests) requires the pageset; if yes then tooltool is called to download the pageset if it doesn't exist already.

On try it worked previously on all the platforms but landed this latest update on try via MozReview again to be sure.
Attachment #8808428 - Flags: review+ → review?(jmaher)
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92456

overall, lets check the inputs better- developers can shoot themselves in the foot too easily, I would prefer a missing or invalid test/suite name to result in a nice error message.

One way to maybe do this is;
if -a || --activetests:
  # suite = find suite(test name)
else if --suite:
  # suite = suite name
  
validate_suite(suite)
pageset_url = get_suite_pageset(suite)

of course the code would be more involved, but that kind of flow might work.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:238
(Diff revision 8)
>          if self.query_talos_json_config() and 'suite' in self.config:
> -            self.pagesets_url = self.talos_json_config['suites'][self.config['suite']].get('pagesets_url')
> -            return self.pagesets_url
> +            self.pagesets_name = self.talos_json_config['suites'][self.config['suite']].get('pagesets_name')
> +            return self.pagesets_name
> +        if self.config.get('run_local') and '--suite' in self.config['talos_extra_options']:
> +            # running locally, suite specified, get suite name from talos_extra_options --suite <value>
> +            suite_name_index = self.config['talos_extra_options'].index('--suite') + 1

this assumes a suite is given, |mach talos-test --suite --tpcycles 1| or |mach talos-test --suite ts_paint| (and ts_paint is not a suite, but a test)
Attachment #8808428 - Flags: review?(jmaher) → review-
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92788

::: testing/mozharness/mozharness/mozilla/testing/talos.py:246
(Diff revision 9)
> +        if self.query_talos_json_config():
> +            if '-a' in self.config['talos_extra_options']:
> +                test_name_index = self.config['talos_extra_options'].index('-a') + 1
> +            if '--activeTests' in self.config['talos_extra_options']:
> +                test_name_index = self.config['talos_extra_options'].index('--activeTests') + 1
> +            test_name = self.config['talos_extra_options'][test_name_index]

if test_name_index > len(self.config['talos_extra_options'])

we could throw an exception here.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:318
(Diff revision 9)
> +                # test name (-a or --activeTests) specified, find out what suite it is a part of
> +                self.suite = self.get_suite_from_test()
> +            elif '--suite' in self.config['talos_extra_options']:
> +                # --suite specified, get suite from cmd line and ensure is valid
> +                suite_name_index = self.config['talos_extra_options'].index('--suite') + 1
> +                self.suite = self.config['talos_extra_options'][suite_name_index]

same concern here about the suitename index being > length of the incoming options array.
Attachment #8808428 - Flags: review?(jmaher) → review-
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92788

> if test_name_index > len(self.config['talos_extra_options'])
> 
> we could throw an exception here.

Ah yes good catch thx
Comment on attachment 8808428 [details]
Bug 1234500 - Allow mach talos-test to download tp5 pageset;

https://reviewboard.mozilla.org/r/91232/#review92796

lets land this!
Attachment #8808428 - Flags: review?(jmaher) → review+
Thanks for the reviews, tested locally and the latest run on try looks good:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f09059c45e5a817d4759627e121f6eb90d5a1b5

I'll land it.
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8e93080f949a
Allow mach talos-test to download tp5 pageset; r=jmaher
https://hg.mozilla.org/mozilla-central/rev/8e93080f949a
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
You need to log in before you can comment on or make changes to this bug.