Closed Bug 528064 Opened 16 years ago Closed 15 years ago

Script to download builds from the Mozilla server

Categories

(Mozilla QA Graveyard :: Mozmill Automation, defect, P1)

Product:

Component:

Type:

defect

Priority:

P1

Severity:

normal

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: whimboo)

References

(
URL
)

Details

Attachments

(3 files, 7 obsolete files)

WorkInProgress 16 years ago Shriram (irc: Mavericks) 4.68 KB, patch		Details \| Diff \| Splinter Review
WIP 16 years ago Shriram (irc: Mavericks) 4.98 KB, patch		Details \| Diff \| Splinter Review
temporary script 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 3.06 KB, text/plain		Details
WIP v1 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 10.97 KB, patch	gmealer : feedback+	Details \| Diff \| Splinter Review
Patch v1 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 13.30 KB, patch		Details \| Diff \| Splinter Review
Patch v1.1 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 13.31 KB, patch		Details \| Diff \| Splinter Review
Patch v1.2 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 15.30 KB, patch	k0scist : review-	Details \| Diff \| Splinter Review
Patch v2 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 15.77 KB, patch	k0scist : review+	Details \| Diff \| Splinter Review
Add locale to scraper args 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 1.12 KB, patch	cmtalbert : review+	Details \| Diff \| Splinter Review
Don't hard-code the application 15 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 1.38 KB, patch	gmealer : review+	Details \| Diff \| Splinter Review

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Description

•

16 years ago

Before we start the software update tests we have to download a couple of builds from the FTP server. It's kinda annoying to navigate through the ftp folders. A Python script script could help us here which downloads the given build and saves it in the given folder. That way we can automate the process a bit more. The following options would be necessary: * version number (e.g. 3.5, 3.5.3, or 3.6b1) * platform (linux-i686, mac, and win32) * locale (e.g. en-US, or de) * target folder (relative or absolute) Given that information the build can be downloaded from the FTP server. Here an example URL: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/releases/3.6b1/win32/de/ When saving this file locally I propose to use the following naming convention: firefox-%version%-%locale%.%ext% (e.g. firefox-3.5.5-en-US.dmg) Would be great if this script could also be used to download nightly builds in the future. So we should create a new Python module under mozmill-tests/scripts/lib/. Any objections?

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 1

•

16 years ago

As given by Tracy, Bob has created a similar script for bash: http://test.bclary.com/bin/download.sh It takes an URL as parameter while we wanna have three parameters which will be used to create the URL automatically. Shriram, would you still be interested to create this Python script?

Shriram (irc: Mavericks)

Comment 2

•

16 years ago

Sure. I took a look at it and sounds pretty straightforward. I will have to review it again one more time to write up the script.

Heather Arthur [:harth]

Comment 3

•

16 years ago

I ended up having to do this for my regression range finder, in python, so this code might help some: http://code.google.com/p/mozregression/source/browse/trunk/runnightly.py

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 4

•

16 years ago

Assigning to Shiram who will work on it.

Assignee: nobody → kshriram18

Status: NEW → ASSIGNED

Shriram (irc: Mavericks)

Comment 5

•

16 years ago

Attached patch WorkInProgress (obsolete) — Details — Splinter Review

I apologize for the big delay. I had to deal with some school stuff in the past two weeks. I shall continue to work on it and finalize the patch in the next two to three days. If someone has more time to work on it immediately , please let me know

Shriram (irc: Mavericks)

Comment 6

•

16 years ago

Attached patch WIP (obsolete) — Details — Splinter Review

Attachment #419151 - Attachment is obsolete: true

Shriram (irc: Mavericks)

Comment 7

•

16 years ago

As of now the most recent patch crashes at line 90. This's probably because of the line 87. I shall continue to work on it in the next days

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 8

•

16 years ago

Shiram, I have taken a look at your WIP's and have some comments: * Try to use the OptionParser (http://docs.python.org/library/optparse.html) to parse the command line options. It makes it much easier. * Can you please make use of a downloader class we can stick into the scripts/lib folder? The executable download script itself should only have to use that class. That way we can integrate it into other scripts too. * The platform version should also be a command line option as given in comment 0. Only the target folder is an argument * Just concentrate your work on releases. Downloading nightly builds is not our primary goal for now. * The local file should be named as given in comment 0 too. If something is unclear to you or when you need help please ping me on IRC as usual. Thanks!

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 9

•

16 years ago

Shriram, can you please give us an update? If you will not have time please tell us so I can take over that work. We really need it in the near future.

Shriram (irc: Mavericks)

Comment 10

•

16 years ago

I shall give an update in the next 12 hours.

Shriram (irc: Mavericks)

Comment 11

•

16 years ago

I started work on the second patch, and had to spend some more time on python docs. I shall make few more changes, test a little bit and upload the second patch.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 12

•

16 years ago

Shriram, I can offer some more days for you. But if we do not have anything by Feb 8th I have to take over the work.

Shriram (irc: Mavericks)

Comment 13

•

16 years ago

Henrik, I will have a chance to upload a patch in the next few hours.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 14

•

16 years ago

Sorry but time flies by. I have to take over the work for that script. Will work on it the next days.

Assignee: kshriram18 → hskupin

Shriram (irc: Mavericks)

Comment 15

•

16 years ago

Sorry about this Henrik. I had network connectivity problems due to a trojan(spoofed attacker) for last week and a half. I resolved it with the help of the Internet Service Provider's assistance. I plan to resume work tomorrow and upload it before the weekend. Let me know if you could grant me that time.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 16

•

16 years ago

It would be the last extension of time. So yes, if you have something by end of this week, please go on.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Blocks: 562352

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Blocks: 562639
No longer blocks: 562352

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 17

•

15 years ago

Attached file temporary script (obsolete) — Details

That's a temporary script I have hacked together quickly to help us in downloading the builds.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Assignee: hskupin → nobody

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Attachment #443646 - Attachment mime type: text/x-python-script → text/plain

Michael Kohler [:mkohler]

Comment 18

•

15 years ago

This is a mass change. Every comment has "assigned-to-new" in it. I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.

Status: ASSIGNED → NEW

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 19

•

15 years ago

The script should be part of our automation repository: http://hg.mozilla.org/qa/mozmill-automation/

URL: http://hg.mozilla.org/qa/mozmill-auto...

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Whiteboard: [mozmill-automation]

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Component: Mozmill → Mozmill Tests

QA Contact: mozmill → mozmilltests

Whiteboard: [mozmill-automation] → [shared module]

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Summary: [shared module] Script to download builds from the FTP site → Script to download builds from the FTP site

Comment 20

•

15 years ago

See also https://bugzilla.mozilla.org/show_bug.cgi?id=582980

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Blocks: 596828

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Priority: -- → P2

Comment 21

•

15 years ago

I'd be happy to modify http://hg.mozilla.org/automation/getlatest-tinderbox/ to fulfill these requirements, if desired. It doesn't download the files, just gets the URLs. The base url would have to become a parameter (which it should, anyways) but most of the innards, despite the name, have nothing to do with tinderbox. You'd probably also have to make mozmill-automation a python package to sensibly dependent on it. Or if you want to roll your own or copy+paste code, then that's up to you.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 22

•

15 years ago

I would be happy to take some basic stuff from your code to build up the url for all the different kinds of builds we have to download. I will investigate next month.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 23

•

15 years ago

(In reply to comment #0) Since the initial comment some things have been changed and will need a revised proposal for what we wanna have. Lets extend the proposal from comment 0: > * version number (e.g. 3.5, 3.5.3, or 3.6b1) > * platform (linux-i686, mac, and win32) > * locale (e.g. en-US, or de) > * target folder (relative or absolute) * version number (i.e. 3.5, 3.5.3, or 4.0bx) * type of build (release, release candidate build %x%, nightly) * platform (linux, mac, and win) * processor (32 bit or 64 bit) [exception is OS X with a combined build] * locale (i.e. en-US, or de) * target folder (relative or absolute) Per default we should set defaults for the type, platform, processor, and locale which should match the system specifications. That way we can make the number of options, which have to be specified, as short as possible. Anthony, Al, and Geo, do I miss something which is important for our automated release testing? If not I will use the given proposal to implement the download script. I will check the proposed solutions and available download scripts before I will start with our version of the script. I don't think an existent version fits all of our requirements.

Assignee: nobody → hskupin

Status: NEW → ASSIGNED

Priority: P2 → P1

Whiteboard: [shared module] → [mozmill-automation]

Mark Banner (:standard8)

Comment 24

•

15 years ago

Can we add "application" to the list of requirements as well please?

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 25

•

15 years ago

(In reply to comment #24) > Can we add "application" to the list of requirements as well please? Sure. I hope you have the same naming scheme for folders and files? I haven't checked that yet in detail for Thunderbird.

Mark Banner (:standard8)

Comment 26

•

15 years ago

(In reply to comment #25) > (In reply to comment #24) > > Can we add "application" to the list of requirements as well please? > > Sure. I hope you have the same naming scheme for folders and files? I haven't > checked that yet in detail for Thunderbird. All our build automation is based on the same as Firefox. So they should be the same!

Al Billings [:abillings - ex-MoCo]

Comment 27

•

15 years ago

This fulfills my requirements.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 28

•

15 years ago

Move of Mozmill related project bugs to newly created components. You can filter out those emails by using "Mozmill-Tests-to-MozillaQA" as criteria.

Component: Mozmill Tests → Mozmill Automation

Product: Testing → Mozilla QA

QA Contact: mozmill-tests → mozmill-automation

Whiteboard: [mozmill-automation]

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 29

•

15 years ago

Also we would need an --url option which will override all the other options, and which can be used when you already have the url to the build to download. I will try to get started on this script mid/end of this week.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 30

•

15 years ago

Something we will have to take care of is the renaming of the folders as announced in dev.tree-management: http://groups.google.com/group/mozilla.dev.tree-management/browse_thread/thread/4343accb10cb9986#

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Summary: Script to download builds from the FTP site → Script to download builds from the Mozilla server

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 31

•

15 years ago

Attached patch WIP v1 (obsolete) — Details — Splinter Review

Except the --url option, which we probably don't need, this WIP makes it possible to download release and candidate builds for Firefox and Thunderbird. Next step would be to also support nightly builds. Geo, would be great if you could test the patch. Right now it relies on lxml and I'm not really happy about using it because it needs to be compiled. What do you think?

Attachment #419400 - Attachment is obsolete: true

Attachment #443646 - Attachment is obsolete: true

Attachment #503519 - Flags: feedback?(gmealer)

Comment 32

•

15 years ago

Perhaps a stupid question, but why aren't we just connecting via FTP? FTP is structured so you don't need to parse, maintains cwd, etc. The python FTP lib is what I used in (the quick/dirty) http://hg.mozilla.org/users/clegnitto_mozilla.com/pulseshims/file/default/ftp_shim.py and it was trivial to download the nightly builds (and attach their build id in the same object by reading the text file next to them)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 33

•

15 years ago

Christian, this script will be available and probably be used in the crowd extension. So I don't want to download builds from our FTP server. We should hit mirrors as much as possible. Not sure about the FTP network but it's not mirrored right? But in either way I should not use stage.mozilla.org in the current version of my script.

Geo Mealer [:geo] -- This account is inactive after 2015-07-07

Comment 34

•

15 years ago

The requirement for a compiled library will hose you for the crowd extension anyway, I think. Would mechanize or another "web automation" library like that work instead of lxml, maybe without a compilation step? Also, I really wouldn't worry about network impact until we have such adoption on the crowd extension that it actually starts causing some sort of issue. The ftp site is open to the public, after all. For in-house use we probably don't want to hit mirrors anyway. If we hit a bad one it'll make the test launch super, super slow. We already have that potential in the update itself, let's not double the chances in the environment setup. All said, I'll give it a shot, but I'd consider using straight ftp for our current requirement.

Comment 35

•

15 years ago

Ah, right. Hitting ftp directly is not mirrored, great point.

Geo Mealer [:geo] -- This account is inactive after 2015-07-07

Comment 36

•

15 years ago

Comment on attachment 503519 [details] [diff] [review] WIP v1 Appears to work well, though I didn't try an exhaustive cross of all options (and really concentrated on firefox). Minor nitpick, it doesn't like having no arguments supplied--should throw help text. Same for leaving off any mandatory options (-p, for example).

Attachment #503519 - Flags: feedback?(gmealer) → feedback+

Geo Mealer [:geo] -- This account is inactive after 2015-07-07

Comment 37

•

15 years ago

Oh, also your help text is caught in between revisions. At the top you say dest is a non-flag argument, in the text and code itself it's -d. That should be fixed.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 38

•

15 years ago

(In reply to comment #34) > The requirement for a compiled library will hose you for the crowd extension > anyway, I think. Would mechanize or another "web automation" library like that > work instead of lxml, maybe without a compilation step? I know. That's why I'm searching for alternatives. If you know something please feel free to point me to that tool. In all simplicity we could just use a regex to get all that information from the HTML document. > For in-house use we probably don't want to hit mirrors anyway. If we hit a bad > one it'll make the test launch super, super slow. We already have that > potential in the update itself, let's not double the chances in the environment > setup. Not an argument for me. It's even a chance for us to hit slow mirrors before we start update testing at all. It would even be an improvement. Beside that the FTP server is barely slow compared to most of our mirrors. I for myself don't see higher speeds as 370kB/s.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 39

•

15 years ago

(In reply to comment #37) > Oh, also your help text is caught in between revisions. At the top you say dest > is a non-flag argument, in the text and code itself it's -d. That should be > fixed. Sorry, but I don't get it. You can specify -d XXX or --destination XXX. Where is your issue?

Geo Mealer [:geo] -- This account is inactive after 2015-07-07

Comment 40

•

15 years ago

(In reply to comment #39) > (In reply to comment #37) > > Oh, also your help text is caught in between revisions. At the top you say dest > > is a non-flag argument, in the text and code itself it's -d. That should be > > fixed. > > Sorry, but I don't get it. You can specify -d XXX or --destination XXX. Where > is your issue? usage = 'usage: %prog [options] destination' You don't really take destination as the final arg, you take it as a flagged option. Should just be %prog [options]. On that point, and I realize there are counter-examples, CLI design usually dictates that flags should always be optional, and anything mandatory should always be an ordered argument after the flags. So I would have gone for %prog [options] build platform Just a suggestion though. The help text one is a bigger deal.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 41

•

15 years ago

Attached patch Patch v1 (obsolete) — Details — Splinter Review

This is an updated version of the last WIP patch and should contain all the necessary items for the first version. I have changed the following: * Removed dependency of lxml. Instead the HTMLParser is used now, which is a default package. * Refactored class structure to make handling different kinds of builds more flexible. * The destination folder is now an argument and not an option Handling of nightly builds will be a follow-up work we can't do at the moment.

Attachment #503519 - Attachment is obsolete: true

Attachment #505102 - Flags: review?(jhammel)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 42

•

15 years ago

Attached patch Patch v1.1 (obsolete) — Details — Splinter Review

The active_url in the parser has also be stripped from ending slashes.

Attachment #505102 - Attachment is obsolete: true

Attachment #506281 - Flags: review?(jhammel)

Attachment #505102 - Flags: review?(jhammel)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 43

•

15 years ago

Attached patch Patch v1.2 (obsolete) — Details — Splinter Review

Moved the Scraper classes into their own module, now that we also want to access the classes from other scripts. That makes it much easier to use. And as a side-note: Any packaging solution will be decided later on.

Attachment #506281 - Attachment is obsolete: true

Attachment #506355 - Flags: review?(jhammel)

Attachment #506281 - Flags: review?(jhammel)

Comment 44

•

15 years ago

We have the same need for our many of our automation utilities. But we're taking a much different approach. For our stuff, we're using pulse and moving away from directory scraping since it is rather brittle. If you're interested, I can get you the code we're creating. The one thing I'm not certain of is whether or not Thunderbird's build messages are in pulse, so that is one requirement you hvae that I don't know if our approach will satisfy, but INMHO, if thunderbird's buildbot isn't talking to pulse, that's something we ought to fix.

Comment 45

•

15 years ago

Comment on attachment 506355 [details] [diff] [review] Patch v1.2 +base_path = os.path.dirname(os.path.abspath(__file__)) +sys.path.append(os.path.join(base_path, 'libs')) You should never append to sys.path in this manner. Please use a standard python way of doing this. + usage = 'usage: %prog [options] folder' Maybe directory vs. folder? The latter is a non-canonical + parser = OptionParser(usage=usage) Should have a description= , preferably from the docstring + parser.add_option('--application', '-a', + dest='application', + choices=scraper.APPLICATIONS, + default=scraper.APPLICATIONS[0], + metavar='APPLICATION', + help='The name of the application to download, default: "firefox"') Should either be default='firefox', or help='The name..., default: "%s"' % APPLICATIONS[0] + # Check required options + if not options.version: + raise Exception("The version of the application to download has not been specified.") + if not options.platform: + raise Exception("The platform of the application to download has not been specified.") probably better off using parser.error() Also, in general, required arguments should be positional arguments, instead of the one positional argument being optional. ("Required options" is not internally consistent wording) + # Check for destination folder + destination = os.path.abspath(args[0]) if args else os.getcwd() Should probably warn/error if there is more than one argument since you don't use them + # Instantiate scraper and download the build + if options.type in 'release': should be '==' instead of 'in' + elif options.type in 'candidate': should be '==' instead of 'in'; also, its probably better to have a more control-flowy way of getting the appropriate class: scraper_args = {'application': options.application, 'platform': options.platform, 'version': options.version} # could also do scraper_args = dict([(i, getattr(options, i)) for i in 'application', 'platform', 'version']) scrapers = {'release': ReleaseScraper, 'candidate': ReleaseCandidateScraper} scraper_options = { 'release': {'build': options.build}} kwargs = scraper.args.copy().update(scraper_options.get(options.type, {})) build = scrapers[options.type](**kwargs) + build.download(destination) Probably better off having scraper.download return the file-like object then you can do with it what you want. This way, you can use scraper as an API. +#!/usr/bin/env python Is the shebang usable? Is this script executable? If not, remove +# ***** END LICENSE BLOCK ***** Should probably have a docstring that explains what this file does. Should probably pass said docstring as the description argument to the parser +# Constants +BASE_URL = 'http://stage.mozilla.org/pub/mozilla.org'; No need for the semicolon +APPLICATIONS = ['firefox', 'thunderbird'] +BUILD_TYPES = ['release', 'candidate', 'nightly'] Unless I misunderstand the intentions of the code, BUILD_TYPES is a bijection with respect to string <-> Scraper class; in this case, why not make a dict? Following scraper declarations: BUILD_TYPES = {'release': ReleaseScraper, 'candidate': ReleaseCandidateScraper, 'nightly': NightlyScraper } +class AbstractException(Exception): + """Exception for an abstract class""" Probably should be "Not implemented Exception for an abstract base class". I didn't understand what this was on first read. (Personally, I usually use NotImplementedError for such things). + req = urllib.urlopen(url) Should probably use urllib2 instead of urllib so you can get reasonable timeouts + name = data.strip('/').replace(' ', '%20') Should probably use urllib.quote() +class GenericScraper(object): + """Generic class to download an application from the Mozilla server""" Maybe should be renamed MozillaScraper or similar. Its not that generic + def download(self, destination=os.getcwd()): + """Download the specified file""" + + destination = os.path.abspath(destination) Is there any reason to use abspath here? Also you should probably note in the docstring that destination is supposed to be a directory. Again, I'd advise just returning the file-like object of what is downloaded vs. having the scraper determine where to put th contents + if not os.path.exists(destination): + raise NotFoundException('Destination not found', destination) Why does the destination have to exist? Can't you do os.makedirs if its supposed to be a directory? Also, sanity checking should be os.path.isdir()? + print 'Downloading "%s" to "%s"' % (url, target) Ideally shouldn't print from this class. + urllib.urlretrieve(url, target) To match the comment ("Download the first matched directory entry"), this should break; otherwise, multiple entries will be downloaded to the same destination. And again, for timeout reasons, you should probably use urllib2 +class NightlyScraper(GenericScraper): + """Class to download a nightly build from the Mozilla server""" + + def __init__(self, branch='mozilla-central', debug=False, *args, **kwargs): + GenericScraper.__init__(self, *args, **kwargs) + self.branch = branch + self.debug = debug raise NotImplementedError since its not done.

Attachment #506355 - Flags: review?(jhammel) → review-

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 46

•

15 years ago

(In reply to comment #44) > We have the same need for our many of our automation utilities. But we're > taking a much different approach. For our stuff, we're using pulse and moving > away from directory scraping since it is rather brittle. If you're interested, Our use cases are a bit different as already said a lot of comments earlier. We also want to download builds (releases, release candidates, nighlies) from any release in the past. I believe Pulse can't manage that. Further we have to hit the mirrors to be able to download the builds as fast as possible. Your builds are probably 100% on internal FTP servers, ours are not.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 47

•

15 years ago

(In reply to comment #45) > probably better off using parser.error() > Also, in general, required arguments should be positional arguments, > instead of the one positional argument being optional. ("Required options" is > not internally consistent wording) Both options will be optional in the future when the INI file support has been landed. It's something I have to work next on. > + # Check for destination folder > + destination = os.path.abspath(args[0]) if args else os.getcwd() > > Should probably warn/error if there is more than one argument since you don't > use them I moved this to an option, what it really is. It's not required. > scraper_args = {'application': options.application, > 'platform': options.platform, > 'version': options.version} > # could also do scraper_args = dict([(i, getattr(options, i)) for i in > 'application', 'platform', 'version']) > scrapers = {'release': ReleaseScraper, > 'candidate': ReleaseCandidateScraper} > scraper_options = { 'release': {'build': options.build}} > kwargs = scraper.args.copy().update(scraper_options.get(options.type, {})) > build = scrapers[options.type](**kwargs) That's great! I love it! > +#!/usr/bin/env python > > Is the shebang usable? Is this script executable? If not, remove No, it's not. Just a left-over from moving that code to its own module. > BUILD_TYPES = {'release': ReleaseScraper, > 'candidate': ReleaseCandidateScraper, > 'nightly': NightlyScraper } Yeah, makes sense. It should even be moved to the script itself. It's nothing which should be part of this library. > +class AbstractException(Exception): > + """Exception for an abstract class""" > > Probably should be "Not implemented Exception for an abstract base class". I > didn't understand what this was on first read. (Personally, I usually use > NotImplementedError for such things). Good call. > + req = urllib.urlopen(url) > > Should probably use urllib2 instead of urllib so you can get reasonable > timeouts Timeouts are not supported on Python 2.4. I have to stay compatible with, so I will stay with urllib. Also it makes downloading files a lot easier. > + name = data.strip('/').replace(' ', '%20') > > Should probably use urllib.quote() Indeed. > +class GenericScraper(object): > + """Generic class to download an application from the Mozilla server""" > > Maybe should be renamed MozillaScraper or similar. Its not that generic done. > + def download(self, destination=os.getcwd()): > + """Download the specified file""" > + > + destination = os.path.abspath(destination) > > Is there any reason to use abspath here? Also you should probably note in the > docstring that destination is supposed > to be a directory. Again, I'd advise just returning the file-like object of > what is downloaded vs. having the scraper determine where to put th contents I have refactored that code which makes it more granular to use. It should fit your expectations. > + if not os.path.exists(destination): > + raise NotFoundException('Destination not found', destination) > > Why does the destination have to exist? Can't you do os.makedirs if > its supposed to be a directory? > Also, sanity checking should be os.path.isdir()? Definitely better approach. > + print 'Downloading "%s" to "%s"' % (url, target) > > Ideally shouldn't print from this class. Moved out to the script.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 48

•

15 years ago

Attached patch Patch v2 — Details — Splinter Review

Updated script according to review comments. I really like the improvements! Thanks for the review Jeff!

Attachment #506355 - Attachment is obsolete: true

Attachment #506621 - Flags: review?(jhammel)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Blocks: 628655

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

15 years ago

Blocks: 628659

Comment 49

•

15 years ago

Comment on attachment 506621 [details] [diff] [review] Patch v2 Since you're using re.match, which matches the whole expression, you probably don't need the '^' or '$' in your regexes. Other than that, much better!

Attachment #506621 - Flags: review?(jhammel) → review+

Comment 50

•

15 years ago

See also https://bugzilla.mozilla.org/show_bug.cgi?id=628685 for consolidating this with other work

See Also: → 628685

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 51

•

15 years ago

(In reply to comment #49) > Since you're using re.match, which matches the whole expression, you probably > don't need the '^' or '$' in your regexes. Other than that, much better! No I cannot. At least not at the current state. Files like "firefox-4.0b8.tar.bz2.asc" will cause us to fetch the wrong file. I could get rid of the prefix but the trailing '$' is important here. I would rather leave both in.

Comment 52

•

15 years ago

Ah, you're right. re.match starts from the start of the string but as long as that matches it doesn't impose exact match. My mistake.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 53

•

15 years ago

Agreed with Jeff on IRC on it. I left both in the regex for now. Landed as: http://hg.mozilla.org/qa/mozmill-automation/rev/43e5ad800f96

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 54

•

15 years ago

I should have taken a deeper look at all the option we specify on the command line. I missed to add the parameter for the locale. Patch upcoming.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 55

•

15 years ago

Attached patch Add locale to scraper args — Details — Splinter Review

Attachment #507084 - Flags: review?(jhammel)

Updated

•

15 years ago

Attachment #507084 - Flags: review?(jhammel) → review+

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 56

•

15 years ago

Landed as: http://hg.mozilla.org/qa/mozmill-automation/rev/0f55a53a9e67

Status: REOPENED → RESOLVED

Closed: 15 years ago → 15 years ago

Resolution: --- → FIXED

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 57

•

15 years ago

Attached patch Don't hard-code the application — Details — Splinter Review

Attachment #507303 - Flags: review?(gmealer)

Geo Mealer [:geo] -- This account is inactive after 2015-07-07

Comment 58

•

15 years ago

Comment on attachment 507303 [details] [diff] [review] Don't hard-code the application Changes look fine. :D

Attachment #507303 - Flags: review?(gmealer) → review+

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 59

•

15 years ago

Landed as: http://hg.mozilla.org/qa/mozmill-automation/rev/5c1b517e42b5 Hopefully the last one. Anything else will get filed as own bugs.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

14 years ago

Depends on: 653524

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

14 years ago

Blocks: 677250

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

14 years ago

Blocks: 679746

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

14 years ago

Blocks: 736098

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: Mozilla QA → Mozilla QA Graveyard

You need to log in before you can comment on or make changes to this bug.