943871 - Autophone - use http instead of ftp for build urls

Assignee

Description

•

11 years ago

While investigating using the archived inbound builds for use in Autophone regression triage, I ran into a problem with the use of ftp urls for builds. The archived inbound builds are only available over http: e.g.: http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android/ Apart from the perhaps simpler logic of dealing with ftp, I think http is a better choice as it is 1) more readily available, 2) more likely to not be constrained by network flow restrictions. My first thoughts on the approach would be to: a) use something like BeautifulSoup to parse the http responses and make minimal changes to the overall logic. b) use something like mozdownload. It doesn't appear that this supports fennec or the inbound archive. c) use mozregression which has added support for the inbound archive. This doesn't appear to support fennec/arm/x86. I'm leaning towards a) at least initially. Suggestions / Comments solicited.

Bob Clary [:bc] (inactive)

Assignee

Updated

•

11 years ago

Blocks: 943873

Mark Côté [:mcote]

Comment 1

•

11 years ago

Tough call. If mozdownload does most of what you need, I'd be tempted to expand that. The most annoying part will be parsing the responses, so I'd see if mozdownload mostly handles that now.

Bob Clary [:bc] (inactive)

Assignee

Comment 2

•

11 years ago

mozdownload is very general but appears to be very tightly coupled to downloading desktop type builds from ftp.mozilla.org for the platform where it is running. The parsing is quite different in that it uses HTMLParser to collect links then 'grep's them out for different file patterns. I think expanding mozdownload for our use case would be a non-trivial undertaking.

William Lachance (:wlach)

Comment 3

•

11 years ago

Two things: (1) It should be fairly easy to extend mozregression to support fennec/arm. I can give guidance on this if necessary. (2) It's definitely annoying that we have duplicated logic to fetch builds between mozdownload/mozregression and lack of android support in mozdownload is one of the main things blocking that. If you have the cycles, I don't think working on that would be a waste.

Bob Clary [:bc] (inactive)

Assignee

Comment 4

•

11 years ago

I'm primarily focused at the moment on getting autophone in shape to do regression testing so I can identify specific revisions which cause issues. The Autophone use cases don't seem to fit well with the existing uses in mozregression or mozdownload. I'm just afraid of getting tied up working on mozregression or mozdownload when my real goal is making sure actionable bugs are filed for autophone.

Mark Côté [:mcote]

Comment 5

•

11 years ago

Mm yeah, normally I think I would say that improving our general tools would be the way to go, but it's true that we're really focussing on getting actionable items out of autophone right now, so let's take the easiest route to get that done--but just keep an eye on what you're doing from a high level to make sure you don't end up duplicating too much, and even better would be writing it in such a way that it could be easily applied to mozdownload or mozregression. I still want to generalize the build cache and potentially put that into mozdownload some day.

Bob Clary [:bc] (inactive)

Assignee

Comment 6

•

11 years ago

I spent a little time yesterday looking at all the repos that are available. I was thinking that supporting accessing build and related files by date, buildid, revision for an arbitrary platform, cpu regardless of the repo name and base url of the site would be a general purpose component that could be easily reused outside of autophone. mozregression's use of the hg json-pushes was very interesting as well and supporting fetching builds by revision would be very helpful in autophone.

Bob Clary [:bc] (inactive)

Assignee

Comment 7

•

11 years ago

I spent a bit more time on this and tried some first stabs at writing some code for the general case of supporting arbitrary base urls and the general set of patterns in use for build directories and file names. It felt as if someone was going to jump out from behind a rock and shout "You are in a maze of twisty little passages, all alike". I'm going to just focus on the immediate need for Autophone for this bug but will continue trying to come up with a more general approach that we can use everywhere we need to get builds.

Bob Clary [:bc] (inactive)

Assignee

Comment 8

•

11 years ago

Attached patch bug-943871-http-builds.patch — Details — Splinter Review

I used httplib2, beautifulsoup4, renamed some things to be more descriptive and removed the unused build_date_from_url|build_date. I also added a bunch of debug log messages and changed the default log level to INFO so we don't see them normally. I tested with the following: python trigger_runs.py 20131201030203 python trigger_runs.py 2013-11-10 2013-11-12 python trigger_runs.py 2013-12-01 python trigger_runs.py latest python trigger_runs.py --build-location=tinderbox 20131202034749 python trigger_runs.py --build-location=tinderbox 2013-11-10 2013-11-12 python trigger_runs.py --build-location=tinderbox 2013-12-01 python trigger_runs.py --build-location=tinderbox latest

Attachment #8341156 - Flags: review?(mcote)

Mark Côté [:mcote]

Comment 9

•

11 years ago

Comment on attachment 8341156 [details] [diff] [review] bug-943871-http-builds.patch Review of attachment 8341156 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, cleaner than the ftp implementation! Just one nit and one suggestion. ::: builds.py @@ +32,4 @@ > self.nightly_dirnames = [(re.compile('(.*)-%s-android(-armv6|-x86)?$' > % repo)) for repo in repos] > > + def search_directories(self, start_time, end_time): This function name kind of sounds like it's *doing* the search... maybe dirs_to_search() or something like that. @@ +198,5 @@ > + directory_href, httplib.responses[builddir_response.status])) > + continue > + > + builddir_soup = BeautifulSoup(builddir_content) > + for build_link in builddir_soup.findAll('a'): I think you could refactor the link-fetching code to avoid duplication between the list of builds and the build files, maybe using a generator.

Attachment #8341156 - Flags: review?(mcote) → review+

Bob Clary [:bc] (inactive)

Assignee

Comment 10

•

11 years ago

(In reply to Mark Côté ( :mcote ) from comment #9) > Comment on attachment 8341156 [details] [diff] [review] > bug-943871-http-builds.patch > > Review of attachment 8341156 [details] [diff] [review]: > ----------------------------------------------------------------- > > Looks good, cleaner than the ftp implementation! Just one nit and one > suggestion. > > ::: builds.py > @@ +32,4 @@ > > self.nightly_dirnames = [(re.compile('(.*)-%s-android(-armv6|-x86)?$' > > % repo)) for repo in repos] > > > > + def search_directories(self, start_time, end_time): > > This function name kind of sounds like it's *doing* the search... maybe > dirs_to_search() or something like that. > Yeah, I struggled over that while trying to keep it readable and short. How about get_search_directories? > @@ +198,5 @@ > > + directory_href, httplib.responses[builddir_response.status])) > > + continue > > + > > + builddir_soup = BeautifulSoup(builddir_content) > > + for build_link in builddir_soup.findAll('a'): > > I think you could refactor the link-fetching code to avoid duplication > between the list of builds and the build files, maybe using a generator. I don't understand. directory_soup.findAll('a') is the list of potential build directories. We skip over the directories which do not have a build_time. We then have to make separate requests to each build directory to get the file list builddir_soup.findAll('a'). I don't see the how of or the benefit of redoing this using generators.

Bob Clary [:bc] (inactive)

Assignee

Comment 11

•

11 years ago

https://github.com/mozilla/autophone/commit/ac8391aacc45cfb1d14c5720a2e0e81854b71567 I went ahead and changed it to get_search_directories but left the other as is for now. I pushed so I can keep going with the inboundarchive and revision changes, I'll leave this open until we decide on the link-fetching code refactoring.

Mark Côté [:mcote]

Comment 12

•

11 years ago

Apparently I didn't submit my comment last time... I was saying that lines 175-181 and lines 195-202 in the patched version are essentially the same, aside from the URL. They could be refactored to take a URL and then yield the results of findAll().

Bob Clary [:bc] (inactive)

Assignee

Comment 13

•

11 years ago

Mark and I talked and I understand what he was talking about better now. I'm going to use mozregression's approach for dealing with fetching the contents of the directories and will use a distinct function like urlLinks which will achieve the desired goal. Marking this fixed.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Bob Clary [:bc] (inactive)

Assignee

Comment 14

•

11 years ago

oops, forgot to mention the changes will be in bug 945390

BMO Automation

Updated

•

3 years ago

Product: Testing → Testing Graveyard

Bugzilla

Autophone - use http instead of ftp for build urls

Categories

(Testing Graveyard :: Autophone, defect)

Tracking

(Not tracked)

People

(Reporter: bc, Assigned: bc)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Updated

Attachment

General

Description

File Name

Content Type