Closed Bug 979435 Opened 8 years ago Closed 8 years ago

[Travis] mozilla-download is failing (no output received in 10 minutes)

Categories

(Firefox OS Graveyard :: General, defect)

defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: mikehenrty, Unassigned)

References

Details

Unit tests, integration tests and Gaia UI tests are all failing in Travis due to the in ability to download builds from tinderbox.

For example: https://travis-ci.org/mozilla-b2g/gaia/jobs/20070700
What's going on? Should we fail over to nightly builds?
Super slow network connection:
Downloading XULRunner...

wget -c http://ftp.mozilla.org/pub/mozilla.org/xulrunner/nightly/2013/08/2013-08-07-03-02-16-mozilla-central/xulrunner-26.0a1.en-US.linux-x86_64.sdk.tar.bz2

--2014-03-04 17:28:25-- http://ftp.mozilla.org/pub/mozilla.org/xulrunner/nightly/2013/08/2013-08-07-03-02-16-mozilla-central/xulrunner-26.0a1.en-US.linux-x86_64.sdk.tar.bz2

Resolving ftp.mozilla.org (ftp.mozilla.org)... 63.245.215.56, 63.245.215.46

Connecting to ftp.mozilla.org (ftp.mozilla.org)|63.245.215.56|:80... connected.

HTTP request sent, awaiting response... 200 OK

Length: 67616122 (64M) [application/x-bzip2]

Saving to: `xulrunner-26.0a1.en-US.linux-x86_64.sdk.tar.bz2'

86% [================================> ] 58,319,910 18.5K/s eta 7m 22s

I'm sorry but your test run exceeded 50.0 minutes.
The problem turns out to be that from certain locations in Europe, downloads from ftp.mozilla.org are extremely slow (~50K/s). Speaking with travis on twitter [1], it seems that at least some of their VM's are in Europe. As a band aid solution here, :lightsofapollo set up an s3 instance to pull our builds off of [2]. We will re-open once travis goes green on this build.



1.) https://twitter.com/travisci/status/440938418508677120

2.) https://github.com/mozilla-b2g/gaia/commit/4b0e62c713d83a36d4db42556625a2bc2b44e3f2
So obviously we need a longer term solution here, which in my opinion should involve us moving off travis ASAP. Even though the problem turned out to be ftp.mozilla.org from Europe, the thing that actually killed us was no insight into these travis VMs, and no way to ultimately control them. In any case, closing this bug for now, since it is not the right place to have this discussion.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Is there an automated process that uploads new builds to s3 periodically?
Not (In reply to Andrew Sutherland (:asuth) from comment #5)
> Is there an automated process that uploads new builds to s3 periodically?

We are working on this right now.
(In reply to Michael Henretty [:mhenretty] from comment #4)
> So obviously we need a longer term solution here, which in my opinion should
> involve us moving off travis ASAP. Even though the problem turned out to be
> ftp.mozilla.org from Europe, the thing that actually killed us was no
> insight into these travis VMs, and no way to ultimately control them. In any
> case, closing this bug for now, since it is not the right place to have this
> discussion.

Well, downloading from ftp.mozilla.org is currently fast for me, and I actually am in Europe ;) From my home connection:

For me, the IP for ftp.mozilla.org are:

ftp.mozilla.org has address 63.245.215.56
ftp.mozilla.org has address 63.245.215.46

Is it the same for you?

I tried downloading from both these IP in FTP (because in HTTP it's forbidden when you use the IP directly), and I found the second one to be somewhat slower (not 18kB/s though, more 300kB/s while the first IP is 600kB/s).

From a server that is also hosted in Europe, I also found the second IP less stable, but it's still from 3MB/s to 5MB/s (the first IP was more reliably at 5MB/s).

Might be worth displaying the DNS entries for ftp.mozilla.org in the travis output.

Also, maybe ftp.mozilla.org had actually an issue yesterday? (although I remember I tried yesterday while we had the issue and it worked correctly for me).

I think that long term, having a S3 instance will be more troublesome than discussing with the people in charge of ftp.mozilla.org.

The long-term plan is still to use TBPL but I don't see us moving off travis really soon, for a quantity of reasons.
Just had a look to the "fix".

Really guys, is it really a solution ?

Trading using the latest TBPL build automatically to something we need to update manually ?

We need to backout this _now_. This is just wrong.
(In reply to Julien Wajsberg [:julienw] from comment #8)
> Just had a look to the "fix".
> 
> Really guys, is it really a solution ?
> 
> Trading using the latest TBPL build automatically to something we need to
> update manually ?
> 
> We need to backout this _now_. This is just wrong.

No - this is just temporary while we work on a fix. The alternative is closing the tree until we have a fix. TBH - I'm fine with either solution, but I would imagine people would rather have an open tree and we can live with the pain of a manual update every day or so?
I would say that the network issues are resolved now, anyway. Otherwise the XULrunner download would not succeed, as you didn't change this.
PR for backout now that the hosting company Hetzner seems to have fixed the issue on their side: https://github.com/mozilla-b2g/gaia/pull/16903
Travis is green so:

reverted in master in 093f35afbe967996c1352258a3b923dd4db7c357
Resolution: FIXED → WONTFIX
(In reply to Julien Wajsberg [:julienw] from comment #10)
> I would say that the network issues are resolved now, anyway. Otherwise the
> XULrunner download would not succeed, as you didn't change this.

Just for informational purposes here: downloading XULRunner is done via wget (as opposed to using the npm package mozilla-download for b2g-dekstop). This is important to note because wget has progress output, while mozilla-download does not. So, travis will kill a build where a b2g-desktop download takes longer than 10 minutes, whereas XULRunner can continue downloading over 10 minutes since it is outputting it's progress.
In travis we use the "-nv" options for wget, which means we have no progress output :)
I found it funny that after reading this thread, I just got a XULRunner timeout on travis :)

Downloading XULRunner...

wget -nv http://ftp.mozilla.org/pub/mozilla.org/xulrunner/nightly/2014/03/2014-03-08-03-02-03-mozilla-central/xulrunner-30.0a1.en-US.linux-x86_64.sdk.tar.bz2

No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.
(In reply to Julien Wajsberg [:julienw] from comment #14)
> In travis we use the "-nv" options for wget, which means we have no progress
> output :)

Crap, you're right! I swear I had seen a progress bar for that in travis..
On Linux we normally use get without -nv, only travis has -nv. Because otherwise this gives a very big log (wget is rewriting the line, so this gives one more line in Travis' log). Would be better if it would simply append a character. Don't know if this is possible with either wget or curl.
See Also: → 1033221
You need to log in before you can comment on or make changes to this bug.