Closed Bug 1191353 Opened 9 years ago Closed 9 years ago

l10n repacks sometimes hang when downloading mars

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

Attachments

(1 file, 1 obsolete file)

We've hit this a few times in recent releases, eg:
Downloading http://stage.mozilla.org/pub/mozilla.org/firefox/candidates/40.0-candidates/build1/linux-x86_64/en-US/firefox-40.0.tar.bz2 to firefox.tar.bz2
Downloading http://stage.mozilla.org/pub/mozilla.org/firefox/candidates/38.0.5-candidates/build4/update/linux-x86_64/bn-BD/firefox-38.0.5.complete.mar to firefox-38.0.5.bn-BD.complete.mar
Downloading http://stage.mozilla.org/pub/mozilla.org/firefox/candidates/35.0.1-candidates/build1/update/linux-x86_64/bn-BD/firefox-35.0.1.complete.mar to firefox-35.0.1.bn-BD.complete.mar
Downloading http://stage.mozilla.org/pub/mozilla.org/firefox/candidates/37.0.2-candidates/build1/update/linux-x86_64/bn-BD/firefox-37.0.2.complete.mar to firefox-37.0.2.bn-BD.complete.mar

command timed out: 2400 seconds without output ... program finished with exit code 247


We should make sure these actually have a timeout, and know how to retry if we hit it.
Attached patch try to timeout and retry (obsolete) — Splinter Review
Needs testing still, but I think this will work?
Attachment #8643756 - Flags: feedback?(rail)
Comment on attachment 8643756 [details] [diff] [review]
try to timeout and retry

Review of attachment 8643756 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/python/release/download.py
@@ +11,5 @@
>  from release.l10n import makeReleaseRepackUrls
>  from release.paths import makeCandidatesDir
>  from util.paths import windows2msys, msys2windows
>  from util.file import directoryContains
> +from util.retry import retrier

I think you can use "from redo import retrier" because the scripts sets the sitedir properly.

@@ +48,5 @@
>          url = '/'.join([p.strip('/') for p in [candidatesDir,
>                                                 urllib.quote(remoteFile)]])
>          log.info("Downloading %s to %s", url, fileName)
> +        for _ in retrier():
> +            f = open(fileName, "w")

Use "wb", it may hit us on Windows.

@@ +53,5 @@
> +            try:
> +                r = requests.get(url, stream=True)
> +                for chunk in r.iter_content():
> +                    f.write(chunk)
> +                f.close()

It feels like f.close() belongs to "finally". I'd use "with open(...) as f" even though it adds another level of indentation.

@@ +57,5 @@
> +                f.close()
> +                r.close()
> +                break
> +            except (requests.HTTPError, requests.ConnectionError):
> +                log.debug("Caught exception downloading")

Can you use log.exception(...) instead so we see the stacktrace?
Attachment #8643756 - Flags: feedback?(rail) → feedback+
All good points! Here's a fixed version that I actually tested. It raises exceptions when there's errors, unlike the first version.
Assignee: nobody → bhearsum
Attachment #8643756 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8645847 - Flags: review?(rail)
Attachment #8645847 - Flags: review?(rail) → review+
Attachment #8645847 - Flags: checked-in+
In production
We didn't hit this in 41.0b2, so the patch certainly didn't make things worse. I tried to find evidence of a download retrying, but there either wasn't any need to in this release, or there's no extra output when it does. Either way, this is almost certainly fixed.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
See Also: → 1206035
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: