Closed Bug 654938 Opened 9 years ago Closed 9 years ago

StagingRepositorySetupFactory should retry on clone failures

Categories

(Release Engineering :: General, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rail, Assigned: rail)

References

Details

(Whiteboard: [staging][automation][simple])

Attachments

(2 files, 2 obsolete files)

Attached patch use retry.py (obsolete) — Splinter Review
It usually fails due to hg.m.o timeouts or connection closures.
Attached patch use retry.py (obsolete) — Splinter Review
Tested in staging.

Normal clone output:
==========================
Calling <function run_with_timeout at 0xb7c84fb4> with args: (['ssh', '-l', 'stage-ffxbld', '-oIdentityFile=~cltbld/.ssh/ffxbld_dsa', 'hg.mozilla.org', 'clone', 'nn-NO', 'l10n-central/nn-NO'], 300, None, None, False, True), kwargs: {}, attempt #1
Executing: ['ssh', '-l', 'stage-ffxbld', '-oIdentityFile=~cltbld/.ssh/ffxbld_dsa', 'hg.mozilla.org', 'clone', 'nn-NO', 'l10n-central/nn-NO']
Process stdio:
Please wait.  Cloning /l10n-central/nn-NO to /users/stage-ffxbld/nn-NO
Clone complete.
Fixing permissions, don't interrupt.

Process stderr:
==========================

Output with failure:
==========================
Calling <function run_with_timeout at 0xb7c6cfb4> with args: (['ssh', '-l', 'stage-ffxbld', '-oIdentityFile=~cltbld/.ssh/ffxbld_dsa', 'hg.mozilla.org', 'clone', 'nl', 'l10n-central/nl'], 300, None, None, False, True), kwargs: {}, attempt #4
Executing: ['ssh', '-l', 'stage-ffxbld', '-oIdentityFile=~cltbld/.ssh/ffxbld_dsa', 'hg.mozilla.org', 'clone', 'nl', 'l10n-central/nl']
Failed, sleeping 30 seconds before retrying
Calling <function run_with_timeout at 0xb7c6cfb4> with args: (['ssh', '-l', 'stage-ffxbld', '-oIdentityFile=~cltbld/.ssh/ffxbld_dsa', 'hg.mozilla.org', 'clone', 'nl', 'l10n-central/nl'], 300, None, None, False, True), kwargs: {}, attempt #5
Executing: ['ssh', '-l', 'stage-ffxbld', '-oIdentityFile=~cltbld/.ssh/ffxbld_dsa', 'hg.mozilla.org', 'clone', 'nl', 'l10n-central/nl']
Process stdio:
Please wait.  Cloning /l10n-central/nl to /users/stage-ffxbld/nl

Process stderr:
Killed by signal 15.

Process stdio:
Please wait.  Cloning /l10n-central/nl to /users/stage-ffxbld/nl
Clone complete.
Fixing permissions, don't interrupt.

Process stderr:
==========================
Attachment #530253 - Attachment is obsolete: true
Attachment #530258 - Flags: review?(catlee)
Comment on attachment 530258 [details] [diff] [review]
use retry.py

Testing another, more robust version.
Attachment #530258 - Flags: review?(catlee)
* Add "?rnd=$randomInt" to the end of the user repo URLs. Sometimes we get 404 even thought the repo exists.

Generates URLs like the following:
http://hg.mozilla.org/users/stage-ffxbld/pt-PT?rnd=528539
Attachment #530258 - Attachment is obsolete: true
Attachment #530274 - Flags: review?(catlee)
Attachment #530274 - Flags: review?(catlee) → review+
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
retry.py has its own timeout, cloning mozilla repo times out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch sync timeoutsSplinter Review
Attachment #530333 - Flags: review?(catlee)
Attachment #530333 - Flags: review?(catlee) → review+
Passed.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.