Closed
Bug 1047207
Opened 10 years ago
Closed 10 years ago
hgtool should retry or exit if it hits a DNS or server error during pull, not clobber and unbundle
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: nthomas, Assigned: catlee)
References
Details
(Whiteboard: [capacity])
Attachments
(2 files)
17.35 KB,
patch
|
rail
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
29.85 KB,
patch
|
rail
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
Wasteful: 21:15:04 INFO - Copy/paste: /usr/local/bin/hgtool.py --bundle https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles/mozilla-b2g32_v2_0.hg https://hg.mozilla.org/releases/mozilla-b2g32_v2_0 /builds/b2g_bumper/v2.0/build/mozilla-b2g32_v2_0 21:15:04 INFO - Using env: {'PATH': '/usr/local/bin:/usr/bin:/bin'} 21:15:04 INFO - Reporting hg version in use 21:15:04 INFO - command: START 21:15:04 INFO - command: hg -q version 21:15:04 INFO - command: cwd: . 21:15:04 INFO - command: output: 21:15:05 INFO - Mercurial Distributed SCM (version 2.5.4) 21:15:05 INFO - command: END (0.34s elapsed) 21:15:05 INFO - command: START 21:15:05 INFO - command: hg path default 21:15:05 INFO - command: cwd: /builds/b2g_bumper/v2.0/build/mozilla-b2g32_v2_0 21:15:05 INFO - command: output: 21:15:05 INFO - https://hg.mozilla.org/releases/mozilla-b2g32_v2_0 21:15:05 INFO - command: END (0.38 elapsed) 21:15:05 INFO - command: START 21:15:05 INFO - command: hg pull https://hg.mozilla.org/releases/mozilla-b2g32_v2_0 21:15:05 INFO - command: cwd: /builds/b2g_bumper/v2.0/build/mozilla-b2g32_v2_0 21:15:05 INFO - command: output: 21:15:31 ERROR - abort: error: Name or service not known 21:15:31 ERROR - Automation Error: hg not responding 21:15:31 INFO - command: ERROR 21:15:31 INFO - Traceback (most recent call last): 21:15:31 INFO - File "<string>", line 47, in run_cmd 21:15:31 INFO - File "/usr/lib64/python2.6/subprocess.py", line 502, in check_call 21:15:31 INFO - raise CalledProcessError(retcode, cmd) 21:15:31 INFO - CalledProcessError: Command '['hg', 'pull', 'https://hg.mozilla.org/releases/mozilla-b2g32_v2_0']' returned non-zero exit status 255 21:15:31 INFO - command: END (25.54s elapsed) 21:15:31 INFO - Error pulling changes into /builds/b2g_bumper/v2.0/build/mozilla-b2g32_v2_0 from https://hg.mozilla.org/releases/mozilla-b2g32_v2_0; clobbering 21:17:44 INFO - Attempting to initialize clone with bundles 21:17:44 INFO - command: START 21:17:44 INFO - command: hg init /builds/b2g_bumper/v2.0/build/mozilla-b2g32_v2_0 21:17:44 INFO - command: cwd: /builds/b2g_bumper/v2.0/build 21:17:44 INFO - command: output: 21:17:44 INFO - command: END (0.22s elapsed) 21:17:44 INFO - Trying to use bundle https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles/mozilla-b2g32_v2_0.hg 21:17:44 INFO - command: START 21:17:44 INFO - command: hg unbundle https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles/mozilla-b2g32_v2_0.hg 21:17:44 INFO - command: cwd: /builds/b2g_bumper/v2.0/build/mozilla-b2g32_v2_0 And more than 30 minutes later still going, plus three others on bm66 in the for b2g_bumper.
Reporter | ||
Updated•10 years ago
|
Summary: hgtool should abort and exit if it hits a DNS error, not clobber and unbundle → hgtool should retry or exit if it hits a DNS error during pull, not clobber and unbundle
Reporter | ||
Comment 1•10 years ago
|
||
Same applies to 500/502/503/504 responses. The hard bit seems to be that the exit statuses of mercurial are poor, so we may have to parse the output.
Summary: hgtool should retry or exit if it hits a DNS error during pull, not clobber and unbundle → hgtool should retry or exit if it hits a DNS or server error during pull, not clobber and unbundle
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → catlee
Comment 2•10 years ago
|
||
You'll have to parse output to determine DNS failures from other failures. Also, I question the sanity of our automation environment if hg.mozilla.org ever fails to resolve. I'm somewhat surprised at the frequency DNS seems to break in automation. I think that is a problem you should investigate fixing. Also, I think wiping the local repo after a single pull failure is bad. hg.mozilla.org just has to go down for a few minutes and then you effectively DDoS hg.mozilla.org. I'm trying to think of a valid scenario where wiping the local repo immediately after pull failure. Corruption is the only one that comes to mind.
Assignee | ||
Comment 3•10 years ago
|
||
Networks and servers are flaky, so we'll always have to deal with some amount of hiccups. This bug is precisely about what you describe - not blowing away local repos on a single remote failure. My first approach is going to parse hg's output to look for dns or http 5XX errors, and retry pull/clone operations in those cases. Hopefully hg's output here isn't version/platform or locale dependent.
Comment 4•10 years ago
|
||
hg does have locale dependent output. Automated agents should have the HGPLAIN environment variable set to keep hg's output as consistent as possible. HGPLAIN When set, this disables any configuration settings that might change Mercurial's default output. This includes encoding, defaults, verbose mode, debug mode, quiet mode, tracebacks, and localization. This can be useful when scripting against Mercurial in the face of existing user configuration. Equivalent options set via command line flags or environment variables are not overridden.
Assignee | ||
Comment 5•10 years ago
|
||
Attachment #8489362 -
Flags: review?(rail)
Updated•10 years ago
|
Attachment #8489362 -
Flags: review?(rail) → review+
Assignee | ||
Comment 6•10 years ago
|
||
Comment on attachment 8489362 [details] [diff] [review] retry pull/clone operations if this sticks, still need to update the pre-built version in puppet.
Attachment #8489362 -
Flags: checked-in+
Assignee | ||
Comment 7•10 years ago
|
||
Attachment #8490945 -
Flags: review?(rail)
Comment 8•10 years ago
|
||
Comment on attachment 8490945 [details] [diff] [review] update dependency-free version of hgtool in puppet. rubber stamp 8-)
Attachment #8490945 -
Flags: review?(rail) → review+
Assignee | ||
Updated•10 years ago
|
Attachment #8490945 -
Flags: checked-in+
Assignee | ||
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•