Closed
Bug 810393
Opened 12 years ago
Closed 11 years ago
deploy release runner
Categories
(Release Engineering :: Release Automation: Other, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Assigned: rail)
References
Details
(Whiteboard: [shipit])
Attachments
(1 file)
637 bytes,
patch
|
bhearsum
:
review+
|
Details | Diff | Splinter Review |
We need to deploy the release runner portion of the kickoff system. Our current idea is to run it on buildbot-master36.
Assignee | ||
Updated•12 years ago
|
Whiteboard: [kickoff]
Assignee | ||
Updated•12 years ago
|
Priority: -- → P2
Assignee | ||
Comment 1•11 years ago
|
||
I deployed the current dev version on bm36. ssh cltbld@buildbot-master36 # install supervisord su - yum install supervisor chkconfig supervisord on # Add the following section to /etc/supervisor.conf: [program:releaserunner] command=/home/cltbld/release-runner/build-tools/buildfarm/release/release-runner.sh exitcodes=0 user=cltbld log_stderr=true log_stdout=true redirect_stderr=true stdout_logfile=/var/log/supervisor/release-runner.log # as cltbld cd ~ mkdir release-runner && release-runner virtualenv-2.6 --no-site-packages $PWD/venv sourece venv/bin/activate pip install simplejson pip install fabric pip install buildbot git clone git://github.com/rail/build-tools.git cd build-tools git checkout -b release-runner origin/release-runner-comments # Set up ~cltbld/.release-runner.ini service start supervisord
Reporter | ||
Comment 2•11 years ago
|
||
This isn't quite working....here's the problems I've encountered: * Killing supervisord doesn't terminate release runner (at least, not when it's doing something like cloning a repository). I end up with the .sh process dead and a release-runner.py in sleep state. It eventually dies, presumably after it finishes retrying its current operation * Not getting all of the output from release runner. We get a lot of output when polling release kickoff, but almost nothing when cloning repositories. For example: ==> /var/log/supervisor/release-runner.log <== Buildbot version: 0.8.7p1 Twisted version: 12.3.0 2012-12-28 11:42:32,666 - DEBUG - Fetching release requests 2012-12-28 11:42:32,712 - INFO - Got a new release request: {'status': 'Pending', 'product': 'fennec', 'name': 'Fennec-20.0-build1', 'dashboardCheck': False, 'buildNumber': 1, 'ready': True, 'l10nChangesets': '{\r\n "ca": {\r\n "revision": "3e911ef81869",\r\n "platforms": ["android"]\r\n },\r\n "cs": {\r\n "revision": "4d8963178613",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "da": {\r\n "revision": "65d017f4f5fe",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "de": {\r\n "revision": "d6ff03c97175",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "es-AR": {\r\n "revision": "03c68802bdc4",\r\n "platforms": ["android"]\r\n },\r\n "es-ES": {\r\n "revision": "d2185a7e4f7f",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "fi": {\r\n "revision": "68cb3de2b609",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "fr": {\r\n "revision": "6513cd7d17ae",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "fy-NL": {\r\n "revision": "0bb46089e086",\r\n "platforms": ["android"]\r\n },\r\n "ga-IE": {\r\n "revision": "b611f1be732c",\r\n "platforms": ["android"]\r\n },\r\n "gl": {\r\n "revision": "6a40cee822a6",\r\n "platforms": ["android"]\r\n },\r\n "it": {\r\n "revision": "d986d54e6074",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "ja": {\r\n "revision": "066c7401bfdc",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "ko": {\r\n "revision": "de1302d7a7f9",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "lt": {\r\n "revision": "c2087557d498",\r\n "platforms": ["android"]\r\n },\r\n "nb-NO": {\r\n "revision": "1a967b024168",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "nl": {\r\n "revision": "3aec3a20b84b",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "pa-IN": {\r\n "revision": "4afbb88e0ccf",\r\n "platforms": ["android"]\r\n },\r\n "pl": {\r\n "revision": "9ba4c01e429f",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "pt-BR": {\r\n "revision": "f230d9a3c797",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "pt-PT": {\r\n "revision": "25ed21fde549",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "ru": {\r\n "revision": "ccfccb0a343d",\r\n "platforms": ["android", "android-multilocale"]\r\n },\r\n "sk": {\r\n "revision": "e371468ef6c9",\r\n "platforms": ["android"]\r\n },\r\n "sl": {\r\n "revision": "07442918d3aa",\r\n "platforms": ["android"]\r\n },\r\n "uk": {\r\n "revision": "e54754531f14",\r\n "platforms": ["android"]\r\n },\r\n "zh-CN": {\r\n "revision": "65723862ba6d",\r\n "platforms": ["android"]\r\n },\r\n "zh-TW": {\r\n "revision": "a5dac614368e",\r\n "platforms": ["android"]\r\n }\r\n}\r\n', 'version': '20.0', 'branch': 'releases/mozilla-beta', 'submitter': 'rail', 'mozillaRevision': 'default', 'complete': False} warning: hg.mozilla.org certificate not verified (check web.cacerts config setting) pulling from https://hg.mozilla.org/users/bhearsum_mozilla.com/buildbot-configs searching for changes no changes found 0 files updated, 0 files merged, 0 files removed, 0 files unresolved abort: no suitable response from remote hg! The latter makes it very difficult to debug other problems.
Reporter | ||
Comment 3•11 years ago
|
||
Release runner hit an ISE 500 on the 24th and e-mailed us to say that it will sleep for 259200 seconds before retry. A few problems here: 1) ISE 500 from the server shouldn't necessarily cause a sleep like that. I think you fixed this already by retrying requests to ship it? 2) Nothing in the server log indicating that it would sleep - you'd only know that if you got the e-mail. 3) More than 259200 seconds (3 days) have gone by since then, and it's still not polling again.
Assignee | ||
Comment 4•11 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #3) > Release runner hit an ISE 500 on the 24th and e-mailed us to say that it > will sleep for 259200 seconds before retry. A few problems here: > 1) ISE 500 from the server shouldn't necessarily cause a sleep like that. I > think you fixed this already by retrying requests to ship it? This should be fixed by bug 833062. > 2) Nothing in the server log indicating that it would sleep - you'd only > know that if you got the e-mail. The attached patch prints a line which goes to logs. > 3) More than 259200 seconds (3 days) have gone by since then, and it's still > not polling again. Hmmm, not sure what happens here... I suspect supervisord not restarting the process when it exits 0. I'll try reproduce this.
Attachment #708142 -
Flags: review?(bhearsum)
Reporter | ||
Updated•11 years ago
|
Attachment #708142 -
Flags: review?(bhearsum) → review+
Assignee | ||
Comment 5•11 years ago
|
||
I added autorestart=true to the config, looks like it solves the problem 3), at least it worked fine with a simple test script.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•11 years ago
|
Whiteboard: [kickoff] → [shipit]
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•