Closed
Bug 685124
Opened 13 years ago
Closed 13 years ago
hgtool needs to recover from being killed
Categories
(Release Engineering :: General, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Assigned: catlee)
References
Details
Attachments
(1 file, 1 obsolete file)
4.55 KB,
patch
|
bhearsum
:
review+
nthomas
:
review+
nthomas
:
checked-in+
|
Details | Diff | Splinter Review |
if hgtool is killed because it's taking to long, .hg/hgrc doesn't get written which means there is no [paths] section, which breaks buildsymbols.
hgtool needs to detect this case and recover from it.
Assignee | ||
Comment 1•13 years ago
|
||
Attachment #558927 -
Flags: review?(bhearsum)
Comment 2•13 years ago
|
||
The manual fix for these slaves (so far linux-ix-slave22, linux-ix-slave30, linux64-ix-slave06), is to drop this
[paths]
default = http://hg.mozilla.org/integration/mozilla-inbound
at /builds/hg-shared/integration/mozilla-inbound/.hg/hgrc
/builds/slave/m-in-lnx/build/.hg/hgrc
(adjusting lnx vs lnx64, adding -dbg if necessary)
I found that fixing the first location then doing an hg up at the copy doesn't fix up hgrc at the copy, so the patch is probably incomplete.
Comment 3•13 years ago
|
||
Comment on attachment 558927 [details] [diff] [review]
clobber shared repo if the default path isn't correct
(In reply to Nick Thomas [:nthomas] from comment #2)
> The manual fix for these slaves (so far linux-ix-slave22, linux-ix-slave30,
> linux64-ix-slave06), is to drop this
>
> [paths]
> default = http://hg.mozilla.org/integration/mozilla-inbound
>
> at /builds/hg-shared/integration/mozilla-inbound/.hg/hgrc
> /builds/slave/m-in-lnx/build/.hg/hgrc
> (adjusting lnx vs lnx64, adding -dbg if necessary)
>
> I found that fixing the first location then doing an hg up at the copy
> doesn't fix up hgrc at the copy, so the patch is probably incomplete.
Hmmm, this confuses me - I'm not sure how we can get into a state where both the shared version and final destination are missing an hgrc. If hgtool gets killed while cloning the hg-shared directory, it won't have an hgrc, and the m-in-lnx one won't get created at all. If hgtool gets killed while sharing to the m-in-lnx one, the shared version will be fine, and the m-in-lnx one won't have an hgrc at all, but sharedpath will be set.
Catlee's patch addresses the former. I'm not sure we need to address hgrc in the latter, since subsequent updates should work fine because of sharedpath being set. Let me know if I've missed something here.
The patch looks fine to me, but r- because it needs tests!
Attachment #558927 -
Flags: review?(bhearsum) → review-
Comment 4•13 years ago
|
||
linux64-ix-slave06 again, https://tbpl.mozilla.org/php/getParsedLog.php?id=6332102&full=1
Assignee | ||
Comment 5•13 years ago
|
||
Attachment #558927 -
Attachment is obsolete: true
Attachment #559221 -
Flags: review?(bhearsum)
Comment 6•13 years ago
|
||
New shooter, linux-ix-slave38 in https://tbpl.mozilla.org/php/getParsedLog.php?id=6333123&full=1
Assignee | ||
Comment 7•13 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #6)
> New shooter, linux-ix-slave38 in
> https://tbpl.mozilla.org/php/getParsedLog.php?id=6333123&full=1
I just patched up this slave.
Comment 8•13 years ago
|
||
Comment 9•13 years ago
|
||
Comment on attachment 559221 [details] [diff] [review]
yay for tests!
Review of attachment 559221 [details] [diff] [review]:
-----------------------------------------------------------------
Looks fine to me, but Nick should have a look to, to make sure my statements in comment #3 are indeed correct
Attachment #559221 -
Flags: review?(nrthomas)
Attachment #559221 -
Flags: review?(bhearsum)
Attachment #559221 -
Flags: review+
Comment 10•13 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #3)
> Hmmm, this confuses me - I'm not sure how we can get into a state where both
> the shared version and final destination are missing an hgrc. If hgtool gets
> killed while cloning the hg-shared directory, it won't have an hgrc, and the
> m-in-lnx one won't get created at all.
Agreed. The issue is what happens on the next build, linux-ix-slave22 on Wednesday. The initial clone times out:
Updating shared repo
command: START
command: hg clone -U -r 8a9c10ebbf00795f005b3612fe472f5bd2bf3382 http://hg.mozilla.org/integration/mozilla-inbound /builds/hg-shared/integration/mozilla-inbound
command: cwd: /builds/slave/m-in-lnx
command: output:
command: output:
command timed out: 3600 seconds without output, attempting to kill
process killed by signal 9
--------
Then the next build pulls rather than clobbering & cloning:
Updating shared repo
command: START
command: hg pull -r 8a9c10ebbf00795f005b3612fe472f5bd2bf3382 http://hg.mozilla.org/integration/mozilla-inbound
command: cwd: /builds/hg-shared/integration/mozilla-inbound
command: output:
pulling from http://hg.mozilla.org/integration/mozilla-inbound
requesting all changes
adding changesets
adding manifests
adding file changes
added 76311 changesets with 360561 changes to 79556 files
(run 'hg update' to get a working copy)
command: END (325.95s elapsed)
----
Buildbot has used a kill -9 for the timeout, so I suspect that abruptly leaves the share in an state with some/most of the history but no hgrc. Then the 2nd build fixes the history but not the hgrc, and we set up our copy from that busted state.
Updated•13 years ago
|
Attachment #559221 -
Flags: review?(nrthomas) → review+
Comment 11•13 years ago
|
||
Comment 12•13 years ago
|
||
linux-ix-slave25 is fixed.
Comment 13•13 years ago
|
||
Comment on attachment 559221 [details] [diff] [review]
yay for tests!
hg.mozilla.org/build/tools/rev/6fd5d6755de5 (doesn't require a reconfig)
Attachment #559221 -
Flags: checked-in+
Assignee | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•