Closed Bug 886822 Opened 12 years ago Closed 11 years ago

Errors during Chief deploy of TBPL: "fatal: Unable to create '/data/genericrhel6/www/.git/index.lock': File exists."

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

task
Not set
minor

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: emorley, Unassigned)

References

Details

Whilst doing the latest TBPL deploy using Chief (bug 886815), I got this error: { ... ... [localhost] err: [2013-06-25 07:45:19] Finished rsync_project (0.160s) [localhost] err: [2013-06-25 07:45:19] Running commit_www [localhost] err: [2013-06-25 07:45:19] [localhost] running: cd /data/genericrhel6/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['tbpl.mozilla.org']' [localhost] err: [2013-06-25 07:45:20] [localhost] failed: cd /data/genericrhel6/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['tbpl.mozilla.org']' (0.787s) [localhost] err: [localhost] err: fatal: Unable to create '/data/genericrhel6/www/.git/index.lock': File exists. [localhost] err: [localhost] err: [localhost] err: [localhost] err: If no other git process is currently running, this probably means a [localhost] err: [localhost] err: git process crashed in this repository earlier. Make sure no other git [localhost] err: [localhost] err: process is running and remove the file manually to continue. [localhost] err: [localhost] err: fatal: Unable to create '/data/genericrhel6/www/.git/index.lock': File exists. [localhost] err: [localhost] err: [localhost] err: [localhost] err: If no other git process is currently running, this probably means a [localhost] err: [localhost] err: git process crashed in this repository earlier. Make sure no other git [localhost] err: [localhost] err: process is running and remove the file manually to continue. } Full log: http://genericadm.private.phx1.mozilla.com/chief/tbpl.prod/logs/f3639f2c9462.1372171514 (not the first time it has occurred) I'm guessing that's because someone else was deploying at the same time? I then tried to deploy again, but got: { [localhost] err: [2013-06-25 07:46:37] Finished rsync_project (0.065s) [localhost] err: [2013-06-25 07:46:37] Running commit_www [localhost] err: [2013-06-25 07:46:37] [localhost] running: cd /data/genericrhel6/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['tbpl.mozilla.org']' [localhost] err: [2013-06-25 07:46:43] [localhost] failed: cd /data/genericrhel6/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['tbpl.mozilla.org']' (6.099s) [localhost] err: [localhost] out: # On branch master [localhost] err: [localhost] out: nothing to commit (working directory clean) } Full log: http://genericadm.private.phx1.mozilla.com/chief/tbpl.prod/logs/f3639f2c9462.1372171596 Few things: 1) I think the deploy script should retry in case the git lock file exists. 2) I don't understand why git thinks there is nothing to commit, if the first commit failed? This now leaves me with no way to deploy unless I push another dummy commit to the repo (whilst an easy workaround for now, it would be nice to fix the root cause if possible :-)).
Blocks: 827473
(In reply to Ed Morley [:edmorley UTC+1] from comment #0) > > Few things: > 1) I think the deploy script should retry in case the git lock file exists. this would require a rewrite in the way our deploy script works. i agree it's something that would be nice to have, but at this point, doing a repush should resolve this. see my next comments, which outline how we use (abuse?) git repos for deploying. the lock file error is the result of another push happening at the same time as yours. > 2) I don't understand why git thinks there is nothing to commit, if the > first commit failed? This now leaves me with no way to deploy unless I push > another dummy commit to the repo (whilst an easy workaround for now, it > would be nice to fix the root cause if possible :-)). we use a local git repo to store all the code pulled down from github. then all the web heads do a git pull from the admin's local repo to get an update of the latest code. as you can see there are two "git" repos in play here. one one that says it has nothing to commit means it has already committed the code from the src directory (github) into the www directory (local repo).
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
marking as r/wontfix. while potentially annoying, it's functioning as expected.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.