Closed Bug 980435 Opened 11 years ago Closed 11 years ago

processor lockfiles outlive their usefulness after SIGKILL

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

People

(Reporter: lars, Assigned: dmaher)

Details

K Lars Lohn [:lars] [:klohn]

Reporter

Description

•

11 years ago

the socorro-processor init.d script uses the killproc function to shutdown.  If the processor exceeds killproc's delay time, killprocs sends SIGKILL to the processor.  While that stops the processor, it also appears to stop "daemonize" from properly removing the lockfile.  We end up having to manually remove it.  That messes up automation for deployments of new code that requires a processor restart.  

look into using the pidfile instead of the lockfile as system documentation suggests is possible.  Alternatively, in socorro-processor, try to just delete the lock file after killproc has run.

Daniel Maher [:phrawzty]

Assignee

Comment 1

•

11 years ago

(In reply to K Lars Lohn [:lars] [:klohn] from comment #0)
> look into using the pidfile instead of the lockfile as system documentation
> suggests is possible.  Alternatively, in socorro-processor, try to just
> delete the lock file after killproc has run.

To amplify:

The "killproc" function is provided by the RHEL-standard init library "functions".  In said function, the final code block is as follows :

        # Remove pid file if any.
    if [ -z "$killlevel" ]; then
            rm -f "${pid_file:-/var/run/$base.pid}"
    fi
    return $RC

Since we are not explicitly declaring a kill level (i.e. we run killproc without passing it as an argument), this block should run; of course, it only removes the pidfile (not the lockfile), which is where an alternate usage of daemonize comes into play.  The man page of daemonize states :

       It is possible to use the pidfile as the lock file (e.g., "-p /var/run/foo -l /var/run/foo"), though typical daemons use separate files.

This would have the net effect of using the same file for both purposes, meaning that when killproc removes the pidfile, it would also remove the lockfile, as they would be one in the same.  This could be a relatively clean solution to the problem that leverages existing system functionality without altering expected behaviour too much (i.e. makes sysadmins happy).

The alternative, as :lars noted, is to simply rm the lock file after killproc has run.  This would also have the net effect of removing the lockfile, but is generally considered bad form from a systems perspective.  If the former solution is one of finesse, this is one of brute force - that said, it's simple and effective.

I'll run some tests and see how the finesse option feels, first.

Daniel Maher [:phrawzty]

Assignee

Comment 2

•

11 years ago

Some basic tests on an RHEL VM show that the lockfile is removed as expected in both cases.  There is no clear winner between the two.  At the end of the day this isn't such a big issue, so I'm inclined to try the "finesse" option first (since it makes the sysadmin in me happier), and if it doesn't work out, we'll just go the rm route.

Status: NEW → ASSIGNED

[github robot]

Comment 3

•

11 years ago

Commit pushed to master at https://github.com/mozilla/socorro

https://github.com/mozilla/socorro/commit/5030fecc75467a9aaf5098358e50a8f6b94d427e
Merge pull request #1934 from phrawzty/bug980435

fixes 980435

Daniel Maher [:phrawzty]

Assignee

Comment 4

•

11 years ago

Closing this for now; re-open if the issue persists.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Lonnen :lonnen

Updated

•

11 years ago

Target Milestone: --- → 78

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

processor lockfiles outlive their usefulness after SIGKILL

Categories

(Socorro :: Infra, task)

Tracking

(Not tracked)

People

(Reporter: lars, Assigned: dmaher)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated