Closed Bug 868512 Opened 11 years ago Closed 7 years ago

investigate why socorro processors are not shut down in timely manner

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

People

(Reporter: rhelmer, Unassigned)

References

Details

Robert Helmer [:rhelmer]

Reporter

Description

•

11 years ago

See bug 864823 comment 5. I've been investigating why Socorro apps are not always shut down in a timely manner.

The only workable theory I have right now is:

1) kill signal sent to app
2) 15 seconds elapse
3) kill -KILL signal sent to app
4) app is still running, in uninterruptible sleep
5) start is send (before #4 has completed)

Is 15s just too short? Can we change the design of processor/monitor/crashmover so that this can't happen?

I'd rather throw away the work they are doing and have them respond to a kill signal in a timely manner, rather than wait an indeterminate length of time.

Robert Helmer [:rhelmer]

Reporter

Updated

•

11 years ago

Assignee: nobody → rhelmer

Status: NEW → ASSIGNED

Lonnen :lonnen

Comment 1

•

11 years ago

I suspect one or more threads is in an I/O op that prevents it from checking for sigint/term for 15s. I'm not sure how long to extend the waiting period, though.

Robert Helmer [:rhelmer]

Reporter

Comment 2

•

11 years ago

Not actively working on this

Assignee: rhelmer → nobody

Lonnen :lonnen

Comment 3

•

11 years ago

During today's release we had multiple failures to shut down. See release bug: https://bugzilla.mozilla.org/show_bug.cgi?id=897612

There were also 10 minidump_stackwalk processes running on processor 05 when it failed.

solarce: [root@sp-processor05 ~]# ps waux | grep mini | wc -l
solarce: 10
solarce: should there be /data/socorro/stackwalk/bin/minidump_stackwalk processes running when the service is stopped?

I remember lars talking about unexpectedly longrunning minidump_stackwalk processes in irc before he left. Is that related to what we're experiencing here?

Summary: investigate why socorro apps are not shut down in timely manner → investigate why socorro processors are not shut down in timely manner

Lonnen :lonnen

Comment 4

•

7 years ago

pass

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

investigate why socorro processors are not shut down in timely manner

Categories

(Socorro :: Backend, task)

Tracking

(Not tracked)

People

(Reporter: rhelmer, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4