Closed
Bug 1111389
Opened 10 years ago
Closed 10 years ago
sp-processor03.phx1.mozilla.com:Socorro Processors - procs is CRITICAL: PROCS CRITICAL: 0 processes with regex args processor_app
Categories
(Infrastructure & Operations :: MOC: Problems, task)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: achavez, Unassigned)
Details
I followed the trouble shooting steps in the run book here:
https://mana.mozilla.org/wiki/display/NAGIOS/Socorro+Processors+-+procs
Here's what happened:
/var/log/socorro/socorro-processor.log
/var/log/socorro/socorro-processor.log: line 1: /data/socorro/socorro-virtualenv/lib/python2.6/site-packages/configman/config_manager.py:747:: No such file or directory
/var/log/socorro/socorro-processor.log: line 2: syntax error near unexpected token `('
/var/log/socorro/socorro-processor.log: line 2: ` 'Invalid options: %s' % ', '.join(unmatched_keys)'
[achavez@sp-processor03.phx1 ~]$ /etc/init.d/socorro-processor restart
Stopping socorro-processor: [FAILED]
Starting socorro-processor: Can't create lock file "/var/run/socorro-processor.pid": Permission denied
I'm going to cc lars@mozilla.com to see if this can get fixed.
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(lars)
Reporter | ||
Comment 1•10 years ago
|
||
Also received this alert: sp-processor03.phx1.mozilla.com:Socorro Processors - log file age is CRITICAL: FILE_AGE CRITICAL: /var/log/socorro/socorro-processor.log is 808 seconds old and 10221153 bytes
Tried running this commands in the run book:
https://mana.mozilla.org/wiki/display/NAGIOS/Socorro+Processors+-+log+file+age and got permission denied.
Comment 2•10 years ago
|
||
it looks like at 2014-12-14T13:28:55 the processor was summarily killed without it having a chance to log anything. The last thing that it said was:
2014-12-14 13:28:54,747 DEBUG - Thread-4 - BotoBenchmarkWrite save_raw_and_processed 0:00:00.562493
Further investigation shows in /var/log/messages at that instant:
Dec 14 13:28:55 sp-processor03.phx1.mozilla.com kernel: python[19139] general protection ip:38a287a83f sp:7fec097d9230 error:0 in libc-2.12.so[38a2800000+197000]
so something went seriously wrong within Python itself. About ten minutes later, it appears that Puppet restarted the processor and it continued normally. It continues to work normally.
I put this down as an anomaly as we've not seen this behavior before and none of the other processors have experienced it. If it happens again, then it'll warrant a more serious investigation.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(lars)
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•