Closed
Bug 1484428
Opened 7 years ago
Closed 6 years ago
Clamd didn't restart properly after a reboot on smtp1
Categories
(Infrastructure & Operations :: Infrastructure: Mail, task)
Infrastructure & Operations
Infrastructure: Mail
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Usul, Assigned: cshields)
References
Details
Attachments
(1 file)
755 bytes,
patch
|
cshields
:
review+
Usul
:
checked-in+
|
Details | Diff | Splinter Review |
<nagios-mdc1> (IRC) Sat 09:05:23 UTC [9056] [Unknown] smtp1.mail.mdc1.mozilla.com:Uptime is WARNING: WARNING: Linux smtp1.mail.mdc1.mozilla.com 3.10.0-862.11.6.el7.x86_64 - up 3 minutes (http://m.mozilla.org/Uptime)
<nagios-mdc1> (IRC) Sat 09:10:30 UTC [9059] [Unknown] smtp1.mail.mdc1.mozilla.com:nf_conntrack_count is UNKNOWN: WARNING: iptables module not loaded? (http://m.mozilla.org/nf_conntrack_count)
<nagios-mdc1> Sat 09:15:01 UTC [9063] [Unknown] smtp1.mail.mdc1.mozilla.com:clamd Procs is CRITICAL: PROCS CRITICAL: 0 processes with args 'clamd', UID = 995 (clamav) (http://m.mozilla.org/clamd+Procs)
[lhirlimann@smtp1.mail.mdc1 ~]$ sudo systemctl status clamd@amavisd.service
● clamd@amavisd.service - clamd scanner (amavisd) daemon
Loaded: loaded (/usr/lib/systemd/system/clamd@.service; static; vendor preset: disabled)
Active: active (running) since Sat 2018-08-18 09:01:55 UTC; 15min ago
Process: 756 ExecStart=/usr/sbin/clamd -c /etc/clamd.d/%i.conf (code=exited, status=0/SUCCESS)
Main PID: 3134 (clamd)
CGroup: /system.slice/system-clamd.slice/clamd@amavisd.service
└─3134 /usr/sbin/clamd -c /etc/clamd.d/amavisd.conf
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: Mail files support e...
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: OLE2 support enabled.
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: PDF support enabled.
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: SWF support enabled.
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: HTML support enabled.
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: XMLDOCS support enab...
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: HWP3 support enabled.
Aug 18 09:01:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: Self checking every ...
Aug 18 09:01:55 smtp1.mail.mdc1.mozilla.com systemd[1]: Started clamd scanner...
Aug 18 09:11:53 smtp1.mail.mdc1.mozilla.com clamd[3134]: SelfCheck: Database ...
Hint: Some lines were ellipsized, use -l to show in full.
[lhirlimann@smtp1.mail.mdc1 ~]$ ps aux |grep cla
root 278 0.0 0.0 0 0 ? S< 09:01 0:00 [xfs-reclaim/sda]
root 430 0.0 0.0 0 0 ? S< 09:01 0:00 [xfs-reclaim/sda]
amavis 3134 0.0 13.6 717300 530740 ? Ssl 09:01 0:00 /usr/sbin/clamd -c /etc/clamd.d/amavisd.conf
lhirlim+ 10708 0.0 0.0 112704 976 pts/0 R+ 09:18 0:00 grep --color=auto cla
[lhirlimann@smtp1.mail.mdc1 ~]$ sudo service clamd@clamd status
Redirecting to /bin/systemctl status clamd@clamd.service
● clamd@clamd.service - clamd scanner (clamd) daemon
Loaded: loaded (/usr/lib/systemd/system/clamd@.service; static; vendor preset: disabled)
Active: inactive (dead)
[lhirlimann@smtp1.mail.mdc1 ~]$ sudo service clamd@clamd start
Redirecting to /bin/systemctl start clamd@clamd.service
Job for clamd@clamd.service failed because the control process exited with error code. See "systemctl status clamd@clamd.service" and "journalctl -xe" for details.
[lhirlimann@smtp1.mail.mdc1 ~]$ sudo systemctl status clamd@clamd
● clamd@clamd.service - clamd scanner (clamd) daemon
Loaded: loaded (/usr/lib/systemd/system/clamd@.service; static; vendor preset: disabled)
Active: activating (start) since Sat 2018-08-18 09:21:50 UTC; 197ms ago
Control: 10951 (clamd)
CGroup: /system.slice/system-clamd.slice/clamd@clamd.service
└─10951 /usr/sbin/clamd -c /etc/clamd.d/clamd.conf
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com systemd[1]: clamd@clamd.service holdoff time over, scheduling restart.
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com systemd[1]: Starting clamd scanner (clamd) daemon...
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: Received 0 file descriptor(s) from systemd.
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: clamd daemon 0.99.4 (OS: linux-gnu, ARCH: x86_64, CPU: x86_64)
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: Running as user clamav (UID 995, GID 992)
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: Log file size limited to 4294967295 bytes.
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: Reading databases from /var/clamav
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: Not loading PUA signatures.
Aug 18 09:21:50 smtp1.mail.mdc1.mozilla.com clamd[10951]: Bytecode: Security mode set to "TrustSigned".
Hint: Some lines were ellipsized, use -l to show in full.
Reporter | ||
Comment 1•7 years ago
|
||
<nagios-mdc1> Usul: rechecking all services on host smtp1.mail.mdc1.mozilla.com
<nagios-mdc1> Sat 09:22:20 UTC [9067] [Unknown] smtp1.mail.mdc1.mozilla.com:clamd Procs is OK: PROCS OK: 1 process with args 'clamd', UID = 995 (clamav) (http://m.mozilla.org/clamd+Procs)
nagios-mdc1> (IRC) Sat 09:35:39 UTC [9070] [Unknown] smtp1.private.mdc1.mozilla.com:clamd Procs is CRITICAL: PROCS CRITICAL: 0 processes with args 'clamd', UID = 995 (clamav) (http://m.mozilla.org/clamd+Procs)
ran puppet.
[root@smtp1.mail.mdc1 clamav]# systemctl restart clamd@amavisd.service
[root@smtp1.mail.mdc1 clamav]# systemctl start clamd@clamd.service
fixed it.
Sat Aug 18 09:39:40 2018 -> ERROR: Can't unlink the socket file /var/run/clamav/clamd.sock
Sat Aug 18 09:39:57 2018 -> ERROR: LOCAL: Socket file /var/run/clamav/clamd.sock could not be bound: No such file or directory
Sat Aug 18 09:39:57 2018 -> ERROR: Can't unlink the socket file /var/run/clamav/clamd.sock
ERROR: NotifyClamd: Can't find or parse configuration file /etc/clamd.conf
ERROR: NotifyClamd: Can't find or parse configuration file /etc/clamd.conf
ERROR: NotifyClamd: Can't find or parse configuration file /etc/clamd.conf
Maybe the reboot caused an issue. Let see how things evolve.
Reporter | ||
Updated•7 years ago
|
Summary: Clamd didn't restart properly after a reboot → Clamd didn't restart properly after a reboot on smtp1
Reporter | ||
Comment 2•7 years ago
|
||
(IRC) Fri 09:04:56 UTC [9629] [Unknown] smtp1.mail.mdc2.mozilla.com:Uptime is WARNING: WARNING: Linux smtp1.mail.mdc2.mozilla.com 3.10.0-862.11.6.el7.x86_64 - up 3 minutes (http://m.mozilla.org/Uptime)
Fri 09:07:05 UTC [9630] [Unknown] smtp1.mail.mdc2.mozilla.com:clamd Procs is CRITICAL: PROCS CRITICAL: 0 processes with args 'clamd', UID = 994 (clamav) (http://m.mozilla.org/clamd+Procs)
Fri 09:11:39 UTC [9634] [Unknown] smtp1.mail.mdc2.mozilla.com:Swap is CRITICAL: SWAP CRITICAL - 1% free (0 MB out of 2046 MB) (http://m.mozilla.org/Swap)
(IRC) Fri 09:15:05 UTC [9636] [Unknown] smtp1.mail.mdc2.mozilla.com:Out of memory - killed process is WARNING: Log errors: Aug 24 09:05:22 smtp1.mail.mdc2.mozilla.com kernel: [ 264.073540] Out of memory: Kill process 3060 (clamd) score 88 or sacrifice child (http://m.mozilla.org/Out+of+memory+-+killed+process)
Processes eating up the memory are clamd.
seeing lines like Thu Apr 26 20:07:39 2018 -> ERROR: Can't open file or directory
in the clamd.log.
seeing ERROR: NotifyClamd: Can't find or parse configuration file /etc/clamd.conf in freshclam.log.
swap recovered when I stopped the services.
After restart journald had the following
Aug 24 09:31:40 smtp1.mail.mdc2.mozilla.com amavis[2257]: (02257-03) Checking: NMceFIMotLky MYNETS [10.50.77.18] <root@genericadm.private.phx1.mozilla.com> -> <infra-shared@infra.m
Aug 24 09:31:44 smtp1.mail.mdc2.mozilla.com amavis[2257]: (02257-03) p001 1 Content-Type: text/plain, 7bit, size: 113, SHA1 digest: 1a9d595a54f4da5a8ec35aeb0262767c41ed3bc3
Aug 24 09:31:46 smtp1.mail.mdc2.mozilla.com amavis[2257]: (02257-03) (!)connect to /var/run/clamav/clamd.sock failed, attempt #1: Can't connect to a UNIX socket /var/run/clamav/cla
Aug 24 09:31:46 smtp1.mail.mdc2.mozilla.com amavis[2257]: (02257-03) ClamAV-clamd: All attempts (1) failed connecting to /var/run/clamav/clamd.sock, retrying (1)
Aug 24 09:31:47 smtp1.mail.mdc2.mozilla.com amavis[2257]: (02257-03) (!)connect to /var/run/clamav/clamd.sock failed, attempt #1: Can't connect to a UNIX socket /var/run/clamav/cla
Aug 24 09:31:47 smtp1.mail.mdc2.mozilla.com amavis[2257]: (02257-03) (!)ClamAV-clamd: All attempts (1) failed connecting to /var/run/clamav/clamd.sock, retrying (2)
Aug 24 09:31:49 smtp1.mail.mdc2.mozilla.com amavis[2273]: (02273-03) bounce unverifiable, originating, <root@syslog1.private.mdc1.mozilla.com> -> <infra-shared@infra.mozilla.com>
Aug 24 09:31:49 smtp1.mail.mdc2.mozilla.com clamd[12093]: Loaded 6620892 signatures.
Aug 24 09:31:49 smtp1.mail.mdc2.mozilla.com postfix/smtpd[12096]: connect from localhost4.localdomain[127.0.0.1]
~
Reporter | ||
Comment 3•7 years ago
|
||
from /var/log/maillog :
Aug 24 09:48:18 smtp1.mail.mdc2.mozilla.com amavis[2254]: (02254-05) ESMTP :10024 /var/amavis/tmp/amavis-20180824T090325-02254-kj_NRIG6: <root@proxy2.dmz.phx1.mozilla.com> -> <infra-shared@infra.mozilla.com> SIZE=4958 Received: from smtp.mozilla.org ([127.0.0.1]) by localhost (smtp1.mail.mdc2.mozilla.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP for <infra-shared@infra.mozilla.com>; Fri, 24 Aug 2018 09:48:14 +0000 (UTC)
Aug 24 09:48:18 smtp1.mail.mdc2.mozilla.com amavis[2241]: (02241-05) (!)ClamAV-clamscan av-scanner FAILED: /usr/bin/clamscan KILLED, signal 9 (0009) at (eval 131) line 951.
Aug 24 09:48:19 smtp1.mail.mdc2.mozilla.com amavis[2241]: (02241-05) (!!)AV: ALL VIRUS SCANNERS FAILED
Reporter | ||
Comment 4•7 years ago
|
||
<nagios-mdc1> Sat 09:05:45 UTC [9188] [Unknown] smtp1.mail.mdc1.mozilla.com:clamd Procs is CRITICAL: PROCS CRITICAL: 0 processes with args 'clamd', UID = 995 (clamav) (http://m.mozilla.org/clamd+Procs)
<nagios-mdc1> (IRC) Sat 09:06:09 UTC [9189] [Unknown] smtp1.mail.mdc1.mozilla.com:Uptime is WARNING: WARNING: Linux smtp1.mail.mdc1.mozilla.com 3.10.0-862.11.6.el7.x86_64 - up 5 minutes (http://m.mozilla.org/Uptime)
This is the 3rd day in a row. Corey, can you look into this please?
Flags: needinfo?(cshields)
Reporter | ||
Comment 5•7 years ago
|
||
nagios-mdc2> Fri 09:12:00 UTC [9646] [Unknown] smtp1.mail.mdc2.mozilla.com:clamd Procs is CRITICAL: PROCS CRITICAL: 0 processes with args 'clamd', UID = 994 (clamav) (http://m.mozilla.org/clamd+Procs)
<nagios-mdc2> Fri 09:12:01 UTC [9647] [Unknown] smtp1.mail.mdc2.mozilla.com:Swap is CRITICAL: SWAP CRITICAL - 9% free (167 MB out of 2046 MB) (http://m.mozilla.org/Swap)
restarted the services.
Assignee | ||
Updated•7 years ago
|
Assignee: infra → cshields
Flags: needinfo?(cshields)
Comment 6•7 years ago
|
||
Working with cshields, bumped the CPU/RAM of the smtp1.mail.mdc2 to 2CPU, 8G RAM.
Reporter | ||
Comment 7•7 years ago
|
||
Today it happened on the other host :
<nagios-mdc1> (IRC) Sat 09:36:33 UTC [9251] [Unknown] smtp1.private.mdc1.mozilla.com:Swap is WARNING: SWAP WARNING - 19% free (344 MB out of 1906 MB) (http://m.mozilla.org/Swap)
<nagios-mdc1> (IRC) Sat 09:41:33 UTC [9256] [Unknown] smtp1.private.mdc1.mozilla.com:Swap is CRITICAL: SWAP CRITICAL - 1% free (4 MB out of 1906 MB) (http://m.mozilla.org/Swap)
<nagios-mdc1> (IRC) Sat 09:42:45 UTC [9258] [Unknown] smtp1.private.mdc1.mozilla.com:Load is WARNING: WARNING - load average: 20.09, 20.95, 19.12 (http://m.mozilla.org/Load)
Reporter | ||
Comment 8•7 years ago
|
||
Error: /Stage[main]/Clamav/Package[clamav-server]/ensure: change from absent to present failed: Execution of '/bin/yum -d 0 -e 0 -y list clamav-server' returned 1: Error: No matching Packages to list
Notice: /Stage[main]/Postfix::Mta/Ldap_users::Groupmember[user_amavis_to_group_amavis]/Exec[usermod -G clamav -a amavis]: Dependency Package[clamav-server] has failures: true
Warning: /Stage[main]/Postfix::Mta/Ldap_users::Groupmember[user_amavis_to_group_amavis]/Exec[usermod -G clamav -a amavis]: Skipping because of failed dependencies
[lhirlimann@smtp1.mail.mdc1 ~]$ rpm -qa |grep clamav
clamav-data-0.100.1-1.el7.noarch
clamav-update-0.100.1-1.el7.x86_64
clamav-server-systemd-0.100.1-1.el7.x86_64
clamav-filesystem-0.100.1-1.el7.noarch
clamav-lib-0.100.1-1.el7.x86_64
clamav-0.100.1-1.el7.x86_64
[lhirlimann@smtp1.mail.mdc1 ~]$
looks like our puppet module needs an update :
# TODO: This part here is ugly really should clean it up
if ($::osfamily == 'RedHat') {
if (versioncmp($::operatingsystemmajrelease, '6') == 0) {
package { 'clamd':
ensure => $package_ensure,
name => 'clamd',
}
package { 'clamav-db':
ensure => $package_ensure,
name => 'clamav-db',
}
}
elsif (versioncmp($::operatingsystemmajrelease, '7') == 0) {
package { 'clamav-server':
ensure => $package_ensure,
}
package { 'clamav-update':
ensure => $package_ensure
}
package { 'clamav-server-systemd':
ensure => $package_ensure
47,1 22%
Sudo yum search clamav :
============================= N/S matched: clamav ==============================
clamav-filesystem.noarch : Filesystem structure for clamav
clamav-milter-systemd.x86_64 : Systemd initscripts for the clamav
: sendmail-milter
clamav-scanner-systemd.x86_64 : systemd initscripts for clamav scanner daemon
clamav-server-systemd.x86_64 : Systemd initscripts for clamav server
clamav-unofficial-sigs.noarch : Scripts to download unoffical clamav signatures
clamav.x86_64 : End-user tools for the Clam Antivirus scanner
clamav-data.noarch : Virus signature data for the Clam Antivirus scanner
clamav-devel.x86_64 : Header files and libraries for the Clam Antivirus scanner
clamav-lib.x86_64 : Dynamic libraries for the Clam Antivirus scanner
clamav-milter.x86_64 : Milter module for the Clam Antivirus scanner
clamav-update.x86_64 : Auto-updater for the Clam Antivirus scanner data-files
according to https://pkgs.org/download/clamav-server it should be clamd
Assignee | ||
Comment 9•7 years ago
|
||
(update)
worked on this some Friday, and fell into a rabbit hole of epel not being updated across our mrepo servers due to a 404 mirror site. We increased resources on this VM which should help when clam is running, however I don't have a confident fix to it running on reboot yet.
Reporter | ||
Comment 10•7 years ago
|
||
When one runs on the host it barfs at :
Notice: /Stage[main]/Postfix::Mta/Ldap_users::Groupmember[user_amavis_to_group_amavis]/Exec[usermod -G clamav -a amavis]: Dependency Package[clamav-server] has failures: true
Warning: /Stage[main]/Postfix::Mta/Ldap_users::Groupmember[user_amavis_to_group_amavis]/Exec[usermod -G clamav -a amavis]: Skipping because of failed dependencies
Attachment #9010891 -
Flags: review?(cshields)
Reporter | ||
Comment 11•7 years ago
|
||
It also seems , but I can't prove it - that things work better once iptables are loaded.
Reporter | ||
Comment 12•7 years ago
|
||
(IRC) Fri 09:03:44 UTC [9732] [Unknown] smtp1.mail.mdc2.mozilla.com:Uptime is WARNING: WARNING: Linux smtp1.mail.mdc2.mozilla.com 3.10.0-862.14.4.el7.x86_64 - up 2 minutes (http://m.mozilla.org/Uptime)
(In reply to Ludovic Hirlimann [:Usul] from comment #11)
> It also seems , but I can't prove it - that things work better once iptables
> are loaded.
so stopping the services and running puppet fixes the issues.
Assignee | ||
Updated•7 years ago
|
Attachment #9010891 -
Flags: review?(cshields) → review+
Reporter | ||
Comment 13•7 years ago
|
||
Comment on attachment 9010891 [details] [diff] [review]
Fix puppet
checked in 15db7cd8025833346616697fe20cd2e21fcc0a2e
Attachment #9010891 -
Flags: checked-in+
Reporter | ||
Comment 16•6 years ago
|
||
fixing puppet fixed the issue. It didn't come back in a bad state since the puppet change.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•