increase number of clients rsyslog can handle in the datacenter

RESOLVED WONTFIX

Status

Infrastructure & Operations
RelOps: Puppet
RESOLVED WONTFIX
3 years ago
2 years ago

People

(Reporter: arr, Assigned: arr)

Tracking

Details

(Whiteboard: [relsec])

Attachments

(1 attachment)

(Assignee)

Description

3 years ago
Now that we've brought the windows machines online, we're handling many more machines. We need to modify the allocated resources and the nagios check to accommodate this.

1) The papertrail config file should have the number of Max sessions large enough that it can handle the full load of machines (~1500) if we lose one of them:

module(load="imtcp" MaxSessions="2000" KeepAlive="on")

2) The number of open files the process can handle needs to be the number of ports + 1000, which means modifying /etc/security/limits.conf:
root soft nofile 4096
root hard nofile 8192

3) The nagios check should be increased to allow 1000 sessions for a warning (with round robin, this is significantly more than half the load). With 1600 as critical (something has clearly gone wrong since that's more than the total number of clients)
(Assignee)

Comment 1

3 years ago
Created attachment 8613536 [details] [diff] [review]
log-aggregator-resources.diff

This bumps up the limits in AWS, too, but that shouldn't be an issue.
Attachment #8613536 - Flags: review?(dustin)
Attachment #8613536 - Flags: review?(dustin) → review+
(Assignee)

Comment 2

3 years ago
For whatever reason, the modifications to /etc/security/limits.conf don't seem to be taking effect.
(Assignee)

Updated

2 years ago
Whiteboard: [relsec]
(Assignee)

Comment 3

2 years ago
Unfortunately we don't have the cycles to work on this. Added another rsyslog server to compensate.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.