Closed Bug 623410 Opened 14 years ago Closed 14 years ago

syslog drops socorro messages

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: rhelmer, Assigned: rhelmer)

Details

Attachments

(1 file, 1 obsolete file)

provide socorro option for logging directly to local socket 14 years ago Robert Helmer [:rhelmer] 1.83 KB, patch		Details \| Diff \| Splinter Review
provide socorro option for logging directly to local socket 14 years ago Robert Helmer [:rhelmer] 1.83 KB, patch	lars : review+	Details \| Diff \| Splinter Review

Robert Helmer [:rhelmer]

Assignee

Description

•

14 years ago

We use syslog for Socorro, and we've seen instances of it dropping messages when doing load testing recently (both on a production node and also on staging). This is not a remote syslog, the messages are being sent from the local machine. We aren't currently monitoring disk activity with ganglia (I'll file a bug for this), but using atop on production, even outside of peak times, I see that syslogd, kjournald and httpd together (mostly the first two) frequently keep the disk at 100% utilization. So a couple things: * we should figure out which messages we can't possibly live without, and log them in a way which can't get dropped like syslog-ng, directly to file ** I think buffering would be OK in most cases * we should make sure the "INFO" level has all the messages we need, and disable "DEBUG" in production. This should decrease the overall volume of log traffic, most of which we don't need.

Laura Thomson :laura

Comment 1

•

14 years ago

Specifically, we need a list of things that are DEBUG that should be INFO. (Lars?)

Robert Helmer [:rhelmer]

Assignee

Comment 2

•

14 years ago

Attached patch provide socorro option for logging directly to local socket (obsolete) — Details — Splinter Review

Default to logging to local socket rather than UDP. May want to add TCP in the future, but I think we might be better off always logging to local and letting the local syslogd decide what to do.

Assignee: nobody → rhelmer

Status: NEW → ASSIGNED

Attachment #504283 - Flags: review?(lars)

Robert Helmer [:rhelmer]

Assignee

Comment 3

•

14 years ago

Attached patch provide socorro option for logging directly to local socket — Details — Splinter Review

Fixed errant comment.

Attachment #504283 - Attachment is obsolete: true

Attachment #504285 - Flags: review?(lars)

Attachment #504283 - Flags: review?(lars)

K Lars Lohn [:lars] [:klohn]

Updated

•

14 years ago

Attachment #504285 - Flags: review?(lars) → review+

Robert Helmer [:rhelmer]

Assignee

Comment 4

•

14 years ago

Landed: r2881 Fixed mispelling of "transport" in config: r2882

Robert Helmer [:rhelmer]

Assignee

Comment 5

•

14 years ago

Have done some testing on staging and PHX pre-production and this seems to work. I would like to see this under high volume before we mark this VERIFIED (make sure that # of submitted crashes match # of received crashes). (In reply to comment #0) > We aren't currently monitoring disk activity with ganglia (I'll file a bug for > this), but using atop on production, even outside of peak times, I see that > syslogd, kjournald and httpd together (mostly the first two) frequently keep > the disk at 100% utilization. We are now monitoring disk activity with ganglia now in both PHX and SJC. PHX seems much better in this regard, it's probably either something about RHEL6 versus RHEL5 (like rsyslog) or perhaps just better hardware in PHX. We can figure this out when we turn SJC into staging. > * we should figure out which messages we can't possibly live without, and log > them in a way which can't get dropped like syslog-ng, directly to file > ** I think buffering would be OK in most cases We use local socket now, and could easily add TCP if we need to (I think letting local syslogd handle remoting is simpler for us right now). > * we should make sure the "INFO" level has all the messages we need, and > disable "DEBUG" in production. This should decrease the overall volume of log > traffic, most of which we don't need. I'm not sure we should worry about this after all, it's very useful to have detailed info in production for troubleshooting and debugging live problems. Also the nagios monitors currently depend on these files being constantly written to.

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

13 years ago

Component: Socorro → General

Product: Webtools → Socorro

You need to log in before you can comment on or make changes to this bug.

Bugzilla

syslog drops socorro messages

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

People

(Reporter: rhelmer, Assigned: rhelmer)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Updated

Attachment

General

Description

File Name

Content Type