Bug 1437646 (bmo-cloud-logs)

Support Mozlog logs using Log::Log4perl

RESOLVED FIXED

Status

()

enhancement
RESOLVED FIXED
Last year
Last year

People

(Reporter: dylan, Assigned: dylan)

Tracking

Production
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 2 obsolete attachments)

45 bytes, text/x-github-pull-request
Details | Review
45 bytes, text/x-github-pull-request
Details | Review
45 bytes, text/x-github-pull-request
Details | Review
45 bytes, text/x-github-pull-request
Details | Review
Alias: bmo-cloud-logs
Assignee: dylan → glob
putting this back with dylan; he's already invested some time against this, and there's some hairy corners to consider:

<dylan> but even if we say "Bugzilla->audit() should output json", there are some potentially weird interactions with apache's stderr error log
Assignee: glob → dylan
details:

1. using Apache2::ServerRec::warn() and a JSON::XS object with the ->ascii() option, everything will probably be fine. When looking at the apache code I got the impression that for very large strings it might insert newlines. It's not clear if each structured log is read by something that does json-sensitive buffering or not. I kinda wanted to test this.
2. the only thing I'd want to avoid doing is hooking into __DIE__ and/or __WARN__ which are bad things we should never used.
3. Alternatively we could run syslog from the container; this seemed like major clown shoes.
Oh! the other problem / unknown: what happens when 20 or 30 workers all write a structured json message at the same time? 

if we're using Apache2::ServerRec::warn() and that is hooked into a stderr filehandle does it do any locking? 
Intermixing freeform text lines is one problem, doing this with json is another problem. :-/
it's worth thinking about initially running syslog in the container (clown shoes and all) so we can unblock this with minimal fuss - ie. write something that consumes our existing audit entries from syslog and publishes as mozlog.

then once we're over the deployment hump look at smarter ways to do this that may involve more work (eg. using SNS).
syslog seems to get interwoven messages if is set to deliver to stdout, but apache (surprisingly) handles this properly.

The remaining problem is that each call to apache's logging function is limited to 8157 bytes.
If you break it up over multiple messages, then you lose ordering.
Posted file PR (obsolete) —
Attachment #8952063 - Flags: feedback?(glob)
Are y'all aware that multiple processes writing to the same file descriptor in amounts over 4k are not atomic on linux? e.g. any log over 4k is potentially corruptible?
Flags: needinfo?(ckolos)
Flags: needinfo?(bobm)
(if and only if the file descriptor is a pipe, which is true for stdout to docker logs)
Do we change PIPE_BUF in production somehow?
stdbuf might be able to change this behavior, but that means this will still fail on alpine docker containers because that doesn't use glibc.

My recollections are correct:
https://www.notthewizard.com/2014/06/17/are-files-appends-really-atomic/

Also here is a badly-behaved WSGI: https://gist.github.com/dylanwh/dbb7acc97151503ef061f0895ec0d137

For the moment we can live with apache's almost-8k limit, which is better than existing services.

Fixing post-apache will involve either a custom log aggregator in the container, or syslog-ng
(which has a configurable log limit).
Okay, so logging. Can we have our own log type ("audit") for getting the audit entries into nightwish or whatever the logging thing is called?

{ Timestamp => $nanoseconds, Type => "audit", Fields => { message => "audit message" } }
?
s/nightwish/hindsight/
Clearing the needinfo for :ckolos and I, as we talked about this in a meeting.  The general Dockerflow guidance is to use Syslog.  Additional discussion here: https://github.com/mozilla-services/Dockerflow/issues/31
Flags: needinfo?(ckolos)
Flags: needinfo?(bobm)
Adding this because we talked about it during that meeting: The maximum message size through Kinesis and Firehose is 1MB including a newline record separator.  So imposing a limit of 1MB is sensible.
(In reply to Dylan Hardison [:dylan] (he/him) from comment #10)
> Okay, so logging. Can we have our own log type ("audit") for getting the
> audit entries into nightwish or whatever the logging thing is called?
> 
> { Timestamp => $nanoseconds, Type => "audit", Fields => { message => "audit
> message" } }
> ?

:hwoo can you provide some guidance here?
Flags: needinfo?(hwoo)

Comment 15

Last year
I think this should be fine.  The type field is only used for kibana/elasticsearch filtering

https://wiki.mozilla.org/Firefox/Services/Logging
Flags: needinfo?(hwoo)
Comment on attachment 8952063 [details] [review]
PR

feedback was provided over irc
Attachment #8952063 - Flags: feedback?(glob)
Summary: Redirect syslog messages to stderr in mozlog format → Redirect syslog messages to stdout in mozlog format
Posted file Stdout redirector part (obsolete) —
Posted file PR 1
Attachment #8952063 - Attachment is obsolete: true
Attachment #8953998 - Attachment is obsolete: true
Posted file PR 2
Posted file PR 3
Posted file PR 4
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Duplicate of this bug: 426463
Summary: Redirect syslog messages to stdout in mozlog format → Support Mozlog logs using Log::Log4perl
You need to log in before you can comment on or make changes to this bug.