Bug 1437646 (bmo-cloud-logs)

Support Mozlog logs using Log::Log4perl

RESOLVED FIXED

Status

()

RESOLVED FIXED
9 months ago
9 months ago

People

(Reporter: dylan, Assigned: dylan)

Tracking

Production

Details

Attachments

(4 attachments, 2 obsolete attachments)

(Assignee)

Description

9 months ago
SSIA
(Assignee)

Updated

9 months ago
Alias: bmo-cloud-logs
Assignee: dylan → glob
putting this back with dylan; he's already invested some time against this, and there's some hairy corners to consider:

<dylan> but even if we say "Bugzilla->audit() should output json", there are some potentially weird interactions with apache's stderr error log
Assignee: glob → dylan
(Assignee)

Comment 2

9 months ago
details:

1. using Apache2::ServerRec::warn() and a JSON::XS object with the ->ascii() option, everything will probably be fine. When looking at the apache code I got the impression that for very large strings it might insert newlines. It's not clear if each structured log is read by something that does json-sensitive buffering or not. I kinda wanted to test this.
2. the only thing I'd want to avoid doing is hooking into __DIE__ and/or __WARN__ which are bad things we should never used.
3. Alternatively we could run syslog from the container; this seemed like major clown shoes.
(Assignee)

Comment 3

9 months ago
Oh! the other problem / unknown: what happens when 20 or 30 workers all write a structured json message at the same time? 

if we're using Apache2::ServerRec::warn() and that is hooked into a stderr filehandle does it do any locking? 
Intermixing freeform text lines is one problem, doing this with json is another problem. :-/
it's worth thinking about initially running syslog in the container (clown shoes and all) so we can unblock this with minimal fuss - ie. write something that consumes our existing audit entries from syslog and publishes as mozlog.

then once we're over the deployment hump look at smarter ways to do this that may involve more work (eg. using SNS).
(Assignee)

Comment 5

9 months ago
syslog seems to get interwoven messages if is set to deliver to stdout, but apache (surprisingly) handles this properly.

The remaining problem is that each call to apache's logging function is limited to 8157 bytes.
If you break it up over multiple messages, then you lose ordering.
(Assignee)

Comment 6

9 months ago
Created attachment 8952063 [details] [review]
PR
Attachment #8952063 - Flags: feedback?(glob)
(Assignee)

Comment 7

9 months ago
Are y'all aware that multiple processes writing to the same file descriptor in amounts over 4k are not atomic on linux? e.g. any log over 4k is potentially corruptible?
Flags: needinfo?(ckolos)
Flags: needinfo?(bobm)
(Assignee)

Comment 8

9 months ago
(if and only if the file descriptor is a pipe, which is true for stdout to docker logs)
Do we change PIPE_BUF in production somehow?
(Assignee)

Comment 9

9 months ago
stdbuf might be able to change this behavior, but that means this will still fail on alpine docker containers because that doesn't use glibc.

My recollections are correct:
https://www.notthewizard.com/2014/06/17/are-files-appends-really-atomic/

Also here is a badly-behaved WSGI: https://gist.github.com/dylanwh/dbb7acc97151503ef061f0895ec0d137

For the moment we can live with apache's almost-8k limit, which is better than existing services.

Fixing post-apache will involve either a custom log aggregator in the container, or syslog-ng
(which has a configurable log limit).
(Assignee)

Comment 10

9 months ago
Okay, so logging. Can we have our own log type ("audit") for getting the audit entries into nightwish or whatever the logging thing is called?

{ Timestamp => $nanoseconds, Type => "audit", Fields => { message => "audit message" } }
?
(Assignee)

Comment 11

9 months ago
s/nightwish/hindsight/
Clearing the needinfo for :ckolos and I, as we talked about this in a meeting.  The general Dockerflow guidance is to use Syslog.  Additional discussion here: https://github.com/mozilla-services/Dockerflow/issues/31
Flags: needinfo?(ckolos)
Flags: needinfo?(bobm)
Adding this because we talked about it during that meeting: The maximum message size through Kinesis and Firehose is 1MB including a newline record separator.  So imposing a limit of 1MB is sensible.
(In reply to Dylan Hardison [:dylan] (he/him) from comment #10)
> Okay, so logging. Can we have our own log type ("audit") for getting the
> audit entries into nightwish or whatever the logging thing is called?
> 
> { Timestamp => $nanoseconds, Type => "audit", Fields => { message => "audit
> message" } }
> ?

:hwoo can you provide some guidance here?
Flags: needinfo?(hwoo)

Comment 15

9 months ago
I think this should be fine.  The type field is only used for kibana/elasticsearch filtering

https://wiki.mozilla.org/Firefox/Services/Logging
Flags: needinfo?(hwoo)
Comment on attachment 8952063 [details] [review]
PR

feedback was provided over irc
Attachment #8952063 - Flags: feedback?(glob)
(Assignee)

Updated

9 months ago
Summary: Redirect syslog messages to stderr in mozlog format → Redirect syslog messages to stdout in mozlog format
(Assignee)

Comment 17

9 months ago
Created attachment 8953998 [details] [review]
Stdout redirector part
(Assignee)

Comment 18

9 months ago
Created attachment 8954180 [details] [review]
PR 1
Attachment #8952063 - Attachment is obsolete: true
Attachment #8953998 - Attachment is obsolete: true
(Assignee)

Updated

9 months ago
Status: NEW → RESOLVED
Last Resolved: 9 months ago
Resolution: --- → FIXED
(Assignee)

Updated

9 months ago
Duplicate of this bug: 426463
(Assignee)

Updated

9 months ago
Summary: Redirect syslog messages to stdout in mozlog format → Support Mozlog logs using Log::Log4perl
You need to log in before you can comment on or make changes to this bug.