Bug 1437646 (bmo-cloud-logs)

Support Mozlog logs using Log::Log4perl

RESOLVED FIXED

Status

()

enhancement
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: dylan, Assigned: dylan)

Tracking

Production

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 2 obsolete attachments)

45 bytes, text/x-github-pull-request
Details | Review
45 bytes, text/x-github-pull-request
Details | Review
45 bytes, text/x-github-pull-request
Details | Review
45 bytes, text/x-github-pull-request
Details | Review
SSIA
(Assignee)

Updated

a year ago
Alias: bmo-cloud-logs
Assignee: dylan → glob
putting this back with dylan; he's already invested some time against this, and there's some hairy corners to consider:

<dylan> but even if we say "Bugzilla->audit() should output json", there are some potentially weird interactions with apache's stderr error log
Assignee: glob → dylan
(Assignee)

Comment 2

a year ago
details:

1. using Apache2::ServerRec::warn() and a JSON::XS object with the ->ascii() option, everything will probably be fine. When looking at the apache code I got the impression that for very large strings it might insert newlines. It's not clear if each structured log is read by something that does json-sensitive buffering or not. I kinda wanted to test this.
2. the only thing I'd want to avoid doing is hooking into __DIE__ and/or __WARN__ which are bad things we should never used.
3. Alternatively we could run syslog from the container; this seemed like major clown shoes.
(Assignee)

Comment 3

a year ago
Oh! the other problem / unknown: what happens when 20 or 30 workers all write a structured json message at the same time? 

if we're using Apache2::ServerRec::warn() and that is hooked into a stderr filehandle does it do any locking? 
Intermixing freeform text lines is one problem, doing this with json is another problem. :-/
it's worth thinking about initially running syslog in the container (clown shoes and all) so we can unblock this with minimal fuss - ie. write something that consumes our existing audit entries from syslog and publishes as mozlog.

then once we're over the deployment hump look at smarter ways to do this that may involve more work (eg. using SNS).
(Assignee)

Comment 5

a year ago
syslog seems to get interwoven messages if is set to deliver to stdout, but apache (surprisingly) handles this properly.

The remaining problem is that each call to apache's logging function is limited to 8157 bytes.
If you break it up over multiple messages, then you lose ordering.
(Assignee)

Comment 6

a year ago
Posted file PR (obsolete) —
Attachment #8952063 - Flags: feedback?(glob)
(Assignee)

Comment 7

a year ago
Are y'all aware that multiple processes writing to the same file descriptor in amounts over 4k are not atomic on linux? e.g. any log over 4k is potentially corruptible?
Flags: needinfo?(ckolos)
Flags: needinfo?(bobm)
(Assignee)

Comment 8

a year ago
(if and only if the file descriptor is a pipe, which is true for stdout to docker logs)
Do we change PIPE_BUF in production somehow?
(Assignee)

Comment 9

a year ago
stdbuf might be able to change this behavior, but that means this will still fail on alpine docker containers because that doesn't use glibc.

My recollections are correct:
https://www.notthewizard.com/2014/06/17/are-files-appends-really-atomic/

Also here is a badly-behaved WSGI: https://gist.github.com/dylanwh/dbb7acc97151503ef061f0895ec0d137

For the moment we can live with apache's almost-8k limit, which is better than existing services.

Fixing post-apache will involve either a custom log aggregator in the container, or syslog-ng
(which has a configurable log limit).
Okay, so logging. Can we have our own log type ("audit") for getting the audit entries into nightwish or whatever the logging thing is called?

{ Timestamp => $nanoseconds, Type => "audit", Fields => { message => "audit message" } }
?
s/nightwish/hindsight/
Clearing the needinfo for :ckolos and I, as we talked about this in a meeting.  The general Dockerflow guidance is to use Syslog.  Additional discussion here: https://github.com/mozilla-services/Dockerflow/issues/31
Flags: needinfo?(ckolos)
Flags: needinfo?(bobm)
Adding this because we talked about it during that meeting: The maximum message size through Kinesis and Firehose is 1MB including a newline record separator.  So imposing a limit of 1MB is sensible.
(In reply to Dylan Hardison [:dylan] (he/him) from comment #10)
> Okay, so logging. Can we have our own log type ("audit") for getting the
> audit entries into nightwish or whatever the logging thing is called?
> 
> { Timestamp => $nanoseconds, Type => "audit", Fields => { message => "audit
> message" } }
> ?

:hwoo can you provide some guidance here?
Flags: needinfo?(hwoo)

Comment 15

a year ago
I think this should be fine.  The type field is only used for kibana/elasticsearch filtering

https://wiki.mozilla.org/Firefox/Services/Logging
Flags: needinfo?(hwoo)
Comment on attachment 8952063 [details] [review]
PR

feedback was provided over irc
Attachment #8952063 - Flags: feedback?(glob)
(Assignee)

Updated

a year ago
Summary: Redirect syslog messages to stderr in mozlog format → Redirect syslog messages to stdout in mozlog format
Posted file Stdout redirector part (obsolete) —
Posted file PR 1
Attachment #8952063 - Attachment is obsolete: true
Attachment #8953998 - Attachment is obsolete: true
Posted file PR 2
Posted file PR 3
Posted file PR 4
(Assignee)

Updated

a year ago
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
(Assignee)

Updated

a year ago
Duplicate of this bug: 426463
(Assignee)

Updated

a year ago
Summary: Redirect syslog messages to stdout in mozlog format → Support Mozlog logs using Log::Log4perl
You need to log in before you can comment on or make changes to this bug.