Closed Bug 1546517 Opened 6 years ago Closed 5 years ago

start sending hardware worker logs to stackdriver

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dhouse, Assigned: dhouse)

References

Details

We're moving worker logs into stackdriver. We can do this for the hardware workers from the log-aggregators, and have stackdriver and papertrail both as output targets.

  1. determine where the logs should go (are we organized?)
  2. get access set up for a service account (get auth)
  3. configure hardware log-aggregators to forward logs to this new place also

Sending logs into stackdriver from an external host requires secrets on that client. Or we can set up a proxy with the secrets (like our log-aggregators).

And it all goes through the api (no syslog receiver):

https://cloud.google.com/logging/docs/reference/v2/rpc/google.logging.v2#google.logging.v2.LoggingServiceV2.WriteLogEntries
rpc WriteLogEntries(WriteLogEntriesRequest) returns (WriteLogEntriesResponse)

Writes log entries to Logging. This API method is the only way to send log entries to Logging. This method is used, directly or indirectly, by the Logging agent (fluentd) and all logging libraries configured to use Logging. A single request may contain log entries for a maximum of 1000 different resources (projects, organizations, billing accounts or folders)

https://cloud.google.com/logging/docs/reference/v2/rest/v2/entries/write
POST https://logging.googleapis.com/v2/entries:write

The URL uses gRPC Transcoding syntax.

examples:

  • container with auth in env
    https://github.com/siriscac/edge-stackdriver-agent
    sets up a container for:
    syslog input -> syslog file
    tail on syslog file -> wrap into api call (js client) with auth

  • do the same in gce with a role
    similarly, gce instances/containers have agents and could be given auth without putting the auth into the environment/disk. A machine in gce could serve as a proxy/agent
    google's logging agent (based on fluentd) can receive tcp input structured logging or watch/tail syslog (like the js script above)

So we can use google's modified fluentd, which uses the api, or we can use the api directly.

the google-cloud-agent package is available with and without a default config (watching standard log files): https://dl.google.com/cloudagents/install-logging-agent.sh
So that may be the simplest to install it without default input config on the log-aggregators, and then configure it to perform as a syslog receiver (like https://www.fluentd.org/guides/recipes/rsyslogd-aggregation), and make sure the hostnames of the remote hosts forwarding to there are preserved.

there is a fluent-bit plugin for stackdriver also:
https://github.com/fluent/fluent-bit-docs/blob/master/output/stackdriver.md

However, there is a note that the officially supported impl. is the ruby gem:
"Stackdriver officially supports a logging agent based on Fluentd."

treasuredata's notes suggust fluentd is better for aggregation and bit it better for forwarding (https://docs.fluentbit.io/manual/about/fluentd_and_fluentbit)
and when we're doing both? probably fluentd since it is older and the plugin is "official"

Depends on: 1523744

I'm working on the set up of fluentd on the log aggregators. It "just worked" in a centos6 docker container, but I had gcc and ruby-dev available:

centos package http://packages.treasuredata.com.s3.amazonaws.com/3/redhat/6/x86_64/td-agent-3.4.1-0.el6.x86_64.rpm
td-agent installs without problems

however, the fluent-plugin-google requires the json gem which in a native build requiring gcc/build-tools. I'm looking at options to package the gem with the binaries for json instead of requiring the build on every machine.

Assignee: relops → dhouse

for windows, the logging client is still "beta" but it is the official client:
https://cloud.google.com/logging/docs/agent/installation#joint-install

It looks like it is self-contained and doesn't have any dependencies! :) although it looks like it is fluentd the same as the client for linux ( https://github.com/GoogleCloudPlatform/google-fluentd).

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.