because it is not running ;-) root 9757 0.5 0.0 0 0 ? Zs 21:05 0:00 [hekad] <defunct> Should this be running and collecting log info?
This should be collecting log info, yes. It's supposed to replace the statsd implementation, at least. Benson (or Bob), can you confirm it's setup correctly?
Changed the title to be more accurate. We need to get logging and aggregation all hooked up.
Summary: Investigate hekad on Loop-Server Stage environment → Get hekad running on Loop-Server Stage environment
:alexis what encoding (json?) and shema (fxa?) is loop outputting logs to? This is our preferred way: - loop server outputs logs on stdout using JSON and a predefined schema - circus writes stdout logs to a file - heka tails log file, parses and sends to our main heka aggregation point - we configure the aggregation point to do something with ie: aggregate metrics, send to elasticsearch, etc
We're currently using heka as a transport for statsd. We don't output any json to stdout, we're using sentry for logging instead. What kind of json logs are you usually sending to stdout in other projects? (Also, that could be useful to have this way of doing things — and all our best practices when it comes to deployment) defined somewhere.
Flags: needinfo?(bobm) → needinfo?(bwong)
Actually we have a stastd/graphite server that loop logs to directly. There is no heka there. We just pushed a change where the nginx logs will be sent via heka to elasticsearch. We don't have a standard schema for JSON logging as it is highly application dependent. However, OpSec does have a schema they use for application level logs  for mozdev.  http://mozdef.readthedocs.org/en/latest/usage.html#json-format
:mostlygeek this is most unusual Can you drop the Prod and Stage shared links to Graphite here so we can verify this is actually working?
This seems so .... empty: https://graphite.shared.us-west-2.prod.mozaws.net What's the Stage version of this link?
Also, hmmmm.... this appears to use Persona just to get the dashboard to show. But Graphite has its own account system - do we need logins for graphite also?
The new loop-server w/ heka for nginx log shipping is deployed on stage now. Here are some appropriate URLs for monitoring stuff: - https://graphite.shared.us-east-1.stage.mozaws.net (statsd data) - https://heka.shared.us-east-1.stage.mozaws.net (shared heka) - https://kibana.shared.us-east-1.stage.mozaws.net (kibana for looking at the elastic search data) The prod endpoints when things are all hooked up are: - https://graphite.shared.us-west-2.prod.mozaws.net (statsd data) - https://heka.shared.us-west-2.prod.mozaws.net (shared heka) - https://kibana.shared.us-west-2.prod.mozaws.net (kibana for looking at the elastic search data) Also I decided to use our graphite/statsd stack for the statsd output. Shipping it through heka just added extra complexity.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Oh I forgot, for sentry: - https://sentry.shared.us-east-1.stage.mozaws.net - https://sentry.shared.us-west-2.prod.mozaws.net
I've got user not authorized error so far on these links.
(As a note, that's working for me, see with :whd about that Rémy)
They all work for me, but here is one very important correction: WRONG: - https://sentry.shared.us-east-1.stage.mozaws.net - https://sentry.shared.us-west-2.prod.mozaws.net CORRECT: - http://sentry.shared.us-east-1.stage.mozaws.net - http://sentry.shared.us-west-2.prod.mozaws.net :mostlygeek did we want the sentry links to be http or https?
OK, with Stage load running, this site gets populated now: https://graphite.shared.us-east-1.stage.mozaws.net Look under Graphite to see new subcategories: carbon, stats, stats_counts, statsd This site is updated in real time now: https://heka.shared.us-east-1.stage.mozaws.net/ These are now working: https://kibana.shared.us-east-1.stage.mozaws.net https://kibana.shared.us-east-1.stage.mozaws.net/index.html#/dashboard/file/loop_http_status.json The percentage error graph (colored circle) is very nice.. This is also showing updates... https://heka.shared.us-east-1.stage.mozaws.net/#sandboxes/LoopHTTPStatus/outputs/LoopHTTPStatus.HTTPStatus.cbuf This also looks good: http://sentry.shared.us-east-1.stage.mozaws.net/loop/loop-stage/ Will assume same for Prod.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.