Closed Bug 1031403 Opened 10 years ago Closed 10 years ago

Convert APK Monolith logs to Kibana JSON format

Categories

(Cloud Services :: Server: Other, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ozten, Assigned: trink)

References

Details

Rob thanks for all your help!

For the sake of progress, this bug will deal with the simple parts of discussed in Bug#973079 Comment#2.

We need a custom decoder to produce 3 streams from the monolith.log currently being written  in stage and production on the APK Factory controller servers.

These streams are in the Kibana JSON format.

All lines in monolith.log are CSV formatted, starting with a type and ending with a timestamp.

1) log lines prefixed with "apk-install" should produce a stream which will eventually be used in Kibana to create a "Total number of APKs installed per day"

Example log line:

    apk-install,"https://ozten.github.io/misfeasance/manifest.webapp",2014-04-21T22:03:52.402Z

These fields are string, url, timestamp

2) log lines prefixed with "apk-update-apps-installed" should produce a stream used for "Active Daily Users per day", as each log line is a 24 hour ping from android.

Example log line:

    apk-update-apps-installed,4,2014-04-21T22:18:45.611Z

These fields are string, integer, timestamp

3) Prefix "apk-update" should produce a stream used for "Total number of times an out of date APKs is reported"

Example log line:

    apk-update,f549b671-bb06-4f24-a03c-048a9e147ff4,"http://deltron3030.testmanifest.com/manifest.webapp",1398118689,2014-04-21T22:18:45.605Z

These fields are string, UUID, url, unix timestamp, timestamp
Blocks: 973079
Assignee: nobody → mtrinkala
The decoder has been created and the puppet_config for the apk_factory edge node has been updated. However, you don't actually need Kibana to meet any of the above requirements.  
1) Do you want me to add the configs to the shared Heka aggregator to produce those views?
1a) If so, do you still want the data loaded into ElasticSearch?  
2) Do you want any monitoring/anomaly detection/alerting on this data?
Flags: needinfo?(ozten.bugs)
Thanks for the fast turn around!

> 1) Do you want me to add the configs to the shared Heka aggregator to produce those views?

This sounds great!

> 1a) If so, do you still want the data loaded into ElasticSearch?  

Nah.

> 2) Do you want any monitoring/anomaly detection/alerting on this data?

I don't think so, unless you want to add that.
Flags: needinfo?(ozten.bugs)
I suspect that the raw Heka dashboard view might not be exactly what you're looking for in terms of visual presentation at the moment. We have real time line graphs but no histograms or other types of visual widgets, currently. Plus you can only see a single graph at a time. We can build other static pages that embed multiple dashboard graphs into a single view, but we have to do that separately.

Also it's worth mentioning that Heka itself isn't doing any persistence. The dashboard will give you real time data, but there's no way to go back in time and say "show me what this graph looked like 6 months ago". I suspect that's a requirement, which would bring us back to needing to ship the data into a persistent data store like ES.

I might be wrong, though, and Heka's dashboard might be enough. At the very least, it doesn't hurt to have it as well, to provide a different flavor or visibility into the data. So we'll go ahead and create it (it's easy) and show you what we have, and once you see that we can talk about the ways that it does or doesn't meet your needs.
> Plus you can only see a single graph at a time...
> Also it's worth mentioning that Heka itself isn't doing any persistence...

This is acceptable to get going.

> At the very least, it doesn't hurt to have it as well, to provide a different flavor or visibility into the data. So we'll go ahead and create it (it's easy) and show you what we have, and once you see that we can talk about the ways that it does or doesn't meet your needs.

Thanks for being flexible!

bwalker can comment on persistence and multiple graphs per page, but I think this first step will give us a lot of value.
Awesome! Expectations managed. ;)
Clarification: It has persistence but you cannot backfill data before the point the plugin was originally started (although I just recently added "fast backfill" to some FxA plugins against my better judgement ;))
Changes:
- Requested items
    - A Monolith decoder was created (apk_monolith.lua)
    - A Monolith counter plugin was created (apk_monolith_counters.lua); it is just a simple count of the number of each type of event.
        - Active daily user analysis must assume each request is only sent once a day since there is no unique identifer to differentiate the requests.

- Additional items
    - The deprecated PayloadJsonDecoder has been replaced with a SandboxDecoder (apk_json.lua)
    - The APK Nginx access and error logs are now processed
    - The puppet configs are now generated for the specific instances of the APK Factory (generator/controller)
    - A Monolith 'top installs' plugin has been added.  This produces a list of the most frequent installs by URL within a UTC day.
    - An HttpStatus plugin was added to analyze the Nginx access logs; it graphs the number of responses in each status category (200. 300, etc.).  Anomaly detection and alerting have been disabled at the moment.
Any ETA on when this deployment will go out?
Flags: needinfo?(mtrinkala)
:ozten aggregator configs went out a week or two ago, and :jason expects a production APK deploy to happen tomorrow 2014-07-21.
Flags: needinfo?(mtrinkala)
So Heka client configs go out tomorrow and then graphs like this will start to have data:
https://heka.shared.us-west-2.prod.mozaws.net/#sandboxes/ApkMonolithCounters/outputs/ApkMonolithCounters.DailyInstalls.cbuf

Thanks whd!
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.