browserid: coordinate log collection with metrics

VERIFIED FIXED

Status

task
VERIFIED FIXED
8 years ago
8 years ago

People

(Reporter: petef, Assigned: petef)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qa-])

talk to metrics and figure out where we need to dump browserid logs, and which logs. firewall rules, cronjobs, etc.
Whiteboard: [qa-]
thunder, do you know who our metrics contact is? I'd like to coordinate what logs we'll have to collect, and how we're going to do the analysis and ship to metrics (or whatever is needed).
Assignee: nobody → petef
Status: NEW → ASSIGNED
Cc Lloyd & Anurag, they can help you out.
@petef: 
Currently, we have a cron-job that copies logs stored on im-log03 to our data-warehouse for further analysis via Hadoop and Hive. 
I assume browserid logs are different wrt privacy and other issues. 
Can u provide more details in terms of log collection, few questions that come to my mind:
 
* Do you want to store the logs for offline analysis?
* What type of analysis would you want to perform on the logs, ad-hoc, nightly stat jobs, anything else?
* Whats the retention policy for logs?
What's the path on im-log03, and which log files do you guys need? just the browserid-metrics and verifier-metrics logs from the webheads, or zeus access logs too?

I'll defer to thunder on types of analysis. I just want to maintain whatever existing browserid metrics dashboards are there when we switch servers.
@petef: 
(some background on how things happen right now)
The identity logs (browserid-metrics & verifier-metrics) aren't pushed to im-log03 for privacy reasons. They currently reside on a separate VM: browserid-stats1.vm1.labs.sjc1.mozilla.com, are processed by Kettle script nightly and the aggregate numbers are pushed to populate the dashboards https://metrics.mozilla.com/pentaho/content/pentaho-cdf-dd/Render?solution=metrics2&path=identity/&file=identity.wcdf


The other aspect relates to logs collected via the website: https://browserid.org/ which should be standard Apache logs. Metrics can use that data to find # of 404's, 200's etc, click counts, and other web-analytic details. AFAIK, these logs aren't being written to im-log03. We'll need access to those logs so we can build dashboards related to funnel drop-off for identity signup and other analysis.

AFAIK, the logs stored @ VM are pretty confidential for privacy reasons and only restricted few should have access to them. Same might be true for weblogs, and I defer to Dan for further insight. 

I am assuming we will need to ship the browserid/verifier logs to a lock-down machine, ditto for weblogs unless Dan says otherwise.
Note that as currently deployed, any Zeus logs that BrowserID writes in production are pushed automatically to both im-log02 and im-log03, same as every other application.

To avoid this, reconfigure the BrowserID vservers to write logs somewhere other than /var/log/zeus/.
Depends on: 711082, 711237
working, and documented.
https://intranet.mozilla.org/Services/Ops/BrowserID/Pentaho
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Verified by documentation on link above.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.