Closed Bug 993914 Opened 11 years ago Closed 11 years ago

Set up long term storage of appropriate data

Categories

(Content Services Graveyard :: Tiles: Ops, defect)

defect
Not set
normal
Points:
3

Tracking

(Not tracked)

RESOLVED FIXED
Iteration:
37.1

People

(Reporter: Mardak, Assigned: relud)

References

Details

(Whiteboard: .005)

We only want to persist certain types of data for various types of analysis. oyiptong? Not sure how this relates/blocks which other bugs.
Flags: needinfo?(oyiptong)
Flags: firefox-backlog?
Component: General → Content Services
Flags: firefox-backlog? → firefox-backlog+
Bryan mentions storing the logs for up to 180 days, which I envision is relevant to this bug. The long term storage can also be used to store summarized data, by-products of the raw logs. These would be useful for our data scientist.
Flags: needinfo?(oyiptong)
Our servers generates a few logs and aggregates: ## raw data These will have a TTL of 7 days 1. front-end nodes store data locally and ship them to AWS S3 2. a server pulls the data from S3 and stores into DDFS (a distributed filesystem) 3. data on DDFS is used to compute aggregates ## aggregates These will have a TTL of 13 months Aggregates computed from DDFS will be stored in AWS RedShift and is included in a table: impression_stats_daily This table contains the data that what will generate our reports for tiles and slots
## application and operational data We have operational metrics data, which are counters, streamed from front-end servers via a statsd-like server and stored 1. operational metrics sent to statsd-like infrastructure (heka) and stored into graphite. These include response times and counters incremented for dashboard/trend analysis 2. application logs tracking errors or success conditions clarkbw: what would be a good TTL for this data? It doesn't contain any PII
Flags: needinfo?(clarkbw)
(In reply to Olivier Yiptong [:oyiptong] from comment #3) > clarkbw: what would be a good TTL for this data? It doesn't contain any PII AFAIK Meta data about the service performance doesn't fall under any of our other TTL rules. I can investigate. How long would you think you need it for?
Flags: needinfo?(clarkbw)
(In reply to Bryan Clark (Firefox PM) [:clarkbw] from comment #4) > (In reply to Olivier Yiptong [:oyiptong] from comment #3) > > clarkbw: what would be a good TTL for this data? It doesn't contain any PII > > AFAIK Meta data about the service performance doesn't fall under any of our > other TTL rules. I can investigate. How long would you think you need it > for? Do you have a TTL for the Meta data?
Flags: needinfo?(oyiptong)
Product: Mozilla Services → Content Services
Status: NEW → RESOLVED
Closed: 11 years ago
Component: General → Tiles: Ops
Flags: needinfo?(oyiptong)
Resolution: --- → FIXED
Iteration: --- → 37.1
Flags: qe-verify-
Points: --- → 3
Whiteboard: .005
Assignee: nobody → dthornton
You need to log in before you can comment on or make changes to this bug.