Investigate having new CEP instances start from the beginning of the current day

NEW
Unassigned

Status

Data Platform and Tools
Pipeline Ingestion
P3
normal
a year ago
2 months ago

People

(Reporter: trink, Unassigned)

Tracking

Details

(Reporter)

Description

a year ago
This would require tracking Kafka checkpoints for each topic on the day boundary and giving the CEPs the ability to access the information and start the stream from that location.  On a catastrophic CEP failure (instance termination) this would also serve as and auto backfill so near real time reporting can be lossless without any manual Ops intervention.

Updated

a year ago
Points: --- → 3
Priority: -- → P2
(Reporter)

Updated

9 months ago
Component: Metrics: Pipeline → Pipeline Ingestion
Product: Cloud Services → Data Platform and Tools

Updated

8 months ago
Priority: P2 → P3

Comment 1

2 months ago
This functionality is supported in kafka as of the 0.11 release, which we are running: https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling, so so we could probably get the CEP set up to autobackfill data on bad restarts/instance terminations etc.
You need to log in before you can comment on or make changes to this bug.