This would require tracking Kafka checkpoints for each topic on the day boundary and giving the CEPs the ability to access the information and start the stream from that location. On a catastrophic CEP failure (instance termination) this would also serve as and auto backfill so near real time reporting can be lossless without any manual Ops intervention.
Component: Metrics: Pipeline → Pipeline Ingestion
Product: Cloud Services → Data Platform and Tools
This functionality is supported in kafka as of the 0.11 release, which we are running: https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling, so so we could probably get the CEP set up to autobackfill data on bad restarts/instance terminations etc.
You need to log in before you can comment on or make changes to this bug.