Closed Bug 1416826 Opened 8 years ago Closed 8 years ago

Investigate the Nov 12 prod cep pipeline outage

Categories

(Data Platform and Tools Graveyard :: Operations, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: trink, Assigned: trink)

Details

It appears the production pipeline CEP failed at 7pm on Nov 12 and was restarted at 8pm
Assignee: nobody → mtrinkala
Status: NEW → ASSIGNED
Points: --- → 2
It was a Hindsight segfault (no details) # Re-Review of the packages deployed on Friday ## lua_sandbox changes - new message matcher functionality (not used on the pipeline CEP) ## Hindsight changes - startup directory checks and rtc relocations (would fail on deployment during initial start) - thread id lookup (no deployment happening at that time and plugin deployment/re-deployment has been tested on that box) ## lua_sandbox_extensions = the only thing the pipeline cep uses that changed is alert.lua (change would not segfault) # Activities of interest - There was a kafka upgrade/repartition on Sunday # Conclusion - The failure is not due to the current release... best guess: the kafka input, sadly this only narrows it down to: 1. https://github.com/edenhill/librdkafka 2. https://github.com/mozilla-services/lua_sandbox_extensions/tree/master/kafka Created an issue in the for the Kafka extension https://github.com/mozilla-services/lua_sandbox_extensions/issues/181
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.