When there is an ES outage, pulse messages are lost. The logparser uses a PulseBuildMonitor to get notifications, but PulseBuildMonitor.pulse_message_received() immediately acks the message before passing it onto the logparser. It would be great if we could somehow change it to only ack if the callback was successful One complication with this is that two callbacks can be called (on_pulse_message() and one other of the on_*() methods). If the second one fails and we don't ack the message, then the first will be called a second time. Also we will have to craft some retry logic, since the message will be sitting in the queue until we successfully write to the db.
Product: Testing → Tree Management
Wontfix in favour of OrangeFactor v2 which will consume Treeherder's API, and likely be written from scratch.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.