Closed Bug 1183246 Opened 5 years ago Closed 1 year ago
We frequently parse logs twice
It seems like we're occasionally parsing the same log twice in treeherder (see bug 1182282). The current "protection" we have against this happening consists only of seeing if the log in question has already been parsed: https://github.com/mozilla/treeherder/blob/master/treeherder/log_parser/tasks.py#L26 https://github.com/mozilla/treeherder/blob/master/treeherder/log_parser/utils.py#L24 ... however that doesn't account for the case that a log parse is already in progress. I wonder if, before starting to actually parse the log, we should set a log parse status of "parsing" (and not parse the log if that status is set). Thoughts? Other solutions?
This might make an interesting bug for :moijes12 to work on, depending on what others say.
I agree. This seems like a good first bug, to be sure.
It occurred to me later that it might be a problem if log parsing got accidentally interrupted and no retry was scheduled. Then logs could be stuck in the "parsing" state indefinitely. Do we need to set some kind of timeout here?
I think memcached entries for logs in progress is the answer here: no DB table churn, and expiry for free.
Not convinced this is a good first bug; we need to look at all root causes (eg repeat parsing should only happen if say parsing is slow and the user selects the job in the UI, thus causing an _hp task to be triggered; or are we broken in other ways?).
If a parse-log task fails, we have a retry policy in place.
The current celery setup doesn't allow multiple workers to pick the same task. When a worker acknowledges a task, that's removed from the queue. In case of catchable failure, the retry mechanism triggers a new task that is then picked up by one of the workers in the pool as if it was new. Ideally all the tasks that we want to be able to retry should be idempotent. If that's not the case, it may be a problem.
Component: Treeherder: Data Ingestion → Treeherder: Log Parsing & Classification
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.