Closed
Bug 1125088
Opened 10 years ago
Closed 10 years ago
Ensure log parser doesn't re-parse the log if another task in the queue has already done so
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P2)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: emorley, Unassigned)
References
Details
Similar to the missing pushlog ingestion task, we should also check that we don't unnecessarily repeat the task, if we've performed it since.
ie in this scenario:
1) Log parse for job X is scheduled, but hasn't completed promptly.
2) User clicks on job X in the Treeherder UI, parsed log isn't available, so high priority on-demand log parsing task is scheduled.
3) Potentially another user does the same as #2 on a different machine.
4) We now (potentially) have multiple duplicate entries in the high priority log parse queue, plus the original normal-priority task in the queues.
5) One of the tasks completes, leaving several redundant tasks, that ideally should be no-ops.
We may handle this case already - but it's worth checking - since if we don't, it can massively compound a log parser backlog - since we end up with hundreds of high priority tasks from people clicking in the UI to have to deal with - which will get handled before the actual backlog.
Reporter | ||
Updated•10 years ago
|
Summary: Check log parsing doesn't do busy work if we've since parsed the log (eg via on demand parsing) → Check log parser doesn't re-parse the log if another task in the queue has already done so
Reporter | ||
Comment 1•10 years ago
|
||
Seems like we should be ok, given:
https://github.com/mozilla/treeherder-service/blob/6bf711fb4a50e1bd88751b8a1d87b2ae6f789c16/treeherder/log_parser/tasks.py#L27
Comment 2•10 years ago
|
||
Yeas that's what guarantees the no-op, although there could be a case when the log parser keeps failing and that condition is not met
Reporter | ||
Comment 3•10 years ago
|
||
But we'll want to retry in many cases though right?
And in those that we don't that's bug 1125104 - ie don't retry 10 times if we hit an exception that should not be retried.
Presuming we're happy with this - we can close this bug now :-)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Summary: Check log parser doesn't re-parse the log if another task in the queue has already done so → Ensure log parser doesn't re-parse the log if another task in the queue has already done so
Reporter | ||
Comment 4•10 years ago
|
||
Oh but I see what you mean - even for the "we should retry" case - eg ftp.m.o timeout - we can end up with multiple duplicate tasks in log_parser_hp (plus the original in log_parser), which will _all_ retry 10 times each, hammering ftp.m.o.
Plus of course the more obvious "two tasks racing, both end up running, but both succeed, so at least we don't retry" case.
You need to log in
before you can comment on or make changes to this bug.
Description
•