Closed Bug 1060339 Opened 10 years ago Closed 10 years ago

"errors" is empty for successful jobs whose logs contained failure lines

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

service PR198 - Parse logs for error lines even if the job was successful 10 years ago Ed Morley [:emorley] 54 bytes, text/x-github-pull-request	camd : review+ camd : review+	Details \| Review

Ed Morley [:emorley]

Assignee

Description

•

10 years ago

In an ideal world, we would never have a successful job that contained errors which our log parser regex matched against. Unfortunately this is not the case in practice, since: 1) There is sometimes spammy log output that isn't intended to make the job fail, but which false-positive matches against our log regex. 2) Sometimes the test/harness/buildbot/... is broken and the log error is real, and it should have caused the run to fail. In the case of #1, hiding these errors in successful runs means that when someone looks at the error summary for a failed run, they don't realise that a proportion of the failures they can see are actually present in _every_ job, not just the failed ones. In the case of #2, hiding the errors means we don't realise <foo> is broken and needs fixing. Now in TBPL, whilst we didn't parse every log by default, you could request one-off log parsing by just loading the brief log for a successful job. Aiui treeherder parses every log anyway (for things like buildstep times), so parsing for errors shouldn't be too much extra infra load.

Ed Morley [:emorley]

Assignee

Updated

•

10 years ago

Summary: "errors" is empty for successful jobs that contained failure log lines → "errors" is empty for successful jobs whose logs contained failure lines

Ed Morley [:emorley]

Assignee

Comment 1

•

10 years ago

Example: https://treeherder.mozilla.org/api/project/mozilla-central/artifact/?job_id=306570&name=Structured+Log Which is for: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64/1409256833/mozilla-central-linux64-valgrind-bm72-build1-build83.txt.gz Which contains multiple "TEST-UNEXPECTED-FAIL"s, which show in TBPL as: https://tbpl.mozilla.org/php/getParsedLog.php?id=46971648&tree=Mozilla-Central

Ed Morley [:emorley]

Assignee

Comment 2

•

10 years ago

https://github.com/mozilla/treeherder-service/blob/2da76b9d0db770190e60356c17d105a22dbb5536/treeherder/model/derived/jobs.py#L1959 if result != 'success': task['check_errors'] = True task['routing_key'] = 'parse_log.failures' else: task['check_errors'] = False task['routing_key'] = 'parse_log.success' https://github.com/mozilla/treeherder-service/blob/2ef3b13cb1ec73d6aabc7f0d3638e75d4b22fec7/treeherder/webapp/api/job_log_url.py#L73 def parse(self, request, project, jm, pk=None): """ Trigger an async task to parse this log. This can be requested by the ui in case the log parsing had an intermittent failure """ ... has_failed = job["result"] in jm.FAILED_RESULTS ... parse_log.delay(project, log_obj["url"], job["job_guid"], job["resultset_id"], check_errors=has_failed)

Assignee: nobody → emorley

Status: NEW → ASSIGNED

Ed Morley [:emorley]

Assignee

Comment 3

•

10 years ago

Cameron, how would you like me to handle this? We currently have check_errors defaulting to both false and true at various points - shall I make them all default to true? Or should I just remove the pref entirely? https://github.com/mozilla/treeherder-service/search?q=check_errors

Flags: needinfo?(cdawson)

Cameron Dawson [:camd]

Comment 4

•

10 years ago

Ed-- Yeah, this was a performance tuning choice so that we didn't waste time looking for valid errors where there would be none. But your cases above prove that assumption wrong. We could set check_errors to true everywhere, if that's more correct. And, like you say, if it's always going to be true, we don't need it. So it could be removed if we're pretty sure we'll never want to skip error checking. This will place a heavier load on the worker parsing the regexes for errors are the bulk of the log-parsing time. And earlier on, it was overloading us. But now that it's a separate worker, perhaps we could handle that load. Or I guess I should say: we need to figure out a way to handle that load. :)

Flags: needinfo?(cdawson)

Ed Morley [:emorley]

Assignee

Comment 5

•

10 years ago

Attached file service PR198 - Parse logs for error lines even if the job was successful — Details

This makes us parse error lines even if the job was successful. I'll leave any cleanup (removal of check_errors) to later, when we're sure we don't need it any more. Is my understanding correct that we already parse the build step times (durations) for successful logs? If not, we need to do that too. And if we do that, I'm not sure if this really increases load that much more - given that we'll have already have had to download the gzipped log, extract it etc. Another thought - I seem to recall us talking about this before (or maybe it was with Mauro) and saying that to ease load, we could make the successful log parsing take a lower priority than the failures. It seems like we're already doing this...? https://github.com/mozilla/treeherder-service/blob/f53fd6ba0a9ebec92d8d1cca77820fe44bb96626/treeherder/model/derived/jobs.py#L1970 if result != 'success': task['routing_key'] = 'parse_log.failures' else: task['routing_key'] = 'parse_log.success' https://github.com/mozilla/treeherder-service/blob/14b3d317936e92eaccf38211334334aa052fb04e/treeherder/settings/base.py#L186 CELERY_QUEUES = ( Queue('default', Exchange('default'), routing_key='default'), # queue for failed jobs/logs Queue('log_parser_fail', Exchange('default'), routing_key='parse_log.failures'), # queue for successful jobs/logs Queue('log_parser', Exchange('default'), routing_key='parse_log.success'),

Attachment #8484086 - Flags: review?(cdawson)

Ed Morley [:emorley]

Assignee

Comment 6

•

10 years ago

Comment on attachment 8484086 [details] [review] service PR198 - Parse logs for error lines even if the job was successful Adding jeads as a reviewer to spread the load a bit, whichever of you gets to this first can cancel the other person's review. See comment 5 for PR overview.

Attachment #8484086 - Flags: review?(jeads)

Cameron Dawson [:camd]

Updated

•

10 years ago

Attachment #8484086 - Flags: review?(jeads)

Attachment #8484086 - Flags: review?(cdawson)

Attachment #8484086 - Flags: review+

Ed Morley [:emorley]

Assignee

Comment 7

•

10 years ago

https://github.com/mozilla/treeherder-service/commit/a623a1fa32a7738f0cf1279e397d647c5c4823fatre This is working, eg: https://treeherder.mozilla.org/ui/logviewer.html#?job_id=360243&repo=mozilla-central

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Ed Morley [:emorley]

Assignee

Updated

•

10 years ago

Blocks: 1078396

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

"errors" is empty for successful jobs whose logs contained failure lines

Categories

(Tree Management :: Treeherder, defect, P2)

Tracking

(Not tracked)

People

(Reporter: emorley, Assigned: emorley)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Updated

Attachment

General

Description

File Name

Content Type