Closed Bug 1043741 Opened 10 years ago Closed 10 years ago

Logviewer: Use buildbot buildstep 'results' code for step pass/fail determination

Categories

(Tree Management :: Treeherder, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: glandium, Assigned: emorley, Mentored)

References

()

Details

(Whiteboard: [good first bug])

Attachments

(2 files)

There are many errors, especially on windows, that are unknown to the structured log parser, and in such cases, all steps are considered and displayed green, despite "failed" being the status of one of those steps.

See 
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=772080&repo=try
And
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=160217&repo=mozilla-central
for a similar case for warnings.
Yeah good spot - the log viewer should be relying on non-zero in:

 ========= Started check test complete warnings (results: 1, elapsed: 10 mins, 55 secs) (at 2014-07-24 08:00:15.141252) ========= 

...and not whether the parser found any error message matches.
Priority: -- → P2
Summary: Structured log viewer shows failed step as green when it has nothing matching in it → Log viewer should use buildbot buildstep 'results' code for green/orange determination
Summary: Log viewer should use buildbot buildstep 'results' code for green/orange determination → Logviewer: Use buildbot buildstep 'results' code for green/orange determination
Thinking this blocker might be suitable for the upcoming work week, and/or upcoming development resources.
Mentor: mdoglio
Whiteboard: [good first bug]
For clarity (or my own benefit :)) a failing job which shows the bug consistently atm, is 'Windows 2012 x64 debug' and 'Windows 2012 x64 opt' build jobs on mozilla-central. Each has (1) warning and incorrectly shows a green log step for each.
You'll have to be quick about using that, that warning will go away on the next merge from mozilla-inbound.
Thanks Phil :) Ok, I will dig around later for more.
So the current log artefact is of form:

[{
	"blob": {
		"header": {
			"slave": "b-2008-sm-0037",
			"buildid": "20140724063607",
			"builder": "mozilla-central-win64-debug",
			"results": "warnings (1)",
			"starttime": "1406208970.18",
			"builduid": "295f9e2fcc8a47c9bd4d1c9ac8693c3b",
			"revision": "616e6924cb0b"
		},
		"step_data": {
			"all_errors": [],
			"steps": [{
				"errors": [],
				"name": "set props: basedir",
				"started": "2014-07-24 06:36:10.176795",
				"started_linenumber": 8,
				"finished_linenumber": 91,
				"finished": "2014-07-24 06:36:11.328212",
				"error_count": 0,
				"duration": 1,
				"order": 0
			},
			{
				"errors": [],
				"name": "set props: hashType",
				"started": "2014-07-24 06:36:11.328575",
				"started_linenumber": 93,
				"finished_linenumber": 95,
				"finished": "2014-07-24 06:36:11.328937",
				"error_count": 0,
				"duration": 0,
				"order": 1
			},

Currently the log viewer is only using the error_count to decide what colour to use, but this is zero in cases where the parser didn't find anything. I seem to remember us discussing whether we needed a per-step result code in the DB - I can't remember the conclusion.

It seems like we need to:
* Add a 'result' property to each step at ingestion
* Decide whether then "error count" property adds any value over len(errors), and if not, remove it
* Get the log viewer to use the per step result property to determine the colour of each step (and ignore error count)
* Add CSS styles for the other result colours (there's only orange and green at the moment)
Assignee: nobody → emorley
Status: NEW → ASSIGNED
* Switches to using named groups for the regex, to make things clearer.
* Switches the regex to raw literal string, since it's safer (we should probably do this everywhere at some point).
* I'm not overly happy with copy-pasta-ing RESULT_DICT yet again - the mappings within are used in several places in the repo, but I'm not sure if the log parser wants to pull them in, since they include a lot of other stuff.
* The tests were updated by uncommenting that block in test_log_view_artifact_builder.py, but since the dicts are unordered, we're at the mercy of whatever order Python decides to use, which is why the test diffs are larger (and less readable) than I would have liked.
* Counter-intuitively, the result can be gleaned even when we've only reached the start step, since the step log output was buffered and the result code added to both the start and end markers.
* I'm not sure why we have constants both inside and outside the StepParser class. I can get that for the regex compiles we perhaps want to save duplicating for each instance, but perhaps we should move ST_STARTED/ST_FINISHED/DATE_FORMAT/RESULT_DICT out too?
Attachment #8484324 - Flags: review?(cdawson)
Comment on attachment 8484324 [details] [review]
service PR200 - Add step result to structured log artefact

(In reply to Ed Morley [:edmorley] from comment #7)
> service PR200 - Add step result to structured log artefact
> 
> * Switches to using named groups for the regex, to make things clearer.
> * Switches the regex to raw literal string, since it's safer (we should
> probably do this everywhere at some point).
> * I'm not overly happy with copy-pasta-ing RESULT_DICT yet again - the
> mappings within are used in several places in the repo, but I'm not sure if
> the log parser wants to pull them in, since they include a lot of other
> stuff.

I've switched this to importing the dict only from treeherder.etl.buildbot.

> * The tests were updated by uncommenting that block in
> test_log_view_artifact_builder.py, but since the dicts are unordered, we're
> at the mercy of whatever order Python decides to use, which is why the test
> diffs are larger (and less readable) than I would have liked.
> * Counter-intuitively, the result can be gleaned even when we've only
> reached the start step, since the step log output was buffered and the
> result code added to both the start and end markers.

Adding jeads as a reviewer to spread the load a bit, whichever of you gets to this first can cancel the other person's review.
Attachment #8484324 - Flags: review?(jeads)
(In reply to Ed Morley [:edmorley] from comment #7)
> Created attachment 8484324 [details] [review]
> service PR200 - Add step result to structured log artefact
> 
> * Switches to using named groups for the regex, to make things clearer.

Much better.

> * Switches the regex to raw literal string, since it's safer (we should
> probably do this everywhere at some point).

Also, much better. We should definitely do this everywhere, especially in https://github.com/mozilla/treeherder-service/blob/master/treeherder/etl/buildbot.py

> * The tests were updated by uncommenting that block in
> test_log_view_artifact_builder.py, but since the dicts are unordered, we're
> at the mercy of whatever order Python decides to use, which is why the test
> diffs are larger (and less readable) than I would have liked.

The testing strategy here, https://github.com/mozilla/treeherder-service/blob/master/tests/log_parser/test_log_view_artifact_builder.py#L48, is brittle and will likely keep biting us as we continue to modify the log view artifact in the future. A better strategy would be to maintain explicit json schemas for json artifacts when treeherder has hard structure requirements. The jsonschema module is already in our vendor lib https://github.com/mozilla/treeherder-service/tree/master/vendor/jsonschema so this is just a matter of defining the schema. 

If we do this, we can modify the test assertion to be schema validation results, which would be much easier to maintain going forward. Would also allow us to validate structures at run time in production so we can log errors when we hit them. This is out of the scope of this PR but should be added as a refactor task for treeherder.

> * Counter-intuitively, the result can be gleaned even when we've only
> reached the start step, since the step log output was buffered and the
> result code added to both the start and end markers.

Not totally sure how to exploit this in the current parser architecture. Do you have some restructuring ideas here?

> * I'm not sure why we have constants both inside and outside the StepParser
> class. I can get that for the regex compiles we perhaps want to save
> duplicating for each instance, but perhaps we should move
> ST_STARTED/ST_FINISHED/DATE_FORMAT/RESULT_DICT out too?

The only reason I could see for this is the need to import RE_HEADER_VALUE/RE_HEADER_START/PATTERN/RE_STEP_START/RE_STEP_FINISH explicitly in other modules independently of HeaderParser and StepParser. Maybe at one time in development that was the case. However, grepping for those constants, I find no explicit imports.

These constants should be converted to class attributes on the classes, HeaderParser and StepParser, that use them.
Flags: needinfo?(emorley)
Comment on attachment 8484324 [details] [review]
service PR200 - Add step result to structured log artefact

This looks good. We can address the remaining issues in a future refactor.
Attachment #8484324 - Flags: review?(jeads)
Attachment #8484324 - Flags: review?(cdawson)
Attachment #8484324 - Flags: review?
Attachment #8484324 - Flags: review+
Flags: needinfo?(emorley)
Attachment #8484324 - Flags: review?
Summary: Logviewer: Use buildbot buildstep 'results' code for green/orange determination → Logviewer: Use buildbot buildstep 'results' code for step pass/fail determination
Blocks: 1063678
I've broken the "use more colours than just green/orange" part out to bug 1063678, since this bug is more about the "if the parser found no failures, we show as success and sheriffs have to click on every step" issue.

Once the service part is in production, I'll work on the UI part, since it will be easier to test locally.
(In reply to Jonathan Eads ( :jeads ) from comment #9)
> Also, much better. We should definitely do this everywhere, especially in
> https://github.com/mozilla/treeherder-service/blob/master/treeherder/etl/
> buildbot.py

Filed bug 1063688.

> The testing strategy here,
> https://github.com/mozilla/treeherder-service/blob/master/tests/log_parser/
> test_log_view_artifact_builder.py#L48, is brittle and will likely keep
> biting us as we continue to modify the log view artifact in the future. A
> better strategy would be to maintain explicit json schemas for json
> artifacts when treeherder has hard structure requirements

Filed bug 1063691.

> > * Counter-intuitively, the result can be gleaned even when we've only
> > reached the start step, since the step log output was buffered and the
> > result code added to both the start and end markers.
> 
> Not totally sure how to exploit this in the current parser architecture. Do
> you have some restructuring ideas here?

I don't think there's anything more to exploit, it was more of a "yeah I know it looks weird setting the result on the 'have found start header' block, but this is why" :-)

> The only reason I could see for this is the need to import
> RE_HEADER_VALUE/RE_HEADER_START/PATTERN/RE_STEP_START/RE_STEP_FINISH
> explicitly in other modules independently of HeaderParser and StepParser.
> Maybe at one time in development that was the case. However, grepping for
> those constants, I find no explicit imports.
> 
> These constants should be converted to class attributes on the classes,
> HeaderParser and StepParser, that use them.

Filed bug 1063695.
I have a WIP for the ui part, but waiting on some fixes on master to be deployed to dev, since the log viewer is broken on dev currently.
Since there are fewer cases that the parser doesn't find now (after bug 1040418 and friends) and there are links to the raw log, this is only a minor regression over TBPL, in that one has to realise to open the raw log and look through it. As such demoting slightly.
Previously only the presence of error lines found by the log parser would cause the step to be marked as failing in the log viewer - even if the step had a result value of !success. With this patch we use the step result, thereby also handling cases where the log parser didn't find a match & avoiding parser false positives from acting as a red herring, if the failure occurred in a different step.

My only concerns with this patch are:
1) The repetition of |!== "success"|. However bug 1063678 will be replacing some of the instances in lvLogSteps.html (eg for "btn-warning"). Could be persuaded to modify the steps artefact in the for loop to have a "stepFailed" property or similar, so we can use that.
2) We'll need to enforce that all submitters of data (ie the non-buildbot ones) provide step results. I think this is generally desirable though.
Attachment #8493840 - Flags: review?(cdawson)
Commits pushed to master at https://github.com/mozilla/treeherder-ui

https://github.com/mozilla/treeherder-ui/commit/2b2c4bf80aa0cafc5d35f004224ed89ccd516731
Bug 1043741 - Use the step result code for step failure determination

Previously only the presence of error lines found by the log parser
would cause the step to be marked as failing in the log viewer - even
if the step had a result value of !success. With this patch we use the
step result, thereby also handling cases where the log parser didn't
find a match & avoiding parser false positives from acting as a red
herring, if the actual failure occurred in a different step.

https://github.com/mozilla/treeherder-ui/commit/82f9b16a2b3d244291658ec8be1ce636bc8b465a
Merge pull request #203 from mozilla/logviewer-step-result

Bug 1043741 - Use the step result code for step failure determination
Attachment #8493840 - Flags: review?(cdawson) → review+
Thanks :-)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Depends on: 1086896
Commits pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/bb14dabcafe7649e1de3537474818e0da7cdeecc
Bug 1043741 - Use the step result code for step failure determination

Previously only the presence of error lines found by the log parser
would cause the step to be marked as failing in the log viewer - even
if the step had a result value of !success. With this patch we use the
step result, thereby also handling cases where the log parser didn't
find a match & avoiding parser false positives from acting as a red
herring, if the actual failure occurred in a different step.

https://github.com/mozilla/treeherder/commit/a7b95e539886a5185c56cca94b5b6a0e92b1f31b
Merge pull request #203 from mozilla/logviewer-step-result

Bug 1043741 - Use the step result code for step failure determination
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: