Closed Bug 1177630 Opened 5 years ago Closed 5 years ago

Create a structured log summary file containing only the data needed for Treeherder's error summary

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set

Tracking

(firefox42 fixed)

RESOLVED FIXED
Tracking Status
firefox42 --- fixed

People

(Reporter: jgraham, Assigned: jgraham)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Autostarring needs to know about various kinds of errors:

* Tests that get an unexpected result
* Crashes
* Messages logged as error/critical

Other data may be required, but that's an easy set to start with.

The plan is to create a line-based JSON format which is a subset of structured logs with just that data in, write it to a file that will be uploaded by blobber and can then be consumed by treeherder (I don't know if that's the *best* plan; maybe it should POST to treeherder directly, or something).
I think Treeherder should consume it, like it does the current "raw" json log (eg https://github.com/mozilla/treeherder/blob/29a704cc907ec8a64be02dc38d6bff5f555b5771/treeherder/etl/buildapi.py#L173-L181).
The data structure we need here is more for the "use a signature as the identifier of a failure, rather than the bug number" work, rather than the automatic classification work, which will come later; changing 'blocks' accordingly.
Blocks: 1179263
No longer blocks: autostar
Summary: Create a structured log summary file containing only the data needed for autostarring → Create a structured log summary file containing only the data needed for Treeherder's error summary
Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary.
Attachment #8630960 - Flags: review?(mdoglio)
Attachment #8630960 - Flags: review?(cmanchester)
https://reviewboard.mozilla.org/r/12817/#review11381

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:1
(Diff revision 1)
> +import json

Not sure that this is the right data or the right format; let's discuss that here.
Attachment #8630960 - Flags: review?(emorley)
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary.
https://reviewboard.mozilla.org/r/12817/#review11381

> Not sure that this is the right data or the right format; let's discuss that here.

To expand on this a bit, depending on what treeherder side wants, we could make the data flatter e.g. have a single object per row with names that map directly to column names. We might want to include more of the information from the stackwalker in the case that it can't produce a backtrace (i.e. the stderr / returncode).
Attachment #8630960 - Flags: review?(cmanchester)
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

https://reviewboard.mozilla.org/r/12817/#review11425

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:21
(Diff revision 1)
> +        test_id = (item["test"], item["subtest"])

Unused variable.

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:14
(Diff revision 1)
> +                           "message":item.get("message", None),
> +                           "stack":item.get("stack", None)})

The result of "get" when the key is missing is already None.

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:5
(Diff revision 1)
> +class ErrorSummaryFormatter(BaseFormatter):

The other thing we talked about was an auto-increment field to help locate errors in a fuller log.

I can imagine context being useful to more sophisticated classification algorithms, but depends on what you have in mind.
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

I don't really know what half the structure log properties are, or how this would work in practice, so not sure I'm the best person to review.

(My heads still at the whiteboard thinking the best way for us to do this stage TBH)
Attachment #8630960 - Flags: review?(emorley)
https://reviewboard.mozilla.org/r/12817/#review11381

> To expand on this a bit, depending on what treeherder side wants, we could make the data flatter e.g. have a single object per row with names that map directly to column names. We might want to include more of the information from the stackwalker in the case that it can't produce a backtrace (i.e. the stderr / returncode).

I changed this a little to make things a little closer to the raw structured log format.
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary.
Attachment #8630960 - Flags: review?(emorley)
Attachment #8630960 - Flags: review?(cmanchester)
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

https://reviewboard.mozilla.org/r/12817/#review11599

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:15
(Diff revision 2)
> +        data = details.copy()

Copy necessary?

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:1
(Diff revision 2)
> +import json

License header!?

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:25
(Diff revision 2)
> +                "message":item.get("message"),
> +                "stack":item.get("stack")}

Nit: space after ":"

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:32
(Diff revision 2)
> +        return self._output_test(item["test"], item["subtest"], data)

s/data/item/

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:21
(Diff revision 2)
> +        data = {"test": test,
> +                "subtest": subtest,

The other thing to do would be exclude the metadata fields rather than include everything but them, which seems like the effect here (when we decide later it's important to have stackwalk_stderr in the database we might need to change this).

Put another way, this result of this is starting to look a lot like bug 1113868, moving the implementation closer might simplify things as well.
Attachment #8630960 - Flags: review?(cmanchester)
https://reviewboard.mozilla.org/r/12817/#review11599

> The other thing to do would be exclude the metadata fields rather than include everything but them, which seems like the effect here (when we decide later it's important to have stackwalk_stderr in the database we might need to change this).
> 
> Put another way, this result of this is starting to look a lot like bug 1113868, moving the implementation closer might simplify things as well.

Well the counter argument is that there will be a fixed set of fields that actually get stored in the database; the intent here is not to actually store a json blob but to turn it into a tabular format. So that imposes an effective whitelist of fields anyway. Given that I'd prefer to maintain the current approach for now.
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary.
Attachment #8630960 - Flags: review?(cmanchester)
Attachment #8630960 - Flags: review?(emorley)
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

https://reviewboard.mozilla.org/r/12817/#review11725

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:43
(Diff revision 3)
> +        if item["level"] not in ("ERROR", "CRITICAL"):

Treeherder uses:
`("ERROR", "WARNING", "CRITICAL")`
https://github.com/mozilla/treeherder/blob/58813b0c51969b9471b8afe1b8624eaf46cde0c5/treeherder/log_parser/artifactbuilders.py#L127

And also:
`(?:ERROR|CRITICAL|FATAL)`
https://github.com/mozilla/treeherder/blob/b8df082ae272cf9846a34fd8d84e08309e273ae3/treeherder/log_parser/parsers.py#L294

Not sure which is correct.
Attachment #8630960 - Flags: review?(cmanchester)
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

https://reviewboard.mozilla.org/r/12817/#review11759

::: testing/mozbase/mozlog/mozlog/structured/formatters/errorsummary.py:30
(Diff revision 3)
> +        return self._output("result", data)

Call this test_result?

I'll save r+ for after mdoglio has given feedback.
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester
Attachment #8630960 - Attachment description: MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary. → MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester
Attachment #8630960 - Flags: review?(mdoglio) → review?(cmanchester)
https://reviewboard.mozilla.org/r/12817/#review11725

> Treeherder uses:
> `("ERROR", "WARNING", "CRITICAL")`
> https://github.com/mozilla/treeherder/blob/58813b0c51969b9471b8afe1b8624eaf46cde0c5/treeherder/log_parser/artifactbuilders.py#L127
> 
> And also:
> `(?:ERROR|CRITICAL|FATAL)`
> https://github.com/mozilla/treeherder/blob/b8df082ae272cf9846a34fd8d84e08309e273ae3/treeherder/log_parser/parsers.py#L294
> 
> Not sure which is correct.

This matches what mozharness uses to mark a job orange when it is processing the structured logs (not all testsuites do that, of course). FATAL isn't a structured log level, and dealing with ERROR and CRITICAL seems hard enough without also adding WARNING into the mix. If we find there is a use case for keeping the WARNINGs later, we can do that of course.
I went through this with mdoglio in person and I think we agree that it provides a reasonable starting point for experimentation.
Attachment #8630960 - Flags: review?(cmanchester) → review+
Comment on attachment 8630960 [details]
MozReview Request: Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester

https://reviewboard.mozilla.org/r/12817/#review11873

Ship It!
url:        https://hg.mozilla.org/integration/mozilla-inbound/rev/77e627b220825a3b7b022819076bbd1c6b5bcf26
changeset:  77e627b220825a3b7b022819076bbd1c6b5bcf26
user:       James Graham <james@hoppipolla.co.uk>
date:       Wed Jul 08 11:32:57 2015 +0100
description:
Bug 1177630 - Add formatter to mozlog for producing a machine readable error summary, r=chmanchester
https://hg.mozilla.org/mozilla-central/rev/77e627b22082
Assignee: nobody → james
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Depends on: 1185453
You need to log in before you can comment on or make changes to this bug.