Closed Bug 1024055 Opened 10 years ago Closed 8 years ago

Make mozharness use structured logging for marionette tests

Tracking

(firefox52 fixed)

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

firefox52

---

fixed

People

(Reporter: chmanchester, Assigned: pyang)

References

Details

Attachments

(4 files, 5 obsolete files)

Alter marionette's mozharness script to use structured log data to determine job status 10 years ago Chris Manchester (limited bugmail, email directly) 8.97 KB, patch	jgraham : feedback+	Details \| Diff \| Splinter Review
Alter marionette's mozharness script to use structured log data to determine job status 10 years ago Chris Manchester (limited bugmail, email directly) 8.36 KB, patch		Details \| Diff \| Splinter Review
Alter marionette's mozharness script to use structured log data to determine job status 10 years ago Chris Manchester (limited bugmail, email directly) 8.42 KB, patch	ahal : review-	Details \| Diff \| Splinter Review
Alter marionette's mozharness script to use structured log data to determine job status 10 years ago Chris Manchester (limited bugmail, email directly) 8.93 KB, patch	ahal : review+ mozilla : review+ chmanchester : checked-in+	Details \| Diff \| Splinter Review
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status. 10 years ago Chris Manchester (limited bugmail, email directly) 14.64 KB, patch	ahal : feedback+ jgraham : feedback+	Details \| Diff \| Splinter Review
Provide a mozlog handler to report overall status of a test run for the benefit of mozharness and others 10 years ago Chris Manchester (limited bugmail, email directly) 4.67 KB, patch	jgraham : feedback+ ahal : feedback+	Details \| Diff \| Splinter Review
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status. 10 years ago Chris Manchester (limited bugmail, email directly) 14.40 KB, patch		Details \| Diff \| Splinter Review
enable_config 8 years ago Paul Yang [:pyang] (away) 5.87 KB, patch	ahal : feedback+	Details \| Diff \| Splinter Review
enable_config 8 years ago Paul Yang [:pyang] (away) 5.89 KB, patch	ahal : review+	Details \| Diff \| Splinter Review

Chris Manchester (limited bugmail, email directly)

Reporter

Description

•

10 years ago

mozharness should use structured log data provided that will be provided by bug 956739 with the parser landed in bug 1021817.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 1

•

10 years ago

Attached patch Alter marionette's mozharness script to use structured log data to determine job status (obsolete) — Details — Splinter Review

This puts alternate parsing behind a config option for marionette tests. Still need to figure out the best strategy for testing this.

James, are these changes to the structured parser compatible with the web platform tests?

Attachment #8438705 - Flags: feedback?(james)

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Assignee: nobody → cmanchester

Status: NEW → ASSIGNED

James Graham [:jgraham]

Comment 2

•

10 years ago

Comment on attachment 8438705 [details] [diff] [review]
Alter marionette's mozharness script to use structured log data to determine job status

Review of attachment 8438705 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/structuredlog.py
@@ +99,5 @@
>  
>          try:
>              data = json.loads(line)
>          except ValueError:
> +            self.warning("Failed to parse line '%s' as json" % line)

What was the reason for this change?

::: scripts/marionette.py
@@ +50,5 @@
> +    A class that extends StructuredOutputParserHelper such that it can parse
> +    if gecko did not install properly
> +    """
> +
> +    bad_gecko_install = 'Error installing gecko!'

So I don't understand why this needs to be special cased here? Can't we just change marionette to do something more sane?

Attachment #8438705 - Flags: feedback?(james) → feedback+

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 3

•

10 years ago

Comment on attachment 8438705 [details] [diff] [review]
Alter marionette's mozharness script to use structured log data to determine job status

Review of attachment 8438705 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/structuredlog.py
@@ +99,5 @@
>  
>          try:
>              data = json.loads(line)
>          except ValueError:
> +            self.warning("Failed to parse line '%s' as json" % line)

Seemed like a reflection of the other changes to the behavior in this patch, but I don't think it matters.

::: scripts/marionette.py
@@ +50,5 @@
> +    A class that extends StructuredOutputParserHelper such that it can parse
> +    if gecko did not install properly
> +    """
> +
> +    bad_gecko_install = 'Error installing gecko!'

From looking at the code and surrounding comments, it seems this is a specific case the test harness needs to cause a special behavior in mozharness.

James Graham [:jgraham]

Comment 4

•

10 years ago

> ::: scripts/marionette.py
> @@ +50,5 @@
> > +    A class that extends StructuredOutputParserHelper such that it can parse
> > +    if gecko did not install properly
> > +    """
> > +
> > +    bad_gecko_install = 'Error installing gecko!'
> 
> From looking at the code and surrounding comments, it seems this is a
> specific case the test harness needs to cause a special behavior in
> mozharness.

This still seems like a bad approach. Having marionette produce random bits of unstructured data requires every consumer to deal with it in some way.

It seems like the problem is really that marionette pushes the logic for retrying when something fails into mozharness, which generally assumes no retries. But I wonder if the right fix is to log at some level e.g. "error" but to reset the status and tbpl status every time the marionette process is restarted so although the logs would show errors on tbpl, the overall job status would be green (assuming it worked after N retries).

Andrew Halberstadt [:ahal]

Comment 5

•

10 years ago

The gecko install hack isn't used anymore.. feel free to nuke it! (my marionette refactor patch which will land shortly actually removes the ability to even do this in the first place)

Andrew Halberstadt [:ahal]

Comment 6

•

10 years ago

For context, that hack was used back when we didn't have releng-provided emulator builds. Basically we had a snapshot of the emulator up on a server, and for each check-in we would download it and then install the newer gecko into it. To quote chmanchester, this worked about as well as you would expect ;).

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 7

•

10 years ago

Attached patch Alter marionette's mozharness script to use structured log data to determine job status (obsolete) — Details — Splinter Review

Nuke the gecko install hack.

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8438705 - Attachment is obsolete: true

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 8

•

10 years ago

We're going to need to do something like this for each harness as they adopt structured logging, so I think we should put this config option in testing_config_options. Then we can land changes to corresponding mozharness scripts as in-tree harness parts are ready. To test, sounds like we can pass the commandline option to turn this on via in-tree configs for those suites that have them, and via buildbot configs on ash for those that don't. Does this sound reasonable?

James Graham [:jgraham]

Comment 9

•

10 years ago

Sounds pretty reasonable to me.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 10

•

10 years ago

Attached patch Alter marionette's mozharness script to use structured log data to determine job status (obsolete) — Details — Splinter Review

This puts the config option in testing_config_options.

Attachment #8442106 - Flags: review?(aki)

Attachment #8442106 - Flags: review?(ahalberstadt)

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8440033 - Attachment is obsolete: true

Andrew Halberstadt [:ahal]

Comment 11

•

10 years ago

Comment on attachment 8442106 [details] [diff] [review]
Alter marionette's mozharness script to use structured log data to determine job status

Review of attachment 8442106 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, but giving r- because this will cause bustage as is.

::: scripts/marionette.py
@@ -41,5 @@
> -
> -    def parse_single_line(self, line):
> -        if self.bad_gecko_install.search(line):
> -            self.install_gecko_failed = True
> -        super(MarionetteOutputParser, self).parse_single_line(line)

Thanks for getting rid of this!

@@ +157,5 @@
>          self.binary_path = c.get('binary_path')
>          self.test_url = c.get('test_url')
> +        self.parser_class = (StructuredOutputParser
> +                             if   c.get('structured_output')
> +                             else TestSummaryOutputParserHelper)

nit: if a single line if statement is too big for a single line, then use a normal if statement ;)

@@ +426,5 @@
> +        if self.config.get("structured_output"):
> +            # Make sure that the logging directory exists
> +            if self.mkdir_p(dirs["abs_blob_upload_dir"]) == -1:
> +                self.fatal("Could not create blobber upload directory")
> +            cmd.append("--log-raw=-")

Shouldn't the formatted log be output to stdout?

@@ +454,5 @@
>              code = self.run_command(cmd, env=env,
>                                      output_timeout=1000,
>                                      output_parser=marionette_parser)
>              if not marionette_parser.install_gecko_failed:
>                  break

Since you removed the MarionetteOutputParser above, please also remove this for loop. This will fail because marionette_parser no longer has an install_gecko_failed attribute.

Attachment #8442106 - Flags: review?(ahalberstadt) → review-

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8442106 - Flags: review?(aki)

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 12

•

10 years ago

Attached patch Alter marionette's mozharness script to use structured log data to determine job status — Details — Splinter Review

Addressed ahal's comments. The in-tree harness writing the raw log to standout is correct as long as we use the structured parser in mozharness.

Attachment #8442158 - Flags: review?(aki)

Attachment #8442158 - Flags: review?(ahalberstadt)

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8442106 - Attachment is obsolete: true

Andrew Halberstadt [:ahal]

Comment 13

•

10 years ago

Comment on attachment 8442158 [details] [diff] [review]
Alter marionette's mozharness script to use structured log data to determine job status

Review of attachment 8442158 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm, thanks! Aki, I think it's mostly the --structured-output argument in testbase.py that you need to review.

Attachment #8442158 - Flags: review?(ahalberstadt) → review+

Aki Sasaki (not active)

Comment 14

•

10 years ago

Comment on attachment 8442158 [details] [diff] [review]
Alter marionette's mozharness script to use structured log data to determine job status

(In reply to Andrew Halberstadt [:ahal] from comment #13)
> Lgtm, thanks! Aki, I think it's mostly the --structured-output argument in
> testbase.py that you need to review.

--structured-output argument lgtm.
We may want to have a --no-structured-output argument if the config file says structured_output = True and we want to override that, but I'm also fine with leaving that until we actually have a need for it.

Attachment #8442158 - Flags: review?(aki) → review+

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 15

•

10 years ago

Before we can use this for parsing, we need to ensure that all the errors specified here: http://hg.mozilla.org/build/mozharness/annotate/404dfba5c39d/scripts/marionette.py#l141 are logged at the error level by the harness. These all appear to correspond to exceptions thrown in python or javascript, so this should be doable.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 16

•

10 years ago

In the meantime:

https://hg.mozilla.org/build/mozharness/rev/0040022e55be

Chris Cooper [:coop] (he/him)

Comment 17

•

10 years ago

Merged to production, and deployed.

Chris Cooper [:coop] (he/him)

Updated

•

10 years ago

Component: Tools → Mozharness

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 18

•

10 years ago

Attached patch Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status. — Details — Splinter Review

This warrants further testing, but it passes my local sanity check that a failing test results in # TBPL WARNING # for marionette tests.

The goal is to make DesktopUnittestOutputParser obsolete for marionette based tests as well as anything outputting structured logs, meaning the regex in mozharness/testing/errors.py will never be used to determine job status.

This passes in an argument to the parser to determine whether output from the harness other than line delimeted JSON is always a failure. jgraham pointed out we can wrap stdout for the harness process, but I think the result is about the same.

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8442158 - Flags: checked-in+

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8478519 - Flags: feedback?(james)

Attachment #8478519 - Flags: feedback?(ahalberstadt)

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 19

•

10 years ago

Comment on attachment 8478519 [details] [diff] [review]
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status.

Review of attachment 8478519 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/structuredlog.py
@@ +22,5 @@
> +        # will be used to re-interpret formatted logs.
> +        if 'strict' in kwargs:
> +            self.strict = kwargs.pop('strict')
> +        else:
> +            self.strict = False

This should default to true, that's a mistake.

Andrew Halberstadt [:ahal]

Comment 20

•

10 years ago

Comment on attachment 8478519 [details] [diff] [review]
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status.

Review of attachment 8478519 [details] [diff] [review]:
-----------------------------------------------------------------

I think this approach is probably the best we can do at the moment. It does make things more confusing, but it accomplishes what we want: getting mozharness to use structured data for determining status. To me that is a win.

::: mozharness/mozilla/structuredlog.py
@@ +38,5 @@
>          self.worst_log_level = INFO
>          self.tbpl_status = TBPL_SUCCESS
>  
> +
> +    def _log(self, line, level, test_unexpected=False):

This function isn't doing any actual logging, I'd call it a handler or something similar.

::: scripts/marionette.py
@@ +480,5 @@
>          if self.mkdir_p(dirs["abs_blob_upload_dir"]) == -1:
>              # Make sure that the logging directory exists
>              self.fatal("Could not create blobber upload directory")
>  
> +        if self.tree_config.get("structured_output"):

Note: just be aware that this will propagate down the release train, so it can't depend on anything outside of the tree (i.e buildbot-configs or regular mozharness configs).

Attachment #8478519 - Flags: feedback?(ahalberstadt) → feedback+

James Graham [:jgraham]

Comment 21

•

10 years ago

Comment on attachment 8478519 [details] [diff] [review]
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status.

Review of attachment 8478519 [details] [diff] [review]:
-----------------------------------------------------------------

I approve of the general idea of using the structured logs, but I think we need to be careful not to fall back to regexp in places where it isn't actually necessary.

::: mozharness/mozilla/structuredlog.py
@@ +118,5 @@
> +
> +        if "status" in data and data["status"] == "SKIP":
> +            self.skipped += 1
> +
> +        self._log(self.formatter(data), level, test_unexpected)

I am still not terribly happy about the fact that this causes lines to be interpreted both as structured data and using regexp (with strict=false). I'm not sure structured logging is buying us anything as long as we still think that's necessary.

@@ +128,5 @@
>              self.update_levels(TBPL_WARNING, WARNING)
> +        if self.count == 0:
> +            self.update_levels(TBPL_FAILURE, ERROR)
> +        if not self.has_suite_end:
> +            self.log("No suite end message was logged by this test harness", ERROR)

Need to update .log to set the tbpl status if the level is ERROR or higher (or maybe WARNING or higher).

@@ +130,5 @@
> +            self.update_levels(TBPL_FAILURE, ERROR)
> +        if not self.has_suite_end:
> +            self.log("No suite end message was logged by this test harness", ERROR)
> +            self.update_levels(TBPL_FAILURE, ERROR)
> +        if self.num_errors != 0:

This shouldn't be needed if we directly set the status when the action=log message is handled.

@@ +141,5 @@
> +        # currently in the format <pass count>/<fail count>/<todo count
> +        # but <expected count>/<unexpected count>/<skip count> is more reflective of
> +        # the meaning of structured logs, and will yield the relevant information.
> +        if self.unexpected_count > 0:
> +            fail_text = '<em class="testfail">%s</em>' % self.unexpected_count

Is anything actually formatting this as HTML?

::: scripts/marionette.py
@@ +346,5 @@
> +
> +        if self.tree_config.get("structured_output"):
> +            # Also when isinstance(parser, StructuredOutputParser)
> +            tbpl_status, log_level = parser.evaluate_parser(return_code)
> +            if return_code == 10 and tbpl_status == TBPL_WARNING:

10 seems like a magic number. Where does it come from?

@@ +348,5 @@
> +            # Also when isinstance(parser, StructuredOutputParser)
> +            tbpl_status, log_level = parser.evaluate_parser(return_code)
> +            if return_code == 10 and tbpl_status == TBPL_WARNING:
> +                text_status = "test failures"
> +            elif return_code != 0:

Can't this go in the StructuredOutputParser? If we need to we can add an option to turn it off, but for now it seems like a good idea.

Attachment #8478519 - Flags: feedback?(james) → feedback+

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 22

•

10 years ago

Comment on attachment 8478519 [details] [diff] [review]
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status.

Review of attachment 8478519 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/structuredlog.py
@@ +118,5 @@
> +
> +        if "status" in data and data["status"] == "SKIP":
> +            self.skipped += 1
> +
> +        self._log(self.formatter(data), level, test_unexpected)

We can probably do better with marionette, but there are some producers we truly don't control.

It's possible mozharness isn't the place to focus our efforts right now. Maybe that place is treeherder. I do see this change as relatively high risk given complexities in mozharness.

@@ +128,5 @@
>              self.update_levels(TBPL_WARNING, WARNING)
> +        if self.count == 0:
> +            self.update_levels(TBPL_FAILURE, ERROR)
> +        if not self.has_suite_end:
> +            self.log("No suite end message was logged by this test harness", ERROR)

I think this could cause false positives if "log" is called from a superclass.

@@ +130,5 @@
> +            self.update_levels(TBPL_FAILURE, ERROR)
> +        if not self.has_suite_end:
> +            self.log("No suite end message was logged by this test harness", ERROR)
> +            self.update_levels(TBPL_FAILURE, ERROR)
> +        if self.num_errors != 0:

This is how the errors logged by a super class would get propagated to the tbpl status. It's not great.

@@ +141,5 @@
> +        # currently in the format <pass count>/<fail count>/<todo count
> +        # but <expected count>/<unexpected count>/<skip count> is more reflective of
> +        # the meaning of structured logs, and will yield the relevant information.
> +        if self.unexpected_count > 0:
> +            fail_text = '<em class="testfail">%s</em>' % self.unexpected_count

It appears that tbpl and treeherder do so.

::: scripts/marionette.py
@@ +346,5 @@
> +
> +        if self.tree_config.get("structured_output"):
> +            # Also when isinstance(parser, StructuredOutputParser)
> +            tbpl_status, log_level = parser.evaluate_parser(return_code)
> +            if return_code == 10 and tbpl_status == TBPL_WARNING:

Seems unique to marionette among the harnesses.

@@ +348,5 @@
> +            # Also when isinstance(parser, StructuredOutputParser)
> +            tbpl_status, log_level = parser.evaluate_parser(return_code)
> +            if return_code == 10 and tbpl_status == TBPL_WARNING:
> +                text_status = "test failures"
> +            elif return_code != 0:

I think that's right.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 23

•

10 years ago

Reviewing a review didn't do what I expected, but comment 22 consists of replies to the review in comment 21.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 24

•

10 years ago

Comment on attachment 8478519 [details] [diff] [review]
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status.

Review of attachment 8478519 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/structuredlog.py
@@ +46,5 @@
> +        # By interpreting log levels directly and counting them
> +        # against num_errors, a test harness can turn a job red
> +        # by logging at the error level without logging "TEST-UNEXPECTED".
> +        if level in (ERROR, CRITICAL, FATAL):
> +            self.num_errors += 1

Just making a note of an irc discussion to do this in parse_single_line to make this more explicit.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 25

•

10 years ago

Reading bug 1017559 made me understand this a bit better. I think we can be more obvious about the distinction. Now I'm thinking:

Mozlog can look at a set of log messages and give us a status for a test run. This is pretty much: any test_end or test_status where 'expected' is in the json message means test failure (orange), any log action at a level 'ERROR' (maybe 'WARNING') or above is error (red). Doing it in mozlog means our definitions don't have an opportunity to get subtly out of step when this logic is implemented in treeherder or some other tool. This can just be a handler.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 26

•

10 years ago

Submitted too early, here's the rest:

Mozharness can pass messages to this handler (the thing that tells us how a run went based on a set of JSON messages) and set status based on that. For those that request it, mozharness will pass unstructured messages ('log' actions, or any non-json lines left on stdout if we allow that), to a regex based parser to detect anything unexpected that should fail the run. We might still need to detect arbitrary python exceptions (like those raised by mozlog unexpectedly) or output from things we don't control with a fallback, but a test result will never be passed through a regex based parser.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 27

•

10 years ago

Attached patch Provide a mozlog handler to report overall status of a test run for the benefit of mozharness and others — Details — Splinter Review

This is what the handler in mozlog to interpret statuses would look like.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 28

•

10 years ago

Attached patch Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status. (obsolete) — Details — Splinter Review

This is a version of the patch I was working on last week that uses mozlog to set job status (the changes to marionette.py are almost identical). It's not that much simpler than the other version, but I think it's a step in the right direction.

If we agree this is a reasonable way forward, I'd like to land these on ash so we can test the various failure scenarios.

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8482883 - Flags: feedback?(james)

Attachment #8482883 - Flags: feedback?(ahalberstadt)

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8482921 - Flags: feedback?(james)

Attachment #8482921 - Flags: feedback?(ahalberstadt)

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 29

•

10 years ago

(In reply to Chris Manchester (PTO 9/6 - 9/10) [:chmanchester] from comment #28)
> Created attachment 8482921 [details] [diff] [review]
> Modify marionette's mozharness script and the structured output parser to
> use structured logs to determine job status.
> 
> This is a version of the patch I was working on last week that uses mozlog
> to set job status (the changes to marionette.py are almost identical). It's
> not that much simpler than the other version, but I think it's a step in the
> right direction.
> 
> If we agree this is a reasonable way forward, I'd like to land these on ash
> so we can test the various failure scenarios.

I just noticed this doesn't set the log level when the fallback parse_single_line is used. That needs to be fixed.

Andrew Halberstadt [:ahal]

Comment 30

•

10 years ago

Comment on attachment 8482883 [details] [diff] [review]
Provide a mozlog handler to report overall status of a test run for the benefit of mozharness and others

Review of attachment 8482883 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense. I have some optional suggestions that are more of a brainstorm than hard requirements.

::: testing/mozbase/mozlog/mozlog/structured/handlers/statushandler.py
@@ +8,5 @@
> +TBPL_SUCCESS = 'SUCCESS'
> +TBPL_WARNING = 'WARNING'
> +TBPL_FAILURE = 'FAILURE'
> +TBPL_EXCEPTION = 'EXCEPTION'
> +TBPL_RETRY = 'RETRY'

I'd like for these to not have the 'TBPL' prefix.

@@ +16,5 @@
> +MOZLOG_OK = 'OK'
> +MOZLOG_TEST_FAIL = 'FAIL'
> +MOZLOG_ERROR = 'ERROR'
> +
> +_tbpl_statuses = {

Same.

@@ +39,5 @@
> +
> +    def __call__(self, data):
> +        status = MOZLOG_OK
> +        action = data['action']
> +        if action == 'log' and log_levels[data['level']] <= log_levels['ERROR']:

Why stop at ERROR? The status should just be the maximum log level, whether that's WARNING or ERROR or CRITICAL. E.g:

if log_levels[data['level']] > log_levels[self.worst_level]:
    status = self.worst_level = data['level']

It might also be nice to keep track of the number of log messages in each level (I know mozharness wouldn't make use of that, but other things may want to).

@@ +45,5 @@
> +        elif action == 'test_end':
> +            self.count += 1
> +            if 'expected' in data:
> +                self.unexpected += 1
> +                status = MOZLOG_TEST_FAIL

Also, maybe test failures and log levels should be stored separately.

@@ +64,5 @@
> +        if status == MOZLOG_OK or self.status != MOZLOG_OK:
> +            return
> +        self.status = status
> +
> +    def evaluate_status(self, style):

I'd just call this 'evaluate' so that the api is `status.evaluate()` instead of `status.evaluate_status()`.

Attachment #8482883 - Flags: feedback?(ahalberstadt) → feedback+

Andrew Halberstadt [:ahal]

Comment 31

•

10 years ago

Comment on attachment 8482921 [details] [diff] [review]
Modify marionette's mozharness script and the structured output parser to use structured logs to determine job status.

Review of attachment 8482921 [details] [diff] [review]:
-----------------------------------------------------------------

I'm going to hold off feedback on this patch until the first one is reviewed/landed as this is heavily dependent on that. But from a quick glance, it looks ok.

Attachment #8482921 - Flags: feedback?(ahalberstadt)

James Graham [:jgraham]

Comment 32

•

10 years ago

Comment on attachment 8482883 [details] [diff] [review]
Provide a mozlog handler to report overall status of a test run for the benefit of mozharness and others

Review of attachment 8482883 [details] [diff] [review]:
-----------------------------------------------------------------

So I think I like the general idea, but it seems much nicer to do all the evaluation at the end when evaluate is called rather than doing it continually. I also agree with everything ahal said.

::: testing/mozbase/mozlog/mozlog/structured/handlers/statushandler.py
@@ +39,5 @@
> +
> +    def __call__(self, data):
> +        status = MOZLOG_OK
> +        action = data['action']
> +        if action == 'log' and log_levels[data['level']] <= log_levels['ERROR']:

I tend to agree that going for a design where we track everything and, at the end, use the data we've gathered to set the status, rather than doing the thing where we continually update the status as we go along.

e.g. 

in __init__
self.log_count = defaultdict(int)

for each message
if action == "log"
   self.log_count[data["level"]] += 1

When we evaluate the status
worst_level = None
for level in log_levels:
    if level in self.log_count:
        worst_level = level
        break

@@ +45,5 @@
> +        elif action == 'test_end':
> +            self.count += 1
> +            if 'expected' in data:
> +                self.unexpected += 1
> +                status = MOZLOG_TEST_FAIL

We can just set status at the end based on there being > 1 unexpected, right?

@@ +70,5 @@
> +        translation from mozlog statuses is provided."""
> +        diagnostic = None
> +        if not self.suite_ended:
> +            self._update_status(MOZLOG_ERROR)
> +            diagnostic = 'No suite_end message was found for this run'

Is this the right thing to do? It depends if you think that the most probable cause of this error is that the logging is broken (so the suite_end won't ever be produced) or evaluate was called too early. It's not quite clear to me which we should worry about more.

Attachment #8482883 - Flags: feedback?(james) → feedback+

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 33

•

10 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #30)
> Comment on attachment 8482883 [details] [diff] [review]
> Provide a mozlog handler to report overall status of a test run for the
> benefit of mozharness and others
> 
> Review of attachment 8482883 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Makes sense. I have some optional suggestions that are more of a brainstorm
> than hard requirements.
> 
> ::: testing/mozbase/mozlog/mozlog/structured/handlers/statushandler.py
> @@ +8,5 @@
> > +TBPL_SUCCESS = 'SUCCESS'
> > +TBPL_WARNING = 'WARNING'
> > +TBPL_FAILURE = 'FAILURE'
> > +TBPL_EXCEPTION = 'EXCEPTION'
> > +TBPL_RETRY = 'RETRY'
> 
> I'd like for these to not have the 'TBPL' prefix.

I want to make it explicit that these specific strings mean something to TBPL, and users interacting with this can therefore use them directly.

> 
> @@ +16,5 @@
> > +MOZLOG_OK = 'OK'
> > +MOZLOG_TEST_FAIL = 'FAIL'
> > +MOZLOG_ERROR = 'ERROR'
> > +
> > +_tbpl_statuses = {
> 
> Same.

These can lose the prefix, but again want to be explicit that this is an adapter between the meaning of a test run as the mozlog format knows it and other tools.

> 
> @@ +39,5 @@
> > +
> > +    def __call__(self, data):
> > +        status = MOZLOG_OK
> > +        action = data['action']
> > +        if action == 'log' and log_levels[data['level']] <= log_levels['ERROR']:
> 
> Why stop at ERROR? The status should just be the maximum log level, whether
> that's WARNING or ERROR or CRITICAL. E.g:
> 
> if log_levels[data['level']] > log_levels[self.worst_level]:
>     status = self.worst_level = data['level']
> 

I guess these would all translate to TBPL_FAILURE (or maybe WARNING if we need the harness to be able to turn things orange).

> It might also be nice to keep track of the number of log messages in each
> level (I know mozharness wouldn't make use of that, but other things may
> want to).
> 
> @@ +45,5 @@
> > +        elif action == 'test_end':
> > +            self.count += 1
> > +            if 'expected' in data:
> > +                self.unexpected += 1
> > +                status = MOZLOG_TEST_FAIL
> 
> Also, maybe test failures and log levels should be stored separately.
> 
> @@ +64,5 @@
> > +        if status == MOZLOG_OK or self.status != MOZLOG_OK:
> > +            return
> > +        self.status = status
> > +
> > +    def evaluate_status(self, style):
> 
> I'd just call this 'evaluate' so that the api is `status.evaluate()` instead
> of `status.evaluate_status()`.

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 34

•

10 years ago

(In reply to James Graham [:jgraham] from comment #32)
> Comment on attachment 8482883 [details] [diff] [review]
> Provide a mozlog handler to report overall status of a test run for the
> benefit of mozharness and others
> 
> Review of attachment 8482883 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> So I think I like the general idea, but it seems much nicer to do all the
> evaluation at the end when evaluate is called rather than doing it
> continually. I also agree with everything ahal said.
> 
> ::: testing/mozbase/mozlog/mozlog/structured/handlers/statushandler.py
> @@ +39,5 @@
> > +
> > +    def __call__(self, data):
> > +        status = MOZLOG_OK
> > +        action = data['action']
> > +        if action == 'log' and log_levels[data['level']] <= log_levels['ERROR']:
> 
> I tend to agree that going for a design where we track everything and, at
> the end, use the data we've gathered to set the status, rather than doing
> the thing where we continually update the status as we go along.

I think that's right, but the advantage to continually updating is we have a notion of the first non-OK status that gets set. Assuming that's the most relevant status to return (I could still be wrong about that), we can just keep a specific field for that.

> 
> e.g. 
> 
> in __init__
> self.log_count = defaultdict(int)
> 
> for each message
> if action == "log"
>    self.log_count[data["level"]] += 1
> 
> When we evaluate the status
> worst_level = None
> for level in log_levels:
>     if level in self.log_count:
>         worst_level = level
>         break
> 
> @@ +45,5 @@
> > +        elif action == 'test_end':
> > +            self.count += 1
> > +            if 'expected' in data:
> > +                self.unexpected += 1
> > +                status = MOZLOG_TEST_FAIL
> 
> We can just set status at the end based on there being > 1 unexpected, right?

Same comment about keeping the first non-OK status.

> 
> @@ +70,5 @@
> > +        translation from mozlog statuses is provided."""
> > +        diagnostic = None
> > +        if not self.suite_ended:
> > +            self._update_status(MOZLOG_ERROR)
> > +            diagnostic = 'No suite_end message was found for this run'
> 
> Is this the right thing to do? It depends if you think that the most
> probable cause of this error is that the logging is broken (so the suite_end
> won't ever be produced) or evaluate was called too early. It's not quite
> clear to me which we should worry about more.

I guess this is right if we're thinking about mozharness because it takes the place of evaluate_parser, which is done after the test process finishes. But now that I'm thinking about other users, and a world where the result of a run might be available incrementally, this isn't right, and I think this check should be moved back to mozharness.

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Attachment #8482921 - Flags: feedback?(james)

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Depends on: 1063535

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 35

•

10 years ago

Split off bug 1063535 to land the status handler.

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Depends on: 1066785

Chris Manchester (limited bugmail, email directly)

Reporter

Updated

•

10 years ago

Blocks: 1068138

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 36

•

9 years ago

Chris, what is the status of this bug? Looks like you missed to finish it off. I wonder if this bug only exists for the marionette script and not for mozharness in general, e.g. as an enhancement to base/log.py. It would be interesting for our firefox-ui-tests which get run through mozharness, and need to report to Treeherder including the creation of the error_summary_text artifact.

Flags: needinfo?(cmanchester)

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 37

•

9 years ago

I think the suite-agnostic parts of this ended up landing in other bugs. I don't think anything would block landing the marionette part, but I'm sure the patches wouldn't apply cleanly by now.

Note that this bug as it is is about setting job status based on structured logs, not using structured logs for mozharness' own logging. The error summary artifact can and should be added to marionette and can be done independent of this, the other scripts like destktop_unittest.py have examples of how this is done.

Assignee: cmanchester → nobody

Status: ASSIGNED → NEW

Flags: needinfo?(cmanchester)

Paul Yang [:pyang] (away)

Assignee

Comment 38

•

8 years ago

I just turned on structured log in mozharness' marionette script and most of suites look fine.  however config 'strict' need to be turned off because microformats suite pushed non-structured message to stdout.
If that wasn't important I guess we can simply turn it off.
try result with strict on https://treeherder.mozilla.org/#/jobs?repo=try&revision=f25e78b231d8539c8b48e5601a8abc7a43b3e6da
strict off https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb9d8f10533eb5a129c2f0e3cd623a1212683ea5

ni? Chris for suggestion

Flags: needinfo?(cmanchester)

Chris Manchester (limited bugmail, email directly)

Reporter

Comment 39

•

8 years ago

Ideally tests would log everything through structured logging and we would be able to rely on that. This might not be easy for all tests, but I'm not familiar with the microformats suite. As long as we're aware of what's being printed without structured logging and none of it is critical information (such as test results, or an error from the harness intended to be fatal) this might be ok

Flags: needinfo?(cmanchester)

Paul Yang [:pyang] (away)

Assignee

Comment 40

•

8 years ago

Attached patch enable_config (obsolete) — Details — Splinter Review

This should be minimized patch to enable structured log for marionette.
feedback? to see if anything else we need to add.

Attachment #8797054 - Flags: feedback?(ahalberstadt)

Andrew Halberstadt [:ahal]

Comment 41

•

8 years ago

Comment on attachment 8797054 [details] [diff] [review]
enable_config

Review of attachment 8797054 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this looks good to me!

::: testing/mozharness/scripts/marionette.py
@@ +454,5 @@
>          level = INFO
> +        success_codes = None
> +        tbpl_status, log_level = marionette_parser.evaluate_parser(
> +            code,
> +            success_codes=success_codes)

No point in passing in "success_codes" if it's just None anyway, might as well remove that.

Attachment #8797054 - Flags: feedback?(ahalberstadt) → feedback+

Paul Yang [:pyang] (away)

Assignee

Comment 42

•

8 years ago

Attached patch enable_config — Details — Splinter Review

- Remove passing success_code argument
- Rename 'code' to 'return_code'

Assignee: nobody → pyang

Attachment #8797054 - Attachment is obsolete: true

Status: NEW → ASSIGNED

Paul Yang [:pyang] (away)

Assignee

Comment 43

•

8 years ago

Comment on attachment 8797355 [details] [diff] [review]
enable_config

Here is the try result of the patch.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=681a374efcf34de135af62d848a9437973e7aa7d

Looks ok and I'll confirm failed tests are  intermittent.

Attachment #8797355 - Flags: review?(ahalberstadt)

Andrew Halberstadt [:ahal]

Updated

•

8 years ago

Attachment #8482921 - Attachment is obsolete: true

Andrew Halberstadt [:ahal]

Comment 44

•

8 years ago

For the future, you should try to avoid using "all" when selecting unittests. For example, in this case you can use a try syntax like:

try: -b o -p <platforms> -u marionette -t none

This will only run the Mn tests (which is what we care about in this patch) and will help keep test machines available for other pushes.

Andrew Halberstadt [:ahal]

Comment 45

•

8 years ago

Comment on attachment 8797355 [details] [diff] [review]
enable_config

Review of attachment 8797355 [details] [diff] [review]:
-----------------------------------------------------------------

Here are a couple more try runs to test the failure cases:
https://treeherder.mozilla.org/#/jobs?repo=try&author=ahalberstadt@mozilla.com&fromchange=848a6d1091b854fc765d9ca84300a7c4b560bacb&tochange=e74b0ca092f4a27712b35f2588309059066da8ff

First one purposefully adds a test failure, second one purposefully causes a crash (though not on windows). Both have the expected results! I'll take the liberty of landing this.

Attachment #8797355 - Flags: review?(ahalberstadt) → review+

Pulsebot

Comment 46

•

8 years ago

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a0789005be29
Make mozharness use StructuredOutputParser for marionette tests, r=ahal

Ryan VanderMeulen [:RyanVM]

Comment 47

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/a0789005be29

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.