Closed Bug 1209463 Opened 4 years ago Closed 2 years ago

Print a master summary when running multiple suites from |mach test| or |mach mochitest|

Categories

(Testing :: General, defect, critical)

defect
Not set
critical

Tracking

(firefox-esr52 wontfix, firefox57 wontfix, firefox58 wontfix, firefox59 fixed)

RESOLVED FIXED
mozilla59
Tracking Status
firefox-esr52 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- fixed

People

(Reporter: standard8, Assigned: ahal)

References

(Blocks 1 open bug)

Details

(Keywords: regression, Whiteboard: [PI:December])

Attachments

(8 files)

On Loop, we typically are running commands like:

./mach mochitest browser/components/loop/test/mochitest/ browser/components/uitour/test/browser_UITour_loop.js  browser/base/content/test/general/browser_devices_get_user_media_about_urls.js browser/base/content/test/general/browser_parsable_css.js

If one of the early tests fail, then when that section exits, there's a summary, however, it continues running the tests, and when it ends, it looks like everything finished successfully.

I'm pretty sure this is a regression, as I think it used to exit straight away.

We either need a "stop on first failure" or a summary at the end of the individual runs and their states.
I found some more information on this. If I run:

./mach mochitest browser/components/loop/test/mochitest browser/components/uitour/test/browser_UITour_loop.js browser/base/content/test/general/browser_devices_get_user_media_about_urls.js browser/base/content/test/general/browser_parsable_css.js

as per comment 0, then it fails to report properly. However, if I run:

./mach mochitest browser/components/loop/test/mochitest browser/components/uitour/test/browser_UITour_loop.js browser/base/content/test/general/browser_devices_get_user_media_about_urls.js

it reports errors properly. I took a look but I've not really got any idea of what's going on here.
See Also: → 1338240
This also fails to report if you're running a directory that is covered with a subsuite as well, e.g.

./mach mochitest browser/components/places/tests/browser/

The main set run, and then the clipboard sub-suite runs. If one of the main tests fail, you don't get to find out about it without scrolling back through the log.
Severity: major → critical
Summary: If a test fails running ./mach mochitest <list of tests>, there's no summary at the end to tell that a test has failed → If a test fails running ./mach mochitest <list of tests>, there's no summary at the end to tell that a test has failed (e.g. due to multiple suites)
Is there any way we could get this looked at?

It feels like something developers can easily miss and cause bad pushes... or waste time as they've either got to watch the log as it goes through or scroll back up to confirm.
:ahal, could you look into this bug in the shorter term?
Flags: needinfo?(ahalberstadt)
Whiteboard: [PI:November]
Yeah, this is something I've thought about a fair bit in the past. It's especially bad when using |mach test|. I agree this is a problem, but just want to note that a non-zero return code will let you know if anything failed (still need to search backscroll to see what the failure was though).

My ideas on fixing this revolved around extending the mozlog data format to allow specifying up front that multiple suites are expected to run. With this, we could modify formatters to print out a summary of all results after the last suite's "suite-end" is received.  We could also do cool things like use the footer that |mach build| has to track progress.

This will likely be a decently complicated change, so I think November is a bit ambitious, but I'll see how far I can get.
Assignee: nobody → ahalberstadt
Status: NEW → ASSIGNED
Flags: needinfo?(ahalberstadt)
Depends on: 1419826
Component: Mochitest → General
Summary: If a test fails running ./mach mochitest <list of tests>, there's no summary at the end to tell that a test has failed (e.g. due to multiple suites) → Print a master summary when running multiple suites from |mach test| or |mach mochitest|
Blocks: 1421799
Blocks: mach-test
Here's an example screenshot of the summary in the 'mach' format.

Also here's some try runs (android failures fixed in the subsequent pushes):
https://treeherder.mozilla.org/#/jobs?repo=try&author=ahalberstadt@mozilla.com&tochange=6c928ea4e89b42f0933793622c900b3171504caa&fromchange=203d37c2570f86f24f9fe2532c06e459b14a7c0e
Whiteboard: [PI:November] → [PI:December]
Attachment #8932149 - Flags: review?(gbrown)
Just found a |mach test| regression from that commit, will have a fix up shortly. The other commits should be unaffected.
Comment on attachment 8933383 [details]
Bug 1209463 - [mochitest] Stop re-logging errors at the end of a mochitest run,

https://reviewboard.mozilla.org/r/204276/#review210636

::: testing/mochitest/runtests.py:2578
(Diff revision 2)
>  
>              # If we are using --run-by-manifest, we should not use the profile path (if) provided
>              # by the user, since we need to create a new directory for each run. We would face
>              # problems if we use the directory provided by the user.
>              tests_in_manifest = [t['path'] for t in tests if t['manifest'] == m]
> -            result = self.runMochitests(options, tests_in_manifest)
> +            res = self.runMochitests(options, tests_in_manifest)

For improved readability, consider

s/res/manifest_result/

and/or

s/result/overall_result/
Attachment #8933383 - Flags: review?(gbrown) → review+
Comment on attachment 8933385 [details]
Bug 1209463 - [mozlog] Add a 'summary_on_shutdown' attribute to MachFormatter,

https://reviewboard.mozilla.org/r/204280/#review210660
Attachment #8933385 - Flags: review?(james) → review+
Comment on attachment 8932150 [details]
Bug 1209463 - [mozlog] Create a SummaryHandler class and refactor the 'mach' formatter to use it,

https://reviewboard.mozilla.org/r/203186/#review210666

::: testing/mozbase/mozlog/mozlog/formatters/machformatter.py:59
(Diff revision 4)
>          self.last_time = None
>          self.terminal = terminal
>          self.verbose = False
>          self._known_pids = set()
>  
> -        self.summary_values = {"tests": 0,
> +        self.summary = SummaryHandler()

Somehow this patch breaks |mach web-platform-tests|. I think maybe this `SummaryHandler()` attribute isn't shared across the `threading` module?

Either way, it results in this logger:
https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py#318

having 0 handlers. I'll leave the review flag up for now, maybe there's a better way of setting this up.
Comment on attachment 8932150 [details]
Bug 1209463 - [mozlog] Create a SummaryHandler class and refactor the 'mach' formatter to use it,

https://reviewboard.mozilla.org/r/203186/#review210666

> Somehow this patch breaks |mach web-platform-tests|. I think maybe this `SummaryHandler()` attribute isn't shared across the `threading` module?
> 
> Either way, it results in this logger:
> https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py#318
> 
> having 0 handlers. I'll leave the review flag up for now, maybe there's a better way of setting this up.

This is weird, |mach web-platform-tests| is broken even if not using the `mach` formatter. This patch shouldn't even affect the other formatters :/
Attachment #8932150 - Flags: review?(james)
Attached patch Debug patchSplinter Review
This is caused by some very bizarre python import mechanics. With the attached patch, you can see that using `from mozlog.structuredlog import StructuredLogger` works, but using `from mozlog import structuredlog; structuredlog.StructuredLogger` doesn't work.

I tracked this down to the `from ..handlers import SummaryHandler` line in machformatter.py. Upon removing that import, both cases work again. Why this happens is beyond my python knowledge. I'm tempted to just fix this by using the former "import method" and not understanding why.
Changing this line:
https://searchfox.org/mozilla-central/source/testing/mozbase/mozlog/mozlog/handlers/base.py#10

to:
from mozlog.structuredlog import log_levels

fixes the behaviour I saw. I don't understand why, but it is good enough for me. I'll re-push the patches soon.
All of the issues I'm aware of have been fixed. Sorry for the review churn.
Comment on attachment 8932149 [details]
Bug 1209463 - [test] Use a common logger across suites from |mach test| and other test commands,

https://reviewboard.mozilla.org/r/203184/#review211064
Attachment #8932149 - Flags: review?(gbrown) → review+
Comment on attachment 8933384 [details]
Bug 1209463 - [test] Log suite names in test harnesses,

https://reviewboard.mozilla.org/r/204278/#review211066
Attachment #8933384 - Flags: review?(gbrown) → review+
An observation from testing: the summary is quite terse. Consider making it a little more verbose.

Example (mach test <a-mochitest-dir> on Android) currently says:

Overall Summary
mochitest-plain: 28/28

I would prefer something like:

Overall Summary
mochitest-plain: 28 passed of 28 tests run
Try it with the 'mach' formatter :).

I plan on switching the 'mach' format to be the default very soon in bug 1421799, so the tbpl summary was mostly just a fill-in until that happens. That said, we could still expand the 'tbpl' summary if you think it would be useful.
Comment on attachment 8932150 [details]
Bug 1209463 - [mozlog] Create a SummaryHandler class and refactor the 'mach' formatter to use it,

https://reviewboard.mozilla.org/r/203186/#review210658

::: testing/mozbase/mozlog/mozlog/formatters/machformatter.py:145
(Diff revision 4)
> -        has_subtests = self.summary_values["subtests"] > 0
> +        rv = ["", suite, "~" * len(suite)]
> -        has_assert_counts = self.summary_values["assertion_counts"] > 0
>  
> -        test_count = self.summary_values["tests"]
> -        components = ["%i parents" % self.summary_values["tests"]]
> -        if has_subtests:
> +        # Format check counts
> +        checks = self.summary.aggregate('count', count)
> +        rv.append("Ran {} checks ({})".format(sum(checks.values()),

I don't really know what a "check" is in this case.

::: testing/mozbase/mozlog/mozlog/formatters/machformatter.py:178
(Diff revision 4)
>  
> -        if not self.summary_values["unexpected"]:
> +        # Format status
> +        if not any(count[key]["unexpected"] for key in ('test', 'subtest', 'assert')):
>              rv.append(term.green("OK"))
>          else:
> -            heading = "Unexpected Results"
> +            heading = "Failures"

That's misleading. An unexpected result can be a PASS.

::: testing/mozbase/mozlog/mozlog/handlers/summaryhandler.py:125
(Diff revision 4)
> +        count = self.current['counts']
> +        count['subtest']['count'] += 1
> +
> +        if 'expected' in data:
> +            count['subtest']['unexpected'][data['status'].lower()] += 1
> +            if data['test'] not in failures:

Calling this "failures" here is also misleading. They aren't "tests that fail", they are "tests that don't get the expected result".

::: testing/mozbase/mozlog/mozlog/formatters/machformatter.py:121
(Diff revision 5)
> -        self.summary_unexpected = []
>          num_tests = reduce(lambda x, y: x + len(y), data['tests'].itervalues(), 0)
>          return "%i" % num_tests
>  
>      def suite_end(self, data):
> +        return self._format_suite_summary(self.summary.current_suite, self.summary.current)

I feel a little like this is over-strong coupling, but maybe it's fine.
Attachment #8932150 - Flags: review?(james) → review+
Comment on attachment 8933385 [details]
Bug 1209463 - [mozlog] Add a 'summary_on_shutdown' attribute to MachFormatter,

https://reviewboard.mozilla.org/r/204280/#review213030
Comment on attachment 8933386 [details]
Bug 1209463 - [mozlog] Print an overall summary from the tbplformatter,

https://reviewboard.mozilla.org/r/204282/#review213032
Attachment #8933386 - Flags: review?(james) → review+
Comment on attachment 8932150 [details]
Bug 1209463 - [mozlog] Create a SummaryHandler class and refactor the 'mach' formatter to use it,

https://reviewboard.mozilla.org/r/203186/#review210658

> I don't really know what a "check" is in this case.

It means one of 'tests', 'subtests' or 'asserts'. The old format used to look like:

    Ran 10 tests (1 parent, 5 subtests, 4 asserts)

I thought this was a bit confusing, since normally what the formatter calls a `parent` is usually just referred to as a `test`. I thought something like:

    Ran 10 checks (1 test, 5 subtests, 4 asserts)
    
was a bit less confusing.
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f18122fa2324
[mochitest] Stop re-logging errors at the end of a mochitest run, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/6aae83bee604
[test] Log suite names in test harnesses, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/d3afe2be8685
[test] Use a common logger across suites from |mach test| and other test commands, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/bdecf757602b
[mozlog] Create a SummaryHandler class and refactor the 'mach' formatter to use it, r=jgraham
https://hg.mozilla.org/integration/autoland/rev/ee8cf2d008e2
[mozlog] Add a 'summary_on_shutdown' attribute to MachFormatter, r=jgraham
https://hg.mozilla.org/integration/autoland/rev/de51994befaf
[mozlog] Print an overall summary from the tbplformatter, r=jgraham
Blocks: 1483003
You need to log in before you can comment on or make changes to this bug.