Closed Bug 1704142 Opened 3 years ago Closed 6 months ago

Errorsummary formatter missing group results in some cases

Categories

(Testing :: Mozbase, defect)

Default
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ahal, Assigned: ahal)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(3 files)

I noticed many of the tasks in this push are missing group results:
https://treeherder.mozilla.org/jobs?repo=try&revision=141ca31b4fd518086a6876783d29514a71bbd227

Some of the failures also had the following traceback in them:

[task 2021-04-09T16:12:33.778Z] 16:12:33     INFO -  mozlog.structuredlog: Failure calling log handler:
[task 2021-04-09T16:12:33.779Z] 16:12:33     INFO -  Traceback (most recent call last):
[task 2021-04-09T16:12:33.779Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/structuredlog.py", line 294, in _handle_log
[task 2021-04-09T16:12:33.780Z] 16:12:33     INFO -      handler(data)
[task 2021-04-09T16:12:33.780Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/handlers/base.py", line 73, in __call__
[task 2021-04-09T16:12:33.781Z] 16:12:33     INFO -      formatted = self.formatter(data)
[task 2021-04-09T16:12:33.781Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/handlers/base.py", line 41, in __call__
[task 2021-04-09T16:12:33.781Z] 16:12:33     INFO -      return self.inner(item)
[task 2021-04-09T16:12:33.782Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/formatters/errorsummary.py", line 27, in __call__
[task 2021-04-09T16:12:33.782Z] 16:12:33     INFO -      rv = BaseFormatter.__call__(self, data)
[task 2021-04-09T16:12:33.783Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/reader.py", line 74, in __call__
[task 2021-04-09T16:12:33.783Z] 16:12:33     INFO -      return handler(data)
[task 2021-04-09T16:12:33.784Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/formatters/errorsummary.py", line 79, in suite_end
[task 2021-04-09T16:12:33.784Z] 16:12:33     INFO -      for group, info in self.groups.items()
[task 2021-04-09T16:12:33.785Z] 16:12:33     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozlog/formatters/errorsummary.py", line 79, in <genexpr>
[task 2021-04-09T16:12:33.785Z] 16:12:33     INFO -      for group, info in self.groups.items()
[task 2021-04-09T16:12:33.786Z] 16:12:33     INFO -  TypeError: unsupported operand type(s) for -: 'NoneType' and 'NoneType'

Though some passing tasks (like this one) are also missing group results and don't have the above traceback.

Assignee: nobody → ahal
Status: NEW → ASSIGNED

There is still investigation needed even after the above patches land.

Keywords: leave-open
Regressed by: 1671422
Attachment #9214754 - Attachment description: Bug 1704142 - [mozlog] Handle case when a manifest has no 'test_end' logs due to a crash, r?marco → Bug 1704142 - [mozlog] Handle duration in errorsummary formatter when no 'test_end' is found, r?marco
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/91b8025a08ff
[mozlog] Refactor 'errorsummary' test into standalone file, r=marco
https://hg.mozilla.org/integration/autoland/rev/6f0a78a02bc6
[mozlog] Handle duration in errorsummary formatter when no 'test_end' is found, r=marco
Depends on: 1705431

Similar case in OVuTZ20wTxa60Rd29DcfLA.

The test that failed was browser/base/content/test/general/browser_save_video_frame.js, belonging to group "browser/base/content/test/general/browser.ini".
There is an error (last line of https://firefoxci.taskcluster-artifacts.net/OVuTZ20wTxa60Rd29DcfLA/0/public/test_info/mochitest-browser-chrome_errorsummary.log), and then no "group_result" line is logged.

Probably some groups didn't even run because of the error, but at least we should have a group_result with PASS for the groups that run and passed and FAIL for the groups that run and failed.

This came up in https://github.com/mozilla/mozci/pull/571, linked to https://github.com/mozilla/mozci/issues/563.

Flags: needinfo?(ahal)

Yeah I think this is a known edge case. Currently we only log group results when we get the suite_end action:
https://searchfox.org/mozilla-central/rev/3d03a3ca09f03f06ef46a511446537563f62a0c6/testing/mozbase/mozlog/mozlog/formatters/errorsummary.py#77

So if there's a crash and no suite_end, then we won't output any group_results. Maybe we could instead log them when the ErrorSummaryFormatter instance is destroyed? Though we might output misleading answers in that case since if there's a crash we would mark the group as passing even though half the tests didn't run. This would be possible to solve, but might be a bit tricky.

Flags: needinfo?(ahal)

(In reply to Andrew Halberstadt [:ahal] from comment #7)

Yeah I think this is a known edge case. Currently we only log group results when we get the suite_end action:
https://searchfox.org/mozilla-central/rev/3d03a3ca09f03f06ef46a511446537563f62a0c6/testing/mozbase/mozlog/mozlog/formatters/errorsummary.py#77

So if there's a crash and no suite_end, then we won't output any group_results. Maybe we could instead log them when the ErrorSummaryFormatter instance is destroyed? Though we might output misleading answers in that case since if there's a crash we would mark the group as passing even though half the tests didn't run. This would be possible to solve, but might be a bit tricky.

Could we output only groups that actually finished running?
We can take note of which ones finished by looking at the group associated to test_start. If it is not the same as the previous test_start, then we know that the previous group ended. For the last group, we know it is ended when we are in suite_end.

Severity: -- → S4

handling test specific crashes in bug 1758984, going to use this bug to deal with leak/timeout/crash on shutdown.

The leave-open keyword is there and there is no activity for 6 months.
:ahal, maybe it's time to close this bug?
For more information, please visit auto_nag documentation.

Flags: needinfo?(ahal)

there are more edge cases to deal with.

Flags: needinfo?(ahal)

any remaining edge cases will have their own bugs, we are really getting down to the rare edge cases now

Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: