Open Bug 1591678 Opened 5 years ago Updated 9 months ago

task with message 'TEST-UNEXPECTED-FAIL | leakcheck | tab missing output line for total leaks!' set as successful

Categories

(Testing :: Mozbase, defect, P2)

defect

Tracking

(firefox72 fixed)

REOPENED
Tracking Status
firefox72 --- fixed

People

(Reporter: aryx, Assigned: ahal)

References

Details

Attachments

(1 file, 1 obsolete file)

Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/ae350e8881e5
add process name to regexps for harness errors to let their occurrences trigger the task to fail. r=jlund
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
See Also: → 1535922

Can somebody please look at fixing this? This failure is still not turning jobs orange. Thanks.

The green jobs here are an example of this: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&searchStr=linux%2C18.04%2Cx64%2Cdebug%2Cmochitests%2Ctest-linux1804-64%2Fdebug-mochitest-plain-e10s-&revision=821bbb36f7d37f1e2dd73d193e728712a55c143d&selectedTaskRun=F02_m2pHQ0yH2GZhGiad-g.0

Fortunately a sheriff noticed that this was happening so we were able to back out my patch that caused it, but it would be better to not have to rely on that. It would also be harder to notice if the failure to create a log was only happening in some test directories instead of in all of them.

Flags: needinfo?(aryx.bugmail)
See Also: → 1649331

See the 'Failure Summary' tab for this task as example.

For missing output line for total leaks!, self.leaked gets set to false (src. In other words, nothing tracks it as an error and the logic to set the task as failure will keep the default TBPL_SUCCESS.

Shall in this case self.num_errors be set and self.tbpl_status be set based on that?

Flags: needinfo?(aryx.bugmail) → needinfo?(gbrown)

I don't recall ever being involved in mozharness leak reporting -- I may not know what I'm talking about.

I note the harness_error_re is defined by https://searchfox.org/mozilla-central/rev/7ec7ee4a9bde171ba195ab46ed6077e4baaef34d/testing/mozharness/mozharness/mozilla/testing/errors.py#102. Reading through https://searchfox.org/mozilla-central/rev/7ec7ee4a9bde171ba195ab46ed6077e4baaef34d/testing/mozharness/mozharness/mozilla/testing/unittest.py#159-168, I think the intent has been:

    if "negative leaks caught!" or "\d+ bytes leaked" is found on a "TEST-UNEXPECTED-FAIL" line:
         self.leaked = True
   elif "missing output line for total leaks!" is found on a "TEST-UNEXPECTED-FAIL" line:
         self.leaked = None
   else:
         self.leaked = False

With https://bugzilla.mozilla.org/show_bug.cgi?id=1591678#c3, "<something> missing output line for total leaks!" is now treated the same as "missing output line for total leaks!" was originally - that seems right.

I don't see any existing code that links self.leaked to success/failure of the mozharness run.

I don't think self.num_errors should be modified for leaks.

If we want to fail, I think this is sufficient:
self.tbpl_status = self.worst_level(TBPL_FAILURE, self.tbpl_status, levels=TBPL_WORST_LEVEL_TUPLE)
but I note that we do not currently do that for other leak conditions, nor for crashes (which seems odd/wrong).

Flags: needinfo?(gbrown)

Andrew, can you provide insight into the reasoning for the current behavior of the leak reporting and guide how it shall be done?

Flags: needinfo?(ahal)

It's also been a long time since I've looked at mozharness log parsing and status setting, but those regexes should only be used for unstructured output. The mochitest harness uses mozlog actions to handle leak messages:
https://searchfox.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/structuredlog.py#22

I believe this is a bug in mozlog's StatusHandler here:
https://searchfox.org/mozilla-central/source/testing/mozbase/mozlog/mozlog/handlers/statushandler.py#79

Basically we default the bytes value to 0, though according to the other formatters, a None value means the total line is missing. I'll put up a patch shortly, though may be a little bit before I can test it. I'll leave the needinfo to remind myself.

Assignee: nobody → ahal
Flags: needinfo?(ahal)
Severity: normal → S2
Component: Applications: MozharnessCore → Mozbase
Priority: -- → P2
Product: Release Engineering → Testing
QA Contact: jlund
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b932e78a6bbc
[mozlog] Ensure missing leak totals count as a failure in the StatusHandler, r=gbrown

Thank you for fixing this!

Status: REOPENED → RESOLVED
Closed: 5 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
Regressions: 1651716
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
No longer regressions: 1651716

Any interest in the remaining unlanded patch?

Flags: needinfo?(ahal)

Yeah, I think it's still something we'd want. Though if I'm remembering properly fixing the thing that caused the backout was going to be tricky. There should be more context in bug 1651716 (I see there have been many comments since the backout that I haven't been following though).

Flags: needinfo?(ahal)
Attachment #9162304 - Attachment is obsolete: true
Severity: S2 → S4
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: