Closed Bug 1678896 Opened 4 years ago Closed 3 years ago

Intermittent toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js | test_BHRObserver - [test_BHRObserver : 155] false == true | After xpcshell return code: 0

Categories

(Toolkit :: General, defect, P5)

defect

Tracking

()

RESOLVED FIXED
94 Branch
Tracking Status
firefox-esr91 --- fixed
firefox94 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jstutte)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Filed by: nbeleuzu [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=322563521&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/IQcPIOAgTr-YkS9ugj-oNw/runs/0/artifacts/public/logs/live_backing.log


[task 2020-11-22T08:50:54.954Z] 08:50:54     INFO -  TEST-PASS | toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js | test_BHRObserver - [test_BHRObserver : 154] "number" == "number"
[task 2020-11-22T08:50:54.954Z] 08:50:54  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js | test_BHRObserver - [test_BHRObserver : 155] false == true
[task 2020-11-22T08:50:54.954Z] 08:50:54     INFO -  Z:/task_1606034378/build/tests/xpcshell/tests/toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js:test_BHRObserver/</<:155
[task 2020-11-22T08:50:54.954Z] 08:50:54     INFO -  Z:/task_1606034378/build/tests/xpcshell/tests/toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js:test_BHRObserver/<:148
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  Z:/task_1606034378/build/tests/xpcshell/tests/toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js:test_BHRObserver:140
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  Z:\task_1606034378\build\tests\xpcshell\head.js:_do_main:239
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  Z:\task_1606034378\build\tests\xpcshell\head.js:_execute_test:568
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  -e:null:1
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  exiting test
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  Unexpected exception NS_ERROR_ABORT:
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  _abort_failed_test@Z:\task_1606034378\build\tests\xpcshell\head.js:824:20
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  do_report_result@Z:\task_1606034378\build\tests\xpcshell\head.js:925:5
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  Assert<@Z:\task_1606034378\build\tests\xpcshell\head.js:73:21
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  proto.report@resource://testing-common/Assert.jsm:233:10
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  proto.ok@resource://testing-common/Assert.jsm:259:10
[task 2020-11-22T08:50:54.955Z] 08:50:54     INFO -  test_BHRObserver/</<@Z:/task_1606034378/build/tests/xpcshell/tests/toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js:155:11
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  test_BHRObserver/<@Z:/task_1606034378/build/tests/xpcshell/tests/toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js:148:16
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  test_BHRObserver@Z:/task_1606034378/build/tests/xpcshell/tests/toolkit/components/backgroundhangmonitor/tests/test_BHRObserver.js:140:24
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  _do_main@Z:\task_1606034378\build\tests\xpcshell\head.js:239:6
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  _execute_test@Z:\task_1606034378\build\tests\xpcshell\head.js:568:5
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  @-e:1:1
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  exiting test
[task 2020-11-22T08:50:54.956Z] 08:50:54     INFO -  PID 13716 | Exiting due to channel error.```
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INCOMPLETE
Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Resolution: --- → INCOMPLETE

My pushes over in D124350 seem to fail now consistently here. Reading test_BHRObserver.js it seems we should change that condition to the following:

  // Check that the telemetry service collected pings which make sense
  ok(telSvc.payload.hangs.length - beforeLen >= 3);
  ok(Array.isArray(telSvc.payload.modules));
  telSvc.payload.modules.forEach(module => {
    ok(Array.isArray(module));
    equal(module.length, 2);
    equal(typeof module[0], "string");
    equal(typeof module[1], "string");
  });

  telSvc.payload.hangs.forEach(hang => {
    ok(hang.duration > 0);
    ok(hang.thread == "Gecko" || hang.thread == "Gecko_Child");
    equal(typeof hang.runnableName, "string");

    // hang.stack
    ok(Array.isArray(hang.stack));
    ok(!!hang.stack.length);
    hang.stack.forEach(entry => {
      // Each stack frame entry is either a native or pseudostack entry. A
      // native stack entry is an array with module index (number), and offset
      // (hex string), while the pseudostack entry is a bare string.
      if (Array.isArray(entry)) {
        equal(entry.length, 2);
        equal(typeof entry[0], "number");
>>>     ok(entry[0] < telSvc.payload.modules.length);
        equal(typeof entry[1], "string");
      } else {
        equal(typeof entry, "string");
      }
    });

    // hang.annotations
    equal(typeof hang.annotations, "object");
    Object.keys(hang.annotations).forEach(key => {
      equal(typeof hang.annotations[key], "string");
    });
  });

This might moot this specific failure. IIUC this entire block just checks the written telemetry data for consistency ("is this a series of hangs with stack traces?") but not for the expected content.

Nika, ni? to you as you apparently introduced that check in bug 1390585. It still remains somewhat concerning that this test seems not to be deterministic in terms of the generated output? Is there any hidden magic I missed? Thanks!

Flags: needinfo?(nika)

(In reply to Jens Stutte [:jstutte] from comment #30)

Nika, ni? to you as you apparently introduced that check in bug 1390585. It still remains somewhat concerning that this test seems not to be deterministic in terms of the generated output? Is there any hidden magic I missed? Thanks!

Yeah, I've unfortunately paged out most of the context here, but reading the code again it looks like the use of .hangs.length instead of .modules.length was a typo on my part when I wrote the patch.

Flags: needinfo?(nika)

It still remains somewhat concerning that this test seems not to be deterministic in terms of the generated output? Is there any hidden magic I missed? Thanks!

Yeah, the feature is quite non-deterministic as it is trying to detect hangs, and the information in the report is quite unstable and can't be easily asserted without going through a whole symbolication process (and even then it is unlikely to be stable across builds of Firefox). I don't work on this code anymore, but maybe things have improved enough in BHR that it's possible to make it deterministic now?

Assignee: nobody → jstutte

(In reply to Nika Layzell [:nika] (ni? for response) from comment #32)

Yeah, the feature is quite non-deterministic as it is trying to detect hangs, and the information in the report is quite unstable and can't be easily asserted without going through a whole symbolication process (and even then it is unlikely to be stable across builds of Firefox). I don't work on this code anymore, but maybe things have improved enough in BHR that it's possible to make it deterministic now?

I have not enough context to do more than just repair the typo, I fear. Thanks!

Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0850c74da42f Repair old typo in test_BHRObserver.js that recently caused intermittent failures. r=mossop
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: