test-verify tasks often have bad errorsummary.log and raw.log information
Categories
(Testing :: General, defect, P1)
Tracking
(firefox79 fixed)
Tracking | Status | |
---|---|---|
firefox79 | --- | fixed |
People
(Reporter: marco, Assigned: gbrown)
References
(Blocks 2 open bugs)
Details
Attachments
(1 file)
E.g. in GHFio16nR1auPd8oTMS6jw browser/extensions/doh-rollout/test/browser/browser_dirtyEnable.js
failed, but the errorsummary.log is:
{"action": "test_groups", "line": 5, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 141, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 147, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 236, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 239, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 330, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 333, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 426, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 429, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 519, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 522, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 616, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 622, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 760, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
{"action": "test_groups", "line": 763, "groups": ["browser/extensions/doh-rollout/test/browser/browser.ini"]}
{"status": "OK", "action": "group_result", "line": 777, "group": "browser/extensions/doh-rollout/test/browser/browser.ini"}
Similarly in TsTTjrsmRES41BGmliS7Sg, netwerk/test/unit/test_trr_case_sensitivity.js
failed, but the errorsummary.log file is:
{"action": "test_groups", "line": 7, "groups": ["netwerk/test/unit/xpcshell.ini"]}
{"action": "test_groups", "line": 38, "groups": ["netwerk/test/unit/xpcshell.ini"]}
Reporter | ||
Comment 1•4 years ago
|
||
The raw log is also wrong, which means data about test-verify is also bogus in ActiveData.
Assignee | ||
Comment 2•4 years ago
|
||
test-verify runs the same test multiple times, so repeated groups are not unexpected, but I don't understand why many successful mochitests show ~13 runs in the errorsummary.log, where such test-verify tasks would run the test 30 (10+5+10+5) times.
Reporter | ||
Comment 3•4 years ago
|
||
:gbrown I'm not so concerned about the repeated groups, but about the fact that failures might not be reported (in the two cases above, they weren't).
Reporter | ||
Comment 4•4 years ago
|
||
Any chance you could prioritize this? It'd be nice to be able to use test-verify data to train the scheduler!
Assignee | ||
Comment 5•4 years ago
|
||
Let's see what I can do....
Assignee | ||
Comment 6•4 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #2)
test-verify runs the same test multiple times, so repeated groups are not unexpected, but I don't understand why many successful mochitests show ~13 runs in the errorsummary.log, where such test-verify tasks would run the test 30 (10+5+10+5) times.
This is actually expected. For a single mochitest TV run, there are 13 group entries in errorsummary.log, corresponding to 13 suite_start logging events: 1 suite_start for step 1 (repeat 10 times in 1 browser session), 5 for step 2 (5 browser sessions), 1 for step 3, 5 for step 4, and finally 1 extra suite_start for the mochitest summary logging (a harmless kludge).
Now, why would failures not be reflected in the errorsummary...
Assignee | ||
Comment 7•4 years ago
|
||
I don't see anything specific to test-verification which would make the errorsummary differ from a corresponding result from a mochitest/reftest/xpcshell-test.
If I force a test failure in a TV task, the errorsummary correctly records the failure(s):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=152b6eae8d650f6427a3aef543d7db47c527c15f
Assignee | ||
Comment 8•4 years ago
|
||
But, a TV task that verifies more than one test in the same suite will run the test harness multiple time, and each time will produce a new errorsummary (and raw) log -- and each time will use the same log file name(s): Log files from the first test will be over-written by the next.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 9•4 years ago
|
||
:marco -- The issue here is that a test-verify task sometimes runs a test harness multiple times, and each run over-writes the logs from the previous run (oops!). Would it be okay to have multiple errorsummary (and raw) logs for test-verify tasks? It would look something like this:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d003f6a6238fe8da96e17f95e1349731223e0b53
Reporter | ||
Comment 10•4 years ago
|
||
It would work for me, I don't know if other tools might be depending on the existence of a single errorsummary.log file though (maybe Treeherder?).
Comment 11•4 years ago
|
||
I don't see problems with it (other than the irony of the fact that we've been trying to eliminate multiple errorsummaries by running one suite per task). But overwriting errorsummaries is objectively worse.. so no issues here. Maybe in the long term we could have suite specific TV tasks (not sure if this actually solves a problem or not, just seems like it would make things more intuitive).
One comment is that the number makes it look like it corresponds to a specific chunk. I'd vote we call it mochitest-plain_run1_errorsummary.log
or something like that.
Comment 12•4 years ago
•
|
||
(In reply to Marco Castelluccio [:marco] from comment #10)
It would work for me, I don't know if other tools might be depending on the existence of a single errorsummary.log file though (maybe Treeherder?).
Also TV already emits more than one errorsummary, so not sure this would break anything more than it already does.
Comment 13•4 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #10)
It would work for me, I don't know if other tools might be depending on the existence of a single errorsummary.log file though (maybe Treeherder?).
Treeherder would handle this fine. It parses any number of logs that end with _errorsummary.log
. Thanks for checking. :)
Assignee | ||
Comment 14•4 years ago
|
||
In all mozharness scripts running test-verify (or per-test coverage), use distinct
file names for errorsummary and raw logs. This usually means that the command needs
to be built inside the per-test loop.
Reporter | ||
Comment 15•4 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #9)
:marco -- The issue here is that a test-verify task sometimes runs a test harness multiple times, and each run over-writes the logs from the previous run (oops!). Would it be okay to have multiple errorsummary (and raw) logs for test-verify tasks? It would look something like this:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d003f6a6238fe8da96e17f95e1349731223e0b53
Just noticed though that there is a problem. In test-android-em-7.0-x86_64/debug-geckoview-test-verify-e10s (SPVn270YSmu0Hh8lBPL96Q), there is a xpcshell failure, but the xpcshell_errorsummary.log doesn't report it.
Assignee | ||
Comment 16•4 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #15)
Yes, that try push did not have the android changes included in the final patch -- so expected and nothing to worry about.
Comment 17•4 years ago
|
||
Comment 18•4 years ago
|
||
bugherder |
Description
•