Closed Bug 937181 Opened 6 years ago Closed 6 years ago

mochitest should do xpcshell-style test summarization

Categories

(Testing :: Mochitest, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla30

People

(Reporter: froydnj, Assigned: froydnj)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 2 obsolete files)

(aka "The output from mochitest is TOO DAMN LONG")

I just ran mochitests in a subdirectory that took ~55 minutes and produced 20MB of output.  This is silly.  I don't need to know whether every individual test failed or passed, just whether any tests in a particular file failed.  xpcshell already does this and it's very convenient.

This would make mochitest running a more pleasant experience locally and might even make tbpl runs a little faster by producing less output.

I think this is pretty easy to do in SimpleTest, but I have a feeling the Right Way to do this is to put all the logic in runtests.py.
It's probably easier in runtests.py only because of the volume of output produced that's not from SimpleTest. It's actually not clear to me that this is entirely desirable, since Mochitests are not separated from each other like xpcshell tests are--xpcshell tests run in separate xpcshell processes, whereas Mochitests run sequentially in the same browser, so there's lots of potential for interference. If we were to suppress the output that happened during a test we might suppress clues about why a previous or following test was malfunctioning.
(But I'm supportive of the idea in general, in that Mochitest logs are super noisy.)
Now that we can upload files from automation, we should start focusing on writing machine readable files to capture all important output. Once that's done, we can trim stdout to be just the tests that failed and other exceptional messages.
I was motivated to do this today after waiting around for a single directory (!),
layout/style/test/, to finish its tests.  It takes 20 minutes to run that directory
on my machine and produces 32MB of log output.  With this patch, it takes a minute
and a half and produces ~850KB of log output.

The implementation feels a little complicated.  I would be willing to add an option
to disable this coalescing to address Ted's concern about hiding clues as to why
a test is failing.

I intended to limit this to single file, so e.g. test_huge_test.html would get log
coalescing, while the following test_small_test.html would get to output all its
messages.  That seems to not be what this patch does, but I am not so sure that's
really a bad thing.

Feedback welcome.
Attachment #8368851 - Flags: feedback?(jmaher)
Attachment #8368851 - Flags: feedback?(ted)
Testing shows that this can also help with memory pressure, e.g. in layout/style/test/test_extra_inherit_initial.html on x86-64 Linux:

Without patch, before test:

10324 INFO TEST-INFO | MEMORY STAT vsize after test: 581881856
10325 INFO TEST-INFO | MEMORY STAT residentFast after test: 162660352
10326 INFO TEST-INFO | MEMORY STAT heapAllocated after test: 51752512

Without patch, after test:

50481 INFO TEST-INFO | MEMORY STAT vsize after test: 681824256
50482 INFO TEST-INFO | MEMORY STAT residentFast after test: 252424192
50483 INFO TEST-INFO | MEMORY STAT heapAllocated after test: 113041616

Deltas: vsize +100MB, residentFast: +90MB, heapAllocated +60MB

With patch, before test:

482 INFO TEST-INFO | MEMORY STAT vsize after test: 572485632
483 INFO TEST-INFO | MEMORY STAT residentFast after test: 165752832
484 INFO TEST-INFO | MEMORY STAT heapAllocated after test: 52749048

Note that the numbers are roughly similar, even with test coalescing.  Now with patch, after test:

887 INFO TEST-INFO | MEMORY STAT vsize after test: 585351168
888 INFO TEST-INFO | MEMORY STAT residentFast after test: 184643584
889 INFO TEST-INFO | MEMORY STAT heapAllocated after test: 74922264

Deltas: vsize: +13MB, residentFast: +20MB, heapAllocated +22MB

Patch improvement (delta of deltas, smaller is better): vsize -87MB, residentFast: -70MB, heapAllocated -38MB

I think that's worthwhile.
Blocks: MochiMem
Note that dholbert has been doing work to reduce how long test_extra_initial_inherit takes to run (bug 958075, bug 958802).  This patch makes it almost trivial:

Without patch: 50484 INFO TEST-END | /tests/layout/style/test/test_extra_inherit_initial.html | finished in 318964ms

With patch: 890 INFO TEST-END | /tests/layout/style/test/test_extra_inherit_initial.html | finished in 692ms

That's a huge win.
Comment on attachment 8368851 [details] [diff] [review]
coalesce log messages from SimpleTest

Review of attachment 8368851 [details] [diff] [review]:
-----------------------------------------------------------------

I am confused in general by this patch.  It could be my brain is just parsing it incorrectly.  Maybe more comments.  I know we have a lot of internal state variables already in SimpleTest, this is only making it more complex.  Can this be written (enhanced by comments) to make it more readable?

::: testing/mochitest/tests/SimpleTest/SimpleTest.js
@@ +320,5 @@
> +    // Test suites that have output a lot of messages will probably continue
> +    // to output a lot of messages.
> +    if (!SimpleTest._collapsingLogMessages &&
> +	SimpleTest._numLogMessages > SimpleTest._logMessageCollapseThreshold) {
> +	SimpleTest._collapsingLogMessages = true;

this is set to true, and not reset- why not just set it as true by default?

@@ +328,3 @@
>      var isError = !test.result == !test.todo;
> +    var outputCollapsedMessage = (SimpleTest._collapsingLogMessages
> +				  && SimpleTest._currentCollapsedMessage == SimpleTest._logMessageCollapseThreshold);

so we only output the 100th line?
Attachment #8368851 - Flags: feedback?(jmaher) → feedback-
(In reply to Joel Maher (:jmaher) from comment #7)
> I am confused in general by this patch.  It could be my brain is just
> parsing it incorrectly.  Maybe more comments.  I know we have a lot of
> internal state variables already in SimpleTest, this is only making it more
> complex.  Can this be written (enhanced by comments) to make it more
> readable?

It is complicated, probably because I was trying to ensure that if you had a test with a small amount of output, you would see all the output, and if you had a test with a large amount of output, you would see only some of the output.  Being willing to coalesce messages all the time would make things simpler.

> ::: testing/mochitest/tests/SimpleTest/SimpleTest.js
> @@ +320,5 @@
> > +    // Test suites that have output a lot of messages will probably continue
> > +    // to output a lot of messages.
> > +    if (!SimpleTest._collapsingLogMessages &&
> > +	SimpleTest._numLogMessages > SimpleTest._logMessageCollapseThreshold) {
> > +	SimpleTest._collapsingLogMessages = true;
> 
> this is set to true, and not reset- why not just set it as true by default?

The intent here was that we'd get a new load of SimpleTest.js for each mochitest, and that would take care of the reset.  For plain mochitests, that doesn't seem to be how the test harness works, and if we want to make coalescing the default, that's fine too.

> @@ +328,3 @@
> >      var isError = !test.result == !test.todo;
> > +    var outputCollapsedMessage = (SimpleTest._collapsingLogMessages
> > +				  && SimpleTest._currentCollapsedMessage == SimpleTest._logMessageCollapseThreshold);
> 
> so we only output the 100th line?

No, because this later bit:

+    if (outputCollapsedMessage) {
+	diagnostic += " (elided " + SimpleTest._currentCollapsedMessage + " passes or known failures)"
+	SimpleTest._currentCollapsedMessage = 1;
+    }

causes us to start counting again.
I did a full mochitest try run with this patch and the results were somewhat mixed:

- For browser-chrome debug, this patch wins ~5-10% of runtime.  That's good.
- For browser-chrome opt, this patch helps a little, maybe 2-3%.  Linux64 opt showed a massive 25% win; I'm not quite sure what's going on there.
- For mochitest-other (opt and debug), there's very little win, except on Win8, where we get 25% improvement.  Linux platforms also seem to benefit 5-10% from it.
- For mochitest-{1,2,3,4,5} debug, ~5-10% wins are common.
- For mochitest-{1,2,3,4,5} opt, there's slight-to-none decreases in runtime, and on several platforms (OS X, WinXP, all Android versions), this patch makes us go slightly slower.

Overall, I think this patch is a good idea, except that it needs some polishing so the reviewers understand what it's doing. :(
Does this differ significantly from what ahal has already implemented in bug 957768?
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #10)
> Does this differ significantly from what ahal has already implemented in bug
> 957768?

I think the idea is the same, but my implementation is just in the python harness, so logs are still being emitted, just not printed and there isn't really any perf win. It was basically a quick hack to get around the fact that b2g desktop mochitest logs were surpassing 50MB.

So if this lands and becomes default behaviour, I think we should backout my patch in bug 957768.
Here's a potentially simpler patch that assumes we want to coalesce log messages
all the time.  That gets rid of some tricky logic.

The second improvement is to move all the state required inside of _logResult.
That means that the state machine is at least local to _logResult, rather than
exposed for the rest of SimpleTest to start abusing.  The waitForFocus logic
variables could use this same treatment, Joel, if you think that would be
worthwhile.

If I have done my git-bz modifications appropriately, the patch should be
generated with --ignore-all-space, which should make the modifications a little
easier to follow and verify.
Attachment #8368851 - Attachment is obsolete: true
Attachment #8368851 - Flags: feedback?(ted)
Attachment #8370088 - Flags: feedback?(jmaher)
Comment on attachment 8370088 [details] [diff] [review]
coalesce log messages from SimpleTest

Review of attachment 8370088 [details] [diff] [review]:
-----------------------------------------------------------------

hey, this is a lot simpler- I assume it functions the same?
Attachment #8370088 - Flags: feedback?(jmaher) → feedback+
Yes, the effect is exactly the same as before.  I've added a bit more
explanation as to why we do log message coalescing in this version.
Attachment #8370088 - Attachment is obsolete: true
Attachment #8370127 - Flags: review?(jmaher)
Comment on attachment 8370127 [details] [diff] [review]
coalesce log messages from SimpleTest

Review of attachment 8370127 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mochitest/tests/SimpleTest/SimpleTest.js
@@ +337,5 @@
>          var resultString = test.result ? passString : failString;
>          var url = SimpleTest._getCurrentTestURL();
>          var diagnostic = test.name + (test.diag ? " - " + test.diag : "");
> +        if (outputCoalescedMessage) {
> +            diagnostic += " (elided " + numCoalescedMessages + " passes or known failures)"

nit: add a semicolon here
Attachment #8370127 - Flags: review?(jmaher) → review+
Assignee: nobody → nfroyd
https://hg.mozilla.org/mozilla-central/rev/c8abc650f217
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
So does this mean that my patch in bug 957768 should get backed out? It looks like they do mostly the same thing, though mine is in the python harness, is stupid (just hides all output between TEST-START and TEST-END unless there is an error), and has little to no perf win. Note that the behaviour is disabled unless --hide-subtests is passed in to the harness.

I'm just not clear if they complement each other or are redundant.
Flags: needinfo?(nfroyd)
(In reply to Andrew Halberstadt [:ahal] from comment #18)
> So does this mean that my patch in bug 957768 should get backed out?

I believe so.  You probably want to verify that backing your patch out doesn't cause b2g mochitest output to balloon to tens of megabytes again.
Flags: needinfo?(nfroyd)
Depends on: 969437
Depends on: 973000
Duplicate of this bug: 926576
Depends on: 974073
Depends on: 977198
You need to log in before you can comment on or make changes to this bug.