Open Bug 1337839 Opened 3 years ago Updated 3 years ago

consider logging all information from test logs now that we are on taskcluster

Categories

(Testing :: Mochitest, defect)

defect
Not set

Tracking

(Not tracked)

People

(Reporter: jmaher, Unassigned)

References

(Blocks 1 open bug)

Details

in the past we have ran into log size limitations on buildbot (iirc 50MB), so we suppressed log information from tests unless there is a failure.  This allows us to keep our log sizes smaller and theoretically reduce runtime.

Now that we are not running tests in buildbot for linux/android, and soon to be osx, we should start running tests with the full log output.

I would like to:
* figure out what other reasons we have for reduced log sizes
* determine what issues this might bring up in taskcluster
* determine the storage requirements
* see if this causes our tests to be more flaky!!!
There were two implementations of hiding log results from non-failing tests for two different reasons:
1) bug 957768, implemented because some B2G logs were over 50MB which caused issues with buildbot. This code was actually backed out in bug 969446 in favor of...
2) bug 937181, implemented because it made running tests a lot faster, and it made the test output more manageable locally. We could test turning that off in a try push to see if the test speedups are still there. froydnj also noted that it helped with browser memory pressure, probably because we generate a lot of objects doing logging. It's possible we could optimize the in-browser portion of the logging routines to help with that.

I'm generally in favor of logs in automation being verbose, since it's hard to diagnose problems if you don't have enough info, and we already have tooling to summarize errors from logs so most people don't have to grovel through all the log content anyway. If the current implementation is still a significant speedup for test runtime though, I'm not sure it's worth spending a bunch of developer time to try to get that speedup while still logging verbose logs.
My understanding is that we keep a buffer of the most recent log messages and write out that buffer when a test hits a failure. That makes it seem like we're already doing almost all the work for logging--we're just not writing the data to disk. And that seems like the easy (and fast) part. Maybe I'm missing something?

The reason I don't like the buffer is that it doesn't go back very far and it also seems to mess up the timestamps of the buffered log messages. The latter is especially bad when you're trying to debug a slow test.
If we ever get to hyper chunking, it would probably negate whatever reason we had for buffering in the first place.
FWIW we should redo the measurements in bug 937181 comment 4 and 5 if we decide to change this.
You need to log in before you can comment on or make changes to this bug.