Closed Bug 1294548 Opened 8 years ago Closed 8 years ago

Some NSS-try log files are too huge for Treeherder to parse

Categories

(NSS :: Build, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: camd, Assigned: franziskus)

References

(Depends on 1 open bug)

Details

Attachments

(1 file)

Some of these logs are 2GB, and have tons of repeated information in them, like this:

https://queue.taskcluster.net/v1/task/cCAdbuPmQa6p_aptbdJ9Nw/runs/0/artifacts/ public%2Flogs%2Flive_backing.log

Treeherder isn't able to parse logs this large and will skip parsing them.  Please work to limit these log sizes to more like 50 or 100 MB.
wladd: I saw your email mentioned in Treeherder UI.  Would you be able to look at the log above and see if you could reduce the log size?

I'm working on a fix to just reject logs this big, but if you could fix on your end, that'd be great, too.  :)

Thanks!
Flags: needinfo?(wladd)
I'm moving the flag to the person who understands the CI stuff and will be around. I don't know what could have made that monstrosity: it's not ASCII.
Flags: needinfo?(wladd) → needinfo?(ttaubert)
I can't find the task with the inspector and have no idea what test suite produces that much output? Anyway, chances are high we indeed have to minimize the output for some test suites.
Flags: needinfo?(ttaubert)
Example task:
https://tools.taskcluster.net/task-inspector/#cCAdbuPmQa6p_aptbdJ9Nw

Example 2GB log:
https://queue.taskcluster.net/v1/task/cCAdbuPmQa6p_aptbdJ9Nw/runs/0/artifacts/public%2Flogs%2Flive_backing.log
(don't load in browser, will hang)

To expand on comment 0 - these logs were causing a tens of thousand of log parsing task backlog in Treeherder last night. Whilst it should be Treeherder's responsibility to handle cases like this more gracefully (or perhaps Taskcluster to not accept a 2GB log being uploaded to S3), it would be good if we could stop try pushes of these types until at least one of the links in the chain is resolved.

Many thanks :-)
That log file contains 39,545,208 lines that are one of these and only 193,787 lines that aren't:

tstclnt: about to call PR_Poll on writable socket !
tstclnt: PR_Poll returned with writable socket !

The usual pattern seems to be "tstclnt: Writing 18 bytes to server", then a run of ~40k lines that are just the two of those messages alternating, then "tstclnt: using asynchronous certificate validation".

Maybe something isn't quite right with that poll loop in tstclnt.c.
See Also: → 1295997
Depends on: 1295997
See Also: 1295997
Depends on: 1294544
This patch removes "-v" from calls in ssl.sh.

I don't really understand the log file from comment 4. I've never seen so many lines of PR_Poll from tstclnt before. The machine must have some issues that cause this. But I think it doesn't hurt to decrease verbosity here.
Assignee: nobody → franziskuskiefer
Attachment #8783620 - Flags: review?(kaie)
Franziskus, which platforms show the large amount of poll status messages from tstclnt? Just one, or multiple platforms?

Was that a sudden large increase, or did it just grow above the thresholds that can be handled? If a sudden increase, we probably should investigate what happened.
Comment on attachment 8783620 [details] [diff] [review]
non-verbose-ssl.patch

I don't know if the tstclnt -v option could ever help with identifying any future regresions.

Please have a look at file ssl_dist_stress.sh for an alternative approach, that would make it easy to re-enable -v for anyone, who needs that output.

r=kaie, but instead of completely removing -v, please introduce a new variable $verbose (which can be empty by default)
Attachment #8783620 - Flags: review?(kaie) → review+
r=kaie is also valid for the patch where you make the suggested change
landed as https://hg.mozilla.org/projects/nss/rev/b4312374170a

I'm closing this now. Please reopen of this happens again (it really shouldn't).
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.27
See Also: → 1343831
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: