Closed
Bug 1295997
Opened 9 years ago
Closed 6 years ago
Add limit to size of log we will parse
Categories
(Tree Management :: Treeherder, defect, P1)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: camd, Assigned: emorley)
References
Details
Attachments
(3 files, 1 obsolete file)
We should skip parsing logs that are too large so that it doesn't stop our processing if a ridiculously large log (like 2GB) comes in.
Reporter | ||
Updated•9 years ago
|
Assignee: nobody → cdawson
Reporter | ||
Updated•9 years ago
|
Reporter | ||
Comment 1•9 years ago
|
||
Part of this work is to switch over to requests which helps us determine the log size before we try to parse. So I will likely be fixing bug 1165356 while I'm at it.
Assignee | ||
Updated•9 years ago
|
Comment 2•9 years ago
|
||
Reporter | ||
Comment 3•9 years ago
|
||
Comment on attachment 8782981 [details] [review]
[treeherder] mozilla:requests-log-parser-responses > mozilla:master
Hey Ed-- I've been beating my head against a wall on this one for a while now. :) I'd love any ideas you may have. Thanks for taking a look. Not a huge rush because I'm going to task switch to clear my head a bit. :) Perhaps some distance will give my better perspective...
Attachment #8782981 -
Flags: feedback?(emorley)
Assignee | ||
Comment 4•8 years ago
|
||
Comment on attachment 8782981 [details] [review]
[treeherder] mozilla:requests-log-parser-responses > mozilla:master
I've had a look but can't see anything too obvious.
Will take a deeper look when I have more time :-)
Attachment #8782981 -
Flags: feedback?(emorley)
Assignee | ||
Updated•8 years ago
|
Assignee: cdawson → emorley
Assignee | ||
Comment 5•8 years ago
|
||
In addition to the instances in bug 1294548 and bug 1343831, we just had another today.
23:28 <emorley> gbrown: this try run is creating 400MB logs which momentarily caused log parser backlogs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1b2c3fe5bbe2cb539e9513067d333891ed4e511c
23:29 <gbrown> emorley: wow, spectacular failure. sorry. cancelled.
23:30 <emorley> gbrown: np, treeherder should handle this case better (doing so in bug 1294544, though requires changing the way we record log parser failures so we can improve the UX and explain that log parsing was skipped deliberately)
23:31 <emorley> it's just that in the meantime, 500 jobs x 200-400MB logs is quite a bit of parsing time :-)
In today's instance the largest logs were 400MB uncompressed but only 22.5MB compressed (since lots of repetition).
Unfortunately the log parser can only see the Content-Length of the compressed log, so we'll have to see whether we can set a compressed size limit of say 10-20MB to catch these, or whether that would be too low for logs that don't have much repetition.
It might be that for such high-compression cases we'll have to rely instead on the time limit of bug 1294544.
Assignee | ||
Comment 6•8 years ago
|
||
00:11 <emorley> gbrown: Aryx: though the jobs do have a failure message that warns about the size (https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/buildbot.py?q=%22Log+file+size%22&redirect_type=single#91) -- perhaps that should truncate the logs down to the max size, so the full log never gets uploaded?
Comment 7•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Attachment #8877268 -
Flags: review?(cdawson)
Assignee | ||
Updated•8 years ago
|
Attachment #8782981 -
Attachment is obsolete: true
Reporter | ||
Updated•8 years ago
|
Attachment #8877268 -
Flags: review?(cdawson) → review+
Comment 8•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/ba4bf09e5ebc1da3bbf46a99e5cf69e605173310
Bug 1295997 - Record the size of the unstructured log download
This will help determine what maximum size threshold is appropriate
to allow the common log sizes, but still prevent the extreme
offenders. The Content-Encoding is also recorded, to check if there
are any other logs being served without gzip.
Assignee | ||
Comment 9•8 years ago
|
||
Whilst working on the implementation for blocking downloads I spotted a requests daftness, which I fixed and has now been released:
https://github.com/requests/requests/pull/4137
https://github.com/requests/requests/blob/master/HISTORY.rst#2180-2017-06-14
Once we update to the new release (https://github.com/mozilla/treeherder/pull/2560) we can remove the `closing()` boilerplate from here:
https://github.com/mozilla/treeherder/blob/ba4bf09e5ebc1da3bbf46a99e5cf69e605173310/treeherder/log_parser/artifactbuildercollection.py#L90
Comment 10•8 years ago
|
||
Comment 11•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/ce5527c820aba79ddef0e76c8e1756ca11d01499
Bug 1295997 - Send the unstructured log size to New Relic as ints
New Relic Insights doesn't coerce strings to integers, so doesn't allow
the graphing of custom attributes sent as strings. HTTP headers are
always exposed as strings, even for fields that are expected to
represent numbers, so we must explicitly cast Content-Length.
Assignee | ||
Comment 12•8 years ago
|
||
I've written a query to generate a histogram of download sizes:
https://insights.newrelic.com/accounts/677903/dashboards/339080
The fix in comment 11 was only just deployed, but we'll gradually get more data over the next few days (our non-pro Insights plan keeps 7 days of data).
Assignee | ||
Updated•7 years ago
|
Component: Treeherder: Data Ingestion → Treeherder: Log Parsing & Classification
Priority: -- → P1
Assignee | ||
Comment 13•7 years ago
|
||
This try push created logs that were 486 MB compressed and a whopping 12 GB uncompressed!
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f9627aa74b1083fd2dab6f4f39fae24fcd9ebbd2
Assignee | ||
Comment 14•7 years ago
|
||
To summarise the current state here:
I've had a WIP for checking Content-Length of log files before performing the download using requests for a while, however it needs some additional changes to Treeherder so we can store a "log parsing failed" reason to display in the UI, otherwise people will blame Treeherder not their logs.
I also added some New Relic stats annotation above to help pick a threshold, though not everything is compressed, so it's hard to pick a threshold that's fair for both cases (plus depending on how much duplication there is in the over-verbose log lines, the appropriate compressed threshold can vary dramatically).
Comment 15•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 16•6 years ago
|
||
https://github.com/mozilla/treeherder/commit/52d6017c5b3751d547a445f0a3fc891c3406d52b
Logs whose download size (ie before decompression, for those that are compressed) that is larger than 5MB will now be skipped by the log parser.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 17•6 years ago
|
||
(This will prevent one of the main causes for rabbitmq queue size alerts in production)
Updated•3 years ago
|
Component: Treeherder: Log Parsing & Classification → TreeHerder
You need to log in
before you can comment on or make changes to this bug.
Description
•