Open
Bug 1492805
Opened 7 years ago
Updated 3 years ago
Some logs fail to parse due to "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 467: unexpected end of data"
Categories
(Tree Management :: Treeherder, defect, P3)
Tree Management
Treeherder
Tracking
(Not tracked)
NEW
People
(Reporter: dluca, Unassigned)
References
Details
Logs on both autoland and inbound are not parsed.
https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed,busted,exception&classifiedState=unclassified&fromchange=7c4bf9e1d72e7c80e573de221d36fe02a84dfdc9&group_state=expanded&selectedJob=200369075
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&resultStatus=testfailed,busted,exception&classifiedState=unclassified&fromchange=5699c66801f5cbbc71548e5bada8276c22f20994&group_state=expanded&selectedJob=200395532
![]() |
||
Comment 1•7 years ago
|
||
Both have this failure in the log:
[task 2018-09-20T05:48:04.764Z] 05:48:04 WARNING - TEST-UNEXPECTED-FAIL | dom/base/test/unit/test_xmlserializer.js | run_test/< - [run_test/< : 36] "<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>\\r<foo/>\\r<!-- é -->" == "<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>\\r<!DOCTYPE whatever PUBLIC \\"-//MOZ//WHATEVER//EN\\" \\"http://mozilla.org/ns/foo\\">\\r<foo xmlns=\\"htp://mozilla.org/ns\\">\\r <baz/><!-- a comment --> <bar> <robots> & <aliens>\\r<mozilla> a a a a a éèàùûî</mozilla>\\r <firefox>Lorem ip<!-- aaa -->sum dolor sit amet, consectetuer adipiscing elit. Nam eu sapien. Sed viverra lacus. Donec quis ipsum. Nunc cursus aliquet lectus. Nunc vitae eros. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Nam tellus massa, fringilla aliquam, fermentum sit amet, posuere ac, est. Duis tristique egestas ligula. Mauris quis felis. Fusce a ipsum non lacus posuere aliquet. Sed fermentum posuere nulla. Donec tempor. Donec sollicitudin tortor lacinia libero ullamcorper laoreet. Cras quis nisi at odio consectetuer molestie.</firefox>\\r <?xml-foo \\"hey\\" ?>\\r</bar>\\r <!-- a comment \\r on several lines-->\\r <?xml-foo \\"another pi on two lines\\" \\r example=\\"hello\\"?>\\r</foo>"
Comment 2•7 years ago
|
||
Looking at New Relic (I believe some of you have access; let me know if any more people want adding?) I see:
https://rpm.newrelic.com/accounts/677903/applications/14179757/filterable_errors?tw%5Bend%5D=1537443927&tw%5Bstart%5D=1537400727#/show/5b2d8a00-bcb7-11e8-bfca-0242ac110008_5138_10270/stack_trace?top_facet=transactionUiName&primary_facet=error.class&barchart=barchart
ie: in the last 12 hours, 3 instances of:
Traceback:
...
File "/app/treeherder/workers/task.py", line 43, in inner
File "/app/treeherder/log_parser/tasks.py", line 54, in parse_logs
File "/app/treeherder/log_parser/tasks.py", line 97, in parse_unstructured_log
File "/app/treeherder/log_parser/utils.py", line 38, in post_log_artifacts
File "/app/treeherder/log_parser/utils.py", line 27, in extract_text_log_artifacts
File "/app/.heroku/python/lib/python2.7/site-packages/simplejson/__init__.py", line 382, in dumps
File "/app/.heroku/python/lib/python2.7/site-packages/simplejson/encoder.py", line 296, in encode
File "/app/.heroku/python/lib/python2.7/site-packages/simplejson/encoder.py", line 378, in iterencode
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 467: unexpected end of data
Example log:
https://queue.taskcluster.net/v1/task/Q1rJboiTS4Sf_cAacBKF9w/runs/0/artifacts/public/logs/live_backing.log
It looks like that log line that was extracted isn't valid UTF-8.
I'm pretty swamped at the moment, so given this is pretty specific (ie these particular failures only and it's possible to look at the raw log), I won't really have a chance to look into this right now.
However if anyone wanted to have a go at fixing it, the line in question is here:
https://github.com/mozilla/treeherder/blob/b621781f2606000bac3693eb324e8e565728682c/treeherder/log_parser/utils.py#L27
And `artifact` comes from:
https://github.com/mozilla/treeherder/blob/b621781f2606000bac3693eb324e8e565728682c/treeherder/log_parser/artifactbuilders.py#L61-L64
https://github.com/mozilla/treeherder/blob/b621781f2606000bac3693eb324e8e565728682c/treeherder/log_parser/parsers.py#L37-L39
https://github.com/mozilla/treeherder/blob/b621781f2606000bac3693eb324e8e565728682c/treeherder/log_parser/parsers.py#L85-L88
Perhaps this line truncation isn't helping?
https://github.com/mozilla/treeherder/blob/b621781f2606000bac3693eb324e8e565728682c/treeherder/log_parser/artifactbuilders.py#L49
Priority: -- → P2
Summary: Log are not being parsed → Some logs fail to parse due to "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 467: unexpected end of data"
Comment hidden (Intermittent Failures Robot) |
Comment 5•6 years ago
|
||
Another example:
https://queue.taskcluster.net/v1/task/SBmZyNJ9QJC ADFLDfytWwQ/runs/0/artifacts/public/logs/live_backing.log
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 467: unexpected end of data
Comment hidden (Intermittent Failures Robot) |
![]() |
||
Updated•6 years ago
|
Priority: P2 → P3
Comment 7•5 years ago
|
||
I'm not seeing any errors like this in New Relic.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INVALID
Comment hidden (Intermittent Failures Robot) |
Updated•5 years ago
|
Assignee: nobody → sclements
Status: RESOLVED → REOPENED
Priority: P3 → P2
Resolution: INVALID → ---
Updated•5 years ago
|
Status: REOPENED → ASSIGNED
Comment 9•5 years ago
|
||
I am also hitting this locally when running some (locally bad) xpcshell tests:
5:30.90 INFO Following exceptions were raised:
5:30.90 ERROR Traceback (most recent call last):
File "/home/simon/work/mozilla-unified/testing/xpcshell/runxpcshelltests.py", line 224, in run
self.run_test()
File "/home/simon/work/mozilla-unified/testing/xpcshell/runxpcshelltests.py", line 855, in run_test
self.log_full_output()
File "/home/simon/work/mozilla-unified/testing/xpcshell/runxpcshelltests.py", line 649, in log_full_output
self.log_line(line)
File "/home/simon/work/mozilla-unified/testing/xpcshell/runxpcshelltests.py", line 630, in log_line
line = self.fix_text_output(line).rstrip('\r\n')
File "/home/simon/work/mozilla-unified/testing/xpcshell/runxpcshelltests.py", line 621, in fix_text_output
line = cleanup_encoding(line)
File "/home/simon/work/mozilla-unified/testing/xpcshell/runxpcshelltests.py", line 109, in cleanup_encoding
return six.ensure_str(s)
File "/home/simon/work/mozilla-unified/third_party/python/six/six.py", line 899, in ensure_str
s = s.decode(encoding, errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 266: invalid continuation byte
Comment 10•5 years ago
|
||
Sorry, we're a little short staffed at the moment so I'm not sure when I'll be able to get to this.
Priority: P2 → P3
Updated•4 years ago
|
Assignee: sclements → nobody
Status: ASSIGNED → NEW
Assignee | ||
Updated•3 years ago
|
Component: Treeherder: Log Parsing & Classification → TreeHerder
You need to log in
before you can comment on or make changes to this bug.
Description
•