UnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-10: invalid data

VERIFIED FIXED in 5.12.6

Status

P1
normal
VERIFIED FIXED
8 years ago
3 years ago

People

(Reporter: krupa.mozbugs, Assigned: basta)

Tracking

Details

(Whiteboard: [z][Step 2][traceback], URL)

Attachments

(1 attachment)

1.34 MB, application/java-archive
Details
(Reporter)

Description

8 years ago
Created attachment 496625 [details]
test theme file

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12

steps to reproduce:
1. Load https://addons.allizom.org/z/en-US/developers/addon/submit/2
2. Upload the attached theme

observed behavior:
Upload fails with "Unexpected server error while validating."

details:
{"url": "/z/en-US/developers/upload/aede81ab0f104dc994ef9bbc57ac847e/json", "full_report_url": "/z/en-US/developers/upload/aede81ab0f104dc994ef9bbc57ac847e", "validation": "", "upload": "aede81ab0f104dc994ef9bbc57ac847e", "error": "Traceback (most recent call last):\n  File \"/data/amo_python/www/preview/zamboni/apps/devhub/tasks.py\", line 24, in validator\n    result = _validator(upload)\n  File \"/data/amo_python/www/preview/zamboni/apps/devhub/tasks.py\", line 51, in _validator\n    addon_validator.prepare_package(eb, upload.path, PACKAGE_ANY)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/submain.py\", line 59, in prepare_package\n    output = test_package(err, package, path, expectation)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/submain.py\", line 146, in test_package\n    return test_inner_package(err, package_contents, package)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/submain.py\", line 209, in test_inner_package\n    test_func(err, package_contents, package)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/content.py\", line 161, in test_packed_packages\n    file_data)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/scripting.py\", line 23, in test_js_file\n    tree = _get_tree(name, data)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/scripting.py\", line 106, in _get_tree\n    data = json.dumps(code)\n  File \"/usr/lib/python2.6/json/__init__.py\", line 230, in dumps\n    return _default_encoder.encode(obj)\n  File \"/usr/lib/python2.6/json/encoder.py\", line 361, in encode\n    return encode_basestring_ascii(o)\nUnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-10: invalid data\n"}
Assignee: kumar.mcmillan → thepotch
Target Milestone: 5.12.5 → 5.12.6
Looks like this is in the validator:

\"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/scripting.py\",
Assignee: thepotch → mbasta
Priority: -- → P1
(Assignee)

Comment 2

8 years ago
Is this bug a duplicate of bug 617778? It might have been fixed.
(In reply to comment #2)
> Is this bug a duplicate of bug 617778? It might have been fixed.

It's not a dupe, but it certainly might have been fixed.  What do the tests say?
(Assignee)

Comment 4

8 years ago
Ewww... Just looked at the code. They've got some sort of terrible encoding on the file. Unknown characters all over the place.

What do you think? I'd say ban it on the grounds that the file contains non-standard characters that can't be converted to UTF-8. I doubt that there's any way to get this into Spidermonkey, and even if it gets there, it's still got to somehow get re-encoded back.
Where does this error occur?  Are you the one opening the file?  vim does fine opening chrome/global/myFirefoxTab/myUrlbar.js, and python can read it too.
(Assignee)

Comment 6

8 years ago
Fixed:

https://github.com/mattbasta/amo-validator/commit/076e7c51c170971c25ee28c5576ebe90ea48305c

The error was occurring when I run the code through json.dumps() to form a JS string for the reflection API. The JS file is encoded in GB2312, but json.dumps expects UTF-8. I moved the unicodeification out of the json encoder and added chardet.detect(code) to the validator to grab the encoding and changed the encoding parameter of json.dumps:

encoding = chardet.detect(code)["encoding"].lower()
data = json.dumps(code, encoding=encoding)
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(Reporter)

Comment 7

8 years ago
Uploading the test file does not result in a traceback. Marking this verified.
Status: RESOLVED → VERIFIED
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.