Closed Bug 618093 Opened 14 years ago Closed 14 years ago

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-10: invalid data

Categories

(addons.mozilla.org Graveyard :: Developer Pages, defect, P1)

defect

Tracking

(Not tracked)

VERIFIED FIXED
5.12.6

People

(Reporter: krupa.mozbugs, Assigned: basta)

References

()

Details

(Whiteboard: [z][Step 2][traceback])

Attachments

(1 file)

1.34 MB, application/java-archive
Details
Attached file test theme file
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12

steps to reproduce:
1. Load https://addons.allizom.org/z/en-US/developers/addon/submit/2
2. Upload the attached theme

observed behavior:
Upload fails with "Unexpected server error while validating."

details:
{"url": "/z/en-US/developers/upload/aede81ab0f104dc994ef9bbc57ac847e/json", "full_report_url": "/z/en-US/developers/upload/aede81ab0f104dc994ef9bbc57ac847e", "validation": "", "upload": "aede81ab0f104dc994ef9bbc57ac847e", "error": "Traceback (most recent call last):\n  File \"/data/amo_python/www/preview/zamboni/apps/devhub/tasks.py\", line 24, in validator\n    result = _validator(upload)\n  File \"/data/amo_python/www/preview/zamboni/apps/devhub/tasks.py\", line 51, in _validator\n    addon_validator.prepare_package(eb, upload.path, PACKAGE_ANY)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/submain.py\", line 59, in prepare_package\n    output = test_package(err, package, path, expectation)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/submain.py\", line 146, in test_package\n    return test_inner_package(err, package_contents, package)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/submain.py\", line 209, in test_inner_package\n    test_func(err, package_contents, package)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/content.py\", line 161, in test_packed_packages\n    file_data)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/scripting.py\", line 23, in test_js_file\n    tree = _get_tree(name, data)\n  File \"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/scripting.py\", line 106, in _get_tree\n    data = json.dumps(code)\n  File \"/usr/lib/python2.6/json/__init__.py\", line 230, in dumps\n    return _default_encoder.encode(obj)\n  File \"/usr/lib/python2.6/json/encoder.py\", line 361, in encode\n    return encode_basestring_ascii(o)\nUnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-10: invalid data\n"}
Assignee: kumar.mcmillan → thepotch
Target Milestone: 5.12.5 → 5.12.6
Looks like this is in the validator:

\"/data/amo_python/www/preview/zamboni/vendor/src/amo-validator/validator/testcases/scripting.py\",
Assignee: thepotch → mbasta
Priority: -- → P1
Is this bug a duplicate of bug 617778? It might have been fixed.
(In reply to comment #2)
> Is this bug a duplicate of bug 617778? It might have been fixed.

It's not a dupe, but it certainly might have been fixed.  What do the tests say?
Ewww... Just looked at the code. They've got some sort of terrible encoding on the file. Unknown characters all over the place.

What do you think? I'd say ban it on the grounds that the file contains non-standard characters that can't be converted to UTF-8. I doubt that there's any way to get this into Spidermonkey, and even if it gets there, it's still got to somehow get re-encoded back.
Where does this error occur?  Are you the one opening the file?  vim does fine opening chrome/global/myFirefoxTab/myUrlbar.js, and python can read it too.
Fixed:

https://github.com/mattbasta/amo-validator/commit/076e7c51c170971c25ee28c5576ebe90ea48305c

The error was occurring when I run the code through json.dumps() to form a JS string for the reflection API. The JS file is encoded in GB2312, but json.dumps expects UTF-8. I moved the unicodeification out of the json encoder and added chardet.detect(code) to the validator to grab the encoding and changed the encoding parameter of json.dumps:

encoding = chardet.detect(code)["encoding"].lower()
data = json.dumps(code, encoding=encoding)
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Uploading the test file does not result in a traceback. Marking this verified.
Status: RESOLVED → VERIFIED
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: