Closed
Bug 635933
Opened 13 years ago
Closed 13 years ago
Validator ignores files with UCS2-Little endian encoding
Categories
(addons.mozilla.org Graveyard :: Developer Pages, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
Q2 2011
People
(Reporter: basta, Assigned: basta)
References
Details
It appears that files with UCS2-Little endian encoding are not passed through the JS tests.
Updated•13 years ago
|
Severity: normal → minor
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Assignee | ||
Comment 1•13 years ago
|
||
Through my research, it seems like little-endian UCS-2 is invalid (it should always be big-endian). Can anyone confirm this? If it's not invalid, we should be focusing on ways to detect and decode it. If it is invalid, then a filter needs to be written that converts the raw bytes to ASCII or UTF-8.
Assignee | ||
Comment 2•13 years ago
|
||
I've got a preliminary fix which I'll be testing tonight and tomorrow. However, I should note that a file from the original package (content/overlay.js) is using an encoding which cannot be decoded by SpiderMonkey, which means that the validator will fail with a compilation error for that file. In the other files (namely content/smftn.js), there are characters which cannot be properly encoded for error output, so you get a lot of question marks, but the actual validation is taking place. Should it be encoded to something like UTF-8, everything would be just peachy. No issue there.
Assignee | ||
Comment 3•13 years ago
|
||
Waiting on the resolution of bug 648102. This will make the tests pass.
Depends on: 648102
Comment 4•13 years ago
|
||
obsolete response to Comment 1 - both big and little-ending UTF-16 are supported; this is probably true for its obsolete predecessor, UCS-2, as well. We determine big or little-endian based on the BOM (byte order marker) at the beginning of the content. Strings are stored in the JS engine roughly as uint16[], in the platform's native byte order. Supplying UTF-8 input, as noted in comment 2, is far safer; it is endian-proof, won't trigger BOM bugs. Also, the JS engine knows how to convert from UTF-8 to UTF-16 native-endian encoding, which is what JS developers expect to find in their strings. Matt, can you point me to a failing test?
Assignee | ||
Comment 5•13 years ago
|
||
Hey Wes There's only one failing test at the moment. You can find it on the "encoding" branch of the validator: https://github.com/mattbasta/amo-validator/branches/encoding The test is found here: https://github.com/mattbasta/amo-validator/blob/encoding/tests/test_controlchars.py#L33 When the following string is passed to Spidermonkey via read(): function täst() {} ...Spidermonkey throws this error: missing ( before formal parameters
Updated•13 years ago
|
Target Milestone: --- → Q2 2011
Assignee | ||
Comment 6•13 years ago
|
||
Merged into mozilla/amo-validator: https://github.com/mozilla/amo-validator/commit/6e2e1fd6b4b98a968b66acae7529cebcc9fe19c6
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•