Closed Bug 712310 Opened 9 years ago Closed 6 years ago

Shifting byte in <script> tag still causes it to decode as <script>


(Core :: Internationalization, defect)

Not set





(Reporter: hsivonen, Assigned: smontagu)


(Blocks 1 open bug)


(Keywords: sec-moderate, Whiteboard: [sg:moderate] potential XSS contributor)


(1 file)

Attached file Test case
<annevk> in shift_jis
<annevk> 84 3C 73 63 72 69 70 74 20 84 3E
<annevk> gives <script> in Gecko/Chrome, but "�script �" in Opera

This might confuse blacklist-based XSS filters (that are inherently unsafe, of course), so doing what Opera does would be on the safe side.
I'm not sure if this is a real issue. If you remove the 0x20 for the space, the script doesn't execute.  The 0x84 before the > is being parsed as an attribute name. This would be similar to using <script a> to bypass a blacklist filter.
Thinking about it more, the issue may be that we are interpreting the 0x84 0x3C and 0x84 0x3E sequences as individual bytes rather than as one character.
Whiteboard: [sg:moderate] potential XSS contributor
Blocks: encoding
Masatoshi-san, would this problem be fixed by implementing the Encoding Standard
for Shift_JIS (bug 747762) ?
Flags: needinfo?(VYV03354)
new TextDecoder("shift_jis").decode(new Uint8Array([0x84,0x3C,0x73,0x63,0x72,0x69,0x70,0x74,0x20,0x84,0x3E]))
"�<script �>"
But I don't think this needs to be "fixed".
- The Encoding Standard requires this behavior.
- Now virtually all browsers (including Blink Opera) are "vulnerable" to this.
- No valid shift_jis sequence uses 0x3C/0x3E as a second byte. If some XSS filters miss this sequence, it should be considered as a serious bug of the filters.
- It will lead other vulnerability if we eat the second byte unconditionally. (consider <a href="<0x84>">).
I suggest WONTFIX.
Flags: needinfo?(VYV03354)
Thanks.  I'm resolving as invalid since the Encoding Standard requires this behavior
it's not a bug.

Henri, please raise a spec issue if you think there's something wrong with the
required behavior.
Closed: 6 years ago
Resolution: --- → INVALID
Anne, is the state of the Encoding Standard on this topic intentional?
Flags: needinfo?(annevk)
Partially. "Eating" the second byte if that is ASCII is itself a vulnerability as emk points out.
Flags: needinfo?(annevk)
Group: core-security
You need to log in before you can comment on or make changes to this bug.