Extra 0xa4 byte throws off EUC-JP detection in Firefox and Safari, but not Chrome
Categories
(Core :: Internationalization, defect, P3)
Tracking
()
| Webcompat Priority | P3 |
People
(Reporter: twisniewski, Unassigned)
References
()
Details
See this comment on webcompat.com for more details.
Essentially, this Oracle document has its inteded EUC-JP encoding guessed by Chrome, but not rejected despite an invalid character in <code>DBL_MIN</code> 3.0, which is a 0xA4 byte. Firefox (and presumably Safari) reject the guess because of the character.
Since this is a webcompat issue, it seems like something the spec should rectify, so I've filed this bug for now until proper next steps can be determined.
| Reporter | ||
Updated•3 years ago
|
Updated•3 years ago
|
(In reply to Thomas Wisniewski [:twisniewski] from comment #0)
and presumably Safari
Was this tested with Safari's UI language set to Japanese? With other UI languages, Safari doesn't run a detector at all, which explains the result.
| Reporter | ||
Comment 2•3 years ago
|
||
Was this tested with Safari's UI language set to Japanese?
Ah, no it wasn't.
Updated•3 years ago
|
Updated•2 years ago
|
While the behavior on the reported page is unfortunate, we've gone for a couple of years without other complaints about detection accuracy. Addressing the problem reported here would be risky, since it would involve a violation to the principle of operation of chardetng (that byte sequences that are invalid according an encoding, with old unsupported proprietary extensions not considered invalid, result in immediate rejection of an encoding from the set of potential candidates), which could disrupt the balance of how other cases currently end up being detected correctly.
Therefore, WONTFIX.
Description
•