Closed
Bug 802082
Opened 12 years ago
Closed 12 years ago
Merge encodings that IE or WebKit treat as the same
Categories
(Core :: Internationalization, enhancement)
Core
Internationalization
Tracking
()
RESOLVED
DUPLICATE
of bug 801402
People
(Reporter: ayg, Assigned: ayg)
References
Details
Attachments
(1 file)
Bug 802030 deals with merging us-ascii, iso-8859-1, and windows-1252. That's potentially risky, because all major browsers (IE, Firefox, Chrome) treat them as distinct. But the encoding spec mandates merging lots of other encodings that we treat as distinct too, listed at bug 802030 comment 2. All of these other merges are already implemented by IE or Chrome, so we have much more assurance that they're safe.
Assignee | ||
Comment 1•12 years ago
|
||
So the following charsets will no longer exist:
* Big5-HKSCS -> Big5
* GB2312 -> gbk
* ISO-8859-6-E -> ISO-8859-6
* ISO-8859-6-I -> ISO-8859-6
* ISO-8859-8-E -> ISO-8859-8
* ISO-8859-9 -> windows-1254
* ISO-8859-11 -> windows-874
* TIS-620 -> windows-874
* windows-949 -> euc-kr
I have to figure out which encoder/decoder to choose in each case, though.
Assignee | ||
Comment 2•12 years ago
|
||
41 files changed, 42 insertions(+), 1180 deletions(-)
This only tries to tackle three of the merges, which Anne advised me would be the safest to start with. It was mindlessly adapted from the patch to bug 623610. I have no idea if this even makes sense, but it seems to compile. Try: https://tbpl.mozilla.org/?tree=Try&rev=264a46a7828e
I guess we want someone interested in mail to comment on whether this is a bad idea for them. It will probably make outgoing mail declare its encoding as windows-* instead of ISO-8859-*, which maybe other clients don't like. If that is a problem, what do we want to do about it, here and in similar cases?
Attachment #671796 -
Flags: review?(smontagu)
Comment 3•12 years ago
|
||
Since you remove variants, you might want to have the UI just say "Thai" instead of also listing the encoding. Chrome does the same.
Comment 4•12 years ago
|
||
Comment on attachment 671796 [details] [diff] [review]
Patch part 1 -- Merge ISO-8859-9 and -11 and TIS-620 with windows-1254 and -874
>- {"ISO-8859-9", "ISO8859_9"},
>+ {"ISO-8859-9", "Cp1254"},
Just remove this line. Left hand is a canonical charset name, so "ISO-8859-9" will never appear after merge.
>- {"iso88599",iso9_tbl}, //ISO-8859-9
> {"iso885910",iso10_tbl}, //ISO-8859-10
>- {"tis620",tis620_tbl}, //TIS-620/ISO-8859-11
>- {"tis6202533",tis620_tbl}, //TIS-620/ISO-8859-11
>- {"iso885911",tis620_tbl}, //TIS-620/ISO-8859-11
Then hunspell doesn't support Thai anymore? I don't think it's correct.
Comment 5•12 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #4)
> Then hunspell doesn't support Thai anymore? I don't think it's correct.
What are those tables in hunspell used for? Do we ever feed hunspell with an encoding other than UTF-* anyway?
Comment 6•12 years ago
|
||
Dunno. Please ask spellchecker folks.
Assignee | ||
Comment 7•12 years ago
|
||
I needed this change too to fix a failing test:
--- a/extensions/universalchardet/src/base/LangThaiModel.cpp
+++ b/extensions/universalchardet/src/base/LangThaiModel.cpp
@@ -180,10 +180,10 @@ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
const SequenceModel TIS620ThaiModel =
{
TIS620CharToOrderMap,
ThaiLangModel,
(float)0.926386,
false,
- "TIS-620"
+ "windows-874"
};
Maybe the classes in that file should be renamed? I'll address other feedback when smontagu says how he'd like me to proceed (or if he's even the right person to ask for review).
Comment 8•12 years ago
|
||
Comment on attachment 671796 [details] [diff] [review]
Patch part 1 -- Merge ISO-8859-9 and -11 and TIS-620 with windows-1254 and -874
Change
> {"TIS-620", "MS874"},
to
{"windows-874", "MS874"},
.
Comment 9•12 years ago
|
||
I'm not happy with merging encoders (as opposed to decoders) until we have a way to make it not apply to sent mail.
Assignee | ||
Comment 11•12 years ago
|
||
If bug 801402 brings us in line with the spec just as well, this bug is no longer necessary.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Updated•12 years ago
|
Attachment #671796 -
Flags: review?(smontagu)
You need to log in
before you can comment on or make changes to this bug.
Description
•