Open Bug 746911 (encoding) Opened 12 years ago Updated 2 years ago

[meta] Implement the Encoding Standard

Categories

(Core :: Internationalization, enhancement)

enhancement

Tracking

()

People

(Reporter: hsivonen, Unassigned)

References

(Blocks 2 open bugs, )

Details

(Keywords: meta)

Gecko should implement http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html for better interoperability and for reduced brokenness.

This involves:
 * Changing the charset alias list to match the spec
 * Changing the Unicode converter API to be able to signal the end of the stream
 * Fixing various converters to signal errors per spec
 * Fixing various converters to behave per spec
 * Removing some converters
Depends on: 562096
Depends on: 711101
Depends on: 715833, 634541, 504831, 716579
Depends on: 747762
Depends on: 749052
Depends on: 203838
Depends on: 712310
Depends on: 782412
Depends on: 782721
Depends on: 736438
Depends on: 562091
Depends on: 687859
Depends on: 796882
Depends on: 797385
Depends on: 799910
Depends on: 799913
Depends on: 801402
Depends on: 802030
Depends on: 802979
Depends on: 805374
Depends on: 809934
Blocks: whatwg
Depends on: 827796
Alias: encoding
Depends on: 844776
Is there any documentation on why the choices made in the encoding standard are as they are? In particular on the mix between windows and iso latin encodings?
(In reply to Axel Hecht [:Pike] from comment #1)
> Is there any documentation on why the choices made in the encoding standard
> are as they are?

The set of encodings is based on researching the commonality between browsers. The basic assumption is that Web content wants to work in multiple browsers, so if only one browser supports a given fringe encoding, the encoding is probably not in significant use. IIRC, actual documentation of test results is somewhere in www-archive.

> In particular on the mix between windows and iso latin
> encodings?

When a windows encoding is a superset of an ISO encoding, the encoding standard retains the windows encoding and makes the ISO labels aliases thereof. In practice, the ISO labels already invoke decoders that are actually decoders for the corresponding windows encoding and IE reports the windows label (and has notable marketshare in many places).

In the cases where there's no windows superset for an ISO encoding, the Encoding Standard uses the ISO-8859-x label as the preferred name.
Depends on: 863728
Depends on: 562590
Depends on: 912470
Depends on: 936440
Depends on: 951691
Depends on: 959058
Depends on: 1093781
Depends on: 1092737
Depends on: 1102679
Depends on: 1200152
Depends on: 1241432
Depends on: 1257877
Depends on: 1312384
Depends on: 1215860
Depends on: 1486949
Depends on: 1460233
Depends on: 1513517
Depends on: 1514664
Type: defect → enhancement
See Also: → 254868
See Also: 254868
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.