Closed Bug 203838 Opened 21 years ago Closed 10 years ago

Make Mac codepages either work or go away from the UI

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: smontagu, Assigned: smontagu)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [fixed by bug 805374])

Attachments

(2 files, 1 obsolete file)

Spin off from bug 199143.

x-mac-devanagari, x-mac-gurmukhi and x-mac-gujarati aren't listed in
charsetalias.properties and it's impossible to use them as Character Coding.

I don't know if all the Mac codepage decoders are really needed, but we should
either remove them from the menus or make them functional.

Steps to reproduce:

Go to 
http://smontagu.damowmow.com/genEncodingTest.cgi?family=apple&codepage=devanagari
http://smontagu.damowmow.com/genEncodingTest.cgi?family=apple&codepage=gujarati
http://smontagu.damowmow.com/genEncodingTest.cgi?family=apple&codepage=gurmukhi

Expected results: 0 failures
Actual results: lots of failures (basically all characters above 0x7E fail)
Attached patch Patch to make the encodings work (obsolete) — Splinter Review
Oops, that should include MacFarsi as well, new patch coming up.
Attachment #122045 - Attachment is obsolete: true
Summary: Mac Indic encodings don't work → Make Mac codepages either work or go away from the UI
Comment on attachment 122050 [details] [diff] [review]
Patch to make the encodings work

I now think we need both these patches. Requesting review accordingly.
Attachment #122050 - Flags: superreview?(alecf)
Attachment #122050 - Flags: review?(ftang)
Attachment #122051 - Flags: superreview?(alecf)
Attachment #122051 - Flags: review?(ftang)
both? Can you explain? is ftang ever going to review this? I'm reluctant to
spend the sime sr'ing if ftang hasn't reviewed yet.
We need the patch to make the encodings work because we use them internally, as
in bug 61031 and bug 65773, and we need the patch to remove them from the UI
because (AFAIK) there aren't any webpages that use them.
Comment on attachment 122051 [details] [diff] [review]
Alternative patch: remove Mac encodings from the menus

sr=alecf
Attachment #122051 - Flags: superreview?(alecf) → superreview+
Comment on attachment 122050 [details] [diff] [review]
Patch to make the encodings work

sr=alecf
Attachment #122050 - Flags: superreview?(alecf) → superreview+
re : bug 199143 comment #10
> IMHO support for Macintosh character sets is pointless for the Unix and
> MS Windows versions because web pages are not written in Mac encodings.

  If we can be 100% sure of that (there might be some web sites in MacRoman??)
_and_ we don't want Mozilla to be a (user-friendly but huge :-)) replacement for
iconv(1) or other charset converters, we can remove decoders for Mac charset to
Unicode on all platforms and keep encoders _only_ on Mac. This will cut down the
footprint a little (150 bytes per decoder? and there are about a dozen of them).
I am not sure if this is the 'right' thing because Mozilla as a 'user-friendly'
charset converter may be useful to some people.
I just realized that intl/uconv is not only for Mozilla but also can be used by
other applications that may have little to do with web browsing. For them,
decoders for Mac charsets may be useful...
Comment on attachment 122050 [details] [diff] [review]
Patch to make the encodings work

moving review request
Attachment #122050 - Flags: review?(ftang) → review?(jshin)
Comment on attachment 122051 [details] [diff] [review]
Alternative patch: remove Mac encodings from the menus

.
Attachment #122051 - Flags: review?(ftang) → review?(jshin)
 I have made you confused. When properties are looked up, charset names
are turned to lowercase. By properties, I mean things like notForBrowser,
notForMail(?), multiByte, and so forth. So, there's no problem with them.

 However, in other places, as bz wrote, we assume that every charset
coming from the internal source (as opposed to external documents) is
canonical and don't invoke the charset alias resolution code (either by
skipping GetCharsetAlias or calling GetUnicode(De|En)coderRaw instead of
GetUnicode(De|Encode). This assumption breaks down with non-canonical
names hard-coded and leads Mozilla to fail GetUnicode(En|De)coderRaw
call.

This hadn't been noticed (by me and others) because most
of us(them) have a set of character codings in the 'static' list that we
use often and only rarely picks up a character coding from the View |
Char. coding | More list (that goes into 'cached' list). Even on those
rare occasions, we'd not have noticed the problem if the one picked up
is windows-xxx or x-* whose canonical names are in lowercase.

  Hope this I made the case clearer than before :-)
sorry the prev. comment was for another bug
Comment on attachment 122050 [details] [diff] [review]
Patch to make the encodings work

r=jshin
Attachment #122050 - Flags: review?(jshin) → review+
Comment on attachment 122051 [details] [diff] [review]
Alternative patch: remove Mac encodings from the menus

I have some reservation about removing all x-mac-*. At least x-mac-roman seems
to have to be left. If someone has old plain text files in MacOS 9 and want to
open them withMozilla, it'd be nice to have MacRoman.

BTW, I think the corresponding entries in intl/ uconv/ src/
charsetTitles.properties should be removed as well.
Comment on attachment 122051 [details] [diff] [review]
Alternative patch: remove Mac encodings from the menus

Pls, ignore my previous comment. I forgot that users can always add back
character encodings with View | character coding | Custom. So, charsetTitle
should be kept as they're even though charset names are removed in 'More' list.
Anyway, I'd keep x-mac-roman, but it's up to you.
Attachment #122051 - Flags: review?(jshin) → review+
Yes, I agree with you about x-mac-roman and will check in without removing it,
after only a week or two ago I found myself opening an HTML document at my day
job which turned out to be in MacRoman, though labelled as Latin1, and I had to
use that menu item to make the quotes and apostrophes display correctly.
FWIW I have web pages on my site in the following encodings:

index.ar.utf8.html         index.nb.html
index.ar.win-1256.html     index.nl.html
index.cy.html              index.nn.html
index.de.html              index.po.html
index.en.html              index.pt-br.html
index.eo.html              index.pt-pt.html
index.es-es.html           index.ro.html
index.es-mx.html           index.ru.iso8859-5.html
index.et.html              index.ru.koi8-r.html
index.fr.html              index.ru.mac-cyr.html
index.he.mac-hebrew.html   index.ru.utf8.html
index.he.utf8.html         index.ru.win-1251.html
index.html2                index.sv.html
index.html3                index.tr.iso8859-5.html
index.hu.php               index.tr.mac-turkish.html
index.it.html              index.tr.utf8.html
index.ja.html              index.uk-unfinished.html
index.kn.html              index.zh-hans.html
index.la.html              index.zh-hant.html
QA Contact: amyy → i18n
Blocks: encoding
Bug 801402 made these unavailable except via the menu. Bug 805374 removed these from the menu. Bug 997133 removed the encoders.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 805374]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: