Closed Bug 74741 Opened 23 years ago Closed 23 years ago

Bidi: extra bidi encodings to be removed

Categories

(Core :: Internationalization, defect)

x86
Windows 98
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla0.9.1

People

(Reporter: mrous, Assigned: mkaply)

References

Details

Attachments

(1 file)

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows 98)
BuildID:    00000000 - Bidi build dated 20010402

need to remove ISO-8859-6-I, ISO-8859-6-E, ISO-8859-8-E, 
and IBM-864-I encodings

Reproducible: Always
Steps to Reproduce:
1.Go to View menu - Character sets encodings

Actual Results:  we get far too many Bidi encodings

Expected Results:  
For Arabic there are only: Windows-1256, IBM-864, & ISO-8859-6
For Hebrew : Windows-1255, IBM-862, ISO-8859-8, & ISO-8859-8-I
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
These thing didn't show on the trunk build yet. 
erik, please edit your patch, take a look at
xpfe/browser/resources/locale/en-US/navigator.properties
remove them from intl.charsetmenu.browser.more5=
you should remove iso-8859-6-i, iso-8859-6-e, iso-8859-8-e ibm864i from the list
I reassign this to you since we have not check in xpfe/browser/resources/locale/
en-US/navigator.properties yet.
Assignee: ftang → erik
Status: ASSIGNED → NEW
mark as moz0.9 since this could be fixed when we land the xpfe/browser/resources/
locale/en-US/navigator.properties changes.
Target Milestone: --- → mozilla0.9
Why should we remove: ISO-8859-6-I, ISO-8859-6-E and ISO-8859-8-E?
They are listed by IANA.
See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets:
  
  ...
  Name: ISO_8859-6-E                                       [RFC1556,IANA]
  MIBenum: 81
  Source: RFC-1556
  Alias: csISO88596E

  Name: ISO_8859-6-I                                       [RFC1556,IANA]
  MIBenum: 82
  Source: RFC-1556
  Alias: csISO88596I

  ...

  Name: ISO_8859-8-E                                  [RFC1556,Nussbacher]
  MIBenum: 84
  Source: RFC-1556
  Alias: csISO88598E
Just because they are listed in IANA's registry does not mean that they are
actually used in the real world. Maha and Simon, please confirm.
I agree with Erik. Also note that we are only talking about removing these
charsets from the menus, where they don't add anything except confusion, since
the Bidi options allow the selection of implicit or visual ordering separately
from the charset.
Target Milestone: mozilla0.9 → mozilla0.9.1
Do we want to do something different with naming Visual and Logical Hebrew 
menus?

IE actually says Visual and Logical, we have Visual Hebrew and Hebrew 
(ISO-8859-8-I)
Can we get feedback from users in Israel on what is expected or easier to
understand?  Are there other examples of software that handle this?  Shall
we post this question to n.p.m.i18n?
The nomenclature question is certainly worth raising on the newsgroup.

Although it might seem inconsistent to have "Hebrew" and "Visual Hebrew",
there's a subliminal evangelism issue: it promotes the concept of logical Hebrew
as normative and visual Hebrew as something exceptional and implicitly less
desirable.
I'll take this one
Assignee: erik → mkaply
Frank, can I get an r= for this and an sr= from Erik?

I'll check it in for 0.9.1 after the tree opens.

I see Simon's point and I think I agree with it. I think the term Logical would 
just confuse the issue.

A couple more things, IE 6 added a new Arabic encoding - ASMO 708 - any idea 
what that is?

Also, IE calls that 8XX encodings DOS encoding rather than IBM encodings. While 
I like putting the IBM name all over the product, what do you think?
Status: NEW → ASSIGNED
I have looked everywhere I could, all point that ASMO 708 is exactly
the same is ISO 8859-6. Take a look at:
<http://www.marko.net/arabic/lists/9810/0006.html>
Blocks: 70344
mkaply> Also, IE calls that 8XX encodings DOS encoding rather than IBM
mkaply> encodings.

I looked at the IANA charset list:, and all of these are listed as IBM8xx
without any DOS8xx aliases or any specified preferred MIME name.  So I 
think the IBM names are more valid than the DOS names.

ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets:
<snip>
...
Name: IBM852                                              [RFC1345,KXS2]
MIBenum: 2010
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: cp852
Alias: 852
Alias: csPCp852

Name: IBM855                                              [RFC1345,KXS2]
MIBenum: 2046
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: cp855
Alias: 855
Alias: csIBM855

Name: IBM857                                              [RFC1345,KXS2]
MIBenum: 2047
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: cp857
Alias: 857
Alias: csIBM857

...

Name: IBM862                                              [RFC1345,KXS2]
MIBenum: 2013
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: cp862
Alias: 862
Alias: csPC862LatinHebrew

Name: IBM864                                              [RFC1345,KXS2]
MIBenum: 2051
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991
Alias: cp864
Alias: csIBM864

...

Name: IBM866                                                     [Pond]
MIBenum: 2086
Source: IBM NLDG Volume 2 (SE09-8002-03) August 1994
Alias: cp866
Alias: 866
Alias: csIBM866
</snip>

mkaply> IE 6 added a new Arabic encoding - ASMO 708 - any idea what that is?

IANA lists this as an alias to ISO_8859-6:
<snip>
Name: ISO_8859-6:1987                                    [RFC1345,KXS2]
MIBenum: 9
Source: ECMA registry
Alias: iso-ir-127
Alias: ISO_8859-6
Alias: ISO-8859-6 (preferred MIME name)
Alias: ECMA-114
Alias: ASMO-708
Alias: arabic
Alias: csISOLatinArabic
</snip>

There's already an "asmo-708" alias in charsetalias.properties.
(http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/charsetalias.properties#
188)
Another thought on the IBM vs DOS naming.  We could take the more neutral
name of CP8xx.  These are aliased in IANA (see previous comment).
More data:  Netscape 4.x uses "Cyrillic (CP866)", but was changed in Mozilla
to "Cyrillic/Russian (IBM-866)".  This may be unintentional because there are
2 entries in charsetTitles.properties:

     cp-866.title = Cyrillic (CP-866)
     ...
     ibm866.title = Cyrillic/Russian (IBM-866)

I also prefer CP names. Are then any other company names in the
charset menu?
sr=erik

Brian, since Frank isn't here, would you please review the patch and add your
r= if OK?
I'm marking this one fixed.

We'll open a bug for the name changes (CP, IBM) after the email discussion is 
complete.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
QA Contact: andreasb → ylong
Verified those extra bidi charset have been removed on 05-15 trunk build.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: