Closed Bug 910187 Opened 6 years ago Closed 6 years ago

Armenian localization should not use armscii-8 as the fallback encoding; should use windows-1252

Categories

(Mozilla Localizations :: hy-AM / Armenian, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hsivonen, Unassigned)

Details

(Whiteboard: [fixed by bug 910192])

https://mxr.mozilla.org/l10n-central/source/hy-AM/toolkit/chrome/global/intl.properties sets intl.charset.default to armscii-8. https://mxr.mozilla.org/l10n-central/source/hy-AM/toolkit/chrome/global-platform/win/intl.properties , https://mxr.mozilla.org/l10n-central/source/hy-AM/toolkit/chrome/global-platform/unix/intl.properties and https://mxr.mozilla.org/l10n-central/source/hy-AM/toolkit/chrome/global-platform/mac/intl.properties set it to ISO-8859-15.

ISO-8859-15 is obviously an inappropriate value, since that encoding is much younger than the legacy encodings out there and it has nothing to do with Armenian (and is not the universal "nothing to do" encoding--windows-1252--either).

armscii-8 Obviously has something to do with Armenian, but it still steams incredible that unlabeled Armenian-language legacy content on the Web would be relying on it, since it's not universally supported by browsers that are not Gecko-based. Chrome doesn't support armscii-8, and according to StatCounter Chrome is way more popular than Firefox in Armenia. Therefore, the legacy dependency on armscii-8 seems unlikely.

In fact, were in the process of removing armscii-8 support, since he didn't get standardized as part of http://encoding.spec.whatwg.org/  due to the non-implementation in all other browsers.

Please set intl.charset.default (in all the four files) to windows-1252  due to the reasons given in https://developer.mozilla.org/en-US/docs/Localizations_and_character_encodings
[fixed by bug 910192]
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 910192]
Hi. I propose to not remove the support of Armcii-8  _now_ and wait some 1-2 years. There are plenty of legacy websites in Armenia that are encoded in Armscii-8. As a webmaster working in  Ubuntu/Firefox I have no other choice than to work in Firefox when dealing with this kind of legacy websites. There is a ongoing process of transfering Armenian websites to UTF-8, but before this is fully achieved Armenian webmasters need a browser with Armscii-8 support.  Pointing to the StatCounter or Chrome is not correct IMHO.   Please keep Armscii-8 for some time, it will die naturally some time later. By the way MySQL continue to support Armscii-8.
(In reply to Aram Palian from comment #2)
> Hi. I propose to not remove the support of Armcii-8  _now_ and wait some 1-2
> years.

We already removed support (except for the menu override; the menu override is going away in Firefox 28) in Firefox 19. The current release is 25. That's a long time without complaints and suggests users aren't actively encountering Armscii-8 sites. Usually we get feedback much faster even about breaking a single obscure site--let alone "plenty" of sites.

Was your comment prompted by a specific site no longer working (which one) or by seeing the title of this bug (which is misleading, since the Armenian localization has actually been falling back to ISO-8859-15 instead of Armscii-8 since the Armscii-8 value was overridden by ISO-8859-15 for each platform).

> There are plenty of legacy websites in Armenia that are encoded in
> Armscii-8. 

Armscii-8 doesn't work in IE, Chrome, Opera (tested even Presto-based) or Safari. Seems like a very bad idea for Web sites to have been Firefox-only. How come Web authors have considered it reasonable to keep their sites Firefox-only for this long? If having their sites not work in IE, Chrome, Opera and Safari hasn't prompted a migration to UTF-8 already, why a couple of more years of Armscii-8 in Firefox help?

Can you, please, provide URLs for some of the Armscii-8-based sites?
Firefox 19 was released on February 19 2013, so we went 9 months before anyone noticed.
If there is no support in a browser, we use the User-Defined feature. So there is no problem to use Armscii 8 in this browsers.  But the problem was that You need to go to the browser Preferences, add the necessary Font to the User Defined. It is the main cause that You didn't see much complaint, because users in Armenia always relied on this User Defined feature. In Firefox it is much easier for newbie. It was the difference.

Well actually I am using the Firefox 25 , and when it doesn't automaticaly recognise Armscii-8 then i go to View->Character Encodings->More Encodings -> ... -> Armscii 8. 
 
So i propose to not remove this feature from the Firefox menu. That You plan to remove in Firefox 28.

///Can you, please, provide URLs for some of the Armscii-8-based sites?///

For example actually I am working on this website. We prepare to move it to UTF-8 the next year.
(In reply to Aram Palian from comment #5)
> If there is no support in a browser, we use the User-Defined feature. So
> there is no problem to use Armscii 8 in this browsers.  But the problem was
> that You need to go to the browser Preferences, add the necessary Font to
> the User Defined.

No, the site provides http://main.am/armenianchurch/ARIALAM.TTF via CSS, so there's not need for the user to configure anything. Except for some text in the middle column (which should be trivial to fix via CSS), the site renders using Armenian glyphs (assigned to the Latin 1 range; not the x-user-defined range!) in Firefox 28, too, so for this site, there's no need to keep Armscii-8 support in Firefox.

> It is the main cause that You didn't see much complaint,
> because users in Armenia always relied on this User Defined feature.

That feature remains is Firefox, though this particular site doesn't actually use the feature even though on superficial inspection it appears to.

> Well actually I am using the Firefox 25 , and when it doesn't automaticaly
> recognise Armscii-8 then i go to View->Character Encodings->More Encodings
> -> ... -> Armscii 8. 
>  
> So i propose to not remove this feature from the Firefox menu. That You plan
> to remove in Firefox 28.

Can you give examples of sites that have recently required you to do this?

> ///Can you, please, provide URLs for some of the Armscii-8-based sites?///
> 
> For example actually I am working on this website. We prepare to move it to
> UTF-8 the next year.

(In reply to Aram Palian from comment #6)
> The link http://main.am/armenianchurch/Program/General/FS2-1.htm

Well, there's a significant difference between "plenty of legacy websites" and "my own site". Especially when it turns out that the site in question isn't actually relying on armscii-8 support. And not even relying on x-user-defined support but using Latin 1 entities to commandeer the Latin 1 range for non-Latin 1 glyphs via @font-face.
This site for example
http://archive.aravot.am/
It is the archive of "Aravot" daily newspaper from 1998 to 2005. It is a well known newspaper.
Choose a random day.
When  i entered there i didn't see the letters, despite the fact that browser took the "User defined"
Then i entered the menu, View->Character encodings ...   and the Armenian letters appeared.
The same thing happens here 
http://archive.hetq.am/arm/
Also an another well known media in Armenia that keeps his old stuff archived in Armscii-8.

Note that I have properly configured "User defined", despite this fact I use the "View->..->Armscii 8" menu to properly view the site.

The behaviour is exactly the same both in Windows and Ubuntu. Firefox 25.
So if you remove Armscii-8 from the "More Encodings" then i don't know how i can access this web-sites.
(In reply to Aram Palian from comment #9)
> http://archive.aravot.am/
> http://archive.hetq.am/arm/

Thanks. OK, so there are at least two *archive* sites that are indeed armscii-8-encoded but declare x-user-defined instead of declaring armscii-8 and rely on users having a font called Arial AM (http://main.am/armenianchurch/ARIALAM.TTF) that puts Armenian glyphs at Latin-1 Supplement code points.

> Note that I have properly configured "User defined", despite this fact I use
> the "View->..->Armscii 8" menu to properly view the site.

"Properly" is a ambiguous word when it comes to "User defined". What exactly do you mean by "properly" in this case?

> The behaviour is exactly the same both in Windows and Ubuntu. Firefox 25.
> So if you remove Armscii-8 from the "More Encodings" then i don't know how i
> can access this web-sites.

I have to wonder if other users know how to access those sites. They don't work in any browser out-of-the-box. Up to and including Firefox 27, they can be read with fonts that put Armenian glyphs in Armenian code points by choosing ARMSCII-8 from deeply-nested menus for each page. Alternatively, they can be read in IE/Chrome/Safari by having a font called "Arial AM" that puts Armenian glyphs in the Latin-1 Supplement range installed. I.e. http://main.am/armenianchurch/ARIALAM.TTF

In theory, they could be read in Firefox by having a font called "Arial AM" that puts Armenian glyphs in the PUA in the U+F780 to U+F7FF range, but I doubt such a font actually exists in wide use. (Does it?)

Since the sites seem to value having their archives available,  it is rather bizarre to make them available in a way that does not work in any browser out-of-the-box  when converting them into a format that would work (UTF-8) would be trivial.

I think we should  primarily pursue this as an  evangelism issue and ask those sites to convert their archives into UTF-8. Secondarily, I think we should consider making x-user-defined in non-XHR context an alias of windows-1252 to enable those archives to edit their CSS to add http://main.am/armenianchurch/ARIALAM.TTF to their archives via @font-face and have it at least look visually like Armenian text, even if bogus for copy-paste/search/etc., across browsers (bug 213517).
Filed bug 944673 and bug 944671 about evangelizing the sites to make their archives readable cross-browser without special config or user action.

Firefox code changes continue over in bug 213517.

This is all off-topic for this bug report anyway.
Well, You persuaded me :)
Move on as You have planned before.

I came here just wondering why my "User defined" option is not working automatically as it does in IE/Opera.

By saying properly i mean that i assigned the font "Arial AM" to the User defined. This works automatically in IE but not in Firefox.  Also i was concerned because i was relying on Firefox in my Linux work.

Now looking at bug 213517 i think i understand that it is a specific Firefox issue.

Excuse me for offtoping.  I will send a notification to Aravot and Hetq.am
(In reply to Aram Palian from comment #12)
> I came here just wondering why my "User defined" option is not working
> automatically as it does in IE/Opera.

Where's the UI for that in Opera? I don't see font UI for User Defined in Linux Presto-Opera or in Mac Blink-Opera. (In fact, I see no script-specific font UI in Blink-Opera.)

> By saying properly i mean that i assigned the font "Arial AM" to the User
> defined.

Do you also have Arial LatArm installed and is it common to have both installed if one of them is installed? (Asking to figure out if bug 213517 should be fixed the Chrome way or the IE way.)

> Excuse me for offtoping. 

Thanks for making me aware of the pattern of using ARMSCII-8 but declaring x-user-defined in <meta>. That sort of this wouldn't have been apparent from either telemetry or a Web crawl looking for data about ARMSCII-8.

> I will send a notification to Aravot and Hetq.am

Thank you.
///Where's the UI for that in Opera?///

* Oh now that You ask I see that Opera handles automatically this situation. I don't know how but I entered in a Armscii site and it showed automatically. Opera in Win XP the latest. I always thinked that i made adjustments there... now i see that i didn't. Excuse me for this untrue information.

* Google Chromes fallbacks to win-1252 and shows automatically if Arial LatArm and Arial AM is installed. Personaly i don't use Chrome because i am concerned about privacy issues there.

* In IE 8 i do configure manually in User Defined and it works later without problem.

* Only in Firefox i need to go the menu, to check the ArmSCII-8 each time, despite the fact that User Defined is the same as in IE.

If You plan to change this in Firefox then IMHO it is better to have this feature work Google Chrome way. It is much easier.

///Do you also have Arial LatArm installed and is it common to have both installed if one of them is installed?////

Yes it is very common here . It is a defacto standard in Armenia in Windows computers to have this fonts.
Thanks Aram, this is great stuff!
You need to log in before you can comment on or make changes to this bug.