Closed Bug 939311 Opened 11 years ago Closed 3 years ago

Consider removing ISO-8859-8 (Visual Hebrew) support

Categories

(Core :: Internationalization, enhancement)

x86_64
Other
enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: eric, Unassigned)

Details

(Keywords: memory-footprint)

User Agent: Mozilla/5.0 (X11; CrOS x86_64 4731.62.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.50 Safari/537.36

Steps to reproduce:

Consider removing ISO-8859-8 (Visual Hebrew) support

ISO-8859-8 (Visual Hebrew) is one of several encodings specified in 1987.  It's not used in any real pages on the web (see below).  And it adds a surprising amount of complexity to Blink, probably does to Gecko too.  The Blink bug is:
https://code.google.com/p/chromium/issues/detail?id=319643

At the same time I'm also proposing removing -webkit-rtl-ordering which appears to have been added as a way to disable support for this encoding inside form controls:
http://trac.webkit.org/changeset/12027
I don't believe Mozilla ever had an equivalent to this property.

Motivation:

To quote a member of Google's Bidi team:
"Visual ISO-8859-8 Hebrew is left over from the days circa Windows 3.1 when browsers could not handle right-to-left languages, so people created this hack of writing Hebrew text backwards per line, using the  ISO-8859-8 one-byte encoding. The newer standard is Logical ISO-8859-8-I Hebrew which is exactly the same byte encoding but signifies that the text is is normal letter order and the browser is responsible for displaying it correctly.

The is no visual Arabic because by the time they got on the Web bandwagon, browsers had already advanced enough to display RTL properly. So ISO-8859-6 Arabic is logical only. "

The same gentleman was also kind enough to run a crawl over Google's search index, and was unable to find a page which used this encoding in the top (extremely large number) of sites.

I'm proposing removing support for this Dinosaur from the Web.
Here is a link to the blink-dev discussion on the topic:
https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/nhCfOhRV-I4
Was the crawl based on looking at encoding labels or did it also include looking at byte patterns in unlabeled pages?
Confirming in the sense that we should definitely consider.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Using the search URL at https://groups.google.com/a/chromium.org/d/msg/blink-dev/nhCfOhRV-I4/kReGHNYzV3gJ and some variations on the same idea I found several pages that use this encoding, mostly old but some dated from this year. Granted the number of such pages is some tiny percentage of the web, and if we didn't support it there would be no justification for spending resources on *adding* support, but I don't understand that as an argument for ripping out existing support.
I think the argument is that if Servo does not need this, we don't need the added complexity either. But it's not entirely clear to me that is true given what has been stated about various sites in the Blink thread.
Component: General → Internationalization
Severity: normal → enhancement
Keywords: footprint

The Chromium bug claims "Fixed", but Visual Hebrew support hasn't been removed from Chrome.

These two look the same in both Firefox and Chrome:
https://hsivonen.com/test/moz/hebrew-visual.htm
https://hsivonen.com/test/moz/hebrew-logical.htm

Let's check back in another 7 years.

Logical encoding as well as Unicode support became common in the middle of 2000s with sites such as Wikipedia which never had Hebrew-Visual support (please correct me if I'm wrong). While Logical Hebrew support was important part for the raising of user generated contents on the web with easier authoring tools, removing Visual support from browsers means that we will lose access to all content created in the first decade+ of the web, including content archived by The Internet Archive and similar projects.

Yeah, in the light of the Support Existing Content design principle and how little (already written) code this feature involves, I think it makes sense to outright WONTFIX this. (I checked the code only after writing comment 6.)

(Anecdotally, a university student from Israel told me in late 2018 that the front page of their university had been using visual Hebrew recently.)

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.