Last Comment Bug 308187 - (haaretz) haaretz.co.il - wrong charset in HTTP headers (csISOLatinHebrew instead of Windows-1255)
(haaretz)
: haaretz.co.il - wrong charset in HTTP headers (csISOLatinHebrew instead of Wi...
Status: RESOLVED FIXED
[contacted] - see comment #3
:
Product: Tech Evangelism Graveyard
Classification: Graveyard
Component: Hebrew (show other bugs)
: unspecified
: All All
P1 normal
: ---
Assigned To: hebrew
:
:
Mentors:
http://www.haaretz.co.il/hasite/pages...
: 308447 308532 311658 314715 315864 317089 318102 319781 320269 342747 412942 431066 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-09-12 10:14 PDT by Reuven M. Lerner
Modified: 2015-04-19 23:45 PDT (History)
22 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
e-mail sent to helpdesk@haaretz.co.il (3.94 KB, text/html)
2005-11-02 01:42 PST, Uri Bernstein (Google)
no flags Details

Description User image Reuven M. Lerner 2005-09-12 10:14:18 PDT
User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b4) Gecko/20050908 Firefox/1.4
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b4) Gecko/20050908 Firefox/1.4

Firefox 1.0.x displayed Hebrew Web pages quite nicely, taking into account the
fact that numbers are displayed from left to right.  The 1.5 beta build, on the
Mac at least, seems to be a regression.  Numbers are normally displayed just
fine, but when they are attached to a word (as is often the case with a hyphen),
the numbers are reversed.  It might also occur sometimes without a hyphen. 
Moreover, the number is placed on the wrong part of the line, out of its correct
place within the sentence.

Reproducible: Always

Steps to Reproduce:
1. Go to
http://www.haaretz.co.il/hasite/pages/ShArtPE.jhtml?contrassID=2&subContrassID=3&sbSubContrassID=0&itemNo=623784
2. Look at the number in the headline, which should be 1967.
3.

Actual Results:  
I filed this bug report.

Expected Results:  
It should have shown the accurate year in the right place.
Comment 1 User image Simon Montagu :smontagu 2005-09-12 11:12:26 PDT
Seeing this on Windows also
Comment 2 User image Simon Montagu :smontagu 2005-09-12 11:33:46 PDT
There is an incorrect charset in the HTTP header:

Content-Type: text/html; charset="csISOLatinHebrew"

and the page encoding is set to Hebrew Visual. Resetting manually to Hebrew
(Windows-1255) makes the ordering OK again.

This is really an evangelism bug, since csISOLatinHebrew is registered as an
alias of ISO-8859-8 in the IANA registry.
Comment 3 User image Uri Bernstein (Google) 2005-09-14 03:04:41 PDT
I sent an e-mail with the details of the problem to helpdesk@haaretz.co.il (I
was provided with this address by phone, after e-mails sent to the published
address, online@haaretz.co.il, bounced).
Comment 4 User image Uri Bernstein (Google) 2005-09-14 05:01:26 PDT
*** Bug 308447 has been marked as a duplicate of this bug. ***
Comment 5 User image Uri Bernstein (Google) 2005-09-14 11:34:16 PDT
*** Bug 308532 has been marked as a duplicate of this bug. ***
Comment 6 User image Jo Hermans 2005-10-08 06:58:59 PDT
*** Bug 311658 has been marked as a duplicate of this bug. ***
Comment 7 User image Eythan Weg 2005-10-08 09:44:40 PDT
(In reply to comment #2)
> There is an incorrect charset in the HTTP header:
> 
> Content-Type: text/html; charset="csISOLatinHebrew"

Today, I see this this in the header of one article:
HTTP-EQUIV="Content-Type" content="text/html; charset=windows-1255"
Still the browser uses iso8859-8.
Comment 8 User image Uri Bernstein (Google) 2005-10-08 09:50:30 PDT
(In reply to comment #7)
> Today, I see this this in the header of one article:
> HTTP-EQUIV="Content-Type" content="text/html; charset=windows-1255"
> Still the browser uses iso8859-8.

The problem isn't (and never was) with the HTTP-EQUIV meta tags in the HTML. The
problem is with the HTTP headers themselves (which are not part of the HTML
source, so you can't see them there).
One way to see the HTTP headers (where the problem is), is using the "Live HTTP
Headers" extension (http://livehttpheaders.mozdev.org/).
Comment 9 User image Uri Bernstein (Google) 2005-11-02 00:54:27 PST
*** Bug 314715 has been marked as a duplicate of this bug. ***
Comment 10 User image Uri Bernstein (Google) 2005-11-02 01:42:26 PST
Created attachment 201602 [details]
e-mail sent to helpdesk@haaretz.co.il

For the record, this is the e-mail I sent to helpdesk@haaretz.co.il on 2005-09-14.
I sent a reminder to online@haaretz.co.il on 2005-09-22, and a second reminder (to helpdesk@haaretz.co.il, online bounced again) today.
I have not received any response so far.
Comment 11 User image Zvi Devir 2005-11-02 09:47:04 PST
Uri, Try sending email to reporter from Captain Internet (Shachar Samocha maybe?), and ask them if they are aware of this issue.
Comment 12 User image Simon Montagu :smontagu 2005-11-11 03:31:16 PST
*** Bug 315864 has been marked as a duplicate of this bug. ***
Comment 13 User image Dan A 2005-11-11 08:15:29 PST
Hi

I don't understand why you're not addressing the problem in FFox.

I understand that the headers say encoding X and the page says encoding Y. Why not just override the page's encoding with the one written in the HTML code? To me this looks like the right behaviour.

Consider the option where this is not an error: The author of the page really wants the browser to view the page in an encoding different than the one the server stipulates in the headers. Why not comply?
(I realize that in the case of haaretz site this IS probably an error).

I think the correct behaviour would be to use the header encoding if none was specified in the page (by HTML code), but to always allow the encoding specified in the HTML code to override.

Dan
Comment 14 User image Uri Bernstein (Google) 2005-11-11 09:29:05 PST
(In reply to comment #13)
> I understand that the headers say encoding X and the page says encoding Y. Why
> not just override the page's encoding with the one written in the HTML code? To
> me this looks like the right behaviour.

It might look right to you, but it would be a clear violation of the W3C standards. See e.g. http://www.w3.org/TR/i18n-html-tech-char/#IDARVFO :
"According to the HTML specification, in a case of conflict the HTTP charset declaration has the highest priority of all means of declaring the character set."

Also, doing what you are suggesting will break compatability with all other browsers (which do follow the standard), and might cause many sites to appear broken.
Comment 15 User image Uri Bernstein (Google) 2005-11-11 09:33:30 PST
The actual standard says (http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2):
-----
To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

   1. An HTTP "charset" parameter in a "Content-Type" field.
   2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
   3. The charset attribute set on an element that designates an external resource.
-----
Comment 16 User image Zvi Devir 2005-11-12 04:47:57 PST
(In reply to comment #14)
> 
> Also, doing what you are suggesting will break compatability with all other
> browsers (which do follow the standard), and might cause many sites to appear
> broken.
> 
What? MSIE displays Haaretz pages in the correct encoding, meaning that it doesn't follow the standard. 
Comment 17 User image Uri Bernstein (Google) 2005-11-12 04:54:30 PST
(In reply to comment #16)
> What? MSIE displays Haaretz pages in the correct encoding, meaning that it
> doesn't follow the standard. 

MSIE displays Haaretz pages in the correct encoding for the same reason Firefox 1.0 did: it does not support quotes around encoding names in HTTP headers (bug 244964). So it does violate a standard, but not this one.

If you want to achieve bug-compatability with IE, you'll have to suggest un-fixing bug 244964. But that too, according to that bug, will break some sites.
Comment 18 User image Shoshannah Forbes 2005-11-12 08:13:16 PST
(In reply to comment #16)
> What? MSIE displays Haaretz pages in the correct encoding, meaning that it
> doesn't follow the standard. 


MSIE isn't the only browser out there. Haaretz is broken in Safari for a long time already. and when Mac users complained, they where told "but it works fine in Firefox". So be careful when making Firefox bug compatible to MSIE.
Comment 19 User image Uri Bernstein (Google) 2005-11-19 02:23:30 PST
*** Bug 317089 has been marked as a duplicate of this bug. ***
Comment 20 User image Jungshik Shin 2005-11-20 06:24:28 PST
(In reply to comment #17)
> (In reply to comment #16)
> > What? MSIE displays Haaretz pages in the correct encoding, meaning that it
> > doesn't follow the standard. 
> 
> MSIE displays Haaretz pages in the correct encoding for the same reason Firefox
> 1.0 did: it does not support quotes around encoding names in HTTP headers (bug
> 244964). So it does violate a standard, but not this one.

Actually, MSIE isn't compliant to the W3C spec. and gives a higher priority to the META declaration over HTTP, which is why so many web sites (where MSIE is dominant) are broken in this respect. Anyway, has anybody contacted the webmaster of the site?

 
Comment 21 User image Uri Bernstein (Google) 2005-11-20 06:31:21 PST
(In reply to comment #20)
> 
> Actually, MSIE isn't compliant to the W3C spec. and gives a higher priority to
> the META declaration over HTTP, which is why so many web sites (where MSIE is
> dominant) are broken in this respect. 

Hmmm, so I guess I was misinformed. I apologize.

> Anyway, has anybody contacted the webmaster of the site?

I tried to - see comment #10. After getting no response, I sent today an e-mail to the editor of the online edition (peterh@haaretz.co.il), in which I complained about the unresponsiveness of the technical team. I'll post an update if I get any response.

Comment 22 User image zohar 2005-11-20 07:18:46 PST
I've sent mail to help@haaretz.co.il earlier today. This is "captain internet's" support Q&As address (supposed to be). I'll post anything I get back.
Comment 23 User image ron v 2005-12-08 14:32:20 PST
I got a response from Oded, the editor of Captain Internet section as follows:

Ron hi,

We are aware of the problem

Unfortunately our computer people have yet to figure out how to mend it without damaging the rest of the site

In the meantime, a good Samaritan (hell, a great one) who goes by the name of Effie Nadiv wrote a patch for the problem

http://www.effie.co.il/zvuvu/mozilla/haaretz.html

we hope it will be fixed soon and I apologize for the inconvenience 

oded

From: Ron ...
Sent: Thursday, December 08, 2005 10:36 PM
To: eyal.saloniki@t...; eyal.saloniki@t...; odedy@h...
Subject: You have a bug

Hello,

I am an avid reader of your online edition and would like to point out a problem in your site.

Your website's HTTP server is using the wrong HTTP header encoding, which shows incorrectly on standard compliant browsers, such as Firefox 1.5. See

https://bugzilla.mozilla.org/show_bug.cgi?id=308187

For more details.

Please fix this issue.

Thanks,
Ron.
Comment 24 User image Shoshannah Forbes 2005-12-09 15:45:06 PST
*** Bug 318102 has been marked as a duplicate of this bug. ***
Comment 25 User image Ryan Flint [:rflint] (ping via IRC for reviews) 2005-12-10 09:00:45 PST
*** Bug 319781 has been marked as a duplicate of this bug. ***
Comment 26 User image Shoshannah Forbes 2005-12-21 04:36:47 PST
*** Bug 320269 has been marked as a duplicate of this bug. ***
Comment 27 User image Eyal Rozenberg 2005-12-24 03:56:00 PST
I've also e-mailed haaretz, to no effect. I guess devoting 5 minutes to basic compliance with web standards is too much of a hassle for them. On the other hand they might just be employing overworked underpayed  non-unionized monkeys (like their reporters and columnists).
Comment 28 User image Dan A 2005-12-26 09:53:59 PST
Apparently this has been hashed over in their forums.
What they say there that the webmasters are aware of the issue, but need to plan ahead for such a complicated change... Who knows what may happen if their server headers match their page headers...
They promise it's on their TODO list, but not in the near future...
Keep nagging them :-)
Comment 29 User image Uri Bernstein (Google) 2006-06-26 10:29:14 PDT
*** Bug 342747 has been marked as a duplicate of this bug. ***
Comment 30 User image Uri Bernstein (Google) 2008-01-19 09:22:52 PST
*** Bug 412942 has been marked as a duplicate of this bug. ***
Comment 31 User image Tomer Cohen :tomer 2008-01-19 10:18:31 PST
For the record, last week I was in contact with one of Haaretz content people, and he promised me to pass the request further, but we are still far from seeing this issue resolved yet.

Comment 32 User image Simon Montagu :smontagu 2008-05-07 00:03:01 PDT
*** Bug 431066 has been marked as a duplicate of this bug. ***
Comment 33 User image playmobil 2009-06-01 07:01:20 PDT
Haaretz appear to have fixed this on their end, I think this issue can be closed.
Comment 34 User image Tomer Cohen :tomer 2009-06-01 07:08:29 PDT
(In reply to comment #33)
> Haaretz appear to have fixed this on their end, I think this issue can be
> closed.

Hi Jermy! Nice to see you around!

Well, I'm not sure if they actually fixed it on every page. For example, I found that the comments popup is still reversed on captain.co.il (and probably on other sites as well, but on captain.co.il more people mix between Hebrew and English). 

Here is a fresh example - http://themarker.captain.co.il/captain/objects/ResponseDetails.jhtml?resNo=4877847&itemno=1089232&cont=2
Comment 35 User image playmobil 2009-06-01 09:00:14 PDT
Hi Tomer,
Do you have any other examples of places this bug is still showing up apart from the comment pages?
Comment 36 User image playmobil 2009-06-02 23:13:26 PDT
OK, captain.co.il appears to be fixed now too.

Do you have any further examples?
Comment 37 User image Tomer Cohen :tomer 2009-06-09 05:07:15 PDT
I've asked on Twitter, than on my blog and some other places. So far no one is able to reproduce the problem; It is now safe to close this bug. :)

My blog post about this issue (Hebrew; I can translate the important information in case you would like me to do it) - 
http://tomercohen.com/2009/06/06/%d7%97%d7%93%d7%a9%d7%95%d7%aa-%d7%aa%d7%90%d7%99%d7%9e%d7%95%d7%aa-%d7%90%d7%aa%d7%a8%d7%99%d7%9d/


Post by Effie Nadiv, the author of the extension to workaround this issue - http://www.effie.co.il/?p=11 (English)



Thanks goes to everyone involved in solving this issue!
Comment 38 User image Simon Montagu :smontagu 2009-06-30 10:32:34 PDT
I still see this on http://www.haaretz.com/captain/pages/indexCaptain.jhtml
Comment 39 User image Tomer Cohen :tomer 2009-06-30 11:14:07 PDT
(In reply to comment #38)
> I still see this on http://www.haaretz.com/captain/pages/indexCaptain.jhtml

Indeed, but I am wondering from where you linked to this page. captain.co.il is redirecting to http://themarker.captain.co.il/ which is good to me. (XP and not PX)
Comment 40 User image Simon Montagu :smontagu 2009-06-30 11:18:46 PDT
(In reply to comment #39)
> Indeed, but I am wondering from where you linked to this page.

I got it as the first Google result for "קפטן אינטרנט"
Comment 41 User image Tomer Cohen :tomer 2009-06-30 13:05:39 PDT
I did the same search and found the following article with the same problem. (note it is on the main Haaretz domain!)

http://www.haaretz.com/captain/pages/LiArtCaptain.jhtml?contrassID=11&subContrassID=1
Comment 42 User image playmobil 2009-07-05 04:56:32 PDT
Looks like it's fixed
Comment 43 User image Tomer Cohen :tomer 2009-10-07 12:26:36 PDT
(In reply to comment #42)
> Looks like it's fixed

I found yet another domain with the same issue still occur. Do you have a contact person you can ask them to re-configure this server as well?

https://secure.haaretz.co.il/hasite/pages/tags/index.jhtml?tag=%E3%F4%E3%F4%F0%E9%ED
Comment 44 User image Tomer Cohen :tomer 2009-12-01 14:35:11 PST
Seems that there are more encoding issues from old articles.

http://news.haaretz.co.il/captain/pages/ShArtCaptain.jhtml?contrassID=11&subContrassID=0&itemNo=803671
Comment 45 User image Tomer Cohen :tomer 2010-06-19 02:58:44 PDT
The secure.haaretz.co.il domain still suffering from this issue...
Comment 46 User image Asher Levy 2011-09-11 12:35:41 PDT
I thought they fixed this issue, but lately (~6/2011) there has been a regression.
It still affects all the pages created by "HaHeadlines.jhtml", "HaSec.jhtml" & "ShArt.jhtml", but not "spages/*.html". Seems to be a problem whenever a "*.jhtml" is called. Examples:
http://www.haaretz.co.il/hasite/pages/HaHeadlines.jhtml?pageNumber=2&source=Lynx (Page 1 was .html)
http://www.haaretz.co.il/hasite/pages/HaSec.jhtml?pageNumber=2&contrassID=1&subContrassID=3 (Also pages 2 and above)
http://www.haaretz.co.il/hasite/pages/ShArt.jhtml?itemNo=1239199&contrassID=1&subContrassID=3&sbSubContrassID=0 (sometimes they link to articles using "ShArt", and not the usual "/spages/")
I have emailed "customer-htz@haaretz.co.il" & "online@haaretz.co.il", but received no reply...
Has anyone ever got them to acknowledge that they see this as a site bug?
Very annoying!
Comment 47 User image Asher Levy 2011-09-12 11:42:30 PDT
Ok,
Haaretz just launched a completely new site (not even a re-vamp, it's all completely new, and quite descent-looking). All pages are now encoded UTF-8, and old links are redirected to the new site.
So, less then 24 hours after I joined the collective moaning about their lack of interest and non-responsiveness, I think this bug can (finally) be closed.
Phew!
Comment 48 User image Tomer Cohen :tomer 2011-09-12 11:58:37 PDT
(In reply to Asher Levy from comment #47)
> So, less then 24 hours after I joined the collective moaning about their
> lack of interest and non-responsiveness, I think this bug can (finally) be
> closed.
> Phew!

Thanks for the information. I've checked previous reports, and all seems to work well without encoding problems. Closing now, we'll have to reopen in case the issue will re-appear. 


(secure.haaretz.co.il has some minor reversed test issues, but since they redirect almost everything from there back to www.haaretz.co.il I don't count it)

Note You need to log in before you can comment on or make changes to this bug.