Closed Bug 258478 Opened 17 years ago Closed 16 years ago
Fix Update to use UTF-8 (Wrong Meta Charset Information)
User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20040908 Firefox/0.10 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20040908 Firefox/0.10 Wrong Meta Charset Information and illegal Characters on "http://update.mozilla.org/extensions/showlist.php" Reproducible: Always Steps to Reproduce: 1. 2. 3.
"Sorry, I am unable to validate this document because on lines 58, 122 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication." 57: </DIV><DIV id="content"> 58: #### encoding problem on this line, not shown #### 59: <SPAN class="listtitle">Firefox Extensions » All </SPAN><br>Extensions 1 - 10 of 53 Missing line is something like <DIV id="listnav"><DIV class="pagenum" style="margin-right: 95px;">Page 0 of 0</DIV> 121: 122: #### encoding problem on this line, not shown #### 123: Jump to: <A HREF="?pageid="></A> Missing line is something like
Status: UNCONFIRMED → NEW
Ever confirmed: true
That would be the same lines that Mozilla prints as ?. It's a non html encoded bullet (which i've already fixed locally, in the new look) Though aren't there author names that are exhibiting the same problems under UTF-8?
> #2 > Though aren't there author names that are exhibiting the same problems under UTF-8? I don't understand your question. The problem is that the page is sent with an HTTP header saying charset is UTF-8 and the page itself is obviously encoded iso-8859-1, according to its meta hack: http://validator.w3.org/check?uri=http%3A%2F%2Fupdate.mozilla.org%2Fextensions%2Fshowlist.php%3Fcategory%3DAll&charset=iso-8859-1+%28Western+Europe%29&ss=1&verbose=1
Yes, but that's exactly why the meta tag is allowed. It sounds like Wolf already has this fixed locally.
The page really should not provide conflicting charset encoding info, and should probably stick to 7-bit ASCII with entities, for safety.
(In reply to comment #5) > The page really should not provide conflicting charset encoding info, This means the meta hack must be in accordance with HTTP header's charset info. > and should probably stick to 7-bit ASCII with entities, for safety. Unicode characters in UTF-8 are ok. No need at all for char entities.
*** Bug 261583 has been marked as a duplicate of this bug. ***
Even if I choose Western encoding, Firefox automagically changes back to (obviously wrong) UTF-8 on every next page in extension list, because that's what is given in HTTP response header. Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20040913 Firefox/0.10.1
Sounds like a server config issue. now that I think about it.. I bet it's because the Default Charset for the new server is UTF-8 and not ISO-8859-1, which needs to be changed.
*** Bug 263969 has been marked as a duplicate of this bug. ***
Severity: trivial → minor
OS: Linux → All
Hardware: PC → All
Whiteboard: [Server Config Blocking] → [Server Config]
This problem can also be found on the page https://update.mozilla.org/themes/moreinfo.php?id=213 Author name shown is V�ctor Fern�ndez. Changing character encoding to iso-8859-1, we can see the correct name : Víctor Fernández.
Sorry, two "?" should have been shown in the place of "�" in the last comment.
*** Bug 264681 has been marked as a duplicate of this bug. ***
Moving to server operations. Where it should've been awhlie ago. Oops. The default character set for update.mozilla.org on iguana needs to be changed from UTF-8 to ISO-8859-1. This bug appeared when we moved to iguana from rodan which had ISO-8859-1 as default.
Assignee: psychoticwolf → myk
Severity: minor → normal
Component: Update → Server Operations
QA Contact: mozilla.update → justdave
Whiteboard: [Server Config]
(In reply to comment #14) > The default character set for update.mozilla.org on iguana needs to be changed > from UTF-8 to ISO-8859-1. This bug appeared when we moved to iguana from rodan > which had ISO-8859-1 as default. Aren't there a bunch of folks working on making update.mozilla.org localizable right now? Fix the app to use UTF-8 instead of ISO-8859-1. Trust me. You're shooting the localization efforts in the foot if you don't.
Component: Server Operations → Update
17 years ago
Assignee: myk → psychoticwolf
QA Contact: justdave → mozilla.update
(In reply to comment #15) > Fix the app to use UTF-8 instead of ISO-8859-1. Trust me. And this especially applies to the data in the database (such as author names), which isn't going to change when the localization stuff is pulled in.
This fixes the featured update box and the ?'s on showlist.php.
Comment on attachment 163496 [details] [diff] [review] Fixes obvious non-UTF-8 issues. Patch checked in to branch, applied to site.
(In reply to comment #17) Thanks a lot. Now it looks much better. But UMO pages are still in a very very bad condition. The silly wrong meta hack is still around. The validator says: The character encoding specified in the HTTP header (utf-8) is different from the value in the <meta> element (iso-8859-1). I will use the value from the HTTP header (utf-8) for this validation. http://validator.w3.org/check?verbose=1&uri=http%3A//update.mozilla.org/extensions/showlist.php
I'm aware of that, that patch didn't even attempt to address that issue. :-)
Summary: Wrong Meta Charset Information and illegal Characters on "http://update.mozilla.org/extensions/showlist.php" → Fix Update to use UTF-8 (Wrong Meta Charset Information)
Bulk Moving Web Site bugs to new component. (Filter: massumowebsitespam)
Component: Update → Web Site
Product: mozilla.org → Update
Version: other → unspecified
Just remove the META element altogether. It's not needed.
Still a problem with update-beta.
(In reply to comment #23) > Still a problem with update-beta. This bug isn't marked as being fixed either. :-)
Version: unspecified → 0.9
The UTF-8/ISO-8859-1 meta conflict should be solved on http://update-beta.mozilla.org. Though, incoming data isn't guaranteed to be UTF-8 in the DB. That will likely not be solved for 1.0.. Changing Target Milestone.
Assignee: psychoticwolf → psychoticwolf
Target Milestone: 1.0 → 1.1
I would like to point out comment 22 once again and I was wondering why it can't be guarenteed UTF-8? If people are using a form to put data in the database, and that form is on a UTF-8 encoded page with UTF-8 as value of the charset parameter there shouldn't be a single problem.
(In reply to comment #22) > Just remove the META element altogether. It's not needed. I think w3 recommends having it.
I don't think so -- unless proved otherwise. W3 (WWW) perfectly works without <meta>, relying on Content-Type HTTP response header... or one of other headers. And just in case if you meant W3C... Well, that's a public organization, so either you'll find their specs recommending <meta>, or there's no such recommendations.
All pages now validate.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
This has nothing to do with validation. And, by the way, see also comment 22.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
META tags are used for supplemental information or to override server settings. Since we're controlling charset at the server level, we don't need a meta tag.
(In reply to comment #31) > META tags are used for supplemental information or to override server settings. Actually, it's the other way around. According to the RFCs, the Content-Type header the server sends overrides the META tag.
Unfortunately, some browsers are not RFC-compliant in this case, Dave.
I am not aware of a single, much used browser (that has something useful to do on UMO), that does not support this. Could you perhaps list them? And point to the test case you used?
Why is this assigned to me? What would you like me to do?
(In reply to comment #35) > Why is this assigned to me? What would you like me to do? As far as I can tell, in all pages referenced in the discussion thread the META and HTTP header content type information are equal to UTF-8. So, unless someone can throw a page where non-ASCII characters are badly displayed (I have unfortunately not seen any page showing non-English characters), this bug should be closed as fixed.
(In reply to comment #36) > As far as I can tell, in all pages referenced in the discussion thread the META > and HTTP header content type information are equal to UTF-8. So, unless someone > can throw a page where non-ASCII characters are badly displayed (I have > unfortunately not seen any page showing non-English characters), this bug should > be closed as fixed. No it should not. See for instance comment 31. This bug is fixed once the META element is gone.
Assignee: justdave → Bugzilla-alanjstrBugs
Patch with R+ on bug 279004
Assignee: Bugzilla-alanjstrBugs → bug
Depends on: 279004
The META element is gone in CVS. Marking FIXED per comment 37. The fix can be found in bug 279004.
Status: NEW → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
and the backend data is ensured to be UTF-8? really? who wrote the code for that?
Wolf, what do you mean with backend data? As long as the MIME type of the input pages is UTF-8 there will not be a problem. There is no additional action required.
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.