Closed Bug 337451 Opened 18 years ago Closed 18 years ago

better charset support (feed preview)

Categories

(Firefox Graveyard :: RSS Discovery and Preview, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 2 beta1

People

(Reporter: ria.klaassen, Assigned: sayrer)

References

Details

(Keywords: fixed1.8.1, intl, Whiteboard: [swag: 0d])

Attachments

(4 files, 3 obsolete files)

When I look at this feed: http://planet.mozilla.org/rss20.xml , mr Smedberg seems to say weird things, so I want to change the encoding but the page doesn't react.
When I run in in the old pretty print in Firefox 1.5.3 however, I see the right text: It’s not Bias, It’s Discrimination.
Attached image minimum font-size set to 14 (obsolete) —
This becomes really ugly when you set a minimum font-size of 14 in Options > Content > Fonts & Colors > Advanced.
(In reply to comment #2)
> Created an attachment (id=221684) [edit]
> minimum font-size set to 14
> 
> This becomes really ugly when you set a minimum font-size of 14 in Options >
> Content > Fonts & Colors > Advanced.
>
Sorry, posted in the wrong bug :( 

http://benjamin.smedbergs.us/blog/feed/

wfm. planet has encoding problems. Not sure changing the encoding will help here, since the feed preview will always be utf-8.
Attachment #221684 - Attachment is obsolete: true
Attached image screenshot
This piece of French text looks alien. A big part of the words look like curses. Sounds very weird to me if this can't be solved.
Well, there's "can't be solved" and then there's "that's not the way to solve it."

Filed bug 339409 against planet.m.o, which is sending UTF-8 as text/xml with no charset param and no encoding in the XML decl, probably the very worst possible way to send XML.

Rob, am I guessing right from our mangled characters, that we're using ISO-8859-1 as a default in the absence of a charset? That doesn't seem like a very happy choice, if we can avoid it.
(In reply to comment #6)
> 
> Rob, am I guessing right from our mangled characters, that we're using
> ISO-8859-1 as a default in the absence of a charset? That doesn't seem like a
> very happy choice, if we can avoid it.

I suspect we are sniffing it, as there are cp-1252 characters near the beginning.
http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fplanet.mozilla.org%2Frss20.xml

I will try that feed with those chars edited out.
Attached file sample xml
OK, planet is totally busted, but we can do better. 
Assignee: nobody → sayrer
Summary: Can't change the encoding of the feed → better UTF-1252 support
Summary: better UTF-1252 support → better UTF-1252 support (feed preview)
> Rob, am I guessing right from our mangled characters, that we're using
> ISO-8859-1 as a default in the absence of a charset?

Phil wins. That's exactly what was happening, if I understand nsParser. This patch  will silently drop the 1252 characters (there were only a few), just like our view-source view does (fair enough, they are control characters in utf-8).
Attachment #223568 - Flags: superreview?(peterv)
Attachment #223568 - Flags: review?
Attachment #223568 - Flags: review? → review?(dbaron)
Flags: blocking-firefox2?
Summary: better UTF-1252 support (feed preview) → better charset support (feed preview)
Attachment #223568 - Flags: review?(dbaron) → review?(bugmail)
Blocks: 339599
Comment on attachment 223568 [details] [diff] [review]
Use TryDocumentCharset the way nsXMLDocument does

>Index: parser/xml/src/nsSAXXMLReader.cpp
> nsSAXXMLReader::OnStartRequest(nsIRequest *aRequest, nsISupports *aContext)
> {
>   NS_ENSURE_TRUE(mIsAsyncParse, NS_ERROR_FAILURE);
>-  nsresult rv = InitParser(mParserObserver);
>+  nsresult rv;
>+  nsCOMPtr<nsIChannel> channel = do_QueryInterface(aRequest);
>+  if (channel)
>+    rv = InitParser(mParserObserver, channel);
>+  else
>+    rv = InitParser(mParserObserver, nsnull);

This looks a lot like just doing |rv = InitParser(mParserObserver, channel);| always :)


>+nsSAXXMLReader::TryChannelCharset(nsIChannel *aChannel,
>+                                  PRInt32& aCharsetSource,
>+                                  nsACString& aCharset)
>+{
>+  if (kCharsetFromChannel <= aCharsetSource) {

Please switch these two around, that makes it easier to read IMHO.

With both those fixed, r=me
Attachment #223568 - Flags: review?(bugmail) → review+
Flags: blocking-firefox2? → blocking-firefox2+
Attached patch address comments from sicking (obsolete) — Splinter Review
Attachment #223568 - Attachment is obsolete: true
Attachment #224378 - Flags: superreview?(peterv)
Attachment #224378 - Flags: approval-branch-1.8.1?(peterv)
Attachment #223568 - Flags: superreview?(peterv)
Comment on attachment 224378 [details] [diff] [review]
address comments from sicking

>+// from nsDocument.cpp

Too bad we can't share this :-(.

>+nsSAXXMLReader::TryChannelCharset(nsIChannel *aChannel,

>+        rv = calias->GetPreferred(charsetVal,
>+                                  preferred);

This can go on one line.
Attachment #224378 - Flags: superreview?(peterv)
Attachment #224378 - Flags: superreview+
Attachment #224378 - Flags: approval-branch-1.8.1?(peterv)
Attachment #224378 - Flags: approval-branch-1.8.1+
Checked in on trunk.

Checking in parser/xml/src/Makefile.in;
/cvsroot/mozilla/parser/xml/src/Makefile.in,v  <--  Makefile.in
new revision: 1.3; previous revision: 1.2
done
Checking in parser/xml/src/nsSAXXMLReader.h;
/cvsroot/mozilla/parser/xml/src/nsSAXXMLReader.h,v  <--  nsSAXXMLReader.h
new revision: 1.3; previous revision: 1.2
done
/cvsroot/mozilla/parser/xml/src/nsSAXXMLReader.cpp,v  <--  nsSAXXMLReader.cpp
new revision: 1.7; previous revision: 1.6
done
Whiteboard: [swag: 0d]
OS: Windows XP → All
Hardware: PC → All
Target Milestone: --- → Firefox 2 beta1
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Attachment #224378 - Attachment is obsolete: true
Checking in parser/xml/src/Makefile.in;
/cvsroot/mozilla/parser/xml/src/Makefile.in,v  <--  Makefile.in
new revision: 1.2.2.3; previous revision: 1.2.2.2
done
Checking in parser/xml/src/nsSAXXMLReader.cpp;
/cvsroot/mozilla/parser/xml/src/nsSAXXMLReader.cpp,v  <--  nsSAXXMLReader.cpp
new revision: 1.6.2.4; previous revision: 1.6.2.3
done
Checking in parser/xml/src/nsSAXXMLReader.h;
/cvsroot/mozilla/parser/xml/src/nsSAXXMLReader.h,v  <--  nsSAXXMLReader.h
new revision: 1.2.2.4; previous revision: 1.2.2.3
done
Keywords: fixed1.8.1
Product: Firefox → Firefox Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: