Closed
Bug 282908
Opened 20 years ago
Closed 10 years ago
character encoding of linked RSS content always falls back to UTF-8
Categories
(MailNews Core :: Feed Reader, defect)
MailNews Core
Feed Reader
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: mkmelin, Unassigned)
References
Details
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 In a feed where RSS article content is linked, the linked content seem to always show up UTF-8 encoded. You can test using http://www.ficora.fi/suomi/tietoturva/rss/varoitukset.xml An example link from this is http://www.ficora.fi/suomi/tietoturva/varoitukset/varoitus-2005-17.htm. This shows up correctly in firefox (which recognize it as ISO-8859-1). When the same article is displayed inside thunderbird the encoding is wrong. Tested with a recent nightly as well as tb 1.0. Bug 272875 is similar but seems to deal only with inline feed content, this one is about linked content. Reproducible: Always Steps to Reproduce: 1. Add feed http://www.ficora.fi/suomi/tietoturva/rss/varoitukset.xml 2. Click on an article 3. Open the link in firefox and compare Actual Results: The article is displayed in UTF-8. Expected Results: Should be displayed in ISO-8859-1 (like in firefox).
Comment 1•20 years ago
|
||
(In reply to comment #0) > In a feed where RSS article content is linked, the linked content seem to > always show up UTF-8 encoded. By "linked content" I'm assuming you mean that, in the Subscription edit field, the checkbox "Show the article summary..." is turned off -- such that the page content is brought in when the item is displayed. > An example link from this is > http://www.ficora.fi/suomi/tietoturva/varoitukset/varoitus-2005-17.htm. This > shows up correctly in firefox (which recognize it as ISO-8859-1). When the > same article is displayed inside thunderbird the encoding is wrong. Yes, I see this. The UTF-8 encoding being used by TB is implemented to apply to the "message" item that is stored in the feed. Those "messages" are stored as UTF-8 as a matter of course. If the subject info originally was encoded in ISO-8859-1, it gets converted to UTF-8 for storage in the message item's Subject header. The "linked content" -- that is, the actual web page -- is brought in using an <iframe>. The message-item's encoding does appear to be applied to the <iframe> -- if the served page does not specify any character set on its own, as the example site does not. Furthermore, changing the encoding via the menu does not change the display of the <iframe>'s content. Seen with TB 1.0 and 1.0+20050218. Note that some text of the example RSS feed -- e.g. the links in the left-hand sidebar -- is displayed with correctly. This is because the HTML source for that text uses entities for the diacriticals. Compare to (for example) this feed: http://www.ch1webdesign.com/rss/ which serves the pages as ISO-8859-1. The items in Thunderbird are still encoded in UTF-8, but the "linked content" displays correctly because those pages are served correctly.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Hardware: PC → All
Version: unspecified → Trunk
Updated•20 years ago
|
Summary: character encoding of linked RSS content is always UTF-8 → character encoding of linked RSS content always falls back to UTF-8
Comment 2•19 years ago
|
||
> The "linked content" -- that is, the actual web page -- is brought in using an > <iframe>. The message-item's encoding does appear to be applied to the <iframe> > -- if the served page does not specify any character set on its own, as the > example site does not. I can confirm that. > Furthermore, changing the encoding via the menu does not change the display of > the <iframe>'s content. > > Seen with TB 1.0 and 1.0+20050218. Still present in TB 1.5. > The items in Thunderbird are still > encoded in UTF-8, but the "linked content" displays correctly because those > pages are served correctly. can confirm that, too. But I think there should be the possibility to change the encoding via the menu.
Comment 3•18 years ago
|
||
*** Bug 357599 has been marked as a duplicate of this bug. ***
Comment 4•18 years ago
|
||
It certainly seems like a bug to me, because it ignores the user character encoding preferences. If the iframe is presenting problems then perhaps this is yet more evidence that displaying all HTML content within an iframe is a design decision that should be revisited.
Updated•18 years ago
|
QA Contact: rss
Comment 5•17 years ago
|
||
I can confirm the bug in 2.0.0.9+20071031 and in latest night build 3.0a1pre (2008020403)
| Reporter | ||
Comment 6•17 years ago
|
||
Sample link doesn't show this anymore due to site redesign.
Assignee: mscott → nobody
Comment 7•17 years ago
|
||
this feed can be used as a reference as well http://aristo4bgu.bgu.ac.il/weboard/rss1.aspx?s=1jsm91r31qBV5TgxEwf1/w==
magnus, is this still a problem? the iframe has not been used to load the web page for a very long time. if the correct encoding is served, it should work fine, otherwise default per settings in the few outlier cases.
| Reporter | ||
Comment 9•10 years ago
|
||
Haven't seen it, and seems we have no testcase anymore, so WFM.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•