Closed
Bug 277243
Opened 20 years ago
Closed 16 years ago
XML entities in RSS feeds should be replaced for the subject
Categories
(MailNews Core :: Feed Reader, defect)
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: zwnj, Unassigned)
References
()
Details
Attachments
(1 file)
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041111 Firefox/1.0 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041111 Firefox/1.0 Thunderbird should replace XML entities with their equivalent unicode in the subject field of feeds, that is plain-text in emails. Reproducible: Always Actual Results: Subject of the feed is """Quest For "Unbreakable Java" Unites ABAP & Java""" Expected Results: That should be """Quest For "Unbreakable Java" Unites ABAP & Java"""
| Reporter | ||
Comment 1•20 years ago
|
||
Comment 2•20 years ago
|
||
how about using a recent release like 1.0 first....thanks
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
| Reporter | ||
Comment 3•20 years ago
|
||
sorry for the noise. :)
Comment 4•17 years ago
|
||
(Not sure why this was INVALID; sounds like mscott was alluding to WORKSFORME in comment 2?) I'm getting reports of this with Thunderbird 2.0.0.12 (a rather recent release!). To wit, this title: <title type="html"><![CDATA[this’ll be fun]]></title> in an Atom 0.3 feed from http://shaver.off.net/diary/feed/atom/ is not having the character entity replaced. By my reading of http://atomenabled.org/developers/syndication/atom-format-spec.php#rfc.section.3.1.1.2 it should be. Firefox 2.0.0.12 does the right thing here, FWIW, as does trunk. http://weblog.philringnalda.com/2005/12/18/who-knows-a-title-from-a-hole-in-the-ground is probably indicating this in its "html-cdata" test, found at http://atomtests.philringnalda.com/tests/item/title/html-cdata.atom . I looked for a bug that would indicate that this was fixed on the trunk, but this was the only thing I found that was close, and it doesn't sound like it expected it to be fixed after TB2.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
Updated•17 years ago
|
Assignee: mscott → nobody
Status: UNCONFIRMED → NEW
Ever confirmed: true
QA Contact: rss
Comment 5•17 years ago
|
||
According to the atom rfc you cited I would think CDATA isn't even allowed there - "the content of the Text construct MUST NOT contain child elements", which the CDATA node would be, no? But even it it were, the characters within the CDATA are to be treated as is. That firefox is displaying the resolved NCR sounds like a (minor) bug, maybe just a coincidence.
Comment 6•17 years ago
|
||
The characters within the CDATA should be treated as-is by the XML parsing of the feed, but type="html" means that the resulting data (which will contain the unreplaced entity) should then be interpreted as HTML, which means replacing the entities. It would mean that ' and would be recognized, which they wouldn't be if the entities were replaced by the XML content sink. I didn't know that CDATA created a child node; I find that surprising, I admit! Are you sure?
Comment 7•17 years ago
|
||
No, I was mistaken about that - it's not a child node.
Comment 8•17 years ago
|
||
You were actually looking for bug 320818 (to file a followup to it, really) - this bug was about the unquestionably broken way that Slashdot used to double-escape HTML in RSS 1.0 titles (the feed would have actually had "&amp;" in it, which requires a difficult decision to say "that ain't right, fix your feed" for RSS 2.0, but is an easy INVALID for RSS 1.0), while your problem is that bug 320818 fixed the test by replacing exactly and only <, >, and &, rather than fixing the issue by replacing all HTML named entities and all NCRs, in Atom with type="html". Dunno if it's worth starting over in a new bug, though, since once I find the "switch to using the toolkit feed parser" bug which is currently hiding from me, I'd just mark the new bug dependent and call it a day :)
Comment 9•16 years ago
|
||
I have a similiar issue with Google News - Sci/Tech http://news.google.com/news?ned=us&topic=t&output=rss where the apostrophes in the title are not parsed: Yahoo's Fire Eagle Updates Location Across Sites - PC Magazine should be Yahoo's Fire Eagle Updates Location Across Sites - PC Magazine I've been looking at all the titles back to July and the ones with this issue are all like: <title>Yahoo&#39;s Fire Eagle Updates Location Across Sites - PC Magazine</title> Note1: The only titles with this issue are those with ' Note2: Firefox (3.0.1) has the very same issue with Live Bookmarks so this might be a problem with the feed.
Comment 10•16 years ago
|
||
Is a problem with the feed: as feedvalidator.org will tell you, "clients will behave unpredictably in the presence of such markup: some will interpret it as HTML, others will strip it, and still others will display the markup itself." (And double-escaping apostrophes in XML *element* content is just flat out insane.) Back to INVALID: I don't know what was up with comment 2 and comment 3, but shaver's comment 4 about Atom with type="html" is the only valid thing here, and that's totally separate from either RSS 1.0 or RSS 2.0. I filed bug 450541 for it.
Status: NEW → RESOLVED
Closed: 20 years ago → 16 years ago
Resolution: --- → INVALID
Summary: XML entities should replaced for the subject → XML entities in RSS feeds should be replaced for the subject
You need to log in
before you can comment on or make changes to this bug.
Description
•