Closed Bug 476641 Opened 16 years ago Closed 11 years ago

parsing RSS of Planet Python outputs a lot of "(no subject)" articles

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

Thunderbird 27.0

People

(Reporter: bamanzi, Assigned: alta88)

References

Details

Attachments

(1 file)

rss1rdf.patch 11 years ago alta88 2.35 KB, patch	mkmelin : review+	Details \| Diff \| Splinter Review

Ralph Young

Reporter

Description

•

16 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.9.1b2) Gecko/20081201 Firefox/3.1b2 GTB5 Build Identifier: 2.0.0.19 When I subscribed Planet Python's RSS feed(http://planet.python.org/rss10.xml), thunderbird give me a lot of articles without content. Reproducible: Always Steps to Reproduce: 0. Create a 'News & Blogs' account 1. In account settings, check 'show summary of article rather than original whole page by default' (I'm using the zh-CN version and don't know the exact english string for this) 2. Subscribe http://planet.python.org/rss10.xml 3. TB starts to parsing that RSS, and the summary window show some (not all) entries labeld '(no subject)', and there's no content for each entry (without step 2, TB can correctly connect the original URL of that entry)

Phil Ringnalda (:philor)

Comment 1

•

16 years ago

Yep, that would be one of the reasons why RSS 1.0 is a bad idea: it's approximately RDF, but very nearly nobody other than Thunderbird parses it with an RDF parser, so people don't realize how horrible their RDF is. The <items> rdf:Seq is like a table of contents for the feed, telling you the URIs that feed has an <item> which is rdf:about the same URI. However, that Planet feed looks like it's assembling the Seq from the original item ids/guids, and the rdf:about on items from the original item links, which can be quite different, so the feed claims it includes items about http://jessenoller.com/?p=461 when it actually has items about http://feedproxy.google.com/~r/Jessenollercom/~3/p02Yjhv_hmU and items about tag:blogger.com,1999:blog-496482.post-4627626801651497621 which are actually about http://holdenweb.blogspot.com/2009/02/on-take.html By far the most likely way this will get fixed is by bug 450543 switching us over to the Toolkit feed parser, which dropped using an RDF parser because it gets terrible results from terrible RDF for no benefit. (And as far as just being able to read Planet Python, even though they don't advertise it they do have an RSS 2.0 feed which is likely to work out much better, at http://planet.python.org/rss20.xml)

Depends on: 450543

timeless

Updated

•

16 years ago

Component: RSS → Feed Reader

Product: Thunderbird → MailNews Core

alta88

Assignee

Comment 2

•

11 years ago

this should be resolved by either 1. closing as invalid as it's up to the publisher to get the <items> list right, and other rdf publishers do it right. 2. ignoring the <items> list and get all <item>s, like Fx does, and which Tb does if there isn't an <items> to begin with. magnus, what do you think?

Flags: needinfo?(mkmelin+mozilla)

Magnus Melin [:mkmelin]

Comment 3

•

11 years ago

2 sounds like the more pragmatic thing to do.

Flags: needinfo?(mkmelin+mozilla)

alta88

Assignee

Comment 4

•

11 years ago

Attached patch rss1rdf.patch — Details — Splinter Review

Assignee: nobody → alta88

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

Attachment #819201 - Flags: review?(mkmelin+mozilla)

Magnus Melin [:mkmelin]

Comment 5

•

11 years ago

Comment on attachment 819201 [details] [diff] [review] rss1rdf.patch Review of attachment 819201 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, thx! r=mkmelin

Attachment #819201 - Flags: review?(mkmelin+mozilla) → review+

Ludovic Hirlimann [:Usul]

Updated

•

11 years ago

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 6

•

11 years ago

https://hg.mozilla.org/comm-central/rev/36ac7d6d04ed

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Keywords: checkin-needed

Resolution: --- → FIXED

Target Milestone: --- → Thunderbird 27.0

You need to log in before you can comment on or make changes to this bug.

Bugzilla

parsing RSS of Planet Python outputs a lot of "(no subject)" articles

Categories

(MailNews Core :: Feed Reader, defect)

Tracking

(Not tracked)

People

(Reporter: bamanzi, Assigned: alta88)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Attachment

General

Description

File Name

Content Type