Closed
Bug 309085
Opened 19 years ago
Closed 18 years ago
False positives from FeedHandler.harvestFeeds hide distinct feeds
Categories
(Firefox Graveyard :: RSS Discovery and Preview, defect)
Firefox Graveyard
RSS Discovery and Preview
Tracking
(Not tracked)
RESOLVED
FIXED
Firefox 3 alpha1
People
(Reporter: philor, Assigned: sayrer)
References
()
Details
Attachments
(1 file)
4.29 KB,
patch
|
Gavin
:
review+
beltzner
:
ui-review+
|
Details | Diff | Splinter Review |
FeedHandler.harvestFeeds() tries too hard to hide "the same feed content in different feed formats" listings, and as a result hides feeds with different content, when the distribution of number of feeds in each type happens to be equal. http://plasmastrum.org has autodiscovery links for two feeds: <link rel="alternate" type="application/atom+xml" href="/syn/" title="Full text" /> <link rel="alternate" type="application/rss+xml" href="http://del.icio.us/rss/ap/!log" title="Link log" /> Anyone trying to subscribe to the "Link log" feed will need to copy the URL out of the source, open the Bookmark Manager, find the Add Live Bookmark item in the File menu, and paste it in. Suggested alternate algorithm: 1. strip (case-insensitively) "rss" and "atom" and all numbers from link titles 2. if all the titles in each type are also in the other types, return just the first type (or, a preferred type if we actually have a reason to prefer one). e.g., for <link rel="alternate" type="application/rss+xml" title="Entries in RSS 2.0"> <link rel="alternate" type="application/atom+xml" title="Entries in Atom 1.0"> <link rel="alternate" type="application/rss+xml" title="Comments in RSS 2.0"> <link rel="alternate" type="application/atom+xml" title="Comments in Atom 1.0"> once the titles are stripped down to where feedHash["application/rss+xml"][0].title is "Entries in ", every title in feedHash["application/rss+xml"] is also in feedHash["application/atom+xml"], and we can (more) safely assume they only differ in format.
Reporter | ||
Updated•19 years ago
|
Flags: blocking1.8b5?
Comment 1•19 years ago
|
||
Is this on trunk or on branch? I can't load the example URL, and on my weblog (http://www.beltzner.ca/mike) the harvester reports ALL of the RSS feeds, seemingly not stripping out any.
Reporter | ||
Comment 2•19 years ago
|
||
Trunk and branch. harvestFeeds bails on hiding feeds if there's a number mismatch: you have two application/rss+xml and one application/atom+xml, so you don't get hidden. I can upload a trivial testcase, if you want.
Reporter | ||
Comment 3•19 years ago
|
||
D'oh, no wonder you can't load it, I typoed it in comment 0 (the URL field's correct).
Updated•19 years ago
|
Flags: blocking1.8b5? → blocking1.8b5+
Comment 4•19 years ago
|
||
Yeah, that sounds good to me. What do you think of also checking to ensure that the path is the same between them? I've run across a few sites where no title attribute is given (it's not required, I don't think) but the pathnames were different. So ..: <link rel="alternate" type="application/rss+xml" title="RSS Feed" href="/news/rss.xml"> <link rel="alternate" type="application/rss+xml" title="RSS Feed" href="/sports/rss.xml"> So: 1. Remove "RSS", "ATOM", "{0-9}" and "." from the title attribute 2. Remove everything up to the last "/" in the href attribute 3. If all the href attribute values are the same and all the title attributes are the same, then optimize the feed list, selecting ATOM* if it exists or the first available feed otherwise. 4. Else, return all feeds.
Flags: blocking1.8b5+ → blocking1.8b5?
Comment 5•19 years ago
|
||
oops - a bugzilla overwrite thingie changed the blocking request back to ? from + ... my bad! (mtschrep had granted blocking previous to my screwup.)
Updated•19 years ago
|
Flags: blocking1.8b5? → blocking1.8b5+
Comment 6•19 years ago
|
||
Unless we see a safe patch pretty quickly, this isn't gonna make it. If you do get a patch, please request reviews and approval to land on the branch. Thanks.
Flags: blocking1.8b5+ → blocking1.8b5-
Updated•19 years ago
|
Target Milestone: --- → Firefox1.6-
Comment 7•18 years ago
|
||
(In reply to comment #4) > Yeah, that sounds good to me. What do you think of also checking to ensure that > the path is the same between them? I've run across a few sites where no title > attribute is given (it's not required, I don't think) but the pathnames were > different. So ..: > > <link rel="alternate" type="application/rss+xml" title="RSS Feed" > href="/news/rss.xml"> > <link rel="alternate" type="application/rss+xml" title="RSS Feed" > href="/sports/rss.xml"> > > So: > > 1. Remove "RSS", "ATOM", "{0-9}" and "." from the title attribute > 2. Remove everything up to the last "/" in the href attribute > 3. If all the href attribute values are the same and all the title attributes > are the same, then optimize the feed list, selecting ATOM* if it exists or the > first available feed otherwise. > 4. Else, return all feeds. But your suggest number 2 negates your suggestion to check the path. If you remove all the href up to the last "/" then in fact your only going to check the filename. What if there: http://www.example.com/club/sports.xml http://www.example.com/professional/sports.xml Your step 2 would make these the same when they clearly aren't.
Comment 8•18 years ago
|
||
A test case I have used: <html> <head> <title>Page Title</title> <link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" /> <link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" /> </head> <body>Web Page</body></html> shows that currently Firefox (and Minefield) don't seem to test the titles & hrefs correctly at all. The listed feeds have completely different text and hrefs yet only the atom one is shown. What also doesn't make sense is if you add a third feed. e.g. <html> <head> <title>Page Title</title> <link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" /> <link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" /> <link rel="alternate" type="application/rss+xml" title="now.there.are.three" href="http://now.there.are.three" /> </head> <body>Web Page</body></html> Then the feed that was being hidden is now shown again, along with the others. This suggests it is more dependent on finding even numbers of feeds where one type is rss and one type is atom than anything else. This can be further confirmed by adding a fourth: <html> <head> <title>Page Title</title> <link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" /> <link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" /> <link rel="alternate" type="application/rss+xml" title="now.there.are.three" href="http://now.there.are.three" /> <link rel="alternate" type="application/atom+xml" title="oh.wait.four" href="http://oh.wait.four" /> </head> <body>Web Page</body></html> which then causes both rss feeds to disappear. Then adding a fifth: <html> <head> <title>Page Title</title> <link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" /> <link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" /> <link rel="alternate" type="application/rss+xml" title="now.there.are.three" href="http://now.there.are.three" /> <link rel="alternate" type="application/atom+xml" title="oh.wait.four" href="http://oh.wait.four" /> <link rel="alternate" type="application/rss+xml" title="man.this.is.a.lot.of.feeds" href="http://man.this.is.a.lot.of.feeds" /> </head> <body>Web Page</body></html> which shows all of them again.
Reporter | ||
Updated•18 years ago
|
QA Contact: nobody → rss.preview
Reporter | ||
Updated•18 years ago
|
Target Milestone: Firefox 2 → ---
Assignee | ||
Comment 9•18 years ago
|
||
This "AI" is causing silent dataloss all over the place. IE7 will show two discovered links when the page contains two link elements, even if they are exactly the same in every respect (including URI). We should delete the code that does this.
Assignee: nobody → sayrer
Assignee | ||
Comment 10•18 years ago
|
||
Case 1: Two link elements, different formats, different titles, different URIs ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/1.html IE7: two feeds Fx2: one feed Case 2: Three link elements, different titles, different URIs, 1 Atom, 2 RSS ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/2.html IE7: three feeds Fx2: three feeds Case 3: Four feeds, 2 Atom, 2 RSS, different URLs, different titles ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/3.html IE7: four feeds Fx2: two feeds Case 4: Five feeds, 3 RSS, 2 Atom, different titles, different URLs ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/4.html IE7: five feeds Fx2: five feeds Case 5: 2 feeds, different titles, same URL, different formats ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/5.html IE7: 2 feeds Fx2: 1 feed Case 6: http://plasmasturm.org case ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/6.html IE7: 2 feeds Fx2: 1 feed Case 7: two feeds, identical in all respects ---------------------------------------------- http://www.franklinmint.fm/2006/12/09/7.html IE7: 2 feeds Fx2: 1 feed
Assignee | ||
Comment 11•18 years ago
|
||
this will show multiple feeds in cases where we didn't used to, but it gives IE compat and no prospect of endless AI bugs.
Attachment #248124 -
Flags: ui-review?(beltzner)
Attachment #248124 -
Flags: review?(gavin.sharp)
Updated•18 years ago
|
Attachment #248124 -
Flags: review?(gavin.sharp) → review+
Comment 12•18 years ago
|
||
Comment on attachment 248124 [details] [diff] [review] remove cleverness So, this is ui-r=beltzner assuming that there will be a follow-up bug that tries to restore some type of duplicate stripping. While I agree that dataloss is bad, and would rather show duplicates than make it look like feeds are missing, we also shouldn't be showing users a bunch of the same thing and asking them to choose between meaningless data formats.
Attachment #248124 -
Flags: ui-review?(beltzner) → ui-review+
Assignee | ||
Updated•18 years ago
|
Target Milestone: --- → Firefox 3 alpha1
Assignee | ||
Comment 13•18 years ago
|
||
Checking in browser.js; /cvsroot/mozilla/browser/base/content/browser.js,v <-- browser.js new revision: 1.746; previous revision: 1.745 done
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 14•18 years ago
|
||
file bug 363968.
Updated•5 years ago
|
Product: Firefox → Firefox Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•