Closed
Bug 309085
Opened 20 years ago
Closed 19 years ago
False positives from FeedHandler.harvestFeeds hide distinct feeds
Categories
(Firefox Graveyard :: RSS Discovery and Preview, defect)
Firefox Graveyard
RSS Discovery and Preview
Tracking
(Not tracked)
RESOLVED
FIXED
Firefox 3 alpha1
People
(Reporter: philor, Assigned: sayrer)
References
()
Details
Attachments
(1 file)
4.29 KB,
patch
|
Gavin
:
review+
beltzner
:
ui-review+
|
Details | Diff | Splinter Review |
FeedHandler.harvestFeeds() tries too hard to hide "the same feed content in
different feed formats" listings, and as a result hides feeds with different
content, when the distribution of number of feeds in each type happens to be equal.
http://plasmastrum.org has autodiscovery links for two feeds:
<link rel="alternate" type="application/atom+xml" href="/syn/" title="Full text" />
<link rel="alternate" type="application/rss+xml"
href="http://del.icio.us/rss/ap/!log" title="Link log" />
Anyone trying to subscribe to the "Link log" feed will need to copy the URL out
of the source, open the Bookmark Manager, find the Add Live Bookmark item in the
File menu, and paste it in.
Suggested alternate algorithm:
1. strip (case-insensitively) "rss" and "atom" and all numbers from link titles
2. if all the titles in each type are also in the other types, return just the
first type (or, a preferred type if we actually have a reason to prefer one).
e.g., for
<link rel="alternate" type="application/rss+xml" title="Entries in RSS 2.0">
<link rel="alternate" type="application/atom+xml" title="Entries in Atom 1.0">
<link rel="alternate" type="application/rss+xml" title="Comments in RSS 2.0">
<link rel="alternate" type="application/atom+xml" title="Comments in Atom 1.0">
once the titles are stripped down to where
feedHash["application/rss+xml"][0].title is "Entries in ", every title in
feedHash["application/rss+xml"] is also in feedHash["application/atom+xml"], and
we can (more) safely assume they only differ in format.
Reporter | ||
Updated•20 years ago
|
Flags: blocking1.8b5?
Comment 1•20 years ago
|
||
Is this on trunk or on branch? I can't load the example URL, and on my weblog
(http://www.beltzner.ca/mike) the harvester reports ALL of the RSS feeds,
seemingly not stripping out any.
Reporter | ||
Comment 2•20 years ago
|
||
Trunk and branch. harvestFeeds bails on hiding feeds if there's a number
mismatch: you have two application/rss+xml and one application/atom+xml, so you
don't get hidden. I can upload a trivial testcase, if you want.
Reporter | ||
Comment 3•20 years ago
|
||
D'oh, no wonder you can't load it, I typoed it in comment 0 (the URL field's
correct).
Updated•20 years ago
|
Flags: blocking1.8b5? → blocking1.8b5+
Comment 4•20 years ago
|
||
Yeah, that sounds good to me. What do you think of also checking to ensure that
the path is the same between them? I've run across a few sites where no title
attribute is given (it's not required, I don't think) but the pathnames were
different. So ..:
<link rel="alternate" type="application/rss+xml" title="RSS Feed"
href="/news/rss.xml">
<link rel="alternate" type="application/rss+xml" title="RSS Feed"
href="/sports/rss.xml">
So:
1. Remove "RSS", "ATOM", "{0-9}" and "." from the title attribute
2. Remove everything up to the last "/" in the href attribute
3. If all the href attribute values are the same and all the title attributes
are the same, then optimize the feed list, selecting ATOM* if it exists or the
first available feed otherwise.
4. Else, return all feeds.
Flags: blocking1.8b5+ → blocking1.8b5?
Comment 5•20 years ago
|
||
oops - a bugzilla overwrite thingie changed the blocking request back to ? from
+ ... my bad! (mtschrep had granted blocking previous to my screwup.)
Updated•20 years ago
|
Flags: blocking1.8b5? → blocking1.8b5+
Comment 6•20 years ago
|
||
Unless we see a safe patch pretty quickly, this isn't gonna make it. If you do
get a patch, please request reviews and approval to land on the branch. Thanks.
Flags: blocking1.8b5+ → blocking1.8b5-
Updated•20 years ago
|
Target Milestone: --- → Firefox1.6-
Comment 7•19 years ago
|
||
(In reply to comment #4)
> Yeah, that sounds good to me. What do you think of also checking to ensure that
> the path is the same between them? I've run across a few sites where no title
> attribute is given (it's not required, I don't think) but the pathnames were
> different. So ..:
>
> <link rel="alternate" type="application/rss+xml" title="RSS Feed"
> href="/news/rss.xml">
> <link rel="alternate" type="application/rss+xml" title="RSS Feed"
> href="/sports/rss.xml">
>
> So:
>
> 1. Remove "RSS", "ATOM", "{0-9}" and "." from the title attribute
> 2. Remove everything up to the last "/" in the href attribute
> 3. If all the href attribute values are the same and all the title attributes
> are the same, then optimize the feed list, selecting ATOM* if it exists or the
> first available feed otherwise.
> 4. Else, return all feeds.
But your suggest number 2 negates your suggestion to check the path. If you remove all the href up to the last "/" then in fact your only going to check the filename. What if there:
http://www.example.com/club/sports.xml
http://www.example.com/professional/sports.xml
Your step 2 would make these the same when they clearly aren't.
Comment 8•19 years ago
|
||
A test case I have used:
<html>
<head>
<title>Page Title</title>
<link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" />
<link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" />
</head>
<body>Web Page</body></html>
shows that currently Firefox (and Minefield) don't seem to test the titles & hrefs correctly at all. The listed feeds have completely different text and hrefs yet only the atom one is shown. What also doesn't make sense is if you add a third feed.
e.g.
<html>
<head>
<title>Page Title</title>
<link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" />
<link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" />
<link rel="alternate" type="application/rss+xml" title="now.there.are.three" href="http://now.there.are.three" />
</head>
<body>Web Page</body></html>
Then the feed that was being hidden is now shown again, along with the others. This suggests it is more dependent on finding even numbers of feeds where one type is rss and one type is atom than anything else.
This can be further confirmed by adding a fourth:
<html>
<head>
<title>Page Title</title>
<link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" />
<link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" />
<link rel="alternate" type="application/rss+xml" title="now.there.are.three" href="http://now.there.are.three" />
<link rel="alternate" type="application/atom+xml" title="oh.wait.four" href="http://oh.wait.four" />
</head>
<body>Web Page</body></html>
which then causes both rss feeds to disappear. Then adding a fifth:
<html>
<head>
<title>Page Title</title>
<link rel="alternate" type="application/atom+xml" title="no.feed.here" href="http://no.feed.here" />
<link rel="alternate" type="application/rss+xml" title="this.is.a.test" href="http://this.is.a.test" />
<link rel="alternate" type="application/rss+xml" title="now.there.are.three" href="http://now.there.are.three" />
<link rel="alternate" type="application/atom+xml" title="oh.wait.four" href="http://oh.wait.four" />
<link rel="alternate" type="application/rss+xml" title="man.this.is.a.lot.of.feeds" href="http://man.this.is.a.lot.of.feeds" />
</head>
<body>Web Page</body></html>
which shows all of them again.
Reporter | ||
Updated•19 years ago
|
QA Contact: nobody → rss.preview
Reporter | ||
Updated•19 years ago
|
Target Milestone: Firefox 2 → ---
Assignee | ||
Comment 9•19 years ago
|
||
This "AI" is causing silent dataloss all over the place. IE7 will show two discovered links when the page contains two link elements, even if they are exactly the same in every respect (including URI). We should delete the code that does this.
Assignee: nobody → sayrer
Assignee | ||
Comment 10•19 years ago
|
||
Case 1: Two link elements, different formats, different titles, different URIs
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/1.html
IE7: two feeds
Fx2: one feed
Case 2: Three link elements, different titles, different URIs, 1 Atom, 2 RSS
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/2.html
IE7: three feeds
Fx2: three feeds
Case 3: Four feeds, 2 Atom, 2 RSS, different URLs, different titles
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/3.html
IE7: four feeds
Fx2: two feeds
Case 4: Five feeds, 3 RSS, 2 Atom, different titles, different URLs
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/4.html
IE7: five feeds
Fx2: five feeds
Case 5: 2 feeds, different titles, same URL, different formats
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/5.html
IE7: 2 feeds
Fx2: 1 feed
Case 6: http://plasmasturm.org case
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/6.html
IE7: 2 feeds
Fx2: 1 feed
Case 7: two feeds, identical in all respects
----------------------------------------------
http://www.franklinmint.fm/2006/12/09/7.html
IE7: 2 feeds
Fx2: 1 feed
Assignee | ||
Comment 11•19 years ago
|
||
this will show multiple feeds in cases where we didn't used to, but it gives IE compat and no prospect of endless AI bugs.
Attachment #248124 -
Flags: ui-review?(beltzner)
Attachment #248124 -
Flags: review?(gavin.sharp)
Updated•19 years ago
|
Attachment #248124 -
Flags: review?(gavin.sharp) → review+
Comment 12•19 years ago
|
||
Comment on attachment 248124 [details] [diff] [review]
remove cleverness
So, this is ui-r=beltzner assuming that there will be a follow-up bug that tries to restore some type of duplicate stripping. While I agree that dataloss is bad, and would rather show duplicates than make it look like feeds are missing, we also shouldn't be showing users a bunch of the same thing and asking them to choose between meaningless data formats.
Attachment #248124 -
Flags: ui-review?(beltzner) → ui-review+
Assignee | ||
Updated•19 years ago
|
Target Milestone: --- → Firefox 3 alpha1
Assignee | ||
Comment 13•19 years ago
|
||
Checking in browser.js;
/cvsroot/mozilla/browser/base/content/browser.js,v <-- browser.js
new revision: 1.746; previous revision: 1.745
done
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 14•19 years ago
|
||
file bug 363968.
Updated•7 years ago
|
Product: Firefox → Firefox Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•