Closed Bug 257247 Opened 17 years ago Closed 16 years ago

Live Bookmark Feed Discovery Includes Atom URI that is not a site feed

Categories

(Firefox Graveyard :: RSS Discovery and Preview, defect)

defect
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED
Firefox1.5

People

(Reporter: stephen.duncan, Assigned: philor)

References

Details

(Keywords: fixed1.8)

Attachments

(3 files, 5 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040826 Firefox/0.9.1+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040826 Firefox/0.9.1+

Links for Atom that are not site feeds are being recognized as site feeds.  On
the example site (www.stephenduncanjr.com), there are two link tags in the head
being recognized as feeds:

<link rel="service.feed" type="application/atom+xml" title="Stephen Duncan Jr's
Blog" href="http://www.stephenduncanjr.com/blog.xml" />
<link rel="service.post" type="application/atom+xml" title="Stephen Duncan Jr's
Blog" href="http://www.blogger.com/atom/3411034" /> 

Only the first one should be recognized.  User's who bookmark the second will
get a "Live Bookmark" could not be loaded message" when attempting to view posts
for that link.  The site feed autodiscovery feature should only  find the
service.feed link, I think.  

Atom link feed meanings can be found here:
http://intertwingly.net/wiki/pie/LinkTagMeaning

Almost all Blog pages produced by Blogger (www.blogger.com) will feature this
particular example problem.

Reproducible: Always
Steps to Reproduce:
1. Go to http://www.stephenduncanjr.com in Firefox Aviary nightly build
2. Click on "RSS" status bar icon
3. Notice two feeds listed.  Only one is correct.

Actual Results:  
Two "feeds" are listed, the actual feed for the site
(http://www.stephenduncanjr.com/blog.xml), and a link to the post URI for the
site (http://www.blogger.com/atom/3411034).

Expected Results:  
Only listed the site feed (http://www.stephenduncanjr.com/blog.xml)
> The site feed autodiscovery feature should only  find the
> service.feed link, I think.  

I've seen some websites use rel="alternate", so it would probably be better to
specifically exclude certain types.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch patch to exclude service.post (obsolete) — Splinter Review
This makes the RSS checker ignore 'service.post' rel values.  Can someone more
knowledgeable about ATOM say if there's any more 'application/atom+xml' types
that should be ignored?  http://intertwingly.net/wiki/pie/LinkTagMeaning has a
good list of rel values.
Assignee: vladimir → quark29
Status: NEW → ASSIGNED
Attachment #157263 - Flags: review?(vladimir)
I think, at a minimum, "service.edit" should also be ignored, based on that
Wiki.  But I'd love to get the opinion of someone more involved with Atom, as
well as knowledge about general usage of these tags.
Flags: blocking-aviary1.0?
Fixed on branch as part of vlad's checkin @ 2004-09-05 00:56
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=mozilla/browser/base/content&command=DIFF_FRAMESET&file=browser.js&rev1=1.296.2.3.2.87&rev2=1.296.2.3.2.88&root=/cvsroot

Confirmed with self build (just now).

Still needs trunk checkin, looks like.
Assignee: quark29 → vladimir
Status: ASSIGNED → NEW
Keywords: fixed-aviary1.0
Attachment #157263 - Attachment is obsolete: true
Attachment #157263 - Flags: review?(vladimir)
Flags: blocking-aviary1.0?
It's a bug in Blogger that it uses rel="service.feed" in HTML. That's the
correct rel value in a feed, in an atom:link element, but in HTML the only rel
value to point to a feed that's an alternate representation of the HTML is
"alternate" - see the autodiscovery Internet Draft at
http://www.ietf.org/internet-drafts/draft-ietf-atompub-autodiscovery-00.txt

Luckily, Blogger expands all that from a single template tag, [BlogMetaData], so
they should be able to correct it easily. I "filed a bug" (emailed their help
system).
*** Bug 259438 has been marked as a duplicate of this bug. ***
Blogger's bug with service.feed is fixed: Stephen's blog now has a
rel="alternate", and they all will as they get republished. So, as long as it
doesn't make a general release too quickly, what's been working for me, and
passes everything in http://diveintomark.org/tests/client/autodiscovery/
 (as hacked into 0.10, sorry, not a patch: ever tried pulling a tree over dialup?):

@ http://lxr.mozilla.org/aviarybranch/source/browser/base/content/browser.js#5653
(5635 in 0.10)

  var erel = event.target.rel;
  var etype = event.target.type;
  var etitle = event.target.title;

  etype = etype.replace(/^\s+/, "");
  etype = etype.replace(/\s+$/, "");
  etype = etype.toLowerCase();
  erel = erel.toLowerCase();

  if ((etype == "application/rss+xml" ||
       etype == "application/atom+xml") &&
       erel.indexOf("alternate") != -1)
  {
    const targetDoc = safeGetProperty(event.target, "ownerDocument");

Everything case-insensitive, strip space around type, rel can be a list, not
just a single value, ought to cover everything. (Strictly speaking, that would
let rel="blalternate" through, but erel.split(" ") and iterating through an
array is too much, even for me.)
> (Strictly speaking, that would let rel="blalternate" through, but erel.
> split(" ") and iterating through an array is too much, even for me.)

You don't want to catch things like

   rel="disabled-alternate"
   rel="alternates"
   rel="alternate-rock-background-music"

...and so forth.
*** Bug 259446 has been marked as a duplicate of this bug. ***
(In reply to comment #8)
> You don't want to catch things like
> 
>    rel="disabled-alternate"

True, might as well get it right the second time instead of the third. How about:

  var erel = event.target.rel;
  var etype = event.target.type;
  var etitle = event.target.title;

  etype = etype.replace(/^\s+/, "");
  etype = etype.replace(/\s+$/, "");
  etype = etype.toLowerCase();

  if ((etype == "application/rss+xml" ||
       etype == "application/atom+xml") &&
       erel.match(/(^|\s)alternate($|\s)/i))
  {
    const targetDoc = safeGetProperty(event.target, "ownerDocument");

Works with all the test-cases I could come up with, anyway.
Attached file Testcase for feed autodiscovery (obsolete) —
Assignee: vladimir → vladimir+bm
*** Bug 274234 has been marked as a duplicate of this bug. ***
The morphed "do better discovery" parts of this are maybe a little higher
priority than they were, now that www.mozilla.org uses application/rdf+xml and
titles that don't include "RSS", so we only discover the MozillaZine feed, not
either of the official m.o feeds.
Maybe the following should also be added to this patch (as per bug 259469)?

    etype == "application/rdf+xml" ||
Because being right always has to be wrong: WordPress by default links to a
heavy RSS 2.0 feed with type="application/xml+rss", a heavy Atom 0.3 feed with
type="application/atom+xml" and a nice light RSS 0.92 feed with only a
plain-text excerpt in <description> that's perfect for us with type="text/xml".

Maybe (alternate && ((rss+xml || atom+xml) || (rssInTitle && (rdf+xml || xml))))?
*** Bug 289177 has been marked as a duplicate of this bug. ***
From the dup: the HTMLization of the draft Atom spec itself gives false
positives, from links like <link rel="Chapter" title="2 Atom Documents"
href="#rfc.section.2">
This should be fixed on trunk by the aviary landing.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
It's only "fixed" by blacklisting one string. As bug 289177 demonstrates, a
blacklist is unworkable: we would need to blacklist every rel value that anyone
might use when their title value included "RSS" or "Atom," which means
blacklisting every string which is not "alternate". If we close this bug because
"service.post" is "fixed," I'll have to immediately file six new bugs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Wups, meant to take it, not shove it down Vlad's throat.
Assignee: vladimir+bm → bugzilla
Status: ASSIGNED → NEW
Attached file Expanded testcase
Attachment #159554 - Attachment is obsolete: true
Attached patch To spec, plus some wiggle (obsolete) — Splinter Review
Best we can do in an imperfect world, I think: rel must be alternate, if you
must use text/xml, application/xml, or application/rdf+xml then you have to use
" rss " in the title to separate it from all the other possible things. Works
on my testcase, Mark's test suite, and around 300 random weblogs and news sites
I tried.
Attachment #159555 - Attachment is obsolete: true
Attachment #191770 - Flags: review?(vladimir)
Looks like this is going to pick up a little bitrot from bug 303848
Component: Bookmarks → RSS Discovery and Preview
Keywords: fixed-aviary1.0
Attachment #191770 - Attachment is obsolete: true
Attachment #191770 - Flags: review?(vladimir)
Depends on: 303848
Oops, now we really need this, since bug 303848 now takes just the first type
when there's equal numbers of each, so a false positive would mean that feed
autodiscovery would only show, e.g., rel="next" type="text/html" title="RSS
Chapter".
Attachment #192484 - Flags: review?(mconnor)
Discovering HTML and ignoring RSS as a result just ain't right.
'kay, make that bug 305134 that's taking FeedView out, but leaving in the format
deduplicating code that will over-rely on autodiscovery and wind up hiding every
sort of feed if we have a non-feed false positive. I'd call that a beta-blocker.
Blocks: 305134
No longer blocks: 303848
Flags: blocking1.8b4?
Flags: blocking1.8b4? → blocking1.8b4+
Severity: normal → minor
Whiteboard: [needs review mconnor]
Flags: blocking1.8b5+
Flags: blocking1.8b5+
Attachment #192484 - Flags: review?(mconnor) → review+
Whiteboard: [needs review mconnor] → [needs approval]
Attachment #192484 - Flags: approval1.8b4?
Attached patch unrottedSplinter Review
Too much context to apply on the branch anymore; unrotted and carrying over the
r=mconnor.
Attachment #192484 - Attachment is obsolete: true
Attachment #193792 - Flags: review+
Attachment #193792 - Flags: approval1.8b4?
Attachment #192484 - Flags: approval1.8b4?
Attachment #193792 - Flags: approval1.8b4? → approval1.8b4+
1.8 branch:
Checking in browser.js;
/cvsroot/mozilla/browser/base/content/browser.js,v  <--  browser.js
new revision: 1.479.2.14; previous revision: 1.479.2.13
done

trunk:
Checking in browser.js;
/cvsroot/mozilla/browser/base/content/browser.js,v  <--  browser.js
new revision: 1.493; previous revision: 1.492
done
Keywords: fixed1.8
Whiteboard: [needs approval]
Target Milestone: --- → Firefox1.5
Status: NEW → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Resetting QA Contact to default.
QA Contact: mconnor → rss.preview
Product: Firefox → Firefox Graveyard
You need to log in before you can comment on or make changes to this bug.