Feed sniffing should apply to fewer mime types

VERIFIED FIXED in Firefox 3.1b1



RSS Discovery and Preview
10 years ago
8 years ago


(Reporter: bz, Assigned: bz)


({html5, perf})

Firefox 3.1b1
html5, perf
Bug Flags:
blocking-firefox3 -
blocking-firefox3.5 -
wanted-firefox3 +
wanted-firefox3.5 +

Firefox Tracking Flags

(Not tracked)



(1 attachment)

Right now, we sniff all loads in the content area, subframes, or <object> for possibly being feeds.

This seems wrong to me.  I see no reason to sniff an image/png or video/whatever or application/x-msword file to see whether it might be a feed.  What we do now is bad for the web in general, not to mention a possible performance drag.

Realistically, this code should only sniff types that are likely to be feeds.  A few of the relevant XML types, text/plain, the "maybe text" type, and text/html should cover it, I suspect.
Flags: blocking-firefox3?


10 years ago
Summary: Feed sniffing too permissive → Feed sniffing should apply to fewer mime types
Flags: blocking-firefox3? → blocking-firefox3+
Keywords: perf


10 years ago
Assignee: nobody → rflint
OS: Linux → All
Hardware: PC → All
Target Milestone: --- → Firefox 3 M10


10 years ago
Target Milestone: Firefox 3 M10 → Firefox 3 M11


10 years ago
Priority: -- → P5


10 years ago
Flags: wanted-firefox3+
Flags: blocking-firefox3-
Flags: blocking-firefox3+
Target Milestone: Firefox 3 Mx → Firefox 3 M11
This also happens to violate HTML5, and is causing real-life interoperability issues...
Flags: blocking-firefox3.1?
Keywords: html5
So are you saying that you want this bug to implement HTML5 sniffing, which means only sniffing for the results of incompetent PHP programmers who send text/html while making it impossible to subscribe to static foo.rss or foo.atom if served by an Apache server with a mime.types file from before August 2007, or to subscribe to foo.xml at all?
HTML5 sniffing specifically allows you to sniff the Apache text/plain brokenness, so I have no idea what you're talking about with "impossible ... if served by an Apache server".

I think what we should be doing is what I said in comment 0: text/html, XML types, and the HTML5-allowed sniffing.
Yes, HTML5 sniffing allows you to sniff Apache text/plain: as text/plain, or as application/octet-stream. Nothing else. There is only one possible entry into feed sniffing in the HTML5 algo, "text/html". No sniffing of text/xml or application/xml or any type ending +xml, including RSS 1.0 served as application/rdf+xml, is allowed.

Comment 5

9 years ago
(In reply to comment #4)
> No sniffing of text/xml or
> application/xml or any type ending +xml, including RSS 1.0 served as
> application/rdf+xml, is allowed.

If that is correct, I don't see how we can follow HTML5's current text. Do any browsers behave that way?

Comment 6

9 years ago
I don't see how HTML5 forbids "sniffing" XML for feeds. (I don't think that's considered sniffing.)
(In reply to comment #5)
> Do any browsers behave that way?

Of course not.

(In reply to comment #6)
> I don't see how HTML5 forbids "sniffing" XML for feeds. (I don't think that's
> considered sniffing.)

Um, what? Please, please, please *read* http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#content-type0 paying attention to parts like where you are told that text/plain may only remain text/plain or become application/octet-stream, and where you are told unambiguously that if you are sniffing and the type is any XML type, you must stop instantly, and then where you are told if and only if you have made it that far without being stopped and the type is text/html, you may then proceed to the lovingly detailed CONTENT-TYPE SNIFFING: feed or HTML section, and then explain to me your theory that sniffing content served under another type and treating it as though it was served with a feed type is not sniffing. I look forward to hearing wtf that section about sniffing feeds is for, if sniffing feeds is not sniffing feeds. While you are doing that, I'll dig up Hixie's mails that clearly and unambiguously say that that's exactly what he intends to say, that nothing but application/rss+xml, application/atom+xml, and text/html may be treated as being feeds when loaded as content, rather than from autodiscovery. Whether or not at some future date HTML5 will still say that, that *is* what it says today.
That section of HTML5 used to say something much more reasonable, and it sounds like it should be changed to say that more reasonable thing again.  Please post to public-html accordingly?

The original point of comment 0 stands, in any case.

Comment 9

9 years ago
The file type is XML. What happens during XML page loading is defined here: http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#read-xml

HTML5 does indeed not allow sniffing in text/plain. I did not contest that.

Comment 10

9 years ago
(I'm not sure what you heard from Hixie and where. It does not seem to be accurate as far as I can tell. Also, reading HTML5 is part of my job...)
Ah, my apologies, I lacked sufficient imagination to expect that there would be sniffing and "sniffing" and that "sniffing" wouldn't be called sniffing.
Would be good to get, but not going to block on it.
Flags: wanted-firefox3.1+
Flags: blocking-firefox3.1?
Flags: blocking-firefox3.1-
Created attachment 335064 [details] [diff] [review]
Let's try this
Attachment #335064 - Flags: review?(rflint)


9 years ago
Attachment #335064 - Flags: review?(rflint) → review+

Comment 14

9 years ago
Comment on attachment 335064 [details] [diff] [review]
Let's try this

>+  // Don't sniff arbitrary types.  Limit sniffing to situations that
>+  // we think can reasonably arise.
>+  if (!contentType.EqualsLiteral(TEXT_PLAIN) &&

Let's take this out, per HTML5. We can put it back if it turns out not to be web compatible.

>+      !contentType.EqualsLiteral(APPLICATION_GUESS_FROM_EXT) &&
>+      !contentType.EqualsLiteral(TEXT_HTML) &&
>+      !contentType.EqualsLiteral(APPLICATION_OCTET_STREAM) &&
>+      // Same criterion as XMLHttpRequest.  Should we be checking for "+xml"
>+      // and check for text/xml and application/xml by hand instead?
>+      contentType.Find("xml") == -1)

We can do better here, let's file a follow-up.
I'll take out the APPLICATION_GUESS_FROM_EXT too, then, since that's our "text/plain that looks like binary" thing.
Assignee: rflint → bzbarsky
Pushed changeset 45d7857e2e0e with that change.
Last Resolved: 9 years ago
Resolution: --- → FIXED


9 years ago
Target Milestone: Firefox 3 beta3 → Firefox 3.1b1
Blocks: 360705
Verified that it has been fixed on build

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1pre) Gecko/20090530 Shiretoko/3.5pre (.NET CLR 3.5.30729)
You need to log in before you can comment on or make changes to this bug.