Last Comment Bug 309438 - Accept: header too long on account of text types
: Accept: header too long on account of text types
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla1.9alpha1
Assigned To: Christian :Biesinger (don't email me, ping me on IRC)
:
: Patrick McManus [:mcmanus]
Mentors:
Depends on: 364234 364352
Blocks:
  Show dependency treegraph
 
Reported: 2005-09-21 01:25 PDT by Brendan Eich [:brendan]
Modified: 2007-01-17 15:02 PST (History)
27 users (show)
pavlov: blocking1.9-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch per comment 11 (1.31 KB, patch)
2006-10-10 17:58 PDT, Christian :Biesinger (don't email me, ping me on IRC)
darin.moz: review+
dbaron: superreview+
Details | Diff | Splinter Review

Description Brendan Eich [:brendan] 2005-09-21 01:25:58 PDT
See bug 240493.  The text types are

text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8

These were added long ago, in a couple of steps.  One big jump came in bug
83458, but bug 58040 started the bloat from */*.  The comments are funny to read
after all these years -- WAP/WML?  It is to laugh.

[Ok, amazon.com did serve content, and may still for all I know, in some such
format, as well as in text/html -- but they did so based only on a WAP client
saying it accepts WML, I'm sure.  No way would amazon risk sending that stuff to
a typical browser user.]

What can be done, at this late date?  It's particularly grating to see the two
XML types (which is it, text or application?  Grrr).  application/xhtml+xml is
important to some who think the web can be tilted away from HTML by clients such
as Firefox (a pipe dream, IMHO).

q=0.9 on text/html is evidently required so we can advertize our virtuousness to
servers who can send XML in addition to more-or-less-the-same HTML. But naughty,
dirty, sinful HTML is not going away, ever; there are billions and billions of
pages of it out there.  So why are we burning cycles and fiber on
text/html;q=0.9?  What good are we doing by that little bandwidth "sin" tax?

IE sends no text types in its Accept header, but it's hardly the thing to
compare ourselves to, I know (yeesh: it sends a bunch of Office application/*
junk in my new XP box, after too many image types that are standard now).

Safari sends */*.  Good for it.

I don't have Opera handy.

Just to fan the flames, or quell them again, I'll repeat something I wrote in
bug 240493: client-driven content negotiation on the web is badly broken, and
not just due to evil/lazy browser implementors.

The protocol does not scale, so no one wants to let too many types creep in. 
This leads to staleness and (minor) bloat, with reform coming, if possible, only
once in a blue moon.  Is it time for reform, or are we stuck with all these text
types?

/be
Comment 1 David Baron :dbaron: ⌚️UTC-10 2005-09-21 01:41:58 PDT
(In reply to comment #0)
> q=0.9 on text/html is evidently required so we can advertize our virtuousness to
> servers who can send XML in addition to more-or-less-the-same HTML.

And we still don't load XHTML incrementally, so for our users, HTML is actually
better.
Comment 2 Josh Aas 2005-09-21 01:51:26 PDT
Opera 8.5 on Mac sends this:

text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg,
image/gif, image/x-xbitmap, */*;q=0.1
Comment 3 Laurens Holst 2005-09-21 03:37:57 PDT
I’d say get rid of at least text/xml and text/html.

I’d personally like application/xhtml+xml to stay there. Aside from your
personal opinion of XHTML, there are many people who want to use XHTML and I’d
say it is the most prominent case where the Accept: header is actually *used*.
Also, without this particular part of the accept header, using XHTML becomes
impossible until all major browsers have implemented XHTML and are widespread
enough to use it. That, or serve XHTML as text/html, which I think you agree is
an even worse solution.


~Grauw
Comment 4 Laurens Holst 2005-09-21 03:40:19 PDT
In the contrary to that last paragraph of mine, text/html serves no purpose at
all. It is supported by everyone. No-one checks the accept header for it. Same
for text/plain. As for text/xml, application/xml is already there, I do not see
why a deprecated (is it? anyways, the application/ one is preferred) MIME type
should be in the Accept header. I do not have a particular opinion about
application/xml.

~Grauw
Comment 5 Anne (:annevk) 2005-09-21 05:21:21 PDT
The thing with regard to application/xhtml+xml is a fact, not an opinion.
text/xml is not yet deprecated. It is intended that the next XML media types RFC
will do so.

I think we should prefer text/html at least over application/xml. I have seen
some XML representations of documents on the W3C just because the HTML variant
is not preferred.
Comment 6 Christian :Biesinger (don't email me, ping me on IRC) 2005-09-21 06:23:46 PDT
we should keep the XHTML mime type in some form, imo. it allows servers to send
us mixed-namespace documents if they so desire.

If you want to rip out anything that's not used by the majority of web content,
then we might as well remove most of our CSS and DOM support.
Comment 7 Brendan Eich [:brendan] 2005-09-21 13:21:49 PDT
(In reply to comment #6)
> we should keep the XHTML mime type in some form, imo. it allows servers to send
> us mixed-namespace documents if they so desire.

Agreed.

> If you want to rip out anything that's not used by the majority of web content,
> then we might as well remove most of our CSS and DOM support.

That's fallacious, a straw man, since we're not talking about ripping out
anything substantive in the web platform.  We're talking about how to get rid of
ubiquitous and q=1 text types from Accept.  Let's stick to the subject.

How about this for text types?

  application/xml,application/xhtml+xml

Questions:

1.  Can we lose text/xml as Laurens proposes?

2.  What about q < 1 for the above?  As dbaron points out, we don't do
incremental layout for XHTML.  If the above two types are in our Accept: header,
is there some standard Apache configuration option that would tend to send us
XHTML instead of HTML in the absence of a lower q for XHTML?

/be
Comment 8 Anne (:annevk) 2005-09-21 15:08:08 PDT
We would need to mention text/html as well then. Apache knows that when there
are two documents, foo.xhtml and foo.html, Mozilla prefers the .html (text/html)
variant. If we do not mention text/html, Apache thinks Mozilla does not support
it and gives back the foo.xhtml.

I also think we should prefer text/html and application/xhtml+xml over
application/xml. As those are more semantically rich document formats.
Comment 9 Brendan Eich [:brendan] 2005-09-21 16:17:24 PDT
(In reply to comment #8)
> We would need to mention text/html as well then. Apache knows that when
> there are two documents, foo.xhtml and foo.html, Mozilla prefers the .html
> (text/html) variant.

Does it know this for Mozilla (Gecko) UAs, or all UAs?  I.e., is it hardwired to
send foo.html without any q parameter for either type in Accept:?

Should we Accept: application/xhtml+xml at q < 1 to prefer text/html to it?

> If we do not mention text/html, Apache thinks Mozilla
> does not support it and gives back the foo.xhtml.

Yeah, that makes sense.  Would it still do that with */* at q=1 at the end? 
Just asking to make sure I understand the RFC (and so does Apache).  Of course
we want q < 1 for */* at the end.

> I also think we should prefer text/html and application/xhtml+xml over
> application/xml. As those are more semantically rich document formats.

I agree that they are richer, but it still sucks to have to spell out text/html
just so we can talk about the other, far less common, types.

So we're down to eliminating text/plain and text/xml?  What a world.

/be
Comment 10 Brendan Eich [:brendan] 2005-09-21 16:36:22 PDT
(In reply to comment #2)
> Opera 8.5 on Mac sends this:
> 
> text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg,
> image/gif, image/x-xbitmap, */*;q=0.1

It sounds like we want this minus the unnecessary (png, jpeg, gif, x-xbitmap
even [who cares!]) types, and with spaces squeezed out, and q=0.5 for */* (or is
there a good reason for Opera's 0.1?).

Then we'll be more bandwidth- frugal than Opera, and as XHTML-ready.

(Yet I repeat my cheer/taunt: Go Safari! :-P)

/be
Comment 11 Anne (:annevk) 2005-09-22 01:40:10 PDT
(In reply to comment #9)
> Yeah, that makes sense.  Would it still do that with */* at q=1 at the end?

Not when application/xhtml+xml has a lower q value. This would work I guess:

# text/html,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7
Comment 12 Gervase Markham [:gerv] 2005-09-22 01:51:54 PDT
How's this for an argument?

The "bandwidth tax" argument doesn't hold much water unless the increased length
causes a significant proportion of our requests to exceed the size of a single
packet when otherwise they wouldn't. If 99% of our requests are single-packet
already, there's not much gain in shrinking Accept: further.

Gerv
Comment 13 Jonathan Watt [:jwatt] 2005-09-22 04:52:47 PDT
(In reply to comment #12)
> If 99% of our requests are single-packet already, there's not much gain in 
> shrinking Accept: further.

To know that wouldn't you need stats on plug-in installs in Firefox and
knowledge of what, if anything, those plug-ins add to Accept:. Does anyone have
anything like that?
Comment 14 Brendan Eich [:brendan] 2005-12-29 12:01:22 PST
I think we should do something like what comment 11 proposes for 1.9a1.

/be
Comment 15 Gervase Markham [:gerv] 2005-12-30 04:06:52 PST
I guess that pointing out that we prefer PNG over GIF (as the current header does) only is relevant if any browser in the world still doesn't support PNG. In 2005, I don't know of one.

So, as comment 11 says:
Accept: text/html,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7

We should be ready to add SVG when the SVG team thinks our support is solid enough.

Gerv
Comment 16 Henri Sivonen (:hsivonen) 2006-05-26 03:23:41 PDT
I agree that the content negotiation concept as defined in HTTP is broken.

Some historical perspective:
Even though Accept: application/xhtml+xml is most often used by XHTML fans merely to deprive Firefox users of incremental display and to show an occasional yellow screen of death, that is not the use case for which the type got its place in the header.

application/xhtml+xml was added to the Accept header at the time when MathPlayer in IE did not support the real XHTML type and this Mozilla-side change made negotiating XHTML+MathML between Mozilla and IE+MathPlayer possible using the usual Apache modules without CGI. The alternatives that were suggested were much worse.

The relevant historical references are:
http://groups.google.com/group/netscape.public.mozilla.mathml/browse_thread/thread/f0d7442075946397/
http://groups.google.com/group/netscape.public.mozilla.mathml/browse_thread/thread/ab83de837ff21576/
http://groups.google.com/group/netscape.public.mozilla.mathml/browse_thread/thread/a2dd34dc398590f2/

Nowadays, AFAIK, you can serve XHTML+MathML without content negotiation as application/xhtml+xml to both.
Comment 17 Christian :Biesinger (don't email me, ping me on IRC) 2006-10-10 17:58:51 PDT
Created attachment 241904 [details] [diff] [review]
patch per comment 11
Comment 18 David Baron :dbaron: ⌚️UTC-10 2006-11-20 16:00:53 PST
Comment on attachment 241904 [details] [diff] [review]
patch per comment 11

sr=dbaron assuming we still send image/png for image requests.

Also, please file a bug on bumping application/xhtml+xml back to 1.0 once incremental loading of XML lands.
Comment 19 Christian :Biesinger (don't email me, ping me on IRC) 2006-11-26 12:41:03 PST
yep, the accept header for images is at:
http://lxr.mozilla.org/seamonkey/source/modules/libpr0n/src/imgLoader.cpp#214

filed bug 361892

checked in:
Checking in all.js;
/cvsroot/mozilla/modules/libpref/src/init/all.js,v  <--  all.js
new revision: 3.662; previous revision: 3.661
done
Comment 20 Aaron Leventhal 2007-01-17 13:20:37 PST
I won't pretend to understand all the issues, but this caused bug 364352. 

Should the techniques people have been using to switch between HTML/XHTML depending on the accept header still work?

For example, see the Apache .htaccess rules as documented in "XHTML's Dirty Little Secret" no longer work -- article URL is 
http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html
Comment 21 Brendan Eich [:brendan] 2007-01-17 14:52:46 PST
(In reply to comment #20)
> I won't pretend to understand all the issues, but this caused bug 364352. 
> 
> Should the techniques people have been using to switch between HTML/XHTML
> depending on the accept header still work?

Not if it was broken.

> For example, see the Apache .htaccess rules as documented in "XHTML's Dirty
> Little Secret" no longer work -- article URL is 
> http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html

Yes, those rules are buggy.  See http://www.intertwingly.net/blog/2006/12/12/Gran-Paradiso .

/be
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2007-01-17 15:02:44 PST
> http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html

From the article:

  RewriteCond %{HTTP_ACCEPT} !application/xhtml\+xml\s*;\s*q=0

As a substring match, that's spectacularly buggy.

Note You need to log in before you can comment on or make changes to this bug.