The default bug view has changed. See this FAQ.

"Accept:" header should do something useful.

VERIFIED FIXED in mozilla0.9.1

Status

()

Core
Networking: HTTP
P3
enhancement
VERIFIED FIXED
17 years ago
4 years ago

People

(Reporter: Adrian Havill, Assigned: Darin Fisher)

Tracking

Trunk
mozilla0.9.1
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Reporter)

Description

17 years ago
Comment in the source code says that the Accept: */* header is done because MIME
based
content negotiation is dead. I believe this is wrong because: 1) MIME based
content
negotiation is a key feature of HTTP/1.1 and removing it would encourage
detection of
features based on the User-Agent, which is common practice I believe because
Accept doesn't
report anything useful. See

<URL:http://www-uk.hpl.hp.com/people/ange/archives/archives-96/http-wg-archive/0087.html>

As we're going through all the work to provide a "standards" based layout mode
and a
"quirk" based layout mode for legacy compatitibility, we should provide the
server side an
easy way to tell what standards the browser supports so it can deliver a
standards based
document if the browser is built for it. By performing correct negotiation
indicating
a preference for stylesheets, xml, etc., server users can take advantage of
browsers that
handle standards if they say they can, and fallback to quirk mode if they don't
say
anything. It should be the other way around, but standard practice is to write
to what
people use, not to what they should be using.

In addition, if Accept gave a clue to additional MIME types supported through
plugins and
compiled in features (SVG, MathML, Java), web content authoring would be greatly
simplified, as now the author has to go through convoluted proprietary "probes"
or
awkward "select this page for a flash version, select this page for a java
version".

Reflecting the user preferences (using "q" values of zero when the user has
images turned
off, java turned off, javascript turned off) also would simplify web authoring
in that
the "select this page for a non-javascript version, a text version of this page,
etc."
pages could be automated, and content producers would benefit by getting stats
as to how
many people want their images, etc. (Privacy needs to be kept in mind, but if
that's a
factor, the User-Agent should be removed for it to have any teeth. Perhaps
"Privacy"
should be a separate feature?)

Anyway, as long as "*/*" is present, and only standards are listed in the Accept
and
not proprietary extensions (which won't happen in Mozilla), the Accept header
can be
a useful flag to allow content authors on the server side to know when it's o.k.
to
deliver "standard" content, as they won't deliver it now because standards based
content
won't look good and/or behave well on the common denominator browser.
(Reporter)

Comment 1

17 years ago
Created attachment 18025 [details] [diff] [review]
adds stds supported to Accept, uses "q" to reflect preferences

Comment 2

17 years ago
Declaring that "MIME based content negotiation has died" is foolish.  Consider
that Apache _already_ supports content negotiation.  Sure, right now the only
real use is for server-side negotation, but the negotiation algorithm has been
written for user agents that send useful "Accept:" headers.

At least one real practical use of content negotiation is for serving XML
content with XSL stylesheets to user agents that accept those types.  I'm
certain I'm /not/ the only person that was anticipating using MIME based content
negotiation for this.
(Reporter)

Updated

17 years ago
OS: Linux → All
Hardware: PC → All

Comment 3

17 years ago
Accept headers should also reflect supported types of inline resources (like
IMG, EMBED, and OBJECT sources, or images specified in background attributes).

For example, given the HTML snipet:

...
<img src="/image" />
<object src="/object" />
...

The corresponding accept headers could be sent in their respective requests
(assuming the appropriate plugins are installed and enabled):

Accept: image/png, video/x-mng, image/x-jng, image/jpeg, image/pjpeg, \
        text/*; qs=0, application/*; qs=0
Accept: video/quicktime, audio/x-pn-realaudio, application/x-shockwave-flash, \
        application/java, text/*; qs=0

This indicates that only image types that can be displayed inline are acceptable
variants for the IMG tag, and that only embeddable objects are acceptable for
the OBJECT tag.

The basic idea here is to only specify Accept headers for what Mozilla really
can accept in the context of that request.  The most inclusive set of Accept
headers will most likely be sent when requesting the resource specified in a
link, bookmark, or a typed in URL.  But any types that can only be displayed
inline should have "qs" value of zero for these type of requests.  For example,
the Accept header when I type in http://foo.net/object into my URL textfield
would be:

Accept: text/*, image/*, ... application/x-shockwave-flash; qs=0, ...

since Flash can only be viewed inline.  If the resource http://foo.net/object is
a negotiable resource, but only the object.swf variant is available, the server
will respond with "406 Not Acceptable" and a list of variants with the single
entry "object.swf".  The user can then request the variant explicitly or Mozilla
can just display the "pick an application or save to disk" dialog directly.

Comment 4

17 years ago
i personally don't like your detection algorithm for svg or mathml.

The code should do simple enumeration of plugins and components instead of 
cheating based on compile time flags (which may or may not reflect the real 
world install)

marking patch and review. I'd expect a real reviewer to have other suggestions. 
Thank you for offering this start.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: patch, review
(Reporter)

Comment 5

17 years ago
Created attachment 19160 [details] [diff] [review]
revised to enumerate plugins within Accept as per suggestions
(Assignee)

Updated

17 years ago
Blocks: 61682

Comment 6

16 years ago
Rumors of content negotiation's death are greatly exaggerated. Let's not try to
kill it with harmful and worthless headers like "Accept: */*". Even something as
simple as "text/html, */*;qs=0" would be a great improvement, for sites that
serve WML or HTML content based on the content of the Accept: header.

Updated

16 years ago
Depends on: 65092

Updated

16 years ago
No longer depends on: 65092

Updated

16 years ago
Blocks: 65092

Updated

16 years ago
Component: Networking → Networking: HTTP
(Assignee)

Comment 7

16 years ago
I'm all for the goal of this patch, but I think that the accept header
generation should be moved into the HTTP handler, with appropriate
pref change notifications.
(Assignee)

Comment 8

16 years ago
reassigning to myself.
Assignee: gagan → darin
(Assignee)

Comment 9

16 years ago
*** Bug 68425 has been marked as a duplicate of this bug. ***
text/ecmascript is not a valid MIME type; don't use it.  Also, what about
;version= parameters?  Hmm.

Some versions of mosaic, back in the infancy of the web, sent over 4K of Accept:
header value.  Do not get me wrong in what I'm about to write: real, working,
scalable content negotiation is a laudable goal that should be implemented well
and standardized based on proven practice.  However, in my opinion, sending that
>4K Accept header was more foolish than sending */*, because in the real world,
*/* performs *much* better, and *almost always* does the right thing, compared
to such exhaustive but never quite right (version? plugins?) bandwidth pigs.

Any Mozilla patch has to be hard-headed about the trade-offs here.  I do not
think we should send an Accept: header of unbounded length on every (or even on
every initial, for a kept connection) HTTP operation.  Can we find a compromise
position, perhaps not as degenerate as Robin's text/html, */*;qs=0 ?

/be
Of course, text/javascript is not a valid MIME type, either.

Looking at the latest patch, I see an attempt to enumerate plugins.  What I
meant by my "versions? plugins" parenthetical aside, the "plugins?" part, was:
what about the null plugin, or other plugin-finding mechanisms that can help you
reconfigure your browser on the fly to understand a MIME type it heretofore did
not understand?

Has anyone measured the length of Mozilla's headers with the patch applied?  Are
we about to break the 4K record?

/be
havill@redhat.com - how long is the Accept header under your patch?

Gerv
(Assignee)

Comment 13

16 years ago
Putting a future milestone on this for now... will change when we come to a
conclusion about what we're going to do. 
Target Milestone: --- → Future
(Reporter)

Comment 14

16 years ago
Gerv: it depends on the number of plugins. The primary idea was to give
content-negotiators a way to:

a) know if flash or some other heavily used plugin that could completely affect
whether a page was rendered was available or not so a text version of the page
could be displayed

b) differentiate between "standards knowing" browsers and "standards ignorant"
browsers without having to resort to the User-Agent... hence the attempt at the
"ecmascript" entry as well as javascript so that an agent would know when it
could deliver JavaScript that was standards compliant. Same goes with stylesheets.

Other than the plugins, the strings in Accept are hardcoded, so you could insert
parameters such as "; version=" in there, except I know of no
content-negotiating server (Apache comes to mind) which would understand it with
the default modules... you'd need custom code.

Also, you can't get too funky and/or too deliver too long an accept line,
because this seems to spook some web servers out there (ZDNet's Netscape
Enterprise appears to barf if you attempt to break the Accept line into a
multiline continuation which is legal under HTTP, for example... that code is
deactivated by default via macro in the patch)
>hence the attempt at the "ecmascript" entry as well as javascript so that an
>agent would know when it could deliver JavaScript that was standards compliant.

Don't promulgate invalid MIME types.

Don't solve problems that do not exist in the real world -- the problem with JS
is not that implementations still present in significant numbers in the field do
not understand an ECMA standard (which Edition of ECMA-262 do you mean?  You
have to say; and then there is Chapter 16 of ECMA-262 Edition 3,which *allows
for syntactic and semantic extensions so long as they have certain properties*).
 The problem with "JavaScript" is all the DOM level 0, 1, and 2 bugs,
incompatible extensions, and standards ambiguities and other errata, which
cannot be summarized by a MIME type.

The same emphatically goes for style sheets, although CSS is a slightly younger,
_de jure_ standard that continues to evolve via design-by-committee.  Also,
style sheets (like JS in many cases) augment pages that can still be used quite
well without style sheet support.

In general, the problem I'm describing applies to all sufficiently evolved and
complicated types.  HTML has several official versions, lots of extensions and
bugs, and the modern forms cite a DTD or scheme in a tag in the doc.  The MIME
type is not nearly enough, wherefore the fairly degenerate text/html,
undecorated by version parameter.

Maybe this is not the problem you're trying to solve, but it is *the problem*
content authors wrestle with: varying levels of buggy conformance to
multi-versioned specs that are themselves buggy (that have unpublished or at
best unofficial errata).  I'm harping on this because you cited JavaScript and
proposed ecmascript.  Probably you have a better example.

Regarding your point (a), what about the plugin-find/download/install capability
of many modern browsers?  Why should my server not ship my compelling and
expensively developed flash content to someone who may well be able to download
and interpret it?  We need something more than an "I currently accept" header. 
Perhaps an exclusion list, a Reject: header?

Multiline continuations, 4K Accept headers, non-exclusive and therefore useless
assertions about transient and upgradeable capabilities...  This is a recipe for
much worse performance, and less user satisfaction, than Accept: */* gives.

Sorry to throw stones.  I'm not here to solve the content negotiation problem. 
But I'm not going to let Mozilla go down the early-mosaic 4K-header path, or to
naively promulgate bogo-MIME-types that cannot express the detailed, bug-wise
compatibility negotiation needed in the DOM world, which is better done by
"client sniffing" and "object sniffing" (barring yet better techniques).

/be

Comment 16

16 years ago
At the very least, the Accept header (IMO) should explicitly list those MIME
types that it can handle internally and is most likely to encounter, namely
text/html, image/png, image/jpeg, and image/gif (with the appropriate q values,
something few browsers seem to bother with), with something on the order of
*/*;q=0.1

This would at least enable servers to selectively send PNG images to Mozilla and
GIF to older versions of Navigator, or send HTML to Mozilla and WML to WAP
browsers.  This would also put it on par with Communicator's support for the header.

It would be very useful to have even this basic support, with the arguments
about whether to include plugin or helper MIME types coming after the header
actually does something.  At the moment it's basically worthless.

Comment 17

16 years ago
Are we going to send gigabyte accept strings, always?  wouldn't it be saner 
(albeit probably as nonstandard as everything) to offer an accept fields based 
on what we're retrieving? eg, when we try to get an <img> send which image/* 
things we accept.  when we try to get an <script> list the things that we know 
are legal script mime types (none, or if you think application/x-javascript is 
legal ...) if we are looking at <style> send accept text/css. for iframe accept 
text/plain, text/html, text/xml for object send a list of all the plugins that 
the user doesn't mind disclosing (nothing by default, because *I* mind)

sorry, i haven't looked at the code in a while. but i do remember that we could 
send extremely long, useless and privacy violating strings.  Oh, and if the 
user disables style sheets don't send text/css in the accept header. similarly
Netscape added a progressive jpeg type to the useless */* when it started
supporting progressive jpegs (3.x?).  The same could be done for PNGs and MNGs.
 I don't see the point of text/html, image/jpeg or image/gif, however.

/be

Comment 19

16 years ago
Not listing text/html in the Accept header on browsers like Mozilla and IE has
made it tricky to implement HTML vs. WML content negotiation.  Sure, I could
maintain a big long list of every WAP browser, emulator, and cell phone out
there, and update it every time someone released a new one, but I'd much prefer
to use the Accept header.  The problem is that Mozilla's */* string indicates
that it accepts both text/html and text/vnd.wap.wml EQUALLY.  Clearly, Mozilla
would prefer HTML over WML, but it expects the server to guess this rather than
telling it.  (Kind of like the classic, "If you don't know what's wrong, I'm not
going to tell you!" syndrome.)  If you list text/html explicitly, and give */* a
lower q value, you can eliminate this type of problem (at least for Mozilla).

Incidentally, the Reject: header someone mentioned isn't necessary.  If I
remember the specs correctly, you can do this in the Accept header by including
something like "text/whatever;q=0"

Comment 20

16 years ago
> Regarding your point (a), what about the plugin-find/download/install
> capability of many modern browsers?  Why should my server not ship my
> compelling and expensively developed flash content to someone who may
> well be able to download and interpret it?

Sending meaningful Accept headers doesn't preclude sites from continuing to
foist whatever content they want upon user's browsers.  It does mean that sites
have reliable information from which to make a decision of which version to
send.  When the choice is between sending your compelling flash content versus
sending the also expensively developed, not quite so compelling, but more widely
supported animated GIF content, having meaningful Accept headers can mean the
difference between transmitting a version that the user will see or won't
because they don't have flash and don't want to be bothered downloading.

I value being able to browse without constantly being interrupted by dialog
boxes asking me to go off on some entirely different task than the one I was
doing.  I don't buy the argument that choosing a less optimum version -- but
doing so transparently -- results in less user satisfaction than prompting the
user to go install additional software just to complete the task at hand.

> We need something more than an "I currently accept" header. 
> Perhaps an exclusion list, a Reject: header?

That's what "qs=0" is for.
> When the choice is between sending your compelling flash content versus
> sending the also expensively developed, not quite so compelling, but more
> widely supported animated GIF content, having meaningful Accept headers can 
> mean the difference between transmitting a version that the user will see or 
> won't because they don't have flash and don't want to be bothered downloading.

Most flash sites I know of do not have any expensively developed animated GIF
alternate content.  But we're trading synthetic arguments.  Someone could survey
the popular sites and report back, but that doesn't alter my fundamental point:
Mozilla should not send long, unbounded Accept headers.

Someone please propose a short, bounded, useful Accept header.

> That's what "qs=0" is for.

Right, q=0.

/be
IMO, someone should propose a set of Accept: headers we should send depending on 
context. Contexts could include:

General (<A> tag, typed in HTTP URL, FRAME src)
IMG href
OBJECT src
APPLET src
SCRIPT src

This way, we keep the length of the header down in many cases. The General one 
could be less detailed, and the other ones more specific.

However, there's a possible disadvantage to this in that a server could 
theoretically send back different content if an image is requested directly as 
opposed to as part of a page (because different Accept: headers would be sent.)

Gerv
(Assignee)

Comment 23

16 years ago
Gerv: in HTTP all content is referenced individually and therefore directly.

Comment 24

16 years ago
Gerv: The server side Vary header disables caching on most of today's caching
proxies. So from the application developer's view, it is much more clever to use
content negotiation only for the *containing* HTML, generating non-negotiating
links to inline objects. So the application implementor needs a full Accept
header on the *first* object she sends.

Why does nobody suggest to provide a similar interface as
Preferences/Navigator/Languages to help people compose their Accept header?
(Assignee)

Comment 25

16 years ago
I like the idea of the Accept header being a preference.  I'm not sure about
making it a user friendly preference... what would that look like?  Just an
edit box perhaps?  ...but that's not very user friendly (and this preference
would have the side effect of preventing a server from doing user-agent based
content-negotiation).

Anyways, I'd really like us to *at least* settle on a simple static Accept
header that would be useful.  From the discussion, it sounds like we could
come up with something simple, static, and far better than */*.  

The issue of whether or not to expose plugins via Accept is IMO less urgent.

So just to get the ball rolling, how about:

   Accept: text/html, image/jpeg, image/png, */*;q=0.1

I've left image/gif out, thinking that image/png should take precidence...
but this is something I'm totally not sure about.

Comment 26

16 years ago
> a user friendly preference... what would that look like?

Two columns: type and preference.
Two buttons: add and edit

Clicking on a line highlights the line, enables editing. Clicking on Edit opens
a window to edit the selected preference. Doubleclicking does both at once.
Clicking on Add opens the Edit window empty. The window has only the two fields.
The type field takes MIME types. The preference field takes numbers between 0
and 1. Columns should be sortable.

Darin: I know all content is referenced directly. That doesn't stop the 
reference having a context, though. The network library would need to be told 
the context in order to send the appropriate Accept: header.

If we make the Accept: header a preference, we can rely on it not being changed 
by the vast majority of users. So, the question then morphs into: "what will the 
default preferences be?" and we are no further forward.

Also, making it a preference may well lead to some hard-to-debug problems, if 
one person's Mozilla 1.0 gets sent different content from the same URL as 
another person's.

However, there definitely needs to be an official, generic way of e.g. MathML, 
SVG etc. and plugins to register their types for addition to the Accept: header.

Surely there was a design document for this? What did it say? Is it available?

Gerv
Indicating support for XML types is important for people who want to use MathML
and serve fallback pages to browsers that don't support XML. It is important
that XML types have a higher q value than text/html. People might have the
permissions to set up alternative pages with Apache's built-in content
negotiation but not have the permissions to execute elaborate Perl scripts.

I'd expect something like:
Accept: text/xml;q=1, text/html;q=0.9, image/png;q=1, image/jpeg;q=1,
image/gif;q=0.9, text/plain;q=0.8, text/css;q=1, */*;q=0.01

Of course application/xml and application/xhtml+xml should be added once the
relevant bugs are fixed.

Nominating for Mozilla 0.9.2 due to importance with authoring XHTML+MathML with
a HTML fallback.
Keywords: mozilla0.9.2
Right. Here's a proposal (which I will implement unless someone tells me why
it's unacceptable.) 

We make the value a pref, with _no_ user UI. This should not be exposed to the
user because its value is a function of the software they have installed, and is
not something which depends on users tastes.

I am lead to believe that prefs can be updated by XPIs using signed scripts
(when that is working) so if you install MathML.xpi it could add
application/mathml+xml, or whatever the type is, by modifying the pref.

Having it as a pref also means that, in the end, people do have access to it if
they _really_ need to change it (or, at a pinch, plugin providers can provide a
script to fix it up), whereas if we hard code it, they don't. It also makes it
much easier for _us_ to update it later (e.g. when application/xml is
supported).

Eventually, I think we should be implementing Tim Taylor/timeless' suggestion
that we send a different value depending on the page context of the request.
This would enable us to be more verbose while keeping header lengths down, but
it requires substantially more work. In the mean time, unless someone wants to
improve upon it, I plan to go with Henri's suggestion to begin with, for all
requests.

Gerv

Comment 30

16 years ago
Gervase, thanks for the vote of confidence on my context based Accept headers.

The first half of my comment is the useful part.  The second half where I
recommend "q=0" for types that are typically embedded inline wasn't very well
thought out.  Specifically, flash can be viewed via a direct URL.  Generally
it's probably not a useful feature even assuming that web servers would respond
as I described.  Besides, "q=0" for a specific mimetype isn't that necessary
when it would otherwise be "q=0.01" per Henri's suggestion.
OK, patch coming up. However, it doesn't compile :-( I need some help figuring
out why.

There's a function in nsHTTPHandler: SetAcceptEncodings. There are only three
references to it in the entire source base - it's defined once and called twice.
No headers.

Yet, when I clone this function and rename it to SetAccept, the compile fails:

nsHTTPHandler.cpp:651: no `nsresult nsHTTPHandler::SetAccept (const
char *)' member function declared in class `nsHTTPHandler'

Anyone see what I'm doing wrong?

Gerv
Created attachment 33963 [details] [diff] [review]
Patch - won't compile :-(
Hmm. nsHTTPHandler.h is checked into CVS. Shouldn't it be generated from the
IDL?

Anyway, I got it to build. Patch coming...

Gerv
Created attachment 33974 [details] [diff] [review]
Patch v.1
OK, so the post-rearch patch is even simpler than the pre-rearch one :-)
Coming right up, looking for r=. There's a thread about this on n.p.m.netlib and
n.p.m.porkjockeys if people have comments (particularly about exactly what the
Accept: header should be.)

Gerv
Created attachment 34241 [details] [diff] [review]
Post-rearch patch. V.2
darin: 

Can you review this, please? Discussion in the newsgroups is moving towards
finding the "right" value for our default Accept: header, but I'd like to get
the back end plumbing in before 0.9.1. We can change the header later with ease.

Gerv
Setting milestone.

darin: Will you be able to review this, or are you able to find someone else to
do it? 

Gerv
Target Milestone: Future → mozilla0.9.1
(Assignee)

Comment 39

16 years ago
r/sr=darin

Comment 40

16 years ago
r=bbaetz

Comment 41

16 years ago
I checked in the patches for Gerv.
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
VERIFIED with ethereal.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.