201195 - Generic XML MIME Types (application/xml and text/xml) should have lower priority than specific XML MIME Types like XHTML (application/xhtml+xml) in HTTP Accept Request Headers

Reporter

Description

•

22 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030401 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030401 The HTTP Accept-Header of Mozilla's HTTP Requests currently looks like this: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 That means that text/xml, application/xml and application/xhtml+xml are all equally preferred. That's not really true, not intended and not convenient. The generic types text/xml and application/xml should not be preferred over the more specific type application/xhtml+xml (and image/svg+xml in SVG-enabled browsers). Instead they should have slightly lower priority. I suggest changing the Accept header like this: application/xhtml+xml,text/xml;q=0.95,application/xml;q=0.95,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 Bye Reproducible: Always Steps to Reproduce: 1.Send an HTTP Request 2.Look at the Accept header Actual Results: Mozilla prefers generic XML MIME types (application/xml and text/xml) with same precedence as specific ones (e.g. application/xhtml+xml). Expected Results: Specific XML-based MIME Types (e.g. application/xhtml+xml) should be preferred over generic XML MIME Types (application/xml and text/xml).

Bill Mason

Comment 1

•

22 years ago

Confirming as an apparent non-dup valid enh req.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Boris Zbarsky [:bzbarsky]

Comment 2

•

22 years ago

Gerv, this is all you

Gervase Markham [:gerv]

Comment 3

•

22 years ago

Relevant comment from n.p.m.browser: Seairth Jacobs wrote: > I didn't say that one should be preferred over the other. I said that > "text/xml" should no be preferred over "text/html". Give them the same > quality value. That way, the server can make the decision. If a server > decides to return text/xml plus stylesheet, it can. If it still wants to > return text/html, it can. > > If you want to give text/xhtml+xml a higher quality than text/html, that > makes sense. In this case, you are specifying a preference between two > specific vocabularies. > > As for when "text/xml" should have a higher preference to "text/html", my > answer is "never". Like it or not, legacy HTML formats are going to be on > the web for a long, long, long time to come. Making them second-class > citizens to the generic text/xml format will only cause users to find > another browser that doesn't (again, imho). Gerv

Darin Fisher

Comment 4

•

22 years ago

i'm keen on anything that will reduce the length of our Accept header (the fewer bytes we send with each and every request the better), so if i follow gerv's point, he's suggesting that we do away with the q=0.9 following text/html? in fact, i don't really understand why we make such a big deal out of saying we prefer this type more than that type. heck, we are simply able to handle any of these types, so why differentiate? we don't need to tell servers that xhtml is better than html, they already know that. /sigh/

Target Milestone: --- → Future

Henri Sivonen (:hsivonen)

Comment 5

•

22 years ago

Why would you want to do content negotiation with a specific type vs. a generic type?Mozilla prefers XHTML over HTML in order to allow content negotiation of real XHTML + MathML vs. IE + ActiveX soup.OTOH, the XML content sink doesn't support incremental loading, yet. So in that sense it would make sense to prefer text/html.The entire HTTP request fits in one TCP packet and dropping " en-US;" from the UA string would save us 7 bytes. After all, sniffing for UI language is bogus, because servers should use Accept-Language.

Gervase Markham [:gerv]

Comment 6

•

22 years ago

The original point in this bug suggests a change to (with some edits to reduce length): application/xhtml+xml,text/xml;q=0.9,application/xml;q=0.9,text/html;q=0.8, text/plain;q=0.7,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 In other words, xhtml+xml first, then xml, then html, then plain. The comment I quoted from the newsgroup suggests making text/xml (and presumably application/xml) the same priority as text/html; i.e. changing the 0.8 to a 0.9 in the string above. Re: the length, I agree (and have been working hard to make it so) that we should keep the length down. But, we had a big discussion about this last time round and the current set are all in there for good reasons. Gerv

Henri Sivonen (:hsivonen)

Comment 7

•

22 years ago

Re: length I think saving a few bytes is not a good reason for fiddling with the q values. There are better ways of saving a few bytes here and there such as omitting Accept-Language on style sheet and script requests and omitting Accept-Charset on image requests. To put things into perspective, Mozilla's HTTP requests are tiny compared to the HTTP requests WAP gateways can make. A Nokia cell phone accessing a Web site through a WAP gateway can advertise dozens of MIME types in the Accept header! (Take a look at http://nds.nokia.com/uaprof/N3650r100.xml and imagine most of that information plus some more formulated as HTTP request headers.) Of course, the difference is that Mozilla's HTTP headers travel end-to-end in any case--even when Mozilla is behind an old-fashioned PPP dialup connection. Re: what to accept and with what q values Should Mozilla be advertising application/xml and text/xml in the Accept header at all? As for sending out q values, does it make any sense to do content negotiation with the other alternative being */xml? Mozilla really should prefer text/html over arbitrary, a priori unknown in meaning, make-up-tags-as-you-go XML. HTML tags carry meanings that are generally known. (Arguably the meanings could be better defined, but that's not the point.) OTOH, private XML vocabularies may seem to have meaning to whoever wrote the document, but for something that doesn't have any idea of the meanings of the elements a priori, the private vocabularies are effectively meaningless. (Adding a CSS presentation to a document tree with unknown namespaces and generic identifiers does *not* fix this basic problem.) So *if* the generic XML content types are kept in the Accept header, it would actually make sense to make their q values lower that the q value of text/html. Is explicitly accepting */xml any more useful than explicitly accepting application/octet-stream? If the resource typed */xml is the only resource available and there's no negotiation, */* catches it anyway and the resource is sent to Mozilla. But are there any good use cases where there are two or more resources one of which has the type */xml and content negotiation makes sense?

Bradley Baetz (:bbaetz)

Comment 8

•

22 years ago

WE could add multipart/x-mixed-replace.. Tahts somethign people are likly to want to sniff on. I really don't think that we're going to have people doing server side sniffing for bmp files. We should only have html(/xhtml/etc), and then other 'uncommon' file formats. Then in 5 years time when all browsers upport mngs, and noone bothers to do server side parsing, we can remove that, and add something else. That isn't the intent of the http sepc, mind you, but I think that it does make logical sense. I think.

Christian Wolfgang Hujer

Reporter

Comment 9

•

22 years ago

One thing about the length / size of the header: As long as the whole header does not exceed one MTUs / TCP packets, I really wouldn't bother making the Accept header a bit longer. The current header of Mozilla is this: GET / HTTP/1.1 Host: localhost:3129 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030401 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 Accept-Language: de-de,de;q=0.8,en-gb;q=0.6,en;q=0.4,pl;q=0.2 Accept-Encoding: gzip,deflate Accept-Charset: UTF-8,* Keep-Alive: 300 Connection: keep-alive The header size is approx. 400 bytes. That's not half of an MTU. Wether the request header is 400 or 800 bytes does not make so much difference. Even with an 28.800 bps Modem you get 3600 bytes per second. So 100 bytes more take only 1/36 seconds for a 28.800 bps Modem, which is 28 ms. For 10 subsequent requests, e.g. a page containing 9 images, it's 1000 bytes more, which still takes less than half of a second to transmit. I'd really not be picky about 10 or 20 bytes more of a header, especially if the difference between the current size and the previous size as less then the size of some URIs. Greetings /Christian

Gervase Markham [:gerv]

Comment 10

•

22 years ago

> WE could add multipart/x-mixed-replace.. Tahts somethign people are likly to > want to sniff on. Problem is, most browsers which do support it don't advertise it. It's becoming clear to me that the Accept: header should contain non-universal formats, and should contain them as soon as we start supporting them (for avoidance of the above situation.) I suppose this is an argument for retaining MNG (see bug 189872). Henri: no-one is suggesting fiddling with q values to save length; my slightly reduced version and the original mean exactly the same thing. Christian Wolfgang Hujer: the fact that it fits into a single packet is not the end of the story. If you load a complex web page, we can issue a lot of requests at once. On a slow connection, every byte counts a little bit. We should certainly not say that we can keep filling up Accept: until we hit the MTU limit. Gerv

Christian Wolfgang Hujer

Reporter

Comment 11

•

22 years ago

Gerv, even if you issue several requests at once using HTTP/1.1 keep-alive, the speed from the server to the client will always be the bottleneck, not the other way round. You issue the first request, and after issueing the 4trh request you're probably still receiving response data from the first request. I don't wanna say okay, let's create a 1 MB request, but let's calculate the difference between a 420 and a 440 bytes header: it's 20 bytes per request. On 100 requests, it is 2000 bytes difference, that's two kB, with an old 28.800 modem it's still less than a second. The bottleneck is the responses, not the requests. I didn't want to say let's fill up the MTU limit. I just wanted to say hey, don't be picky about 10 or even 50 bytes in the Accept header. /Chris

Dave Hodder

Comment 12

•

22 years ago

Without wanting to complicate things further... is there any chance the XUL media type could be added? ;o) There is presently a discussion on news:netscape.public.dev.xul on how to best detect a XUL-capable software agent -- see http://groups.google.co.uk/groups?as_umsgid=5IpDa.10080%24lL2.142082%40news.chello.at At present, server-side code has to check the "User-Agent:" header; here's a Java/J2EE example: String agentHeader = request.getHeader("USER-AGENT"); if (null != agentHeader && -1 != agentHeader.indexOf("Gecko/")) { out.println("XUL is supported."); } else { out.println("XUL is not supported."); } A saner approach would be if we could examine the "Accept:" header instead: String acceptHeader = request.getHeader("ACCEPT"); if (null != acceptHeader && -1 != acceptHeader.indexOf("application/vnd.mozilla.xul+xml")) { out.println("XUL is supported."); } else { out.println("XUL is not supported."); } Taking Gerv's suggestion, removing the presently defunct "video/x-mng" then adding "application/vnd.mozilla.xul+xml" with the same level as "application/xhtml+xml", we have: application/xhtml+xml,application/vnd.mozilla.xul+xml,text/xml;q=0.9, application/xml;q=0.9,text/html;q=0.8,text/plain;q=0.7,image/png,image/jpeg, image/gif;q=0.2,*/*;q=0.1 Thanks Dave P.S. "OS" should be changed from "Linux" to "All"

Gervase Markham [:gerv]

Comment 13

•

22 years ago

Now that we have Gecko browsers without XUL support, I can see an argument for adding XUL. The problem is, of course, that it's not there in all the previous Mozillas, so checking for it won't be very sensible. Checking for "Gecko/" and then !"Camino|Chimera" might do the trick, I suppose. Have we missed the boat on adding the XUL media type? Gerv

Darin Fisher

Comment 14

•

22 years ago

gerv: what about applications that embed gecko, but don't support XUL? the list is not limited to chimera and camino... more to the point, the list of non-XUL apps could increase over time, so sniffing for the application name just doesn't seem like a good solution at all to me. sniffing for known applications works if we can say that post |this version| of gecko, XUL will be advertized in the Accept header. that way, servers can read the Accept header to check for XUL, and if it is not there, they can optionally fallback on guessing based on the UA string.

Henri Sivonen (:hsivonen)

Comment 15

•

22 years ago

It would be important to make sure that the Accept-string really matches the capabilitites. For example, tehre's bug 188376, which wouldn't exist if compile-time options could turn stuff on and off in the Accept header. If XUL goes in the default Accept header, how do we make sure J. Random Embedder takes it out if the embedding app doesn't do XUL?

Gervase Markham [:gerv]

Comment 16

•

22 years ago

> If XUL goes in the default Accept header, how do we make sure J. Random Embedder > takes it out if the embedding app doesn't do XUL? J. Random Embedder's browser doesn't work properly on sites which are sending XUL to XUL browsers. J. Random Embedder's customers get irritated with J. Random Embedder. The same argument applies/applied to MNG, which could be compiled out for embedding. Short of some very smart build-time way of building the header, we just need to make sure it's documented that embedders should check that the Accept header is still valid for their build. Gerv

Henri Sivonen (:hsivonen)

Comment 17

•

22 years ago

Sadly, it doesn't work that way. Even a product released by mozilla.org itself (Chimera/Camino) had (still has?) a bogus Accept header for months.

Gervase Markham [:gerv]

Comment 18

•

22 years ago

Henri: so what do you suggest? Some way of assembling the Accept: header at build time, based on the build options? Gerv

James Graham

Comment 19

•

21 years ago

Based on what I've seen people using or wanting to use, I'd say we need the following types in the accept header: XHTML (done) MathML (unless --disable-mathml is specified) XUL (unless --disable-XUL is specified) SVG (if --enable-svg is specified) Being able to detect on MathML is significant because not all XHTML browsers support MathML (in fact, only Mozilla and Firefox support MathML; Camino builds without and no other browser has support). Since MathML is namespaced directly into the page, there is a strong need to do server side content-type negotiation based on the presence, or not, of MathML. At present, this means doing user-agent detection (see e.g. http://golem.ph.utexas.edu/~distler/blog/archives/000309.html for the type of hacks this involves). Arguing that not including XUL in the accept header because "some builds did support XUL but don't have it in the accept header" seems like a fallacy to me. Putting XUL in the accept header can't possibly make the situation worse; if anyone is actually serving XUL, they are still free to detect based on UA or whatever they are using at present. However putting XUL in the accept header makes it easier for people to design cool XUL-enabled features and be sure that people without XUL won't get junk. If SVG support is compiled in, that should be advertised as well (perhaps with a low q value); after all an SVG enabled Mozilla browser does accept SVG, even if the support isn't perfect. It also gives those who would like to serve SVG the oppertunity to do so; at present there is no obvious way to distinguish browsers where it is supported from those where it is not. This also means that SVG sites are more likely to work with Mozilla if SVG ever makes the default builds.

James Graham

Comment 20

•

21 years ago

I filed Bug 234170 on allowing the accept header to be set at build time.

Henri Sivonen (:hsivonen)

Comment 21

•

21 years ago

> XHTML (done) > MathML (unless --disable-mathml is specified) Reiterating comment #5: Mozilla prefers application/xhtml+xml over text/html, because application/xhtml+xml can contain MathML (see old n.p.m.mathml archives). When MathML is not enabled, preferring application/xhtml+xml is (arguably) wrong, because then application/xhtml+xml provides no added value but text/html does provide added value (incremental rendering). Also, I'm inclined to think application/xml and text/xml are non-sensical as conneg alternatives.

James Graham

Comment 22

•

21 years ago

> Mozilla prefers application/xhtml+xml over text/html, because > application/xhtml+xml can contain MathML But a build containing SVG but not MathML would also 'prefer' application/xhtml+xml over text/html because it could accept SVG content in the XHTML. Opera or Safari might decide to 'prefer' application/xhtml+xml over text/html, especially if support for, say, SVG appears in those browsers (or even if it doesn't). None of this helps someone who wants to send MathML to browsers that support it and png to browsers that don't. In fact, one might have three versions of a page, one with MathML equations, one with SVG equations and one with PNG equations. Content negotiation should allow one to decide which of the three pages a browser can actually handle. Having said that, I agree that, in general, application/xhtml+xml should be prefered in builds that support MathML or SVG. I just don't think that's sufficient.

Christian :Biesinger (don't email me, ping me on IRC)

Comment 23

•

21 years ago

in all mozilla builds that don't use --disable-xul application/xhtml+xml can contain xul as well... (or so I hope. I haven't tested...)

Gervase Markham [:gerv]

Comment 24

•

21 years ago

Comment 25

•

20 years ago

Untill everything is sorted out to be perfect, it would be nice if text/html is preferred over generic XML. Example where this is annoying: <http://www.w3.org/2001/tag/doc/versioning>. It would also be nice if application/xhtml+xml is preferred over text/html. (Like it is today.) Not only for SVG and MathML builds but also because websites use it to serve Mozilla XHTML with a somewhat more advanced style sheet and IE HTML. Henri, are there bugs for Mozilla sending Accept-Language for style sheets, scripts and such?

OS: Linux → All

Hardware: PC → All

Henri Sivonen (:hsivonen)

Comment 26

•

20 years ago

Anne, I am not aware of a bug about Accept-Language. Strictly speaking, sending Accept-Language is not wrong, because images, style sheets and scripts can contain things that depend on the natural language skills of the reader and in theory someone somewhere could use conneg with them.

Anne (:annevk)

Comment 27

•

20 years ago

We currently have the following: |text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5| How about changing that too: |application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,application/xml;q=0.7,text/xml;q=0.6,image/png,*/*;q=0.5| It is 12 characters longer and is a bit better IMHO. However, I think we can drop application/xml and text/xml from the list. (At least, this solves the problem when there is both a meaningless XML file and a semantic HTML one.)

Gervase Markham [:gerv]

Comment 28

•

20 years ago

Anne: but which is more likely - the case you mention, or the case where there's both an legacy HTML and an modern XML version of a resource? That XML may well not be XHTML. I wouldn't want to rearrange the order of entries in the Accept: header (as opposed to adding new ones) without a lot of discussion, because of potentially breaking stuff - and it's really not a high priority right now... Gerv

Anne (:annevk)

Comment 29

•

20 years ago

I think my case is more likely. There is no such thing as "modern XML". It is useless for people to use XML without any semantics on the web without having a HTML or XHTML alternative, which can be understanded by most browsers. (Althouh Google only understands the semantics of HTML (text/html) documents.) With who would this need to be discussed?

Darin Fisher

Updated

•

19 years ago

Assignee: darin → nobody

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

19 years ago

Flags: blocking1.9a1?

Flags: blocking1.8.1?

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 30

•

19 years ago

For what it's worth, the original rationale for these changes was bug 58040 comment 28.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

19 years ago

Flags: blocking1.8.1?

Stuart Parmenter

Updated

•

19 years ago

Flags: blocking1.9a1? → blocking1.9+

Gervase Markham [:gerv]

Comment 31

•

19 years ago

This bug has been plussed for 1.9 but there doesn't seem to be any consensus on what change to make, if any... Gerv

Benjamin Smedberg

Updated

•

18 years ago

Whiteboard: looking for new networking owner

Benjamin Smedberg

Comment 32

•

18 years ago

Fixed by checkins for bug 361892 and bug 309438

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED