Closed Bug 88614 Opened 24 years ago Closed 24 years ago

w3c.org/TR/xpath - old version of spec are displayed as just text

Categories

(Tech Evangelism Graveyard :: English US, defect, P4)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: nvshaji, Assigned: bc)

References

()

Details

(Whiteboard: [ETA ?] [PDT-])

Attachments

(1 file)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010628 BuildID: 2001062815 The above URL seems to be displayed only as text Reproducible: Always Steps to Reproduce: 1.Go to http://www.w3.org/TR/xpath 2. 3.
Confirming with 2001-06-28-15 under Win2k-SP1
Status: UNCONFIRMED → NEW
Ever confirmed: true
Are you using a build with XSLT enabled?
Confirming with: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.1+) Gecko/20010629 BuildID: 2001062921
The server is serving us XML without stylesheet. This is a problem at W3C end, and they are actually aware of this. Because of W3C's spec definitions, they MUST serve the HTML version because it is the normative spec. Their webserver is just confused because it sees we accept text/xml. It should ignore that and do XSLT transformation on the server to HTML or something... I believe peterv is in contact with the W3C people, reassigning.
Assignee: heikki → peterv
Component: XML → Evangelism
*** Bug 92422 has been marked as a duplicate of this bug. ***
P1--major site, major problem
Priority: -- → P1
Please don't set a priority on my bugs without consulting me. Thanks. I'll nudge my W3C contact again.
Severity: normal → major
Status: NEW → ASSIGNED
OS: Windows NT → All
Priority: P1 → P4
Hardware: PC → All
All Evangelism Bugs are now in the Product Tech Evangelism. See bug 86997 for details.
Component: Evangelism → US English
Product: Browser → Tech Evangelism
Version: other → unspecified
Max, any progress on getting the w3c server to feed the (authorative) HTML version of the spec by default?
HTTP_ACCEPT = text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2, text/plain;q=0.8, text/css, */*;q=0.1 This is what I see us sending with 9/26/2001 Win32 branch build. You can see that the Q value for HTML is less than the Q value for text/xml. Servers honoring Q values will rank text/html lower than text/xml.
I modified my prefs.js by adding the following line overriding the default setting: user_pref("network.http.accept.default", "text/xml, application/xml, application/xhtml+xml, text/html, image/png, image/jpeg, image/gif;q=0.2, text/plain;q=0.8, text/css, */*;q=0.1"); Note that this makes text/html as having the same value of "1" making it as desirable as text/xml type for the client browser. Under this setting, proper HTML page is served and there is no problem. Since we are sending the accept-header with text/html ranking lower than text/xml, we are getting what we deserve -- though W3C should serve stylesheet aslo in that case. This is Mozilla's bug. Please fix the default accept header by eliminating ";q=09".
Should we worry about this for the next Netscape release? Anyone wants to nominate this for 0.9.4 branch?
Target Milestone: --- → mozilla0.9.4
W3C pages are very visible. We should not be having this kind of problem with standards body documents. Nominating for nsbranch. The fix should be simple and safe. pref("network.http.accept.default", "text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2, text/plain;q=0.8, text/css, */*;q=0.1"); In all.js file, just eliminate ";q=0.9" from "text/html;q=0.9" in the "network.http.accept.default" setting.
No, no, no! Mozilla is correct. W3C is wrong. Our accept header is correct: we prefer XML over HTML. The W3C server has at least two problems: 1) It serves us XML without stylesheet, which makes no sense and 2) according to W3C the HTML version of the spec is the normative version and therefore they should ALWAYS serve the HTML version of a spec.
*** Bug 101653 has been marked as a duplicate of this bug. ***
> 2) according to W3C the HTML version of the spec is the normative version > nd therefore they should ALWAYS serve the HTML version of a spec. May be so, but us sending 0.9 value attached to text/html and then expecting to get an HTML document served is not correct, either. I agree that 1) is a problem.
> Our accept header is correct: we prefer XML over HTML. Heikki, what is wrong for us to give servers an option to send us either XML or HTML documents by giving both types the equal "implicit" Q values? This way, server admins can decide to send XML documents when they know their XML versions work well. I would think that that would be a realistic approach given the rate of XML adoption at this point.
Technically we are not expecting any special mime type from the server, Accept-Header is just our preference. The server has a problem if it sends us something that has no use, like XML without a stylesheet. And the point about Q values is moot: the server can ignore them if it wants to. If the server is misconfigured it is not our problem.
> And the point about Q values is moot: the server can ignore > them if it wants to. If the server is misconfigured it is not > our problem. I am sorry about I cannot agree with you here. With XML, there are chances that less than perfect pages may be delivered at this point in time. We should not be taking on unnecessary evangelism problems if we can avoid it. I have a compromise plan. Why don't we put in the change to the accept-header value for the commercial branch only now leaving Mozilla/commercial trunk as is. This lets us avoid the problem and other potential problems of this type in the next release. We can debate about the preference of XML over HTML later.
I am not calling this change a bug fix -- just a workaround for the problem at hand.
I'm with Heikki on this one, it's a server issue. It makes no sense to be sending out a XML file that's useless to a browser for http://www.w3.org/TR/xpath. I could understand it for http://www.w3.org/TR/xpath.xml, but http://www.w3.org/TR/xpath should point to the authorative HTML version.
To make this an evangelism bug, will owner and qa please do the following (or send it back to browser if this is a browser bug): 1. unset the target milestone 2. Reassign to default owner and qa 3. remove the nsbranch and nsenterprise keywords, they have no meaning in evangelism. 4. Set this to P3 for further triage Adding status codes as per http://mozilla-evangelism.bclary.com/ evangelism.html Thanks, Zach
Summary: The URL in displayed as just text → w3c.org/TR/xpath - The URL in displayed as just text
Whiteboard: [USERAGENT] [DOCTYPE]
I am willing to change the product to browser and send it to either networking or layout group. I would like a workaround for this porblem in the next release. Then we can send it back to evangelism later.
Component: English: US → Networking: HTTP
Product: Tech Evangelism → Browser
Version: unspecified → other
removing evangelism traces
Summary: w3c.org/TR/xpath - The URL in displayed as just text → The URL in displayed as just text
Whiteboard: [USERAGENT] [DOCTYPE]
Kat, the milestone field is for the owner of the bug, so I'd appreciate it if you didn't change it on my bugs. I disagree with your workaround, we should prefer XML/XHTML above HTML. This is just a server configuration issue.
Summary: The URL in displayed as just text → w3c.org/TR/xpath - The URL in displayed as just text
Whiteboard: [USERAGENT] [DOCTYPE]
Guys, let's take the religion out of this discussion. Kat's point is that in the real world we simply appear to users as a broken browser. They don't know or care that it's not our fault. One of our top goals in the current release is to minimize broken sites that we can fix either by design or by hack. You have an opportunity here to be a little flexible and help us all meet our goal.
To update, the URL mentioned in the URL field has been fixed. But so far none of the files linked under the Previous versions have been fixed. Also the links mentioned in Bug 101653 have not been fixed. I know that eventually we might be able to get W3C to fix all broken links. But ... 1. Do we know how many other pages like this one are on the W3C site? 2. It may be that given that XML and XHTML are relatively new protocols, other Apache server sites may also have similar problems -- Apache servers pay attention to many more http-headers than other servers do. 3. A compromise plan I am advocating does not disfavor XML or XHTML. It simply says that HTML is given an equal chance and it's up to servers to send what they think is best -- it is a conservative approach but I think it is a safer one.
Let's not forget the reason why the Accept header looks like the way it does. Some authors want to make two versions of a document: on that contains HTML and one that contains XHTML + some other XML based stuff. Then they want to send the latter to browsers that can handle it. (Windows IE 5 is specifically considered a browser that can't handle it.) However, they don't have a permission to run Perl or PHP script on the server they are using. Mozilla's current Accept-header allows users of clueful Web servers such as Apache to deal with the situation without scripting. Demo URLs: http://www.hut.fi/u/hsivonen/test/multitype/test.var http://www.hut.fi/~hsivonen/test/xhtml-suite/xhtml-index (There's no Perl, PHP or the like involved.) I agree with Heikki. This is a server configuration problem. It makes no sense to send unstyled XML with private element names to a Web browser even if a browser accepts XML. The Accept header has been the way it is for some time and only two real-world problems have been discovered: the W3C (who should know better) and a Danish golf site (not a top site). This is not a wide-spread problem. (In fact, this is significantly more rare than bug 22274 for example.) If the accept header was changed now, there would be no guarantee of the server situation getting any better. We'd just get stuck with a suboptimal Accept header. CCing Gerv, who is officially the guardian of the Accept header.
>2. It may be that given that XML and XHTML are relatively > new protocols, other Apache server sites may also have > similar problems -- Apache servers pay attention to > many more http-headers than other servers do. Not a problem with Apache's default setting. One has to take special steps to configure the server the way the W3C's server is configured. > 3. A compromise plan I am advocating does not disfavor > XML or XHTML. It simply says that HTML is given an > equal chance and it's up to servers to send what they > think is best -- it is a conservative approach but I > think it is a safer one. Would it be, in that case, possible to do the content selection I described in my previous comment (with Apache, without scripting)? AFAIK, it isn't. The requirements for the non-scripting solution to work (AFAIK) are q_browser_html * q_server_html < q_browser_xml * q_server_xml and q_server_html > q_server_xml Those requirements can't be satisfied if q_browser_html == q_browser_xml In my demo cases with Mozilla the values are: q_browser_html == 0.9 q_server_html == 1.0 q_browser_xml == 1.0 q_server_xml == 0.95
Sigh. mozilla0.9.4 is long gone. I don't see what this has to do with religion. Reassigning to Gerv, as this has turned into an accept header discussion. Gerv, feel free to pass it on, or hand it back to me if it turns out to be evangelism with the w3c site after all.
Assignee: peterv → gerv
Status: ASSIGNED → NEW
Target Milestone: mozilla0.9.4 → mozilla0.9.5
I can see that alot of this is internal to your process, which I'm not part of, but I got an email back which might clear up some of the evangelism issues you were talking about. The webmaster for w3c said that it would be a bug on their end, except they did it on purpose. They pu tno stylesheet because they "didn't want it to be browser readable." I assume they wanted to stick with xhtml for browsers. Given that, I can only speculate that the xml version is meant to be pulled into a documentation database of some sort, probably Microsoft's/IBM's/whoever's. I did note that that the page has microsoft employees listed on it. So there's no mystery or evangelism to be concern with, though it would be nice if they had an XML+CSS version to examine. Like I said above, maybe I'll make one.
Well, that's fine if they serve the XML from http://www.w3.org/TR/xpath.xml. http://www.w3.org/TR/xpath should point to the HTML version.
what are the chances of getting this fixed for the 0.9.4 branch?
Whiteboard: [USERAGENT] [DOCTYPE] → [USERAGENT] [DOCTYPE], [ETA ?]
The Guardian of the Accept: header speaks: ;-) The W3C and anyone else has the responsibility to make sure that anything it sends to a browser is viewable by that browser. They should know that no browser can display unstyled XML, and so should not send it to us. If it were styled and we were screwing up, then I can see that might be a problem. The URL they publish is the one to be visited by browsers to see the spec. They should not, therefore, be serving unstyled XML out of it. Having XML as a higher priority than HTML is an important forwards-compatibility move. If you change this, particularly in a high-visibility shipping product, you will make life very difficult for the ever-increasing number of webmasters who want to produce an XML version of their page for capable browsers, and an HTML version for incapable ones. Us saying we prefer XML gives us this option. This is not a religious point - it's a forwards-compatibility one. This will not be changed on the trunk. Netscape owns the branch, so they may do what they like, but I would _strongly_ advise against this change. Let's remember that the pages which are still broken are old spec versions, and so far less of a priority than the original ones. This is an evangelism bug. The W3C should either explain, given that they were the ones wanting us to send an Accept: header and that had some input into what it actually was, what they think we are doing wrong, or they should change their server :-) Gerv
Assignee: gerv → peterv
Severity: major → normal
Summary: w3c.org/TR/xpath - The URL in displayed as just text → w3c.org/TR/xpath - old version of spec are displayed as just text
Whiteboard: [USERAGENT] [DOCTYPE], [ETA ?] → [ETA ?]
Target Milestone: mozilla0.9.5 → ---
Evangelism has worked! http://www.w3.org/TR/xpath is now sent as HTML. No need to break the Accept header. Thanks to the folks who fixed this at the W3C and those who persuaded tehm to fix it.
(Henri) > Evangelism has worked! http://www.w3.org/TR/xpath is now sent as > HTML. No need to break the Accept header. Pages mentioned in Bug 101653 have not been fixed. My point is that W3C may have to do an awful lot of work to get get all visible ones taken care of. To Henri: thank you very much for esxplaining in deatil what Apache can do without elaborate scripting. I can at least now see part of the motivation to want to set HTML value lower. Having said that I believe Apache has implemented this very poorly on the server side. And we have changed Q values from Netscape 6.0/6.01 for that? A server should able to deal with cases where HTML and XML are equally favored **explicitly** and send XML in that case. But at the same time, if a browser has no preference on the Accpet header about XML or HTML (e.g. NN4.x, IE5.5 or IE6.0 do not include HTML or XML values), then it should be able to send an HTML document -- by a simple swicth or preference settting. If the current generation of Apache servers cannot do that without elabroate scripting, then that is a poorly implemented way to use the Accept-header. (I think Henri proved that this simple more reasonable response is not possible currently without some scripting.) Contrary to what Gerv says, I don't believe there is anything backward about what I am proposing. If Apache is the only reference you have for how Accept headers should be implemented on the client side, that is really not good. My reading of HTTP 1.1 does not tell me that the server side must implement Accept header handling the way Apache did. So there must be smarter servers out there, right? Gerv, what is backward about the client saying that I can accept either XML or HTML document and give me the best you got? Lastly, I am adovating this change for the next Netscape client release only until we can assess the situation a lot better. The way Apache servers are implemented now, they have to make a lot of guess work to assign the server side Q values against different clients. (Just think through Henri's sample, you will see how contrained this interaction will have to be. There could be other possible mishap given this.)
I have re-opened Bug 101653 because we need to continue to evangelize W3C about the problem pages reported there while discussion is going on here.
*** Bug 101653 has been marked as a duplicate of this bug. ***
And I just duped it again: the issue is about reacting wrongly to our valid Accept-header, which is what this bug is about. Both are even about the same site...
With the w3c site fixed, are there no other (highly visited) sites to worry about?
I would also like information on if the fix at W3C needs to be done file/directory by file/directory, or by a general change to server settings. In other words, what's the likelihood that the problems can get fixed en masse.
Not that we know of (in fact, the only other site that we know is a minor Danish site). W3C is not completely fixed: the latest versions of specs seem to work ok, but older versions give out XML without stylesheets (the specs give out spec history, and have links to all previous versions of the specs).
I attached above the XML-related directory links that were broken as of last night at W3C. There were 12 in all.
Thanks Kat, I'll see if I can get them to fix these too.
And it wasn't even necessary to contact them. I got a mail from Dominique Hazaël-Massieux (W3C Webmaster), I'll quote (I hope he doesn't mind that): "I tweaked the qs factor for those pages, so that the HTML version gets served to mozilla instead of the without stylesheet XML version. Hope this helps, Dom" A big thanks to Dominique for fixing this. I guess we can close the bug?
>Having said that I believe Apache has implemented this very poorly >on the server side. The problem is that some notable browsers--including some (if not all) versions of IE--accept */* and nothing else. A browser that only accepts */* and a browser that accepts text/html and text/xml explicitly both indicate an equal preference for text/html and text/xml for the purpose of the content negotiation algorithm. So the other browsers that only accept */* are the problem. (Or actually, allowing */* in the Accept header is a problem.) Apache just doesn't include elaborate enough workarounds for dealing with browsers that accept */* and nothing else. (Apache does include a workaround that discredits */*, but the workaround is useful only when the Accept header also contains explicitly accepted types.) It would be nice if Apache gave the author more control over content negotiation. However, if Mozilla didn't allow easy content negatiation with the current installed base of Apache, authors would likely be more vocally asking for ugly hacks for overriding the content type from within content. I think making HTTP content negotiation with the current installed base of Apache possible is better than introducing ugly hacks for overriding HTTP.
As long as the pages/links that have been mentioned here and in my attachment work OK, I am willing to let this be resolved as Worksforme. I still have some issues with Accept-header Q values but that discussion can take place in another bug. For example, I have an issue with the following statement by Henri Sivonen: > So the other browsers that only accept */* are the problem. > (Or actually, allowing */* in the Accept header is a problem.) But let's do this somewhere else. For this bug, can someone other than I verify that links now all work without tinkering with the accept-header settings? And then resolve it as Worksforme.
PDT- to take it off our radar.
Whiteboard: [ETA ?] → [ETA ?] [PDT-]
all of the urls which do not end in .xml are presented to mozilla 2001100303/win2k as text/html and are displayed properly. the urls that are .xml are presented as text/xml. the site has fixed their content, so you can do whatever you want to this bug.
Changing Product category.
Component: Networking: HTTP → English: US
Product: Browser → Tech Evangelism
Version: other → unspecified
The site has fixed the problems reported. Resolving it as fixed. The HTTP accept-header value issue will be filed as a separate bug.
Assignee: peterv → bclary
QA Contact: petersen → zach
Changing the resolution to fixed.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Verified 2002022703/WinXP
Status: RESOLVED → VERIFIED
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: