Closed Bug 129607 Opened 23 years ago Closed 23 years ago

XMLHttpRequest hangs with http://www.yahoo.com

Categories

(Core :: XML, defect, P2)

x86
Windows 2000
defect

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: hjtoi-bugzilla, Assigned: hjtoi-bugzilla)

References

()

Details

(4 keywords, Whiteboard: [ADT2] [DIGbug])

Attachments

(2 files, 1 obsolete file)

I believe this is a regression, should cough up a testcase... The workaround is to call overrideMimeType("text/xml").
Hmm... this test works. Could you guys find some tests where we do hang? http://www.mozilla.org/xmlextras/xget2.html
It works in my builds, but when I tested harishd's P3P enabled build I actually got a crash! Need to investigate that further.
*** Bug 134314 has been marked as a duplicate of this bug. ***
Ok, it is Yahoo that is giving us grief... Dunno what is so different about it. Mozilla.org etc. work just fine.
Summary: XMLHttpRequest hangs with HTML file → XMLHttpRequest hangs with http://www.yahoo.com
Keywords: hang
Priority: -- → P2
Target Milestone: --- → mozilla1.0
I also get the same problem trying to get http://www.sun.com and http://www.budweiser.com. I'm server admin for budweiser and can tell you there's nothing unusual being done with it. But mozilla.org still works fine.
I ran Netmon on my computer while opening a test page that did an asynchronous get with XMLHttpRequest. I'm not an expert in this, but I noticed one difference between sites that worked - mozilla.org and apache.org - and sites that didn't - yahoo.com and budweiser.com and altavista.com. Under the part of the packet that is for "TCP Flags", the sites that worked had a line that looked like this: TCP: ....0... = No Push function The sites that did not work looked like this: TCP: ....1... = Push function I don't know enough about Mozilla and TCP to comment on whether this makes a difference. This is a definition of the Push flag that I found: "When a receiving TCP sees the PUSH flag, it must not wait for more data from the sending TCP before passing the data to the receiving process."
This page is set to grab apache.org, which works. Just change the HTML so it points to other sites to see how they work.
Is http://green.nscp.aoltw.net/heikki/129607/ internal to Netscape? I get hostname unknown message.
Yes, sorry I did not mention it. *.mcom.com and *.nscp.aoltw.net are inside Netscape firewall. The URL contains the testcase attachment and it loads a local copy of the Yahoo page. This is easier for me & other Netscaper's to test than saving the attachment to local disc and editing it. harishd and I looked at this, and we think we have a solution. We should make sure that XMLHttpRequest does not call the HTML parser (by checking the mime type or something). This same problem happens with document.load(). Alternatively, or additionally, the HTML parser could detect the mime type and bail out if it is not HTML. I'll see if we need that kind of bullet proofing.
Status: NEW → ASSIGNED
I know you are busy, but I was wondering why would some sites work, and others not? For example, mozilla.org works, but not yahoo.com.
Style and script tags make a difference in processing; apache.org does not have inline style elements, yahoo.com does. I have a proof of concept fix now, cleaning up. The fix will prevent parsing althogether if the mime type is not XML, which means it is a considerable performance boost for non-XML data. And this is a regression from 0.9.4.
Severity: normal → critical
Filed bug 138071 for document.load() which suffers from the same thing.
Attached patch Partial fix (obsolete) — Splinter Review
This patch depends on the mime type, but that is a problem because there are servers out there that do not return a mime type. Also we don't have HTTP headers when reading from local disc. So unless there is another way of doing this we really need the fix in the parser as well (see bug 138071). I am not sure if it is safe to not call the parser's OnStartRequest and OnEndRequest (does everything get torn down properly, have we even constructed the parser and its supporting objects?). Harish?
Comment on attachment 79720 [details] [diff] [review] Partial fix >+ GetResponseHeader("Content-Type", getter_Copies(type)); Add check for out of memory.
Attachment #79720 - Flags: review+
Comment on attachment 79720 [details] [diff] [review] Partial fix A couple of points: 1) You can use nsIChannel::GetContentType to get the mime type, rather than directly looking at HTTP response headers. In fact, this might give you an inferred mime type (based on file extension, for example). 2) As it stands, the new code will not parse XML files delivered without a Content-Type header specifying that it is text/xml. I don't know if that's a common case, but it is a change in behavior from the original code.
Based on vidur's comments I talked to Darin, and learned that the channel indeed provides the guessed mime type (from file suffix) if there is no content-type header. I have a modified patch to do this, will attach. So I made test files that don't return mime type, but return well-formed XML with XML declaration, even: http://green/cgi-bin/hello http://green/cgi-bin/hello.cgi http://green/cgi-bin/hello.xml If you try these in the browser you see that we detect the .xml suffix and treat it as XML. We treat the others as text/plain (and that is what the channel returns). Darin said that we also do content sniffing if we can't guess the mime type from file suffix. I placed breakpoint in nsUnknownDecoder::OnStartRequest() but we don't hit that with the above test cases. I also made three testcases that use XMLHttpRequest to load the above files: http://green/heikki/129607/notypenosuffix.html http://green/heikki/129607/notype.html http://green/heikki/129607/notypexml.html The last one, which loads the file with .xml suffix, is the only one that parses the response as XML, i.e. this functionality is identical to normal document load. If we want to do content sniffing for XML (I don't recommend this), we should open a new bug for that.
Attachment #79720 - Attachment is obsolete: true
Comment on attachment 79882 [details] [diff] [review] Proposed fix sr=vidur, though I'm still mildly concerned about the expectation of finding "xml" in the mime type. What happens if there is a proliferation of mime types of XML applications that don't contain "xml"? My understanding is that they should be of the form application/<xml appl>+xml, but who knows. Also, the previous hanging behavior while parsing certain types of HTML content as XML should be fixed (a separate bug, of course).
Attachment #79882 - Flags: superreview+
I checked the default Apache mime.types file and the string "xml" was only in mime types that were really XML data. This actually allows XMLHttpRequest to deal with more XML than we can view normally, because we explicitly check only three XML mime types for browsing: text/xml, application/xml and application/xhtml+xml. There might be some that are XML but don't have that string in them; I even think the "+xml" in mime types is optional. But like I said, we wouldn't be able to view them as XML anyway, so why would you expect to be able to load them with XMLHttpRequest either? In any case, you can always call overrideMimeType("text/xml") if in doubt...
Comment on attachment 79882 [details] [diff] [review] Proposed fix r=harishd
Attachment #79882 - Flags: review+
Fixed on trunk, mailing drivers. Some people have reported that their testcases are 20-30 seconds slower in Mozilla, compared to IE. I have not yet seen their tests, but if they were loading non-XML data this should prove useful. Please note that the workaround for this hang, overrideMimeType("text/xml") forces us to always parse as XML, which is a silly performance hit if the data is not XML.
Keywords: adt1.0.0
Whiteboard: [fixed on trunk]
Whiteboard: [fixed on trunk] → [fixed on trunk][ADT2]
Is XMLHttpRequest widely used by web developer? Chris - Can you verify this is fixed, and does not cause any other regressions? Cathleen/DP - Pls note Heikki comment on perf in comment #21. Should we be concerned about this?
Keywords: topembed
Changing Q/A contact
QA Contact: petersen → rakeshmishra
We use XMLHttpRequest in P3P, and this + bug 138071 would enable us to get faster response in case there is no P3P policy (still the majority of sites). Also the DIG group uses XMLHttpRequest, and this would certainly help their performance. I am sure there are a fair number of XMLHttpRequest users since I hear from XMLHttpRequest regressions reasonably fast, or at least faster than of any other XML regression (including XHTML). We do not use XMLHttpRequest in any of our perf tests.
Verified on the trunk build 2002-04-19-06-trunk on Win 2000 machine.
adt1.0.0+ (on ADT's behalf) approval for checking in to the 1.0 branch. Pls check this in today, then add the fixed1.0.0 keyword.
Keywords: adt1.0.0adt1.0.0+
Whiteboard: [fixed on trunk][ADT2] → [fixed on trunk][ADT2] [DIGbug] [ETA 04/25]
Fixed on branch as well.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Keywords: fixed1.0.0
Resolution: --- → FIXED
Whiteboard: [fixed on trunk][ADT2] [DIGbug] [ETA 04/25] → [ADT2] [DIGbug]
rakesh verified on trunk earlier, marking as such.
Status: RESOLVED → VERIFIED
Verified on the branch build 2002-04-26-08-1.0.0 on Windows 2000 machine. Adding "verified1.0.0" in the keyword .
Keywords: verified1.0.0
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: