From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529 BuildID: 2002052918 To get around the annoying tar.bz2 downloading as text/plain bug, I often just hit "save target link". For some reason, though, on this site (maybe others?), Moz returns an error that the target link couldn't be found, although left-click does bring up a screen full of garble. Reproducible: Always Steps to Reproduce: 1. Go to URL. 2. Right-click on the link to "wmcms-0.3.5.tar.bz2" 3. Go to "save link target as", and watch the error. Actual Results: Error. Expected Results: Should download the file. This would be less of a problem if Moz downloaded bz2-balls correctly, or is this a web server problem?
I've been all over that page, and I can't see the link. Can you give me a clue where it is?
Oops- click on "wmcms", then the link is about halfway down. Use of tables threw me off...
Confirming, WinXP, trunk 2002060704. (OS->All) (amending URL to page with the link) left-clicking on the link loads the file into the browser window, as per bug 126782 but right click->save link target as gives an error dialog "The link could not be saved. The web page might have been removed or had its name changed". Very odd. Not sure where this belongs, I'll try file handling first...
nominating --i see this in the branch as well as the trunk, all platforms. spoke with Bill, who suggested that this should go to Boris. reassign as needed. here are a couple test cases, just involving saving of HTML files: recipe A: 1. go to http://mozilla.org/ 2. bring up the context menu for a link whose url doesn't end with an explicit filename --eg, the "Testing" link on the left side (http://mozilla.org/quality/). 3. select "save link target as" result: instead of getting the file picker, i get an error dlg, "The link could not be saved. The web page might have been removed or had its name changed." recipe B: same as A, but instead of the context menu for steps 2 and 3, shift+click the link (option+click on Mac). results: unlike A, B *sometimes* does display the file picker, but only a third of the time. two-thirds of the time, it fails like A. odd. recipe C: same as A, but go to a URL where the filename *is* appended, eg, the "Roadmap" link (http://mozilla.org/roadmap.html). results: unlike A or B, C resulted in the file picker appearing (as expected).
Assignee: law → bzbarsky
Severity: normal → major
Keywords: nsbeta1, regression
Hardware: PC → All
wacky regression info: i was looking at some older build on win2k, i "narrowed" this down a bit: with a 2/13 build i don't see this problem, but with a 2/18 build i do. this is very odd, considering how i filed 129535 on 3/7, and recipe C (in bug 129535 comment 0) worked fine (selecting "save link as" for http://www.mozilla.org/quality/ brought up file picker). perhaps what i'm looking for regression-wise is not the best test...
adding nsbeta1+ per nav triage team
Keywords: nsbeta1 → nsbeta1+
Whiteboard: [adt3 rtm]
Created attachment 88212 [details] [diff] [review] Partial patch OK. There are actually two separate issues in this bug: 1) http://orbita.starmedia.com/~neofpo/wmcms.html site. On this site, making a HEAD request for the .bz2 file returns a "403 Forbidden" response from the server. A GET request succeeds. The server is "Apache/1.3.12 (Unix) PHP/4.0.2" and this is not normal behavior for that server -- the administrator has apparently purposefully disabled HEAD requests. 2) http://mozilla.org/. On this site, we the server is "Netscape-Enterprise/3.6". It's handling of HEAD is known to be buggy, in fact we already work around the fact that it returns a 404 for HEAD requests on top-level pages. It seems to also return a 500 for HEAD requests on directories.... what fun. The attached patch fixes case #2 (which is a bug in the server, not a configuration choice by the admin). We could attempt to fix case #1 by retrying with GET in cases when the HEAD returns a 403.... Ultimately, though, the simple fact is that this whole setup is broken. The information we get from HEAD requests is not sufficiently reliable to be used to decide whether to continue the file save operation, in my opinion....
ccing the parties that should be in on this discussion and should be reviewing changes to this code... This is nsbeta1+, but it's not even clear to me what we're trying to fix here. The mozilla.org site? The site this was reported against? Solve the problem once and for all for all sites (this means compeletely redoing the save as architecture to not rely on the URI checker and not use this weird two-pass approach)?
Personally, I would be much happier if selecting save as caused a save as dialog to pop up right away, without waiting for any checking. Yes, this would sometimes mean I would go to the trouble of picking a destination, only to get an error dialog later. But that rare case would be more than balanced by not having to wait a long time for the save as dialog in the normal case where the file does exist. Re the patch: that's odd, we used to be getting a 404 from mozilla.org, as seen by the code block immediately after the line you patched. Does it return 404 in some cases and 500 in others? I don't have a problem with your patch (looks fine, r=akkana if you want to check in that part), I'm just wondering why it's different from the case I was seeing earlier.
Yeah, we really shouldn't be using nsiurichecker. I still think that this should have been done via an explicit attribute on the channel, but since nsIChannel is frozen.. One thing I mentioned to bz in another bug was that the status arg for onstart/onstop/etc could be a new success value. This will then work for people using the NS_SUCCEEDED macro (which just checks the first bit of the response) without breaking anything, and thus we can do this to a frozen interface, just like we could be returning a new underlying faliure code from a low level layer. People (css loader, etc) can then check for NS_OK_WITH_ERROR, or something. The link checker is useful for compsoer, which doesn't actually want the result (and so wants ot play with HEAD). I just don't think its appropriate for anyone who does want the content.
Unfortunately, we use the checking as part of our algorithm to determine he suggested filename (via the HTTP "Content-Disposition" header). So we cannot put up the filepicker till the check is done. :( I just tested some more, and mozilla.org is definitely returning a 500 on all HEAD requests for URIs that end with "/". I have no idea what changed in that behavior... Punting the link checker from this code would leave us two alternatives: 1) Go back to the old way of doing a (slow) double-get 2) Move the UI code into callbacks from the persistence object's saveURI. Handle the "Web page complete" case by actually checking whether we have a document and opening a filepicker immediately if we do (there is no need to get any content to do so in this case). If we do not, then we call saveURI which handles prompting for a filename once the OnStartRequest from its GET comes in (note that this is what's already done by the ExternalHelperAppHandler). Ccing some more people who may have useful things to say. Also, we could back out the code that blocks saves on errors until our errors actually _mean_ something. So back to my question. What exactly is nsbeta1+ here? The rearch that's needed to really fix this can't be....
I think the nsbeta1+ was due to the mistaken belief that this was a recent regression. If that's not the case and the only thing "broken" are certain links to sites runing a particular server, then probably this isn't that big a deal. That said, is is possible to slightly modify the code so that an error on the HEAD request is simply ignored? That is, treat it the same way as say an ftp: link (where we can't do a http HEAD to get content-disposition or content-type). Sorry for being lazy and not digging into the code to see if that's easy to do. My recollection is that the code was structured to enable that sort of thing.
Is it possible/prudent, perhaps, to have the browser fall back to using GET in case of a HEAD error?
Possible, sure. Useful, I don't know. The primary consumer of the URIChecker is composer, so it's their call what they want to do there. Bill, do you mean basically backing out the fix to bug 89389? Or something different?
Something different, definitely. That fix solved the problem of us saving the error response page. We still don't want to save the error response page. But rather than put up an alert and save nothing, we should forget about the HEAD request and just resort to saving the link target based on its URL (ignoring content type). Looking at the patch for bug 89389, it isn't obvious why the code didn't work that way prior to landing that fix. It seems like the throw("Link check failed") should have achieved the same end result as the throw "Unknown type" a bit higher up. Maybe a better fix for that bug would have been to insert some code like: this.mData.document = null; prior to the throw("Link check failed"). I'm guessing that some sort of change like that would have coercedthe catch block that handles these throws to do the right thing.
Well.. |throw "Link check failed";| would make us do a GET for the url. Then if the GET comes back with an error page (as it did in bug 89389) we would save the error page....
I see. Sounds like we would also need some code to detect the bad status on the GET request, then. That probably is over in nsWebBrowserPersist... Would that be an improvement over the status quo (and the pre-bug 89389 behavior)?
Yes, it would; that's what we should really be doing...
OK... over to Adam for now, to decide what nsWebBrowserPersist is and is not willing to do to make this work correctly....
Assignee: bzbarsky → adamlock
This is tricky. Webbrowserpersist expects a filename and that is what it saves to. If you want to throw away the URI checker, then in order to pose the proper save as options you will have to: 1. Use the "type" attribute on the anchor tag in the rare event the content bothers to supply one. 2. Infer the type from the URL (risky) 3. Open a channel and sniff the type from the headers or the first chunk of data. Probably number 3 is going to be the most reliable (it's what IE does), however I don't know if we have any type sniffing code. Do we? Once the type is known you can close it, or alternatively perhaps the persist object could take ownership of it. I suppose this would be the most efficient thing to do, but I don't know how feasible it would be. It may be possible to implement a callback on the persist object that allows the caller to determine what filename to use after some data and the content type has arrived.
*** Bug 184044 has been marked as a duplicate of this bug. ***
*** Bug 173877 has been marked as a duplicate of this bug. ***
I can't see the bug in a Mozilla nightly of 2004-10-07 nor in a Mozilla nightly of 2004-10-18, nor in Mozilla1.6. Probably this has been fixed on the server side?
Status: NEW → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.