save target link doesn't work

RESOLVED WORKSFORME

Status

Core Graveyard
File Handling
--
major
RESOLVED WORKSFORME
16 years ago
2 years ago

People

(Reporter: Felipe, Assigned: Adam Lock)

Tracking

({regression})

Trunk
Future
regression

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [adt3 rtm], URL)

Attachments

(1 attachment)

(Reporter)

Description

16 years ago
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529
BuildID:    2002052918

To get around the annoying tar.bz2 downloading as text/plain bug, I often just
hit "save target link". For some reason, though, on this site (maybe others?),
Moz returns an error that the target link couldn't be found, although left-click
does bring up a screen full of garble.

Reproducible: Always
Steps to Reproduce:
1. Go to URL.
2. Right-click on the link to "wmcms-0.3.5.tar.bz2"
3. Go to "save link target as", and watch the error.

Actual Results:  Error.

Expected Results:  Should download the file.

This would be less of a problem if Moz downloaded bz2-balls correctly, or is
this a web server problem?

Comment 1

16 years ago
I've been all over that page, and I can't see the link.  Can you give me a clue
where it is?
(Reporter)

Comment 2

16 years ago
Oops- click on "wmcms", then the link is about halfway down. Use of tables threw
me off...

Comment 3

16 years ago
Confirming, WinXP, trunk 2002060704. (OS->All)

(amending URL to page with the link)

left-clicking on the link loads the file into the browser window, as per bug
126782 but right click->save link target as gives an error dialog "The link
could not be saved.  The web page might have been removed or had its name
changed".  Very odd.

Not sure where this belongs, I'll try file handling first...
Assignee: sgehani → law
Status: UNCONFIRMED → NEW
Component: XP Apps → File Handling
Ever confirmed: true
OS: Linux → All
QA Contact: paw → sairuh
nominating --i see this in the branch as well as the trunk, all platforms. spoke
with Bill, who suggested that this should go to Boris. reassign as needed.

here are a couple test cases, just involving saving of HTML files:

recipe A:

1. go to http://mozilla.org/
2. bring up the context menu for a link whose url doesn't end with an explicit
filename --eg, the "Testing" link on the left side (http://mozilla.org/quality/).
3. select "save link target as"

result: instead of getting the file picker, i get an error dlg, "The link could
not be saved. The web page might have been removed or had its name changed."

recipe B: same as A, but instead of the context menu for steps 2 and 3,
shift+click the link (option+click on Mac).

results: unlike A, B *sometimes* does display the file picker, but only a third
of the time. two-thirds of the time, it fails like A. odd.

recipe C: same as A, but go to a URL where the filename *is* appended, eg, the
"Roadmap" link (http://mozilla.org/roadmap.html).

results: unlike A or B, C resulted in the file picker appearing (as expected).
Assignee: law → bzbarsky
Severity: normal → major
Keywords: nsbeta1, regression
Hardware: PC → All
wacky regression info: i was looking at some older build on win2k, i "narrowed"
this down a bit: with a 2/13 build i don't see this problem, but with a 2/18
build i do.

this is very odd, considering how i filed 129535 on 3/7, and recipe C (in bug
129535 comment 0) worked fine (selecting "save link as" for
http://www.mozilla.org/quality/ brought up file picker). perhaps what i'm
looking  for regression-wise is not the best test...

Comment 6

16 years ago
adding nsbeta1+ per nav triage team
Keywords: nsbeta1 → nsbeta1+
Whiteboard: [adt3 rtm]
Created attachment 88212 [details] [diff] [review]
Partial patch

OK.  There are actually two separate issues in this bug:

1)  http://orbita.starmedia.com/~neofpo/wmcms.html site.  On this site, making
a
    HEAD request for the .bz2 file returns a "403 Forbidden" response from the
    server.  A GET request succeeds.  The server is "Apache/1.3.12 (Unix)
    PHP/4.0.2" and this is not normal behavior for that server -- the
    administrator has apparently purposefully disabled HEAD requests.
2)  http://mozilla.org/.  On this site, we the server is
    "Netscape-Enterprise/3.6".	It's handling of HEAD is known to be buggy, in
    fact we already work around the fact that it returns a 404 for HEAD
requests
    on top-level pages.  It seems to also return a 500 for HEAD requests on
    directories.... what fun.

The attached patch fixes case #2 (which is a bug in the server, not a
configuration choice by the admin).  We could attempt to fix case #1 by
retrying with GET in cases when the HEAD returns a 403....

Ultimately, though, the simple fact is that this whole setup is broken.  The
information we get from HEAD requests is not sufficiently reliable to be used
to decide whether to continue the file save operation, in my opinion....
ccing the parties that should be in on this discussion and should be reviewing
changes to this code...

This is nsbeta1+, but it's not even clear to me what we're trying to fix here. 
The mozilla.org site?  The site this was reported against?  Solve the problem
once and for all for all sites (this means compeletely redoing the save as
architecture to not rely on the URI checker and not use this weird two-pass
approach)?

Comment 9

16 years ago
Personally, I would be much happier if selecting save as caused a save as dialog
to pop up right away, without waiting for any checking.  Yes, this would
sometimes mean I would go to the trouble of picking a destination, only to get
an error dialog later.  But that rare case would be more than balanced by not
having to wait a long time for the save as dialog in the normal case where the
file does exist.

Re the patch: that's odd, we used to be getting a 404 from mozilla.org, as seen
by the code block immediately after the line you patched.  Does it return 404 in
some cases and 500 in others?  I don't have a problem with your patch (looks
fine, r=akkana if you want to check in that part), I'm just wondering why it's
different from the case I was seeing earlier.
Yeah, we really shouldn't be using nsiurichecker.

I still think that this should have been done via an explicit attribute on the
channel, but since nsIChannel is frozen..

One thing I mentioned to bz in another bug was that the status arg for
onstart/onstop/etc could be a new success value. This will then work for people
using the NS_SUCCEEDED macro (which just checks the first bit of the response)
without breaking anything, and thus we can do this to a frozen interface, just
like we could be returning a new underlying faliure code from a low level layer.
People (css loader, etc) can then check for NS_OK_WITH_ERROR, or something.

The link checker is useful for compsoer, which doesn't actually want the result
(and so wants ot play with HEAD). I just don't think its appropriate for anyone
who does want the content.
Unfortunately, we use the checking as part of our algorithm to determine he
suggested filename (via the HTTP "Content-Disposition" header).  So we cannot
put up the filepicker till the check is done.  :(

I just tested some more, and mozilla.org is definitely returning a 500 on all
HEAD requests for URIs that end with "/".  I have no idea what changed in that
behavior...

Punting the link checker from this code would leave us two alternatives:

1)  Go back to the old way of doing a (slow) double-get
2)  Move the UI code into callbacks from the persistence object's saveURI.
    Handle the "Web page complete" case by actually checking whether we have a
    document and opening a filepicker immediately if we do (there is no need to
    get any content to do so in this case).  If we do not, then we call saveURI
    which handles prompting for a filename once the OnStartRequest from its GET
    comes in (note that this is what's already done by the
    ExternalHelperAppHandler).

Ccing some more people who may have useful things to say.

Also, we could back out the code that blocks saves on errors until our errors
actually _mean_ something.

So back to my question.  What exactly is nsbeta1+ here?  The rearch that's
needed to really fix this can't be....

Comment 12

16 years ago
I think the nsbeta1+ was due to the mistaken belief that this was a recent
regression.  If that's not the case and the only thing "broken" are certain
links to sites runing a particular server, then probably this isn't that big a deal.

That said, is is possible to slightly modify the code so that an error on the
HEAD request is simply ignored?  That is, treat it the same way as say an ftp:
link (where we can't do a http HEAD to get content-disposition or content-type).

Sorry for being lazy and not digging into the code to see if that's easy to do.
 My recollection is that the code was structured to enable that sort of thing.
(Reporter)

Comment 13

16 years ago
Is it possible/prudent, perhaps, to have the browser fall back to using GET in
case of a HEAD error?
Possible, sure.  Useful, I don't know.  The primary consumer of the URIChecker
is composer, so it's their call what they want to do there.

Bill, do you mean basically backing out the fix to bug 89389?  Or something
different?

Comment 15

16 years ago
Something different, definitely.  That fix solved the problem of us saving the
error response page.  We still don't want to save the error response page.  But
rather than put up an alert and save nothing, we should forget about the HEAD
request and just resort to saving the link target based on its URL (ignoring
content type).

Looking at the patch for bug 89389, it isn't obvious why the code didn't work
that way prior to landing that fix.  It seems like the throw("Link check
failed") should have achieved the same end result as the throw "Unknown type" a
bit higher up.

Maybe a better fix for that bug would have been to insert some code like:
     this.mData.document = null;
prior to the throw("Link check failed").  I'm guessing that some sort of change
like that would have coercedthe catch block that handles these throws to do the
right thing.
Well.. |throw "Link check failed";| would make us do a GET for the url.  Then if
the GET comes back with an error page (as it did in bug 89389) we would save the
error page....

Comment 17

16 years ago
I see.  Sounds like we would also need some code to detect the bad status on the
GET request, then.  That probably is over in nsWebBrowserPersist...

Would that be an improvement over the status quo (and the pre-bug 89389 behavior)?
Yes, it would; that's what we should really be doing...
OK...  over to Adam for now, to decide what nsWebBrowserPersist is and is not
willing to do to make this work correctly....
Assignee: bzbarsky → adamlock
(Assignee)

Comment 20

16 years ago
This is tricky. Webbrowserpersist expects a filename and that is what it saves
to. If you want to throw away the URI checker, then in order to pose the proper
save as options you will have to:

1. Use the "type" attribute on the anchor tag in the rare event the content
bothers to supply one.
2. Infer the type from the URL (risky)
3. Open a channel and sniff the type from the headers or the first chunk of data.

Probably number 3 is going to be the most reliable (it's what IE does), however
I don't know if we have any type sniffing code. Do we?

Once the type is known you can close it, or alternatively perhaps the persist
object could take ownership of it. I suppose this would be the most efficient
thing to do, but I don't know how feasible it would be. It may be possible to
implement a callback on the persist object that allows the caller to determine
what filename to use after some data and the content type has arrived.
QA Contact: sairuh → petersen
(Assignee)

Updated

16 years ago
Target Milestone: --- → Future

Comment 21

16 years ago
*** Bug 184044 has been marked as a duplicate of this bug. ***

Comment 22

16 years ago
*** Bug 173877 has been marked as a duplicate of this bug. ***
I can't see the bug in a Mozilla nightly of 2004-10-07 nor in a Mozilla nightly
of 2004-10-18, nor in Mozilla1.6.
Probably this has been fixed on the server side?
Status: NEW → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.