Closed Bug 160454 Opened 22 years ago Closed 20 years ago

Eliminate use of HEAD in "save as"

Categories

(Core Graveyard :: File Handling, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.8alpha5

People

(Reporter: bzbarsky, Assigned: Time_lord)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

Attachments

(2 files, 14 obsolete files)

33.64 KB, patch
bzbarsky
: superreview+
Details | Diff | Splinter Review
2.99 KB, patch
asaf
: review+
beltzner
: approval1.9+
Details | Diff | Splinter Review
HEAD is so poorly supported that using it is pretty much pointless.  We should
just go back to what we were doing before the "save as, complete" landing for 
"save link as" and we should expose the content-disposition on the document
object so that we honor that header for content that we have already loaded....
indeed... moreover, fixing this bug would once again allow folks to "queue up" a
bunch of downloads, select save locations, and leave their computer.  this was
something i used to enjoy being able to do with previous versions of mozilla...
but, in the current world, you have to wait for the HEAD request to succeed
before you're able pick the save location :(
Oops.  Other way around.  And some dependencies.  There are probably more of
those...

As we do this, we should make sure that the persistence object throws an error
alert if it gets a 404 while doing "save link as".  I think a new flag
(DETECT_NETWORK_ERRORS or something) will be needed for that.
Assignee: law → bzbarsky
Blocks: exe-san
Blocks: 164433
*** Bug 164433 has been marked as a duplicate of this bug. ***
Blocks: 165636
Blocks: 172542
Blocks: 174150
Depends on: 174167
QA Contact: sairuh → petersen
Blocks: 155524
Blocks: 151126
Blocks: 188512
Keywords: helpwanted
Thought I'd mention, since this is a major site...

With a recent amazon.com purchase, I got some free digital documents (A Time
article, and two books) added to my account, which are in PDF form.

Left-clicking them opens them in my PDF viewer, and Save link as is a no-op...
Phoenix issues two HEAD requests for the file, and there is no responce.

  HEAD /exec/obidos/redirect-to-external-url/ref=mt_dl_dwn_free_3/[...]

If I weren't as savvy a user, I would have NO way to save this content I could
have potentially paid for.

Though I do appreciate the attempt to use HEAD to DTRT, this is a major problem.
Blocks: 196185
While working with some web applications, I tried using Mozilla to save a 
report that was generated by POSTing a form.  Mozilla of course tried to do the 
HEAD and the result is a server error because the ASP serving the page did not 
have some required POST information.  So I tried this on another page where I 
submitted a POST but it was not required.  The end result was that the browser 
did NOT save the content of the page I was viewing on the screen.  It saved a 
NEW page resulting from a NEW request (not even the original post!).

Don't most people want to save the page that they're already looking at when 
they choose File->Save Page As???????????  Or am I mistaken?

I was having a tough time coming up with a non-order-oriented situation where 
you want to save a posted page, so here's an example.  Go to epinions.com and 
when you write a review, it prompts you to confirm before it gets posted.  Save 
this page and wah-lah, you have CONFIRMED without CONFIRMING.

This sounds like a much bigger problem to me than anyone has made it out to be. 
This may be 2 different problems I'm representing here, but they definitely 
seem related.

The only work-around I've found to save the CURRENT page is to view source 
(which magically gives you the source of the page you wanted to save in the 
first place...from cache) and select all, copy into a text editor, and save.
Keith, the POST issue is very serious, yes.  And it has NOTHING AT ALL to do
with this bug or any of the other bugs you spammed it with.  It has a bug of its
own, and that bug has plenty of information on why it's a major problem
(everyone agrees it is).

Fixing this bug is completely orthogonal to fixing the POST bug; to fix that we
need to pass the right cache keys or history entries to the persistence object
and this requires a pretty big rewrite of all the save as callers (whereas this
bug is about the save as _back_end_, not the callers).
Blocks: 196871
Priority: -- → P1
Target Milestone: --- → mozilla1.5alpha
Blocks: 202241
Blocks: 202811
Blocks: 203204
Blocks: 182692
*** Bug 147194 has been marked as a duplicate of this bug. ***
Blocks: 205470
Blocks: 208523
Blocks: 208584
Flags: blocking1.4?
Just as a note, there is no way I, personally, am landing a change like this one
on anything being called a stable branch.  I'm not even sure I'd land it in a
beta milestone.
Blocks: 211468
This is definitely causing issues with a number of E-Commerce offerings for a 
department in the government of canada where I work. Users are not able to save 
their work in Netscape 7.1 where they always could before. With a large 
contingent of MAC users (darn those lawyers and accountants) that are looking 
at Netscape as their only real option, they now find they cannot save or print 
their filings. Using a GET is not an option as the systems use the file upload 
feature.
Blocks: 213510
Blocks: 213877
213647 appears to be related to this
Any chance of seeing this fixed anytime soon, its almost a showstopper for me...
Is there something I can do to help? (coding wise that is)
Blocks: 213647
This needs the following steps:

1)  Expose content-disposition on the document object in a sane way (clear it with
    jst)
2)  Remove all uses of the link checker in contentAreaUtils.js
3)  Decide what to do when saving error pages (both currently loaded, and linked
    to).
4)  Make changes to persistence object and API as needed to handle whatever
    choices were made in #3 (clear them with adamlock).
5)  Test the hell out of the whole thing.
mozilla1.4 shipped. unsetting blocking1.4 request.
Flags: blocking1.4?
Will this bug been fixed in 1.6?
Target Milestone: mozilla1.5alpha → mozilla1.6alpha
No (notice the "helpwanted" keyword), and please do NOT EVER touch the target
milestone of a bug you don't own.
Target Milestone: mozilla1.6alpha → mozilla1.5alpha
Boris, excuse, was my fault. (Was a misuse with keyboard)
Will investigate to this bug.
*** Bug 222994 has been marked as a duplicate of this bug. ***
Blocks: 225069
Blocks: 233376
Blocks: 234349
(In reply to comment #11)
> 1)  Expose content-disposition on the document object in a sane way (clear it with
>     jst)

Hm.. how is that required for this bug? does file|save page as uses HEAD as well?

if so, that actually makes my getFilenameFromChannel suggestion in another bug
much more workable :)
> Hm.. how is that required for this bug? does file|save page as uses HEAD as
> well?

Absolutely.  That's why you can't save a 404 page you're looking at.
*** Bug 232660 has been marked as a duplicate of this bug. ***
Blocks: 232660
Blocks: 242412
Blocks: 241588
You may also have a look at bug 246633.
I just have read RFC2616.
As I understand this, the difference between a GET and HEAD request is that in
case of HEAD, the server must not return a HTML-formated error page, in case of
an error.

So it is absolutely correct that Mozilla sends a HEAD-request for "Save Link
As..." and "Save Page As...". Who wants to download error messages!?

Since tis problem only happens with some rare servers, I think Mozilla should
keep it's correct HTTP implementation. And this problem can be worked around by
disabling for example the PDF plugin (in case of downloading PDFs), so that a
download request appears.
Blocks: 246633
sending HEAD before doing anything else has other drawbacks, most significantly
that there's a delay before the filepicker appears, and that users do not care
whose fault it is that they can't download a file.
> Since tis problem only happens with some rare servers, I think Mozilla should
> keep it's correct HTTP implementation. And this problem can be worked around by

This problem can be easily demonstrated with nearly every major server on the
internet.  Simply try to download (save to disk) three files from one server. 
You'll find that the first two prompt you for the save location, but you do not
get prompted for the third until after at least one of the preceding two
downloads completes.  This is because the HTTP/1.1 standard says that a
user-agent should open no more than 2 keep-alive connections per server.  Unless
Mozilla wanted to violate that and/or not use keep-alive connections when
downloading, there is a major problem.
As far as I know, HEAD and keep-alive do not depend on each other...
I think I remember that recent versions of the windows download manager "Net
Transport" uses keep-alive and GET (not HEAD)...
Maybe it would make in general sense to add a section "Protocol" to the Options
where such things can be user defined...
(In reply to comment #23)

Sorry. In comment #21 I mean the fact that some servers do in general not
support HEAD downloads...
This problem is described in bug 246633, that turned out to be a server bug, not
a Mozilla bug.
(In reply to comment #24)
> As far as I know, HEAD and keep-alive do not depend on each other...

That's correct. but keepalive means that mozilla only makes 2 connections to a
server, and use of HEAD in save as means that the dialog only appears after
getting a HEAD response.

combined, this causes the delays darin mentioned. see also comment 1.
(In reply to comment #27)
"HEAD in save as means that the dialog only appears after
getting a HEAD response"

I have not tested the problem with the delay... But i believe you... :-)

Is this really needed by the protocol specs (RFC2616)? I don't think.
I always thought HEAD should only prevent the server from sending error pages
instead of only error codes.

I think this delay could be fixed (removed) while keeping HEAD. Or could not?
I'm not a protocol guru...

I just can read in RFC2616 that HEAD is more suitable for downloads, because of
not returning error pages...
No, you misunderstood the purpose of HEAD. It "must not" give *any* body, that
includes the file, so you cannot download using HEAD, only check the headers /
meta-data (when it last changed, mimetype etc.).
<http://asg.web.cmu.edu/rfc/rfc2616.html#sec-9.4>
See the list of depending bugs for how many problems this bug would fix.
BTW: This is a tracking bug.
(In reply to comment #29)


OK. You are right. Sorry. :-)
Please forget comment #21.
You may also have a look at bug 160448 and bug 246633.
(In reply to comment #27)

> and use of HEAD in save as means that the dialog only appears after
> getting a HEAD response.
> 
> combined, this causes the delays darin mentioned. see also comment 1.

I see a big problem here!

Mozilla sends a HEAD request, followed by GET request, right?

The file-requester opens after the HEAD request's response, right?

I think it is necessary to wait for the http headers before the file-requester
opens, because of content-disposition filenames or URL redirections that cause
other filenames than the original URL contains!

So I think such a delay is necessary. But I don't know if therefore a HEAD
request is necessary.

I guess that IE waites for the header of the GET request's response and then
opens the file-requester that would also contain the correct file name in case
of content-disposition, and begins to cache file already before the user has
confirmed.

So I think it's impossible to eliminate that delay, because we must check the
response header for content-disposition file-names or other file-names due to
redirected URL ...
If the HEAD response is taking too long perhaps we should just give the user the
choice to not wait any longer and specify the save as location without any
suggestion.  I'm not sure what the UI would look like, but it seems to me that
there are cases where it would be nice to be able to just say "screw waiting for
the Content-Disposition header, I know where I want to put this file, and I want
to tell you (the browser) now, damn it!"
Another reason we need to wait is to get the MIME type of the file from the server.
If we dont wait, we have to guess the extension (which may not be correct).
(In reply to comment #34)
> If the HEAD response is taking too long perhaps we should just give the user the
> choice to not wait any longer and specify the save as location without any
> suggestion.  I'm not sure what the UI would look like, but it seems to me that
> there are cases where it would be nice to be able to just say "screw waiting for
> the Content-Disposition header, I know where I want to put this file, and I want
> to tell you (the browser) now, damn it!"


Therfor just have a look at the download managers "NetAnts" and "Net Transport":

They only request a directory and an _optional_ rename field that could be left
empty to add the download-job to the queue.

If the user enters a file name in the rename field, it has highest prio. If he
has left this field empty, the filename is determined while downloading...

In my oppinion this is a very smart solution that should be considered to be
adopted for Moziall/Firefox ...

Just have a look at "NetAnts" or "Net Transport"...
But we must not forget that we talk about a browser, not a download manager ...

I guess Mozilla uses the HEAD request because it cannot cache the file like IE
until the user has confirmed the file requester.

So I agree to remove the HEAD request, because some servers don't support it and
I see no need for it.

So I think downloading could be handled in the following way:


- GET request


- waiting for server response


- open file requester with correct file name


- possibility 1:
 begin caching the file (like IE, I guess)

 possibility 2:
 just pause the stream if this is possible

- wait until the user has confirmed the file requester
(in case 2: cintinue streaming)


I think now the programers should have some good suggestions...
Let we decide them...
(In reply to comment #37)

typing error:
"cintinue"->"continue"

Note:
The delays are also on other browsers...

So please don't forget that the main problem of the HEAD request is that it
makes it impossible to download from some server with save-as, because they
don't support HEAD!!!
Why do we need to issue a HEAD request to get the item's headers, instead of
just issuing a GET request, then popping up the download window after the
headers have arrived?

That would also allow the possibility of <meta http-equiv> tags in the <head>
section of an HTML file, if we detected HTML and parsed it until </head>.  It
could also give us the convenient feature that Opera has (IIRC) where it starts
downloading the file in the background while you're handling the dialogs.

If the user cancels the download, the effect could be the same as if they'd
pressed STOP while loading a Web page.
(In reply to comment #40)

>... where it starts
> downloading the file in the background while you're handling the dialogs.
> 
> If the user cancels the download, the effect could be the same as if they'd
> pressed STOP while loading a Web page.

I think this is what I mean with "possibility 1" in comment #37. I think IE also
does so...
(In reply to comment #37)
> So I think downloading could be handled in the following way:
> 
> - GET request
> 
> - waiting for server response
> 
> - open file requester with correct file name

I went thru this conversation a few months ago... I suggested replacing the HEAD
with a partial GET - but this is an optional part of the spec and apparently
supported even worse than HEAD ;-) And it may still mean the server gets hit twice.

One thing to consider is network latency - regardless of the size of HEAD
req/resp, there is a delay before the save-as dialog appears. This can be
frustrating, especially for a lot of us still on dialup (although I have fibre
at work). Even with a fast connection, there can be significant network delay.
Re. comments in #37 and #40, yes rendering the document once received (even
partially) is a way to determine content type and offer 'save modes' to the
user, but again this introduces a delay before the save-as dialog appears, and
is subject to many different factors due to the varied types of
documents/images/binaries etc that can be saved (and the quality of their
content - the content once received may be of no use at all).

To cut a long story short, I pulled out all the HEAD stuff some weeks ago - see
http://bugzilla.mozilla.org/show_bug.cgi?id=238966#c16 . There may be interest
out there in tweaking it (and maybe doing parts of it more 'correctly' ;-) ),
but it is functional and I've been using it for weeks.
The patch should really have been posted here...
note: bug 238966's patch (which would fix this bug) does not address issue 1 of
comment 11, and probably should.
Find attached the second diff (for patch 2) of contentAreaUtils.js rewrite
(header sniffer code removed). There is still one outstanding issue, namely the
problem of determining the content encoding type (if any). Haven't had time to
get that sorted. But please review the remainder of this patch - have made the
suggested cleanups & tweaks (hopefully removed any rogue trailing whitespace,
named some variables better and improved commenting).
I hope to spend more time on the content encoding issue soon...
Flags: blocking1.8a3?
Comment on attachment 153933 [details] [diff] [review]
Diff contentAreaUtils.js for proposed patch

+  var fileInfo = null;// = new FileInfo(null, null, null, null);
+  if (!aChosenData)
+    fileInfo = getUriAndFileInfo(aURL, aDocument, aUserTypedUrl, contentType);

OK,so fileInfo is null if aChosenData was given.

Now:
+	 helperAppService.applyDecodingForExtension(fileInfo.fileExt,
contentEncodingType)) {

this will throw an exception (but it's caught... so that's not much of a
problem)

+    fileInfo: fileInfo,

this will also be null...

+  var source = useSaveDocument ? aDocument : fileInfo.nsiUri;

but this will throw an exception. Why not fill fileInfo from aChosenData?

++  var source = useSaveDocument ? aDocument : fileInfo.nsiUri;

hrm, this variable name (nsiUri) seems only marginally better. actually I'd
prefer uriObj to it. better would be just "uri", imo. (like in the AutoChosen
structure)

++    var nsiUri = makeURI(aURL, aOriginCharset)

please add trailing semicolons

++function makeUriAndPopulateFileInfo(aURL, aOriginCharset)

hrm, ok.... I'll let neil decide if he likes the name.

++    /* Is there a funky Mozilla way to extract an nsIURI from aDocument or do
++	 we just go straight on to (1)?

don't think there's a scriptable way

++	  return fI
trailing semicolon, please

++    if (!fI.nsiUri && (aURL == "about:blank"))
++	// NB. We will REALLY be stuffed later if we don't get a 'real' nsIURI
++	// object soon... Lets just hope for the best with aURL again...
++	fI.nsiUri = makeUriAndPopulateFileInfo(aURL, docCharset);

you called exactly the same function a few lines above. why will it succeed now
if it faild before?

+ * @param aFpP a structure (see caller) 

"see caller"? that firstly requires finding the caller. and it assumes there
can be only one.


review- because you need to take care of applyConversion as per mail
conversation
Attachment #153933 - Attachment is patch: true
Attachment #153933 - Flags: review?(cbiesinger) → review-
Attachment #153933 - Attachment is obsolete: true
Attachment #155271 - Attachment is obsolete: true
Here's 3 new attachments that addresses the content-encoding issue we've
discussed.
Will look at your new comments (#45) 'soon'...
P.
(In reply to comment #45)
> (From update of attachment 153933 [details] [diff] [review])
> +  var fileInfo = null;// = new FileInfo(null, null, null, null);
oops I should have trimmed that comment off

> +	 helperAppService.applyDecodingForExtension(fileInfo.fileExt,
> contentEncodingType)) {
> this will throw an exception (but it's caught... so that's not much of a
> problem)
That code has been ripped out now anyway

> +  var source = useSaveDocument ? aDocument : fileInfo.nsiUri;
> 
> but this will throw an exception. Why not fill fileInfo from aChosenData?
That was one of the original ideas behind fileInfo & the method
getUriAndFileInfo... dunno, haven't given it much thought

> ++  var source = useSaveDocument ? aDocument : fileInfo.nsiUri;
> 
> hrm, this variable name (nsiUri) seems only marginally better. actually I'd
> prefer uriObj to it. better would be just "uri", imo. (like in the AutoChosen
> structure)
I like to see some sort of distinction in the naming of the variable since we
are in a 'loosly-typed' environment, so I was looking for a name that reflects
the type of variable, eg string, some sort of nsI object, etc... and since 'uri'
is used all over the place I was keen to use a different name

> ++    var nsiUri = makeURI(aURL, aOriginCharset)
> please add trailing semicolons
damn. oops.

> ++function makeUriAndPopulateFileInfo(aURL, aOriginCharset)
> 
> hrm, ok.... I'll let neil decide if he likes the name.
> 
> ++    /* Is there a funky Mozilla way to extract an nsIURI from aDocument or do
> ++	 we just go straight on to (1)?
Damn, more crud I left in there

> trailing semicolon, please
****

> ++    if (!fI.nsiUri && (aURL == "about:blank"))
> ++	// NB. We will REALLY be stuffed later if we don't get a 'real' nsIURI
> ++	// object soon... Lets just hope for the best with aURL again...
> ++	fI.nsiUri = makeUriAndPopulateFileInfo(aURL, docCharset);
> 
> you called exactly the same function a few lines above. why will it succeed now
> if it faild before?
Because (aURL == "about:blank"), we trampled on fI.nsiUri that we set the first
time we called makeUriAndPopulateFileInfo(aURL, docCharset). And the things we
are about to do next are different from the first time we called
makeUriAndPopulateFileInfo(aURL, docCharset).

> + * @param aFpP a structure (see caller) 
> 
> "see caller"? that firstly requires finding the caller. and it assumes there
> can be only one.
Hopefully made that clearer.

> review- because you need to take care of applyConversion as per mail
> conversation
Done applyConversion stuff.
Have made above suggested changes (except renaming nsiUri ;-) ).
Will attach a new contentAreaUtils.js diff then mark for review.
[What's up with bugzilla? Whenever I submit anything (new attachment OR
comment) the browser never seems to 'complete' and stays 'busy' but when I load
the call again from scratch I see my changes...]
Attachment #155274 - Attachment is obsolete: true
> ++    /* Is there a funky Mozilla way to extract an nsIURI from aDocument or do
> ++	 we just go straight on to (1)?

> don't think there's a scriptable way

You can easily get a URI string using the DOM apis.  You can also get the
nsIWebNavigation for the document and get its currentURI.
Comment on attachment 155273 [details] [diff] [review]
New nsIWebBrowserPersist.idl with extra flag

ok... no need to change the IID I guess since this interface is binary
compatible.
Attachment #155273 - Flags: review+
Comment on attachment 155276 [details] [diff] [review]
New nsWebBrowserPersist.cpp handling encoded docs

+	 nsCOMPtr<nsIExternalHelperAppService> helperAppService =
+	     do_GetService(NS_EXTERNALHELPERAPPSERVICE_CONTRACTID, &rv);


if you're ignoring the rv anyway, then you shouldn't ask for it... also, no
need to init rv, do_GetService will always write into it.

please check that this getservice succeeded.

+	 nsCOMPtr<nsIEncodedChannel> encChannel = do_QueryInterface( channel );

you MUST check that this do_QI succeeds. not all channel support this; in fact,
only HTTP does.

you can probably do all three checks in the if (sourceURL).


Oh, hm... isn't this onstartrequest called for each file in case of
saveDocument? So you should probably do this after the hashtable lookup;
probably in the
    if (data && data->mFile)
block, and use data->mOriginalLocation as uri. (or get the uri off the channel)
Attachment #155276 - Flags: review-
Comment on attachment 155280 [details] [diff] [review]
New contentAreaUtils.js handling encoded docs

> +  var source = useSaveDocument ? aDocument : fileInfo.nsiUri;
> 
> but this will throw an exception. Why not fill fileInfo from aChosenData?

this will still throw an exception.

>I like to see some sort of distinction in the naming of the variable since we

ok.

>Because (aURL == "about:blank"), we trampled on fI.nsiUri that we set the first
time we called makeUriAndPopulateFileInfo(aURL, docCharset). And the things we
>are about to do next are different from the first time we called
>makeUriAndPopulateFileInfo(aURL, docCharset).

OK. but shouldn't the second call, inside the if (aURL == "about:blank") have
taken care of that?

I am also not sure that this special-casing of about:blank is such a great
idea... on the other hand, it _is_ unlikely that users want to save a blank
page.


review- because I'd like to see a new patch with those changes, but it looks
good otherwise. thank you for doing this!
Attachment #155280 - Flags: review-
(In reply to comment #54)
> if you're ignoring the rv anyway, then you shouldn't ask for it...
I am checking rv now - this is what happens when one does a cut'n'paste ;-)

> also, no need to init rv, do_GetService will always write into it.
I always try to make sure things are initialised when created, regardless of
what later function changes them (even if it is a well-worn bit of code) - call
me pedantic :-). If you wish I'll not initialise it at creation time...

> please check that this getservice succeeded.
> you MUST check that this do_QI succeeds. not all channel support this;
> in fact, only HTTP does.
> you can probably do all three checks in the if (sourceURL).
> 
> Oh, hm... isn't this onstartrequest called for each file in case of
> saveDocument? So you should probably do this after the hashtable lookup;
> probably in the
>     if (data && data->mFile)
> block, and use data->mOriginalLocation as uri. (or get the uri off the channel)

Being more thorough with checks...
In fact I moved the code into a new private function, will make those last
suggestions easier in terms of moving the function call.
Adding code is one thing, but it seems adding a new function is another - I now
get a compile error:

/cygdrive/e/MozillaBuilds/BuildDir/mozilla/build/cygwin-wrapper
e:/MozillaBuilds/moz_tools/bin/nsinstall -m 644 nsIWindowCreator2.idl ../../dist/idl
make[2]: *** No rule to make target `_xpidlgen/nsIWindowCreator2.h', needed by
`export'.  Stop.
make[2]: Leaving directory
`/cygdrive/e/MozillaBuilds/BuildDir/mozilla/embedding/base'
make[1]: *** [export] Error 2
make[1]: Leaving directory `/cygdrive/e/MozillaBuilds/BuildDir/mozilla/embedding'
make: *** [all] Error 2


I was wondering if the header/interface for this class (nsWebBrowserPersist.h)
is affected in any way by the fact that many interfaces are frozen... I was
assuming not, but now I am unsure. Is that what the problem above is?
Maybe I need to do a make clean because I modified a .h file?
(In reply to comment #55)
> (From update of attachment 155280 [details] [diff] [review])
> > +  var source = useSaveDocument ? aDocument : fileInfo.nsiUri;
> > 
> > but this will throw an exception. Why not fill fileInfo from aChosenData?
> this will still throw an exception.
Fixed (fileInfo will not be null).

> OK. but shouldn't the second call, inside the if (aURL == "about:blank") have
> taken care of that?
Not if it didn't resolve the problem (i.e not if it didn't find a 'good'
filename & extension).

> I am also not sure that this special-casing of about:blank is such a great
> idea... on the other hand, it _is_ unlikely that users want to save a blank
> page.
Maybe if ( or *when* ;-) ) this gets released in alpha, we can observe user
feedback... I've just found in practice this is of more benefit than harm. Too
late at the moment to say more :)

> review- because I'd like to see a new patch with those changes, but it looks
> good otherwise. thank you for doing this!
> 

Will submit patch when I fix WBP....
(In reply to comment #56)
> I was wondering if the header/interface for this class (nsWebBrowserPersist.h)
> is affected in any way by the fact that many interfaces are frozen... I was
> assuming not, but now I am unsure. Is that what the problem above is?
> Maybe I need to do a make clean because I modified a .h file?

strange... no, make clean should absolutely not be needed. no idea why that
error happens.

(In reply to comment #57)
> > OK. but shouldn't the second call, inside the if (aURL == "about:blank") have
> > taken care of that?
> Not if it didn't resolve the problem (i.e not if it didn't find a 'good'
> filename & extension).

+      fI.nsiUri = makeUriAndPopulateFileInfo(aUserTypedUrl, docCharset);
+      fI.nsiUri = makeUriAndPopulateFileInfo(aURL, docCharset);

ok, nevermind me, I read aURL where you wrote aUserTypedUrl.
Flags: blocking1.8a3? → blocking1.8a3-
Attached patch All-inclusive patch (4 files) (obsolete) — Splinter Review
New patch, 4 modified files:

nsIWebBrowserPersist.idl
nsWebBrowserPersist.h
nsWebBrowserPersist.cpp
contentAreaUtils.js

This should address all issues raised in comments #54, #55.

Also had to add some special handling when saving link to a site - without
HEAD, content type unknown, so .html was missing from saved filename.

(BTW, compile problem (comment #56) went away with a make clean... strange...)
Attachment #155273 - Attachment is obsolete: true
Attachment #155276 - Attachment is obsolete: true
Attachment #155280 - Attachment is obsolete: true
Flags: blocking1.8a4?
*** Bug 257066 has been marked as a duplicate of this bug. ***
Attachment #156336 - Flags: review?
Attached patch All-inclusive patch (4 files) (obsolete) — Splinter Review
Fixed a trivial 1-char mistake ;)
Attachment #156336 - Attachment is obsolete: true
Attachment #157606 - Flags: review?(cbiesinger)
*** Bug 246633 has been marked as a duplicate of this bug. ***
Assignee: bzbarsky → Time_lord
Comment on attachment 157606 [details] [diff] [review]
All-inclusive patch (4 files)

this patch is not nice for people who want to apply it... you should have
created it like:
cd mozilla; cvs diff -u6p embedding/components xpfe/communicator  > foo.diff

nsWebBrowserPersist.cpp:
+void nsWebBrowserPersist::ModifyIfConversionRequired(nsIChannel *aChannel)
+{
+	nsresult rv = NS_OK;

wrong indentation on the nsresult line (seems to be a tab)

+    if (NS_FAILED(rv) || !encChannel)

no need to check both of these conditions. rv is a failure code if and only if
the QI returned null.

+    aChannel->GetOriginalURI(getter_AddRefs(thisURI));

you want GetURI I think, not GetOriginalURI. sorry for my misleading comment
54.

+	     NS_ASSERTION(helperAppService, "Where did the service go?");

well, ok... if you want... but it's a pointless assertion, since if the
do_GetService fails, you returned early. (above)

contentAreaUtils.js:

+ * @param aURL The String representation of the URL of the document being
saved,
+ * according to the browser window, or supplied with 'save-link-target-as'

some kind of formatting would be nice. for example:

@param aURL
       The string repre..
       according...
@param aUserTypedURL
       The URL ...

too tired for the rest of the review now... will continue tomorrow
Comment on attachment 157606 [details] [diff] [review]
All-inclusive patch (4 files)

contentAreaUtils.js:
+    fileInfo = getUriAndFileInfo(fileInfo, aURL, aDocument, aUserTypedUrl,
contentType);

why are you both passing fileInfo to this function, and also returning it from
it?


In file "contentAreaUtils.js":
------------------------------
340:   var source = useSaveDocument ? aDocument : fileInfo.nsiUri;

so, if aChosenData was passed, then nsiUri will be null. correct? that would be
not good.
it looks like you wrote this code to mainly work w/ passing documents to it if
you pass aChosenData, but it would be good to have this work w/ urls too. how
about always filling the file info from the passed data, and then setting
aDefaultFilename from aChosenData?


In file "contentAreaUtils.js":
------------------------------
370:	   var destFileName = (aChosenData ? aChosenData.file :
fpParams.fp.file);
371:	   filesFolder = destFileName.clone();

destFileName is not a name, it's an object... can you name it "destFile"
instead?


In file "contentAreaUtils.js":
------------------------------
782: // Given aFileName, find the fileName without the extension on the end.
783: function getFileBaseName(aFileName, aFileExt)
784: {
785:   // Is there a better 'formal' way of doing this? Or an XPCom component?
786:   // Remove the file extension from aFileName (it was done this way prior
to v1.8 also):
787:   return aFileName.replace(/\.[^.]*$/, "");
788: }
789: 

nsIURLParser::parseFileName could do it. but this is fine for now.


Looks good. r=me with the above changes, feel free to transfer it to a new
patch yourself.
Attachment #157606 - Flags: review?(cbiesinger) → review+
(In reply to comment #63)
> this patch is not nice for people who want to apply it... you should have
> created it like:
> cd mozilla; cvs diff -u6p embedding/components xpfe/communicator  > foo.diff
Ok... you mean so the file being diff'd has the full (relative path) to it? Ok,
will do...
You mentioned a little while back that I should put all the diffs together in
one patch - have I got that bit correct?
I'm not familiar with the way mozilla reviewers use these CVS diff 'patches' - I
guess to apply a patch you use some CVS commands? Is there are recommended set
of tools...?

> wrong indentation on the nsresult line (seems to be a tab)
Done

> no need to check both of these conditions. rv is a failure code if and only if
> the QI returned null.
ok, changed - that's my unfamiliarity with the Moz infrastructure.

> you want GetURI I think, not GetOriginalURI. sorry for my misleading comment
> 54.
Done - still seems to work ;-)

> well, ok... if you want... but it's a pointless assertion, since if the
> do_GetService fails, you returned early. (above)
Removed

> some kind of formatting would be nice.
Done

(In reply to comment #64)
> contentAreaUtils.js:
> +    fileInfo = getUriAndFileInfo(fileInfo, aURL, aDocument, aUserTypedUrl,
> contentType);
> why are you both passing fileInfo to this function, and also returning it from
> it?
Ummm... means you can do
    myObj = SomeMethod(...)
which is easier for some to read (i.e the method returns something), as opposed
to modifying the contents of the 'in' parameter. Also this is a left-over from
when I was creating a new instance of the FileInfo object inside the method (but
now I can't because I'm setting the default filename in it). Anyway, maybe I
could have changed it to not return the FileInfo object...

> In file "contentAreaUtils.js":
> ------------------------------
> 340:   var source = useSaveDocument ? aDocument : fileInfo.nsiUri;
> so, if aChosenData was passed, then nsiUri will be null. correct? that would be
> not good.
This bit was bothering me too. However in the scenarios I considered, this could
not occur, and I haven't managed to reproduce any problem.

> it looks like you wrote this code to mainly work w/ passing documents to it if
> you pass aChosenData, but it would be good to have this work w/ urls too. how
> about always filling the file info from the passed data, and then setting
> aDefaultFilename from aChosenData?
This is what I do, i.e with urls the default filename is put into the fileinfo
structure.

> destFileName is not a name, it's an object... can you name it "destFile"
> instead?
Yep

> Looks good. r=me with the above changes, feel free to transfer it to a new
> patch yourself.
Sorry, unfamiliar with your abbreviation "r=me". I think you mean this patch is
now ok with you (but put in the tweaks you've mentioned)?
I'm attaching the new patch... where to from here, how does it get into the main
code tree?
Attached patch All-inclusive patch (4 files) (obsolete) — Splinter Review
Attachment #157606 - Attachment is obsolete: true
Comment on attachment 158620 [details] [diff] [review]
All-inclusive patch (4 files)

Request final(?) review. When then, a superreview? If so, who (or do I leave
that blank)?
Attachment #158620 - Flags: review?(cbiesinger)
(In reply to comment #65)
> Ok... you mean so the file being diff'd has the full (relative path) to it? 

Correct.

> You mentioned a little while back that I should put all the diffs together in
> one patch - have I got that bit correct?

Basically yes, but with the mentioned exception that the paths are incorrect.
Looks like your latest patch does it right - thanks.

> I'm not familiar with the way mozilla reviewers use these CVS diff 'patches' - I
> guess to apply a patch you use some CVS commands? Is there are recommended set
> of tools...?

No, they use the "patch" command :-) Like this:
cd mozilla; patch -p0 < name_of_patch_file

Recommended set of tools is, basically, cvs, diff and patch.


> This bit was bothering me too. However in the scenarios I considered, this could
> not occur, and I haven't managed to reproduce any problem.

It seems to me like it will occur if you want to save a URL, rather than a document.

>>  how
> > about always filling the file info from the passed data, and then setting
> > aDefaultFilename from aChosenData?
> This is what I do, i.e with urls the default filename is put into the fileinfo
> structure.

You only fill the default filename in those cases, I think.


> Sorry, unfamiliar with your abbreviation "r=me". I think you mean this patch is
> now ok with you (but put in the tweaks you've mentioned)?

oh sorry. yes, that's what it means - "r" means "review", "sr" would mean
"super-review". "r=me" means "you now have a review+ from me". similarly for
sr=<someone>

> I'm attaching the new patch... where to from here, how does it get into the main
> code tree?

once I mark review+ again, you ask for super-review, probably from bzbarsky@mit.edu.
Attachment #158620 - Flags: review?(cbiesinger) → review+
(In reply to comment #68)
> No, they use the "patch" command :-) Like this:
> cd mozilla; patch -p0 < name_of_patch_file
Ta :)

> > This bit was bothering me too. However in the scenarios I considered, this could
> > not occur, and I haven't managed to reproduce any problem.
> It seems to me like it will occur if you want to save a URL, rather than a
document.
Saving URLs seems ok also. I want to have a closer look at this some time when I
can clear by brain, but for now I'm satisfied that it works...

>> > about always filling the file info from the passed data, and then setting
>> > aDefaultFilename from aChosenData?
>> This is what I do, i.e with urls the default filename is put into the fileinfo
>> structure.
> You only fill the default filename in those cases, I think.
Actually you are right, I recall the original reason I created the FileInfo
structure and pass the work off to getUriAndFileInfo(...) was to make the code
more modular (and so it could be used by other code). But regardless,
getUriAndFileInfo *does* fill FileInfo in pretty much most cases :)

> once I mark review+ again, you ask for super-review, probably from
bzbarsky@mit.edu.
Thanks for your help, will do.
Attachment #158620 - Flags: superreview?(bzbarsky)
Comment on attachment 158620 [details] [diff] [review]
All-inclusive patch (4 files)

>Index: embedding/components/webbrowserpersist/public/nsIWebBrowserPersist.idl
>+  /**
>+   * Let the WebBrowserPersist decide during OnStartRequest

The fact that there is an OnStartRequest is really an implementation detail. 
Just remove the "during OnStartRequest" part, please.

>Index: embedding/components/webbrowserpersist/src/nsWebBrowserPersist.h

>+    void ModifyIfConversionRequired(nsIChannel *aChannel);

I'd call this method ApplyConversionIfNeeded(), myself...  Or
SetApplyConversionIfNeeded...  If you want to keep the "Modify" and
"ConversionRequired" parts, at least make it "ModifyWhetherConversionRequired"
(longer, but more grammatically correct).

>Index: embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp

>+        // If PERSIST_FLAGS_AUTODETECT_APPLY_CONVERSION is set in mPersistFlags this
>+        // is because the process that requested a save (eg save page or save link
>+        // target)

I don't think that parenthetical is needed.  In fact, I don't think we need to
explain _why_ one would set this flag; just comment on what we should do with
it.  So:

 // If PERSIST_FLAGS_AUTODETECT_APPLY_CONVERSION is set in
 // mPersistFlags, try to determine whether this channel needs to apply
 // Content-Encoding conversions

or something.

> +void nsWebBrowserPersist::ModifyIfConversionRequired(nsIChannel *aChannel)

>+    // Set the default conversion preference?
>+    encChannel->SetApplyConversion(PR_TRUE);

The code in contentAreaUtils.js defaulted to _false_ for this.	I'd like to see
this code do the same.

>+    if (extension.IsEmpty())
>+        return;

Why?  Empty extension is ok as far as this code is concerned... It's none of
its business what ApplyDecodingForExtension plans to do with the extension. 
Please remove this check

>+        if (NS_SUCCEEDED(rv) && !encType.IsEmpty())

Again, please remove the IsEmpty() check.

>+            PRBool applyConversion = PR_TRUE;
>+            helperAppService->ApplyDecodingForExtension(extension,
>+                encType, &applyConversion);

Check return value of ApplyDecodingForExtension, and only call
SetApplyConversion here if it succeeded.

Also, please line up the args to the function (one per line if needed).

>Index: xpfe/communicator/resources/content/contentAreaUtils.js

>+  internalSave(aURL, null, null,
>+    aFileName, aShouldBypassCache, aFilePickerTitleKey, null);

Again, I would be happier with lining up the args and using more lines if
needed.

>+ * internalSave: Used when saving a document or URL. This functionality
>+ * rearranged for Mozilla 1.8+ to get rid of the HeaderSniffer code.

The rearranged part will be clear from the CVS log.  Please remove that
sentence.

>+ * <ul>

Please don't put HTML in code comments...  If you want something that looks
like a list, use 'o' or '*' characters for the list items.

>+ * <li>Create a 'Persist' object (which will perform the saving in the
>+ * background) and then start it.</li>

"creates" and "starts".

>+ * @param aURL The String representation of the URL of the document being saved,
>+ *        according to the browser window or supplied with 'save-link-target-as'

No need for the "according ...." part.	Remove it, please.

>+ * @param aUserTypedUrl The URL as supplied by the user. This should be the same
>+ *        as aURL except when this document is not yet available
[etc]

Is there a reason not to push this off to callers?  For example, callers could
pass in the userTypedURL for cases when the actual URL they have is
about:blank.  Or are there cases when you would want to have the about:blank in
this function?

I have to be frank -- I'm not sure why we're doing this.  It seems to me that
it will lead to inconsistencies.  Say the user types a URL in the URL bar and
then hits "save" and we pretend that we're saving what the user typed, but we
have a document object (for the about:blank) document.... so we will let the
user save as "web page, complete", which will save the about:blank document. 
Or am I missing something?

>+ * @param aDocument The document (already cached/rendered & available)

I'd say "The document object to be saved", no more.  In fact, this document may
be neither cached nor rendered (all that's required here is that it be a DOM
document).

>+ * @param aDefaultFileName The caller-provided filename, if any (supplied for
>+ *        example when this method is called by saveURL)

No need to list example callers in the API documentation.  I'd call this "The
caller-provided suggested filename if we don't find a better one"

>+ * @param aFilePickerTitleKey Alternate Title for the FilePicker

Don't capitalize "Title".  And "file picker" should be two words, both
lower-case.

>+ * @param aChosenData If non-null this contains an instance of object AutoChosen

This is ok, I guess (apart from being unused thus far).  I can't say I like the
name, but I can't think of a better one, so....

>+function internalSave(aURL, aUserTypedUrl, aDocument, aDefaultFileName,
>+  aShouldBypassCache, aFilePickerTitleKey, aChosenData)

Again, please line up the args.

>+  // Note: aDocument == null when this code is used by the save-link-as...

No need for "the" before "save-link-as".  In fact, there may be no need for
explaining _when_ aDocument is null; just comment that it may be.

>+  // Find the URI object for aURL and the FileName/Extension to use when saving.
>+  // FileName/Extension will be ignored if aChosenData supplied.
>+  var fileInfo = new FileInfo(null, null, null, null, aDefaultFileName);
>+  if (!aChosenData)
>+    fileInfo = getUriAndFileInfo(fileInfo, aURL, aDocument, 
aUserTypedUrl, contentType);

Why not just have a function that takes aURL, aDocument, aUserTypedUrl,
contentType, and aDefaultFileName and returns a FileInfo?

If you do want to keep this as is, the function doesn't need to return a
fileinfo (since it just writes to the fields of the one you pass in) and would
be better called initFileInfo()....

>+  // Note when this method is called by save-link-target-as, aDocument==null
...
>+  // NOTE (May 2004): This to be put on ice because nsWebBrowserPersist::SaveURI

Please don't check this whole comment in.  Just file a followup bug on this? 
Especially since I'm not sure the approach this comment describes is the best
one....

>+  var isDocument = aDocument != null && saveMode;

I know you just modified the existing code, but make that

  var isDocument = aDocument != null && saveMode != SAVEMODE_FILEONLY;

please?  That makes it clearer what's going on...

>+  // If aChosenData is null then the file picker is shown.
>+  if (!aChosenData) {
>+    if (!manageFilePicker(fpParams))

I'd call that method "poseFilePicker".

>+                        ((fpParams.saveMode & SAVEMODE_COMPLETE_DOM && fpParams.fp.filterIndex == 0) ||
>+                         (fpParams.saveMode & SAVEMODE_COMPLETE_TEXT && fpParams.fp.filterIndex == 2));

Again, I know you just slightly modified it, but please put parens around the
'&' expressions?  I can't recall whether '&' or '&&' has higher precedence in
JS, so it would be good to disambiguate this.

>   const flags = nsIWBP.PERSIST_FLAGS_NO_CONVERSION | nsIWBP.PERSIST_FLAGS_REPLACE_EXISTING_FILES;

You need to remove "nsIWBP.PERSIST_FLAGS_NO_CONVERSION" from that line.

>       var filesFolderLeafName = getStringBundle().formatStringFromName("filesFolder",
>                                                                        [nameWithoutExtension],
>                                                                        1);
>-
>       filesFolder.leafName = filesFolderLeafName;

Don't make random whitespace changes, please.

>+/**
>+ * Structure for holding info about a URL and target filename (for when it is
>+ * locally saved).


"URL and the target filename it should be saved to"

>+function FileInfo(aFileName, aFileBaseName, aFileExt, aNsiUri, aSuggestedFileName) {
>+  this.fileName = aFileName;
>+  this.fileBaseName = aFileBaseName;
>+  this.fileExt = aFileExt;
>+  this.nsiUri = aNsiUri;
>+  this.suggestedFileName = aSuggestedFileName;

What is the difference between suggestedFileName and fileName?	Please document
that.

I'd call the "nsiUri" member just "uri" and document that it should be an
nsIURI object.

>+/**
>+ * Structure for holding info about automatically supplied parameters for the
>+ * internalSave(...)

No need for "the" before "internalSave".

>+/**
>+ * Constructs a new URI from the supplied URL (string) and charset, then uses
>+ * QueryInterface is used on the resulting URI object to obtain a more specific
>+ * type of URI (containing file info).

This setup (QI magically adding more stuff to the interfaces flattened on the
XPCOM wrapper) may actually become unsupported (and stop working) in the
not-too-distant future.  I'd rather you didn't do it.  Store both a "uri" and a
"url" in the fileinfo (the latter the result of QIing "uri" to nsIURL). 
Document that that's what the fields mean, then use "url" as appropriate in the
code....

Instead of setting .uri on the file info, you could move the "QI to nsIURL and
set .url" part into a setURI method on the object, then pass URIs to that
method when you want to set them.

>+/**
>+ * Given a passed FileInfo structure, determine if all its important

"whether all"

>+ * @return true if the FileInfo structure is considered 'complete' and ready.

"and false otherwise", right?

Maybe this should be called validateFileInfo()?  That's more in keeping with
what it does.  Or even fileInfoIsValid()?

For that matter, is there a reason not to make this a member isValid() function
on FileInfo?

>+function checkFileNameAndExt(aFI, aContentType)
....
>+  aFI.fileExt      = (uri && uri.fileExtension ? uri.fileExtension : null);

Why?  Don't you just want to use getDefaultExtension() here? It seems that
you're ignoring the filename in the fileinfo altogether if the URI has an
extension... is that desired?  If so, why?

>+ * Determine what the 'default' filename string is, its file extension and the
>+ * filename without the extension. This filename is used when prompting the user
>+ * for confirmation in the FilePicker dialog.

"file picker" is two lowercased words.

>+ * @param aURL The String representation of the URL of the document being saved,
>+ *        according to the browser window, or supplied with 'save-link-target-as'

Remove everything after (and including) the comma, please.  It doesn't matter
where the URL came from.

>+ * @param aDocument The document (already cached/rendered & available)

Again, remove the parenthetical (which is not even true).  Just say "the
document object being saved, if any"

>+ * @param aUserTypedUrl The URL as supplied by the user.

This is repeating a lot of the comments of internalSave.  Please just point to
those with an @see.

>+function getUriAndFileInfo(aFI, aURL, aDocument, aUserTypedUrl, aContentType)

I haven't reviewed this method in detail, thus far, because I think it can be
simplified greatly if the about:blank issue is handled up front (or not handled
at all, for now) instead of passing it all the way down into this code.

>+function manageFilePicker(aFpP)

>+  fp.init(window, bundle.GetStringFromName(titleKey),
>+    Components.interfaces.nsIFilePicker.modeSave);

Please line up the args (line up "Components" with "window").

>+  fp.defaultString = getNormalizedLeafName(aFpP.fileInfo.fileName,
>+    aFpP.fileInfo.fileExt);

And here.

>+  appendFiltersForContentType(fp, aFpP.contentType, aFpP.fileInfo.fileExt,
>+    aFpP.saveMode);

And here.

>+function getFileBaseName(aFileName, aFileExt)
>+{
>+  // Is there a better 'formal' way of doing this? Or an XPCom component?

There is not.  Just remove this comment.

>+  // Remove the file extension from aFileName (it was done this way prior to v1.8 also):

And remove the parenthetical here.

>+function getDefaultFileName(aDefaultFileName, aDocument, aDocumentURI)

You seem to have changed the order of things this function looks at.  Please
don't do that.

>+  if (aDefaultFileName && (aDefaultFileName != ""))

If the first test succeeds, the second will always succeed.  Just test
|aDefaultFileName|, since "" tests false in JS.

>+  if (aDocumentURI)

Can aDocumentURI be null here?

I did not review this following part too carefully, since it needs to be put
back in the right order.  I'd very much appreciate a diff -w (in addition to
the normal diff without -w); that would make reviewing this part much simpler.

Marking sr-, but this is great stuff, and it's pretty close to being ready to
go.  I won't be able to look again until Sunday morning, but if you can get an
updated patch up by then I'll look then.

Thanks again for doing this!
Attachment #158620 - Flags: superreview?(bzbarsky) → superreview-
ok, I finally figured out what the caller-defined filename passed to saveURL is.
it is basically the text of an <a href> when you click it, with "save link
target as" or shift-clicking. so, I fully agree w/ bz, do not change the order.

Attached patch All-inclusive patch (4 files) (obsolete) — Splinter Review
Attachment #158620 - Attachment is obsolete: true
(In reply to comment #70)
...<snip>
Made various suggested changes above

> >+    if (extension.IsEmpty())
> >+        return;
> Why?  Empty extension is ok as far as this code is concerned...
Just following the ways simeilar code was implemented elsewhere...

> >+        if (NS_SUCCEEDED(rv) && !encType.IsEmpty())
> Again, please remove the IsEmpty() check.
Done

> Check return value of ApplyDecodingForExtension, and only call
> SetApplyConversion here if it succeeded.
good point, done

> The rearranged part will be clear from the CVS log.  Please remove that
> sentence.
> Please don't put HTML in code comments...  If you want something that looks
> like a list, use 'o' or '*' characters for the list items.
Done

> >+ *        as aURL except when this document is not yet available
> [etc]
> 
> Is there a reason not to push this off to callers?  For example, callers could
> pass in the userTypedURL for cases when the actual URL they have is
> about:blank.  Or are there cases when you would want to have the about:blank in
> this function?
> I have to be frank -- I'm not sure why we're doing this.  It seems to me that
> it will lead to inconsistencies.  Say the user types a URL in the URL bar and
> then hits "save" and we pretend that we're saving what the user typed, but we
> have a document object (for the about:blank) document.... so we will let the
> user save as "web page, complete", which will save the about:blank document. 
> Or am I missing something?
Maybe this can do with some more thought, but all I can say for now is that I
spent quite a while trying to get this code 'just right'. It seems that quite a
few issues can occur depending on when you trigger the save....

> >+ * @param aChosenData If non-null this contains an instance of object AutoChosen
> This is ok, I guess (apart from being unused thus far).
The presence of this allows internalSave to be used more easily by other code,
i.e in my case, a 'save-all-tabs' feature - but I gather I need to run that by
someone like Neil. (It all works ;-) )

> No need for "the" before "save-link-as".  In fact, there may be no need for
> explaining _when_ aDocument is null; just comment that it may be.
One of the factors behind being verbose in my comments has been to document
things that would hopefully help - anyone else reading it - just how & why
things typically get called.
Cleaned up comment :-)

> >+  var fileInfo = new FileInfo(null, null, null, null, aDefaultFileName);
> >+  if (!aChosenData)
> >+    fileInfo = getUriAndFileInfo(fileInfo, aURL, aDocument, 
> Why not just have a function that takes aURL, aDocument, aUserTypedUrl,
> contentType, and aDefaultFileName and returns a FileInfo?
> If you do want to keep this as is, the function doesn't need to return a
> fileinfo (since it just writes to the fields of the one you pass in) and would
> be better called initFileInfo()....
Cleaned up, but don't quite agree with calling the method initFileInfo because
it's much more than that...

> >+  // NOTE (May 2004): This to be put on ice because nsWebBrowserPersist::SaveURI
> 
> Please don't check this whole comment in.  Just file a followup bug on this? 
> Especially since I'm not sure the approach this comment describes is the best
> one....
... removed... as much a note to myself. Who knows - I /may/ find time after
this to implement save-link-as for complete web pages ;-)

> I know you just modified the existing code, but make that
... Done

> I'd call that method "poseFilePicker".
ok

> >   const flags = nsIWBP.PERSIST_FLAGS_NO_CONVERSION |
nsIWBP.PERSIST_FLAGS_REPLACE_EXISTING_FILES;
> You need to remove "nsIWBP.PERSIST_FLAGS_NO_CONVERSION" from that line.
oops, that's a good point :-[ done

> I'd call the "nsiUri" member just "uri" and document that it should be an
> nsIURI object.
aahhhh... have had this discussion a couple of times... was orginally "uriObj".
What I was trying to get across to the reader is that the variable is an nsI
object (and not just a lowly string). I.e, looking at "uri" (or "url"), how is
the reader to know, other than tracing it down to where it's initialised?
Ok, have changed it.

> >+ * Constructs a new URI from the supplied URL (string) and charset, then uses
> >+ * QueryInterface is used on the resulting URI object to obtain a more specific
> >+ * type of URI (containing file info).
> 
> This setup (QI magically adding more stuff to the interfaces flattened on the
> XPCOM wrapper) may actually become unsupported (and stop working) in the
> not-too-distant future.  I'd rather you didn't do it.  Store both a "uri" and a
> "url" in the fileinfo (the latter the result of QIing "uri" to nsIURL). 
> Document that that's what the fields mean, then use "url" as appropriate in the
> code....
> 
> Instead of setting .uri on the file info, you could move the "QI to nsIURL and
> set .url" part into a setURI method on the object, then pass URIs to that
> method when you want to set them.
This is the only thing from your comments that I haven't looked
at/actioned/changed. It works for now and I can come back to it.

> >+ * @return true if the FileInfo structure is considered 'complete' and ready.
> "and false otherwise", right?
Is there an alternative to a boolean? Sorry, couldn't resist ;-)

> For that matter, is there a reason not to make this a member isValid() function
> on FileInfo?
... because JS is not my forte ;-) So maybe I could do this some time in the future

> >+  aFI.fileExt      = (uri && uri.fileExtension ? uri.fileExtension : null);
> Why?  Don't you just want to use getDefaultExtension() here? It seems that
> you're ignoring the filename in the fileinfo altogether if the URI has an
> extension... is that desired?  If so, why?
Right after this I check for a blank filename, so setting fileExt above does no harm

> I haven't reviewed this method in detail, thus far, because I think it can be
> simplified greatly if the about:blank issue is handled up front (or not handled
> at all, for now) instead of passing it all the way down into this code.
What I was trying to achieve was one single method where the filename & ext is
figured out (which - in my tinkering - seems to need to include handing of
about:blank). The whole thing with the browser having a 'special' about:blank
page seems to create problems...
I encountered all those issue months back, but maybe I need to look at it again.
Soon. ;-)

> >+  // Is there a better 'formal' way of doing this? Or an XPCom component?
> There is not.  Just remove this comment.
... as much a note to you guys who know the code better :-) Beisi mentioned
another way, but this is nice & terse - comment removed.

> >+function getDefaultFileName(aDefaultFileName, aDocument, aDocumentURI)
> You seem to have changed the order of things this function looks at.  Please
> don't do that.
Ok... and potentially I was skipping the first catch block.
Fixed.

> If the first test succeeds, the second will always succeed.  Just test
> |aDefaultFileName|, since "" tests false in JS.
cool

> >+  if (aDocumentURI)
> Can aDocumentURI be null here?
Dunno, haven't thought about it... first one removed but the second remains.

> Marking sr-, but this is great stuff, and it's pretty close to being ready to
> go.  I won't be able to look again until Sunday morning, but if you can get an
> updated patch up by then I'll look then.
> 
> Thanks again for doing this!

Ok, attaching a new patch...
Attachment #159391 - Flags: superreview?(bzbarsky)
Attached file diff -w of contentAreaUtils.js (obsolete) —
> Just following the ways simeilar code was implemented elsewhere...

Copy-paste is a bad way to do things... ;)  You were moving existing JS code
into C++; the thing to do was to just do what the JS code does.

> but all I can say for now is that I spent quite a while trying to get this
> code 'just right'

My point is that as far as I can tell it's wrong and introduces a lot of
unnecessary complexity into this patch.  I don't think I can in good conscience
give sr to this code; I'd prefer it spun off into a separate bug and given a lot
of thought.  (needs fixing)

> but don't quite agree with calling the method initFileInfo because it's much
> more than that

Like what?  It just sets the various internal fields of the FileInfo, no?

> It works for now and I can come back to it.

Famous last words.  This really needs to be fixed before I'll mark sr+ on the
patch... (needs fixing)

> Is there an alternative to a boolean?

Well, this is JS.  You can't restrict the return values to booleans in the
function declaration, so if they are so restricted that needs to be clearly
documented.

>> is there a reason not to make this a member isValid() function on FileInfo?
> maybe I could do this some time in the future

I'd ask around on irc (maybe even catch me there?) on how to do this.  I'd
really prefer this change to happen before this patch lands.  (needs fixing)

> Right after this I check for a blank filename, so setting fileExt above does
> no harm

Actually, I think it does (or rather this code as a whole does).  This function,
as written, computes the extension solely from the URI.  The old code used the 
"default filename" (which could come from something other than the URI) to
compute the extension.  That approach was a lot better, imo.  (needs fixing)

The issues I mention in this comment (the ones marked "needs fixing") really
need to be addressed before the next round of review....
(In reply to comment #75)
> Copy-paste is a bad way to do things... ;)  You were moving existing JS code
> into C++; the thing to do was to just do what the JS code does.
It sure is... Actually, it was a c'n'p of other C++ code. Any it was a starting
point since I don't know the codebase ;-)

> My point is that as far as I can tell it's wrong and introduces a lot of
> unnecessary complexity into this patch.  I don't think I can in good conscience
> give sr to this code; I'd prefer it spun off into a separate bug and given a lot
> of thought.  (needs fixing)
Ok, have pulled it out. I was having a lot of problems in the early days of this
patch, but seems to be ok now. So maybe there isn't need for all that stuff
(backs away sheepishly ;-) )

> > but don't quite agree with calling the method initFileInfo because it's much
> > more than that
> Like what?  It just sets the various internal fields of the FileInfo, no?
Calling something "init" seems appropriate when initialising something to zero,
or defaults, etc, but not for some code which is figuring out a whole lotta
stuff. I dunno, may be personal preference.

> Famous last words.  This really needs to be fixed before I'll mark sr+ on the
> patch... (needs fixing)
Sure, heh heh... ok, changed, see if that's what you were meaning.

> > Is there an alternative to a boolean?
> Well, this is JS.  You can't restrict the return values to booleans in the
> function declaration, so if they are so restricted that needs to be clearly
> documented.
Ok, true enough, but if someone uses one variable for normally putting a boolean
into but in some cases puts something else there, then that's pretty cruel on
others using the code. (No flames please, sure some many may disagree.)
Yep, good documentation helps.

> >> is there a reason not to make this a member isValid() function on FileInfo?
Done

> > Right after this I check for a blank filename, so setting fileExt above does
> > no harm
> Actually, I think it does (or rather this code as a whole does).  This function,
> as written, computes the extension solely from the URI.  The old code used the 
> "default filename" (which could come from something other than the URI) to
> compute the extension.  That approach was a lot better, imo.  (needs fixing)
Well taking all code into context, first the URI is used (which is a replacement
for the way the HEAD info was used), /then/ if this fails the "default filename"
is used (as in the old code).
Anyway, things may be easier to follow now I've taken out the special
"about:blank" code...

> The issues I mention in this comment (the ones marked "needs fixing") really
> need to be addressed before the next round of review....

Ok, all ready to go, will attach, and a new diff -w....
Attached patch All-inclusive patch (4 files) (obsolete) — Splinter Review
Attachment #159391 - Attachment is obsolete: true
Attachment #159468 - Flags: superreview?(bzbarsky)
Attached file diff -w of contentAreaUtils.js (obsolete) —
Attachment #159393 - Attachment is obsolete: true
Attachment #159391 - Flags: superreview?(bzbarsky) → superreview-
Comment on attachment 159468 [details] [diff] [review]
All-inclusive patch (4 files)

>Index: embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp

I just realized that the flag setter here should probably assert if both the
"guess yourself" flag and the "no conversion" flag are set.

>Index: xpfe/communicator/resources/content/contentAreaUtils.js
>@@ -216,13 +216,14 @@ function findParentNode(node, parentNode
>+  internalSave(aURL, null, null,
>+               aFileName, aShouldBypassCache, aFilePickerTitleKey, null);

What's up with the weird linebreak positioning?  Push more stuff to that first
line?

>+  var browser = getBrowser();

I thought we'd decided to get rid of the aUserTypedURL stuff until we have a
chance to sort out what, if anything, it should do.  Please remove these
vestiges of it... (including the comments that no longer correspond to the
code, etc.).

> Calling something "init" seems appropriate when initialising
> something to zero, or defaults, etc

In Mozilla, init() is something that needs to be called after the constructor
to make an object usable.  That's exactly the situation here.  Read on, though.

I still fail to see why this function  is not simply rolled into the
constructor, (note that most of the constructor args are currently unused!)....

>+ * Structure for holding info about a URL and the target filename it should be
>+ * saved to. This structure is populated by the getUriAndFileInfo(...).

No need for "the" before function name (and again, I think the constructor
should populate the structure).

>+ * Given a passed FileInfo structure, determine whether all its important fields
>+ * are valid and hence whether we have a filename/ext and nsIURI object to use
>+ * in the internalSave method (including in the file picker if applicable).

So I checked again, read this code carefully, and this method definitely only
uses the URI to set the extension.  That's not so bad, because it's only called
when we are trying to set the extension from the URI.  But this needs to be
clearly documented.  Given that, isValid() seems to be a bad name for this
function, too....

Frankly, all this code seems very cumbersome.  Also, why is lack of default
extension considered cause for not using the filename from the URI?

The old architecture  ("stick all available data into a function to get a
default filename, then pick a default extension") was much better, in my
opinion... In the new setup the flow of control is very unclear.

>+ * Constructs a new URI from the supplied URL (string) and charset, then uses
...
>+ * @return an nsIURI object based on aURL.

This function doesn't return anything, actually...

>+ * @return a FileInfo structure (for the internalSave method to use).
>+ */
>+function getUriAndFileInfo(aFI, aURL, aDocument, aUserTypedUrl, aContentType)

This function doesn't return anything.....

>+    // 1. If we can get a filename/ext from aURL then use it:
>+    makeUriAndPopulateFileInfo(aFI, aURL, docCharset);

>+    // 2. We still don't have a filename so then call the 'getDefault' methods
>+    // from the previous incantation of this code.

Won't this duplicate the work done by makeUriAndPopulateFileInfo?  Which is
what makes me question the need for makeUriAndPopulateFileInfo at all.

Also, don't comment about "previous incantation", etc.	The code is just the
code.  People who want history get it via CVS.

>+    // If aFI.fileExt is still blank, consider: aFI.suggestedFileName is supplied
>+    // if saveURL(...) was the original caller (hence both aContentType and
>+    // aDocument are blank). If they were saving a link to a website then make
>+    // the extension .html

I'd really rather not do this.	It's pretty wrong, in my opinion... All sorts
of non-HTML stuff gets served via HTTP.

Again, please leave as much of the old (carefully written and well-tested)
filename/file-extension code around as possible instead of reinventing it all.

>+function getDefaultFileName(aDefaultFileName, aDocumentURI, aDocument)

>+  if (aDocumentURI)

The old code never had this null.  Where in your code could it become null,
exactly?

Also, this is breaking the Content-Disposition headers for the case when we
_do_ have it around (when aDocument is not null).  That's OK for a followup
bug, assuming it gets fixed, I guess.
Attachment #159468 - Flags: superreview?(bzbarsky) → superreview-
(In reply to comment #79)
> I just realized that the flag setter here should probably assert if both the
> "guess yourself" flag and the "no conversion" flag are set.
Done

> >+  internalSave(aURL, null, null,
> >+               aFileName, aShouldBypassCache, aFilePickerTitleKey, null);
> What's up with the weird linebreak positioning?  Push more stuff to that first
> line?
ok... ALL other places the method was called there were 3 args on the first
line, 4 on the second... just so happens it looks sucky when there are lots of
nulls. So I changed it.

> In Mozilla, init() is something that needs to be called after the constructor
> to make an object usable.  That's exactly the situation here.  Read on, though.
What I was trying to do was make my method name meaningful and more obvious to
the reader. Calling lots of methods initSomething and initThisnThat doesn't help
anyone a great deal, except those that already know the codebase (IMHO) but lets
not get into those discussions here, for fear of offending many.
Anyway, I changed it.

> I still fail to see why this function  is not simply rolled into the
> constructor, (note that most of the constructor args are currently unused!)....
The FileInfo structure and getUriAndFileInfo method used to be much more
flexible, but with successive reviews they are getting less and less so, and
morphing into something that they weren't before.

> No need for "the" before function name (and again, I think the constructor
> should populate the structure).
... running all the extra code and populating the structure is largely a waste
of time when aChosenData!=null.

> when we are trying to set the extension from the URI.  But this needs to be
> clearly documented.  Given that, isValid() seems to be a bad name for this
> function, too....
J.H.C, you just suggested a couple of days ago that I call it 'isValid'.

> Frankly, all this code seems very cumbersome.  Also, why is lack of default
> extension considered cause for not using the filename from the URI?
It seemed many things didn't like it (especially WBP) when saving a URI where an
extension couldn't be discovered. Save-link-as could fail. I couldn't save an
image for example.
This is all so far back in the early days of this patch and having spent so many
hours uncovering all these issues, I'm nervous seeing some of these bits pulled
apart. And I'm concerned that I'll feel the urge to run away very fast in the
opposite direction.

> The old architecture  ("stick all available data into a function to get a
> default filename, then pick a default extension") was much better, in my
> opinion... In the new setup the flow of control is very unclear.
That was the whole idea behind getUriAndFileInfo() originally - a one-stop-shop
for discovering filename & extension. But now it keeps on changing.

> >+ * @return an nsIURI object based on aURL.
> This function doesn't return anything, actually...
It did until 2 days ago

> >+ * @return a FileInfo structure (for the internalSave method to use).
> This function doesn't return anything.....
It did until 2 days ago

> Won't this duplicate the work done by makeUriAndPopulateFileInfo?  Which is
> what makes me question the need for makeUriAndPopulateFileInfo at all.
No, it doesn't do the same things. But since the middle of getUriAndFileInfo has
been gutted, makeUriAndPopulateFileInfo is now only called once (was 3 times
before), so I've got rid of it.

> >+    // If aFI.fileExt is still blank, consider: aFI.suggestedFileName is
supplied
> >+    // if saveURL(...) was the original caller (hence both aContentType and
> >+    // aDocument are blank). If they were saving a link to a website then make
> >+    // the extension .html
> I'd really rather not do this.	It's pretty wrong, in my opinion... All sorts
> of non-HTML stuff gets served via HTTP.
Fine. But when in Moz 1.7 I select save-link-as for http://www.xyz.com and it
suggests a filename of "http _www.xyz.com_.htm", where does it get the .htm
from? The HEAD request?
I posed this questions some weeks back, and pretty much the only answer I got
was that if you are saving a URL and there is no obvious file extension (with
which to take a guess as to the probable file type), then the odds are that the
target is html. The chances of something else are negligible.
If you can describe a better approach, I'm all ears (and some examples and sites
where to test such issues), although I'd rather see this as a followup.

> Again, please leave as much of the old (carefully written and well-tested)
> filename/file-extension code around as possible instead of reinventing it all.
So much was based around the HEAD data... I made a gradual shift away from that
code, I didn't just write stuff from scratch, so I've tried to retain what I
could....

> >+function getDefaultFileName(aDefaultFileName, aDocumentURI, aDocument)
> >+  if (aDocumentURI)
> The old code never had this null.  Where in your code could it become null,
> exactly?
... unless (assuming yesterdays patch) you were talking of this method (above).
No, I didn't go through all possible code paths to see if aDocumentURI could be
null. But rather than assume that all the calling code has done things
perfectly, I saw no harm in adding a sanity check and bypassing the code
conditional upon aDocumentURI. Afterall, none of it can function without
aDocumentURI. Just dotting the i's and crossing the t's, that's all.
Ok, I've taken that check out.

> Also, this is breaking the Content-Disposition headers for the case when we
> _do_ have it around (when aDocument is not null).  That's OK for a followup
> bug, assuming it gets fixed, I guess.
You lost me there but I don't mind looking if you can elaborate.

Have made changes but waiting for a full build, so will have to wait for the
morning.
(In reply to comment #80)
> The FileInfo structure and getUriAndFileInfo method used to be much more
> flexible, but with successive reviews they are getting less and less so, and
> morphing into something that they weren't before.

Right.  But we're talking about the code that's going to go into the tree, here,
not about possible future enhancements to it.  I'd really like to see the switch
from HEAD made with minimal functionality changes so it can be tested on its own
before piling on other changes (which would be doable easily in beta milestones
if needed).

> > when we are trying to set the extension from the URI.  But this needs to be
> > clearly documented.  Given that, isValid() seems to be a bad name for this
> > function, too....
> J.H.C, you just suggested a couple of days ago that I call it 'isValid'.

That was based on your validateFileInfo() name for it.  I had not read through
the function carefully at that point, as the comments said...

I'm sorry this is being frustrating.  :(

> It seemed many things didn't like it (especially WBP) when saving a URI where an
> extension couldn't be discovered. Save-link-as could fail. I couldn't save an
> image for example.

Hmm... Document that in the patch, and file a followup bug to test that and
remove this code if it's not needed?  I suspect that anything failing for lack
of an extension just needs to be fixed.

> That was the whole idea behind getUriAndFileInfo() originally - a one-stop-shop
> for discovering filename & extension.

But it's not.  It does some stuff based on just the URI, then calls other things
that do stuff based on just the URI, etc...

> > This function doesn't return anything, actually...
> It did until 2 days ago

I realize that.  The point is that the comment that's checked in should match
the code that's checked in.

> Fine. But when in Moz 1.7 I select save-link-as for http://www.xyz.com and it
> suggests a filename of "http _www.xyz.com_.htm", where does it get the .htm
> from? The HEAD request?

Yes.  The HEAD request.

> I posed this questions some weeks back, and pretty much the only answer I got
> was that if you are saving a URL and there is no obvious file extension (with
> which to take a guess as to the probable file type), then the odds are that the
> target is html. The chances of something else are negligible.

OK.  Let's do that for now, then, and file a followup on a better heuristic?

> > Also, this is breaking the Content-Disposition headers for the case when we
> > _do_ have it around (when aDocument is not null).  That's OK for a followup
> > bug, assuming it gets fixed, I guess.
> You lost me there but I don't mind looking if you can elaborate.

When we have a non-null aDocument, we can get the Content-Disposition filename
off of that object (may require a bit of hacking of the document object).  In
that case, we should honor that header.  Again, a followup bug is fine here; we
can elaborate on the issue there.

Thank you for you patience...
Flags: blocking1.8a4? → blocking1.8a4-
(In reply to comment #81)
> Right.  But we're talking about the code that's going to go into the tree, here,
> not about possible future enhancements to it.  I'd really like to see the switch
> from HEAD made with minimal functionality changes so it can be tested on its own
> before piling on other changes (which would be doable easily in beta milestones
> if needed).
True... I had thought getting exposure to this new code in an alpha is as
valuable as getting it just right... but getting a clean bare-bones version is
probably better.

> > It seemed many things didn't like it (especially WBP) when saving a URI where an
> > extension couldn't be discovered. Save-link-as could fail. I couldn't save an
> > image for example.
> Hmm... Document that in the patch, and file a followup bug to test that and
> remove this code if it's not needed?  I suspect that anything failing for lack
> of an extension just needs to be fixed.
Ok, true... I was just trying to make sure that what I produced worked within
the bounds of what WBP would handle correctly. Sure maybe I shouldn't put
work-arounds in this code to make up for problems elsewhere, but the overall
goal of this patch (IMHO) is also not to make more holes in the code than there
currently are.

> > That was the whole idea behind getUriAndFileInfo() originally - a one-stop-shop
> > for discovering filename & extension.
> But it's not.  It does some stuff based on just the URI, then calls other things
> that do stuff based on just the URI, etc...
From the standpoint of internalSave(), it was the one method that decided on
filename/ext (before prompting user for confirmation) and still is. Of course
internally it calls other methods, but what code doesn't? (Except where people
create r-e-a-l-l-y long methods with everything in them.)

> > target is html. The chances of something else are negligible.
> OK.  Let's do that for now, then, and file a followup on a better heuristic?
Sounds good to me.

> When we have a non-null aDocument, we can get the Content-Disposition filename
> off of that object (may require a bit of hacking of the document object).  In
> that case, we should honor that header.  Again, a followup bug is fine here; we
> can elaborate on the issue there.
I can raise a followup bug, but is this useful to do it now if nobody ever
encounters problems?

Speaking of which, cannot submit my patch yet. Moz crashes after second save,
and I suspect it's because of the extra assert put in for 'autodetect'/'no
conversion' check. How could I get that wrong?
With a debug build I can't get the console to show up any more (not since the
build process changed some weeks back). So I'll have to grub around to figure
out why (or find a new .mozconfig).

P.
assertions in mozilla don't crash the browser... not if you use NS_ASSERTION at
least, which you should
(In reply to comment #82)
> Ok, true... I was just trying to make sure that what I produced worked within
> the bounds of what WBP would handle correctly.

Agreed.  I'm OK with putting in such workarounds if we document exactly what
circumstances need them and file bugs to get WBP fixed.

> From the standpoint of internalSave(), it was the one method that decided on
> filename/ext (before prompting user for confirmation) and still is.

Sure.  I guess I'd just prefer the flow of control to be clearer in this case. 
A clear description of the algorithm we use for getting the filename, like the
old code has.

> I can raise a followup bug, but is this useful to do it now if nobody ever
> encounters problems?

People will certainly encounter it.  I just want us to have a place to dup all
the bugs that will get filed.  :)
(In reply to comment #83)
> assertions in mozilla don't crash the browser... not if you use NS_ASSERTION at
> least, which you should

I did use that (from memory, I /think/ I did.... can't bring up the code at the
moment)... I just thought that'd be the culprit as I didn't expect that changes
in the JS would be crashing it. But until I get a debug build running properly
(with a console window) I'm just guessing.
(In reply to comment #84)
> > From the standpoint of internalSave(), it was the one method that decided on
> > filename/ext (before prompting user for confirmation) and still is.
> Sure.  I guess I'd just prefer the flow of control to be clearer in this case. 
> A clear description of the algorithm we use for getting the filename, like the
> old code has.
There was a description of the old alogorithm??? Or it was just easier to
follow? (I thought it was much more confusing, but some of it being driven by
the callback - triggered as a result of receiving the HEAD response - didn't help).

> > I can raise a followup bug, but is this useful to do it now if nobody ever
> > encounters problems?
> People will certainly encounter it.  I just want us to have a place to dup all
> the bugs that will get filed.  :)
ok, I just have to be as descriptive as possible so people will find it easily
when seachin for it.
Attached patch All-inclusive patch (4 files) (obsolete) — Splinter Review
Sorry, should have put this up 24 hours ago, but had stability problems
(unrelated to this patch, it seems).
Attachment #159468 - Attachment is obsolete: true
Attachment #159469 - Attachment is obsolete: true
Attachment #159840 - Flags: superreview?(bzbarsky)
Stripped out some unnecessary handling, called from initFileInfo.
Attachment #159840 - Attachment is obsolete: true
Comment on attachment 160223 [details] [diff] [review]
All-inclusive patch (4 files)

let's rock ;-)
Attachment #160223 - Flags: superreview?(bzbarsky)
Attachment #159840 - Flags: superreview?(bzbarsky)
Blocks: 263393
Care should be taken when fixing this bug to ensure that HEAD is still used
where appropriate.  See bug #121616 (comment 19).  
Comment 90 has nothing to do with this bug, really....
Comment on attachment 160223 [details] [diff] [review]
All-inclusive patch (4 files)

sr=bzbarsky
Attachment #160223 - Flags: superreview?(bzbarsky) → superreview+
Depends on: 263697
I've checked the patch in and filed bug 263697 on the remaining
content-disposition issue.  There is also bug 174167 that needs to be fixed to
restore complete parity with the old code.  Leaving bug open for Phil to resolve
if there's nothing left to be done other than those two.

Someone should go through the bugs this one blocks in tomorrow's builds and see
what's left unfixed.
(In reply to comment #93)
> I've checked the patch in
Thanks Boris

> Leaving bug open for Phil to resolve
> if there's nothing left to be done other than those two.
Sorry, you mean I should close this bug if I have nothing else to add? (I don't
think that's what you mean?)

Other than bug 263679, Some issues have been raised (with this patch):
1. A lack of file extension sometimes seemed to stop WBP from working correctly
- need to recheck that.
2. Save-link-target-as for urls like http://www.xyz.com, the target filename is
automatically given the extension ".htm" (without it, the filename is "http
_www.xyz.com_"). So we need to file a followup on a better heuristic?
3. Save-link-target-as cannot save 'complete' web pages (i.e with images,
stylesheets, etc). Sorry, that's bugspam really, and there must already be a bug
for that somewhere :)

Comments...? I'll look at these again and raise bugs if it seems necessary.

> Someone should go through the bugs this one blocks in tomorrow's builds and see
> what's left unfixed.
Bags not me ;-)
(In reply to comment #94)
> Sorry, you mean I should close this bug if I have nothing else to add?

That's correct.  Either if there is nothing else to add or if there are bugs
filed on the remaining issues.

> 1. A lack of file extension sometimes seemed to stop WBP from working correctly
> - need to recheck that.

If you can, that would be great.

> 2. Save-link-target-as for urls like http://www.xyz.com, the target filename is
> automatically given the extension ".htm" (without it, the filename is "http
> _www.xyz.com_"). So we need to file a followup on a better heuristic?

Sure, but I doubt we can find one... Please file it anyway.

> 3. Save-link-target-as cannot save 'complete' web pages (i.e with images,
> stylesheets, etc). Sorry, that's bugspam really, and there must already be a bug
> for that somewhere :)

I'm not sure there is, actually.  In any case, this is an enhancement request;
I'm talking about actual remaining issues for parity with the old code.

> Bags not me ;-)

Your time is better spent on the code anyway.  ;)
(In reply to comment #94)
> Other than bug 263679 ...
oops, I meant bug 263697 :)

(In reply to comment #95)
> > 1. A lack of file extension sometimes seemed to stop WBP from working correctly
> > - need to recheck that.
> If you can, that would be great.
Will see what I can do

> Sure, but I doubt we can find one... Please file it anyway.
Will do so in next 24hrs...

> > 3. Save-link-target-as cannot save 'complete' web pages (i.e with images,
> > stylesheets, etc). Sorry, that's bugspam really, and there must already be a bug
> > for that somewhere :)
> I'm not sure there is, actually.  In any case, this is an enhancement request;
Well I consider it a deficiency (and inconsistent with being able to save
complete pages if you have actually visited them), but yes, it's effectively an
enhancement.
Comment on attachment 160223 [details] [diff] [review]
All-inclusive patch (4 files)

>+    try {
>+      aFI.uri = makeURI(aURL, docCharset);
>+      // Assuming nsiUri is valid, calling QueryInterface(...) on it will
>+      // populate extra object fields (eg filename and file extension).
>+      var url = aFI.uri.QueryInterface(Components.interfaces.nsIURL);
>+    } catch (e) {
>+    }
>+    // Get the default filename:
>+    aFI.fileName = getDefaultFileName((aFI.suggestedFileName || aFI.fileName),
>+                                      aFI.uri, aDocument);
>+    aFI.fileExt = url.fileExtension;
As for biesi's interface flattening question, I fail to see how instanceof
would work without interface flattening. But what I want to nit you on is that
you only use url to set aFI.fileExt so you should really do that inside the try
block.
> But what I want to nit you on

Good catch.  I've checked in that change (moving that line up into the try block).
(In reply to comment #97)
> As for biesi's interface flattening question, 

bz's, if you mean the one in comment 70.
(In reply to comment #98)
> > But what I want to nit you on
> 
> Good catch.  I've checked in that change (moving that line up into the try block).

hmmmm... ok... but have you tested that?
> hmmmm... ok... but have you tested that?

I checked that the extension in the filepicker is still the one in the URI, if
that one matches the content type, yes.
I guess this patch landing means that "save link target as.." on bugzilla now
gives a filename of attachment.cgi
That's correct.  Not much we can do about that, unfortunately.
(In reply to comment #103)
> That's correct.  Not much we can do about that, unfortunately.

Does that problem occur???
I haven't tried saving a bugzilla page with 1.8xx, but I'll try it tonite...
> Does that problem occur???

Of course.  The filename used to come from the content-disposition header.

Note that we're talking about save link as on an attachment, not saving a page
one is viewing (though the latter is broken too, and that's bug 263697).
(In reply to comment #105)
> > Does that problem occur???
> 
> Of course.  The filename used to come from the content-disposition header.
> 
> Note that we're talking about save link as on an attachment,
Yes, saving when saving a link to this page
(https://bugzilla.mozilla.org/show_bug.cgi?id=160454):
1.7.3 suggests the filename "show_bug.cgi.htm"
1.8a4 (with this patch) suggests the filename "show_bug.cgi"
This seems to me to be the same situation as in bug 263839, except that
"https://" applies. Unless you have an alternate Boris, I'll look at whether
that works easily for now.

> not saving a page one is viewing (though the latter is broken too, and
> that's bug 263697).
No it isn't for me - when saving this page I'm viewing:
1.7.3 suggests the filename "show_bug.cgi.htm"
1.8a4 (with this patch) also suggests the filename "show_bug.cgi.htm"
So the end result is the same.
(In reply to comment #106)
> This seems to me to be the same situation as in bug 263839, except that
> "https://" applies. Unless you have an alternate Boris, I'll look at whether
> that works easily for now.

I tried it and problem is solved, so I'll submit a patch - but wait for comments
first.
> > Note that we're talking about save link as on an attachment,
> Yes, saving when saving a link to this page

This page isn't an _attachment_.  Try the last patch attached to this bug.  The
filename you should be getting for it when you load it and then save is
"review9.diff.txt".  Instead you get "attachment.cgi" (or "attachment.cgi.txt",
possibly) in a current build.  That's bug 263697.

I agree that checking for https:// at the same place where we check for http://
is reasonable.
(In reply to comment #108)
> This page isn't an _attachment_.
oops, I missed that magic word 'attachment', sorry ;-)

> Try the last patch attached to this bug.  The
> filename you should be getting for it when you load it and then save is
> "review9.diff.txt".  Instead you get "attachment.cgi" (or "attachment.cgi.txt",
> possibly) in a current build.  That's bug 263697.

Ah. you get prompted with "attachment.cgi.htm", but the ".htm" will be because
of the change I just made (comment #106). Shame the link doesn't have more info
- is there anything in any spec anywhere - that could be used to our advantage? 

The way the links are defined at the top in the attchements area and in comment
#97 (for example) are
<a href="attachment.cgi?id=160223&amp;action=view" >All-inclusive patch (4
files)</a>
and
<a href="attachment.cgi?id=160223&amp;action=view" class="" title="All-inclusive
patch (4 files)">attachment 160223 [details] [diff] [review]</a>

Is there any standard attribute that could go into the link /and/ be passed to
the save-as code for us to use?

 
> I agree that checking for https:// at the same place where we check for http://
> is reasonable.
ok, will make a patch...
> Ah. you get prompted with "attachment.cgi.htm"

Er... why?  That URI has an extension in the filename; we should not be
overriding it, since we have no clue what its MIME type is.  That sounds like a
bug in your patch, and it's a major regression that will break things all over
if I understand it correctly (eg saving a link to "foo.pdf" will save
"foo.pdf.html").  Please file it and fix it.

> Is there any standard attribute that could go into the link 

There is "type", but that would only indicate the content-type.  There is
nothing to indicate a suggested filename.
> Ah. you get prompted with "attachment.cgi.htm"

Note that my current trunk build, with this bug's patch in it, doesn't show this
problem...

In any case, I think we should be filing new bugs on the various remaining
issues and regressions (including the https:// issue)..  Marking this one fixed.
 If someone could please check the deps and resolve those that are now fixed,
that would be much appreciated.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
No longer blocks: 229787
(In reply to comment #110)
> > Ah. you get prompted with "attachment.cgi.htm"
> Er... why?
Because it's been caught by the patch as described in bug 263839. It's 
catching https pages also (on my system) as discussed in the last day or so 
(but I haven't yet submitted the change that includes https...).

> That sounds like a
> bug in your patch, and it's a major regression that will break things all 
over
> if I understand it correctly (eg saving a link to "foo.pdf" will save
> "foo.pdf.html").  Please file it and fix it.
The .pdf should stay a .pdf, and if not then I'll look into why.

(In reply to comment #111)
> > Ah. you get prompted with "attachment.cgi.htm"
> Note that my current trunk build, with this bug's patch in it, doesn't show 
this
> problem...
Like I say it does this because of changes I've made (on my system) as 
discussed in the last day or so (but I haven't yet submitted the change that 
includes https...).
Blocks: 229787
Depends on: 264757
No longer blocks: 264825
Depends on: 264825
(In reply to comment #110)
> Er... why?  That URI has an extension in the filename; we should not be
> overriding it, since we have no clue what its MIME type is.  That sounds like a
> bug in your patch, and it's a major regression that will break things all over
> if I understand it correctly (eg saving a link to "foo.pdf" will save
> "foo.pdf.html").  Please file it and fix it.

No, have just checked again (eg PDFs) and it's working fine when the URI has an
extension in the filename. I.e it leaves the extension as .pdf
most probably silly (considering the moment), but ...is it possible to check in
aviary branch too?
We could, but firefox doesn't use this code.  Also, this patch caused several
regressions, if you read the bug.  So it doesn't really belong on branch.
Target Milestone: mozilla1.5alpha → mozilla1.8alpha5
Depends on: 265208
It was my understanding that regressions were ironed out. Anyway, that's
irrelevant since this code cannot be directly checked into aviary. Is it worth
to file a bug for incorporating it into the Firefox trunk ?
> It was my understanding that regressions were ironed out

How, when they're still open and blocking this bug??

> Is it worth to file a bug for incorporating it into the Firefox trunk ?

I did that a week and a half ago.  Bug 263698.
*** Bug 265723 has been marked as a duplicate of this bug. ***
Depends on: 252058
Depends on: 267406
Depends on: 263839
Depends on: 271981
Blocks: 294759
*** Bug 291693 has been marked as a duplicate of this bug. ***
How is this bug considered resolved? This one has been plaguing me for ages (all
versions of Win32 Firefox 1.0.x I think and presumably well before that too),
today I can't save a page simply because the server is down -- even more
annoyingly, this causes the browser to appear to ignore the Save request until
finally a dialog appears denying my request to save. View Source does work in
this case, but I have had it before such that even View Source doesn't work
although I imagine that this is another bug.

View Source is of course not an option if I wish to save an image that has since
gone missing (or the site has). For GIF images I can just screenshot the image,
but for JPEG images I don't especially want to do that and then re-save the JPEG!

Presently: Firefox 1.0.4, Windows 2000 SP4
Never worked out just what makes Bugzilla add your CC...
(In reply to comment #121)
> How is this bug considered resolved?

It is resolved on the trunk, meaning Firefox 1.1 will have the fix. The 1.0.x
branch isn't taking these kind of fixes.
(In reply to comment #121)
A complement to Gavin's comment : you can download a trunk "nightly" from
www.mozillazine.org. First you may check the associated thread at forum
http://forums.mozillazine.org/viewforum.php?f=23 . In any case, current Firefox
nightlies are stable enough for everyday use. You will only encounter two
issues. First, they are branded as "Deer Park". Second (and most important),
most 1.0.x compatible extensions won't work with them. You may also wait for the
imminent official Deer Park alpha 2 release. But if you want absolute stability,
waiting for FF 1.1 is the way to go. 
this specific patch is not for firefox. you want some other bug.
Well, maybe, maybe not but if I am to believe comment #123, it is resolved in
Firefox 1.1 anyhow, so I just have to hold out and wait (can't get a nightly
unless I plan to lose all my favourite extensions). As long as I can look
forward to it being fixed, that's fine.
*** Bug 300372 has been marked as a duplicate of this bug. ***
|getFileBaseName()| improvements after previous patch:
*Use it in an additional case.
*Remove useless |aFileExt| parameter.
Attachment #291080 - Flags: superreview?(neil)
Attachment #291080 - Flags: review?(mano)
Attachment #291080 - Flags: superreview?(neil) → superreview+
Comment on attachment 291080 [details] [diff] [review]
(AAv1) cleanup |getFileBaseName()| usages
[Checkin: Comment 132]

r=mano
Attachment #291080 - Flags: review?(mano) → review+
Keywords: checkin-needed
Whiteboard: [c-n: AAv1]
Attachment #291080 - Flags: approval1.9?
All patches that touch Firefox code need approval. Please do not request checkin-needed until the patch has approval.
Keywords: checkin-needed
Whiteboard: [c-n: AAv1] → [not checked in: AAv1]
Comment on attachment 291080 [details] [diff] [review]
(AAv1) cleanup |getFileBaseName()| usages
[Checkin: Comment 132]

a=beltzner for 1.9

Serge, you have reed to thank for us noticing this. Next time, please file a follow up bug and attach your patch to that.
Attachment #291080 - Flags: approval1.9? → approval1.9+
Keywords: checkin-needed
Whiteboard: [not checked in: AAv1] → [c-n: AAv1]
Checking in toolkit/content/contentAreaUtils.js;
/cvsroot/mozilla/toolkit/content/contentAreaUtils.js,v  <--  contentAreaUtils.js
new revision: 1.101; previous revision: 1.100
done
Checking in suite/common/contentAreaUtils.js;
/cvsroot/mozilla/suite/common/contentAreaUtils.js,v  <--  contentAreaUtils.js
new revision: 1.152; previous revision: 1.151
done
Keywords: checkin-needed
Whiteboard: [c-n: AAv1]
Attachment #291080 - Attachment description: (AAv1) cleanup |getFileBaseName()| usages → (AAv1) cleanup |getFileBaseName()| usages [Checkin: Comment 132]
Product: Core → Core Graveyard
Duplicate of this bug: 308156
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: