Open
Bug 238966
Opened 20 years ago
Updated 2 years ago
If saving two pages simultaneously save is sometimes incomplete
Categories
(Firefox :: File Handling, defect)
Tracking
()
UNCONFIRMED
People
(Reporter: Time_lord, Unassigned)
Details
Attachments
(4 files)
30.71 KB,
text/plain
|
Details | |
31.23 KB,
patch
|
Details | Diff | Splinter Review | |
6.22 KB,
image/png
|
Details | |
30.88 KB,
patch
|
Biesinger
:
review-
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7b) Gecko/20040327 (Please note: My local build ID above, based on tarball from 20040313) I've modified Mozilla to save two pages consecutively (two different tabs), but sometimes the first page isn't saved completely, or has the name of the second page, or is not saved at all. Eg I have 'fred.html' open in tab1 and 'index.htm' open in tab2. Sometimes the only page that is saved is index.htm. However if I set breakpoints in the trigger and the callback (for the save) then it works correctly (using JS debugger/venkman). Depending where the code is paused and where breakpoints are set, I've even seen the first page saved but with the name of the second page. It seems the callback (that does the writing) isn't 'reentrant' or there is a race condition. Another way of looking at this: "Save as web page complete" is fine, but not if you call it programatically, i.e for one page then the next, it appears that if no time is given in between saving the first & second pages, then the second page clobbers the first. This is unpredictable. I am working in mozilla\xpfe\communicator\resources\content\contentAreaUtils.js. Reproducible: Sometimes Steps to Reproduce: Programatically save two web pages (two tabs). A quick hack should suffice; my modifications are more substantial, so won't include here. Actual Results: Unpredictable Expected Results: Should have same result as calling 'Save Page As' for each tab.
Comment 1•20 years ago
|
||
i.e. you are using "web page, complete" for both pages? (or, from a programmatic point of view, nsIWebBrowserPersist?)
(In reply to comment #1) > i.e. you are using "web page, complete" for both pages? (or, from a programmatic > point of view, nsIWebBrowserPersist?) Yes, I am triggering the existing code which determines (or defaults to) "web page, complete", but I suppress the fp.show (confirmation dialog where the user confirms directory, filename and file type).
On closer investigation, I believe the code that is triggered stores some data which the callback accesses (in nsIRequestObserver:onStopRequest) which is altered when the second page is saved. This is potentially a problem when the user manually saves 2 pages simultaneously. Will decide how best to fix.
(In reply to comment #3) That assumption I now think is false.
Comment 5•20 years ago
|
||
hmm, are you maybe using the same nsIWebBrowserPersist object for the two saves?
(In reply to comment #5) > hmm, are you maybe using the same nsIWebBrowserPersist object for the two saves? No, the code seems to fail in the callback before that bit of the code runs.
The small change is based on a seamonkey tarball from 20040414; In
mozilla\xpfe\communicator\resources\content, edit contentAreaUtils.js:
235,240c235,243
< // In both cases here, we want to use cached data because the
< // document is currently visible.
< if (aDocument)
< saveInternal(aDocument.location.href, aDocument, false);
< else
< saveInternal(_content.location.href, null, false);
---
> var browser = getBrowser();
> var thisBrowser = browser.browsers[0];
> var ref = thisBrowser.contentDocument.location.href;
> var doc = thisBrowser.docShell.document;//null;//
> saveInternal(ref, doc, false);
> thisBrowser = browser.browsers[1];
> ref = thisBrowser.contentDocument.location.href;
> doc = thisBrowser.docShell.document;//null;//
> saveInternal(ref, doc, false);
Do a debug rebuild, then when you run the code, make sure you have 2 pages (2
tabs) open, then right-click and select 'save page as'. Two dialogs pop up (i.e
one for each tab) - very quickly click save in both of them. If both are
running at the same time all sorts of errors can occur, for example an error
reporting 'code not reached' followed by an empty (and seemingly orphaned)
download dialog. Sometimes there are many errors but both pages are saved (and
sometimes not).
Whether errors occur may depend on the speed of your system, luck, and how fast
you click in the save-as dialogs (I'm running a 2.6G P4 with 512M on XP SP1).
Of course the code I was working on has more smarts (eg checking whether
browser 2 exists) and actually suppressing the save-as dialog (which can make
the problem more pronounced). However the simple change above is enough to
cause the problem.
Comment 8•20 years ago
|
||
So based on the thread in n.p.m.netlib it sounds like things are racing _somewhere_. I just scanned, and neither the contentAreaUtils code nor the uri checker really use any global vars.... Brendan, any idea how to debug possible races in the contentAreaUtils.js code?
Comment 9•20 years ago
|
||
How is the saved-to filename chosen? Is nsIFile.createUnique used? I don't see threads involved here, but I'm not looking hard. /be
Comment 10•20 years ago
|
||
Hmm.... If the filepicker is used, save-to filename is chosen using the filepicker and the file is not created till later, so things _could_ end up with the same filename if the second filepicker uses the same filename and is dismissed before the first download has created the file. When not using the filepicker, I'm not sure. Phil? As for threads, there are necko threads (nsURIChecker) involved. There are also nested event queues involved (at least if filepickers go up). So there are various race opportunities.
Reporter | ||
Comment 11•20 years ago
|
||
(In reply to comment #10) > Hmm.... If the filepicker is used, save-to filename is chosen using the > filepicker and the file is not created till later, so things _could_ end up with > the same filename if the second filepicker uses the same filename and is > dismissed before the first download has created the file. When not using the > filepicker, I'm not sure. Phil? In my full-blown version, after selecting the destination directory (only once before processing the tabs), for each tab (page) I see if that file already exists (and if it does then give the page to be saved a new name). Since I suppress the 'save-as' dialog (because I am automatically choosing the name), any conflict caused by the user is avoided. The potential exception to this is if the first page is about to be saved (but nothing is yet on disk) and then the filename for the second page is checked (!exists), but it's actually the same as the first page. So a conflict occurs when the second page is saved. (That'll be a nasty one, but could occur with any application.) One major point to consider is that I think the filepicker is a red-herring, and the problem occurs before this. I.e after the urichecker has done its thing, the callback (onStopRequest) to the code in contentAreaUtils doesn't always fire for both pages. So we haven't yet got to the filechecker code! BTW, the point of doing all this is that I've added a 'save all pages (tabs)' feature (obviously different from bookmark all tabs). It works fine except for this race condition. I'd like to submit it as a proposed enhancement. Because I'm relatively new to XPCOM and the Moz codebase, I decided to make use of the existing 'save page as' code. Nice also to re-use code. However I was reluctant to rearrange the existing code to do all this.... maybe I should, although in reality not much would change.
Comment 12•20 years ago
|
||
Brendan, is the way |new nsHeaderSniffer| used at http://lxr.mozilla.org/seamonkey/source/xpfe/communicator/resources/content/contentAreaUtils.js#255 and the way it's defined at http://lxr.mozilla.org/seamonkey/source/xpfe/communicator/resources/content/contentAreaUtils.js#411 and http://lxr.mozilla.org/seamonkey/source/xpfe/communicator/resources/content/contentAreaUtils.js#440 ok? In particular, are the assignments to this.xxxx in nsHeaderSniffer() fine, given that those properties are not declared on the prototype? Really, we just need to remove all this code, though. See bug 160454
Comment 13•20 years ago
|
||
> In particular, are the assignments to this.xxxx in nsHeaderSniffer() fine, given
> that those properties are not declared on the prototype?
Why not? Nothing's declared, and it's fine to set instance properties in a
ctor, whether or not they override proto-properties.
/be
Comment 14•20 years ago
|
||
Hey, I'm just looking for possible reasons that onStopRequest() wouldn't get called on the two totally different objects we're creating here....
Reporter | ||
Comment 15•20 years ago
|
||
(In reply to comment #12) > Really, we just need to remove all this code, though. See bug 160454 Interesting - but surely most of those things need to happen somewhere? I.e if the page isn't cached, check the URL & get the page so it can be saved. And under what circumstances do you check the URL anyway, because it may be out-of-date? A little off-topic, but I've looked at bug 160454, but could you point me in the direction of more info about the nature of the HEAD problem (email me, thanks)? Seems resolving that will have a major bearing on this code.
Reporter | ||
Comment 16•20 years ago
|
||
After discussion with Boris & Beisi a few weeks back, it seems something related to this area is a desire to remove the header sniffer stuff. I've done this, and will attach the modified file (contentAreaUtils.js). I've been using these changes for several weeks and it seems fine. I see there is some discussion on this topic at the moment in Bug 160454 (remove HEAD requests) so I should really be posting there ;-) To anyone looking at the code, you'll see I broke the 'internalSave' up a bit and it has a couple of extra parameters here & there. I've done this because I've written extra code to allow a 'save-all-tabs' option (where the user is NOT prompted when saving all files - I've omitted that extra code here to avoid clouding the issue). If the code meets with approval, we could also trim the comments back a little. Getting back to the subject of this bug, I find when I attempt to save 2 (or more) tabs simultaneously (via my new 'save-all-tabs' option), I get an error: ###!!! ASSERTION: CachedChromeStream doesn't receive data: 'Not Reached', file e:/MozillaBuilds/mozilla/content/xul/document/src/nsXULDocument.cpp, line 4150 This problem still occurs in the new code (i.e new code without all the HEAD request stuff). After clearing the error dialog, the multiple pages are all saved, but there is an 'orphaned' save-as dialog left on the screen which never goes away. If I do save-all-tabs again (and again), there is no error. I've looked at nsXULDocument.cpp, but I got a little lost... Any ideas? Anyway, please review the new contentAreaUtils.js..... :)
Reporter | ||
Comment 17•20 years ago
|
||
See also snapshot of 'code not reached' dialog.
Comment 18•20 years ago
|
||
(In reply to comment #16) > Created an attachment (id=151280) > New contentAreaUtils.js with HEAD sniffer code removed. could you attach this as a patch? (cvs diff -u6p filename > patchname)
Reporter | ||
Comment 19•20 years ago
|
||
(In reply to comment #18) > could you attach this as a patch? > (cvs diff -u6p filename > patchname) Yep, sorry about that, here it is
Comment 20•20 years ago
|
||
Comment on attachment 151287 [details] [diff] [review] Diff of changes patching file contentAreaUtils.js Hunk #2 FAILED at 216. caused by the change of _content to content, I think
Comment 21•20 years ago
|
||
Comment on attachment 151287 [details] [diff] [review] Diff of changes general comment about the patch... please try to limit lines to 80 characters. - // In both cases here, we want to use cached data because the - // document is currently visible. + // In both cases here, we want to use cached data because the + // document is currently visible. please don't add trailing whitespace + var browser = getBrowser(); getBrowser() is in navigator.js... I'm not sure if depending on browser/ from communicator/ is a good idea... oh nevermind, this function is already used in that file. grrr. just curious... is there a reason why you renamed saveInternal to internalSave? +// aURL - The String representation of the URL of the document being saved, according to +// the browser window, or supplied with 'save-link' +// aUriObj - If available, the URL of the document being saved as an nsIURL object hmm. I don't like this. why not use a single parameter, that is an nsIURI? +// aUserTypedValue - The URL as supplied by the user (aURL may not match this if the document is not +// yet available, eg it is 'about:blank' because the page has not yet been cached) I don't understand this comment. OK, say I want to save an http document, that is cached. what would I pass here? This comment makes it sound like this is the value of the urlbar. is it? if so, how does it differ from the uri object? ok, let me try to read the code to understand this parameter... if (aUserTypedValue && (aUserTypedValue != aURL) && (aURL == "about:blank")) { can you explain this? So I suspect this will cause severe problems if the user typed a new value in the urlbar but did not yet hit enter, and then decides to save the current page, since getBrowser().userTypedValue will be that new value of the urlbar then. What does this fix? +// aShouldBypassCache - Like it says "if true, the document will always be refetched from the server" +// aAutoChosenData - If non-null this contains pre-determined data so that the user does not +// need to be prompted for a target filename. OK, this one is always null in the callers of this patch, so it seems this is something you want to offer to extensions. fine with me, but you really should document what kind of data this expects. a string indicating the target filename? nsIFile indicating the target filename? indicating the directory? please document this in the comment. personally I like doxygen-style comments, like: /** * Used when saving a document or URL, this method: * [...] * @param aURL The string representation [...] */ but, I guess I have to be glad that a comment even exists ;) +// This functionality rearranged for Mozilla 1.8+ to get rid of the HeaderSniffer code. +// May 2004. no need to use comments for history purposes, we have a revision control system (CVS). + var contentEncodingType = (aDocument ? aDocument.actualEncoding : null); uh, document.actualEncoding is a character set (in fact, it is the very same thing as document.characterSet, at least in mozilla). this means that this is wrong. + getUriAndFileInfo(fileInfo, aURL, aUriObj, aDocument, aUserTypedValue, contentType); why does this function not just return the fileInfo? // What we should probably do is change nsWebBrowserPersist::SaveURI to download everything I'm not sure that that's the way to go... I guess that I'd instead construct a document in the caller, and then pass that to wbp. - // If we're saving a document, and are saving either in complete mode or + // If we're saving a document, and are saving either in complete mode or trailing whitespace + target : (!aAutoChosenData ? makeFileURI(fpParams.fp.file) : makeFileURI(aAutoChosenData.file)), nsIFilePicker has this nice attribute: 134 * Get the nsIFileURL for the file or directory. 135 * 136 * @return Returns the file currently selected 137 */ 138 readonly attribute nsIFileURL fileURL; oh, the old code didn't use it either... ah well. please change it while you're touching this code :) Also... can you change it to read something like: target: aChosenData ? makeFileURI(aData,file) : fp.fileURI; i.e. don't negate aChosenData - else + else - + I do wonder how you manage to consistently add trailing whitespace. + var destFileName = (!aAutoChosenData ? fpParams.fp.file : aAutoChosenData.file); again... please aChosenData ? aChoseData.file : fp.file (that's easier to read, imo) - + - encodingFlags |= nsIWBP.ENCODE_FLAGS_NOFRAMES_CONTENT; + encodingFlags |= nsIWBP.ENCODE_FLAGS_NOFRAMES_CONTENT; again... + dl.init((!aAutoChosenData ? fileInfo.uriObj : aAutoChosenData.uri), again (different one than right above though :) ) + persist.saveDocument(persistArgs.source, persistArgs.target, filesFolder, more trailing ws + dl.init((!aAutoChosenData ? source : aAutoChosenData.uri), again, please reverse the condition + this.uriFileName = aUriFileName; + this.uriFileBaseName = aUriFileBaseName; + this.uriFileExt = aUriFileExt; + this.uriObj = aUriObj; I'm not particularly happy with this naming. how about: fileName, fileBaseName, fileExt, uri +// Structure for holding info about automatically supplied parameters for the internalSave(...). oh, you have the structure even defined here. that really should be mentioned in the comment of internalSave. +// aUriAutoChosen - This is the nsIURI object for the target why do both have to be supplied? I mean, there's a bijective mapping from nsIFile <-> nsIFileURL, isn't there. so internalSave, or maybe this function, could determine the URL automatically. +// method which utilises @mozilla.org/network/io-service;1). I'd just mention "which uses nsIIOService", doesn't really matter though +function getUriObj(aURL) this method signature makes me really unhappy. just like makeURI's signature. they are broken by design. they really need to have a charset parameter for proper support of URLs containing non-ascii characters. (the character set can be gotten off the document) I'm not convinced of the need for this function. Its name is misleading, too, since it seems redundant with makeURI. I'd just move the QI to the top of internalSave, with the change to always pass an nsIURI to internalSave. Or more to the point, I'd move it to getUriAndFileInfo (I also don't see a need for this nested try..catch) // Special case is user selects 'stop' before anything loaded (when parameter // aDocument is 'about:blank'): ahh. there comes the userTyped thing from. I rather think that we should save what's displayed though (i.e. nothing). What if some document is displayed, I type a url, then hit stop soon enough. how is that different from your about:blank example? + aFI.uriFileName = (uO && uO.fileName ? uO.fileName : null); aFI.uriFileName = uO && uO.fileName; (same for the next three lines) I have to say, I hate the use of this uO abbrevation. imho, it makes the code less readable, since uO is really an abbreviation that makes no sense. + // decide a default extension. Should we do this? I do think we should, since people want to be able to double-click saved web pages and open them in the browser. + if (aFI.uriFileName && !aFI.uriFileExt) { hm, you only need to do this if the above if wasn't taken, since it already has a call like the one in this if, right? but maybe you should instead move the call to getDefaultExtension after the first if. something like: // from the previous incantation of this code... if (!aFI.uriFileName && !aFI.uriFileExt) { aFI.uriFileName = getDefaultFileName(aFI.uriFileName, aDocument, aFI.uriObj); aFI.uriFileBaseName = getFileBaseName(aFI.uriFileName, aFI.uriFileExt); } if (aFI.uriFileName && !aFI.uriFileExt) { aFI.uriFileExt = getDefaultExtension(aFI.uriFileName, aFI.uriObj, aContentType); } + } catch (e) { + } hmm. so failure to access (say) the filename from the URL will not set any filename hint. that seems suboptimal to me. why don't you put this catch after the uO.file* accessors? // Ignore the supplied aFileName... well err... the old code seemed to anyway... that doesn't seem right... it was used in case 4 of getDefaultFileName, no? can you explain your changes to getDefaultFileName? manageFilePicker: fp.filterIndex = aFpP.prefs.getIntPref("save_converter_index"); hm... does setting the filterIndex work, when the filters haven't been appended yet? sigh, this would've been easier to review if you hadn't rearranged the lines in manageFilePicker, compared to the old foundHeaderInfo review- for now.
Attachment #151287 -
Flags: review-
Comment 22•20 years ago
|
||
I'd suggest, though, to post further patches to bug 160454 since that's the main issue this patch fixes...
Reporter | ||
Comment 23•20 years ago
|
||
(In reply to comment #21) Thanks for the comments... sorry I don't know much about the codebase (hence my suggestion that there may be 'better' ways to do bits) - see my responses inline > general comment about the patch... please try to limit lines to 80 characters. Sorry about violating that edict... I think there were already some lines longer than this so I was happily trying to operate within the bounds of "when in someone elses source code follow their coding style" > please don't add trailing whitespace Damn, that is one of my pet hates, so that embarasses me :-[ I'm more likely to be caught trimming someone *elses* trailing whitespace > just curious... is there a reason why you renamed saveInternal to internalSave? The main reason was to highlight that the signature (parameters) was completely different, should anyone be looking for the old name. We can use the old name if you prefer? > > +// aURL - The String representation of the URL of the > document being saved, according to > +// the browser window, or supplied with 'save-link' > +// aUriObj - If available, the URL of the document being saved > as an nsIURL object > > hmm. I don't like this. why not use a single parameter, that is an nsIURI? Yes, a little ugly, like denormalised data. However the nsIURI parameter in this 'standard' code is always null, whereas it is supplied in my 'save all' code. Maybe I should rethink that. > +// aUserTypedValue - The URL as supplied by the user (aURL may not match > this if the document is not > +// yet available, eg it is 'about:blank' because the > page has not yet been cached) > > I don't understand this comment. OK, say I want to save an http document, that > is cached. what would I pass here? > > This comment makes it sound like this is the value of the urlbar. is it? if so, > how does it differ from the uri object? Hmmm, maybe that isn't clear enough. What I found happens is (for example) you can enter "www.google.com", but the browser is not online (eg you are waiting for the PC to dial). If you choose to save at this point, the internalSave method is presented with a URL (doc) "about:blank", and the users typed value ("www.google.com") can be found in <browser>.userTypedValue. Although in this scenario the context menu is not available (or at least not a save option), you can start the save with CTRL-S. Yes, I guess there is a urlbar to correctly retrieve this from... > ok, let me try to read the code to understand this parameter... > > if (aUserTypedValue && (aUserTypedValue != aURL) && (aURL == > "about:blank")) { > can you explain this? > > So I suspect this will cause severe problems if the user typed a new value in > the urlbar but did not yet hit enter, and then decides to save the current > page, since getBrowser().userTypedValue will be that new value of the urlbar > then. > What does this fix? I was trying to target the above situation where the internalSave method is presented with a URL "about:blank", but that is infact because the users requested page has not yet been retrieved. So I guess if the user has typed a new URL, not pressed enter and yet has chosen save, I'd suspect that they typed the new URL for a reason? > +// aShouldBypassCache - Like it says > > "if true, the document will always be refetched from the server" ok ;-) I was getting lazy... and it wasn't documented before :-) > +// aAutoChosenData - If non-null this contains pre-determined data so > that the user does not > +// need to be prompted for a target filename. > > OK, this one is always null in the callers of this patch, so it seems this is > something you want to offer to extensions. fine with me, but you really should > document what kind of data this expects. a string indicating the target > filename? nsIFile indicating the target filename? indicating the directory? Well-noted, I can include the prototype for aAutoChosenData. BTW, should something like 'save all open documents' be an extension or a new option in the standard Mozilla product (as I was hoping)? > personally I like doxygen-style comments, like: <snip> AKA javadoc style comments - I'm happy to do that > + var contentEncodingType = (aDocument ? aDocument.actualEncoding : null); > > uh, document.actualEncoding is a character set (in fact, it is the very same > thing as document.characterSet, at least in mozilla). this means that this is > wrong. Oh? Not sure where that bit come from, will look into it... but hasn't been causing me any problems ;-) > + getUriAndFileInfo(fileInfo, aURL, aUriObj, aDocument, aUserTypedValue, > contentType); > > why does this function not just return the fileInfo? Because in the early days I was having trouble creating a FileInfo object inside the method, and it was going out of scope when the method finished. Javascript isn't really my cup of tea... but I'll look at it again > nsIFilePicker has this nice attribute: <snip> > 138 readonly attribute nsIFileURL fileURL; > > oh, the old code didn't use it either... ah well. please change it while you're > touching this code :) Noted > Also... can you change it to read something like: > target: aChosenData ? makeFileURI(aData,file) : fp.fileURI; > i.e. don't negate aChosenData Yep... just trying to keep the 'standard' flow of program contol first in the "if...else" > I do wonder how you manage to consistently add trailing whitespace. Sorry again, must be my damn microsloth editor :-) And had my diff tool set to ignore changes in whitespace :-[ > I'm not particularly happy with this naming. > how about: fileName, fileBaseName, fileExt, uri okey dokey > +// Structure for holding info about automatically supplied parameters for the > internalSave(...). > > oh, you have the structure even defined here. that really should be mentioned > in the comment of internalSave. oops, so I do :-[ > +// aUriAutoChosen - This is the nsIURI object for the target > > why do both have to be supplied? I mean, there's a bijective mapping from > nsIFile <-> nsIFileURL, isn't there. so internalSave, or maybe this function, > could determine the URL automatically. Ok, again that's my unfamiliarity with what XPCOM can do for me (and not being able to find enough doc's on what it does ;-) ) > +function getUriObj(aURL) > > this method signature makes me really unhappy. just like makeURI's signature. > they are broken by design. they really need to have a charset parameter for > proper support of URLs containing non-ascii characters. (the character set can > be gotten off the document) > > I'm not convinced of the need for this function. Its name is misleading, too, > since it seems redundant with makeURI. > > I'd just move the QI to the top of internalSave, with the change to always pass > an nsIURI to internalSave. Or more to the point, I'd move it to > getUriAndFileInfo OK > // Special case is user selects 'stop' before anything loaded (when > parameter > // aDocument is 'about:blank'): > ahh. there comes the userTyped thing from. I rather think that we should save > what's displayed though (i.e. nothing). It does... sort-of.. i.e the contents of about:blank "<html><head><title></title></head><body></body></html>" (interestingly the doctype info is missing when this is saved but I guess this is something that WBP does when saving) > What if some document is displayed, I type a url, then hit stop soon enough. > how is that different from your about:blank example? It's all a big race really - I guess if the old doc is still displayed when this method is triggered then that doc is what is presented & processed, whereas if the browser is blank (in preparation for the new url) then 'about:blank' document is presented. > I have to say, I hate the use of this uO abbrevation. imho, it makes the code > less readable, since uO is really an abbreviation that makes no sense. Noted. That was 'uO' for 'uri Obj' :-[ > but maybe you should instead move the call to getDefaultExtension after the > first if. something like: > > // from the previous incantation of this code... > if (!aFI.uriFileName && !aFI.uriFileExt) { > aFI.uriFileName = getDefaultFileName(aFI.uriFileName, aDocument, > aFI.uriObj); > aFI.uriFileBaseName = getFileBaseName(aFI.uriFileName, aFI.uriFileExt); > } > if (aFI.uriFileName && !aFI.uriFileExt) { > aFI.uriFileExt = getDefaultExtension(aFI.uriFileName, aFI.uriObj, > aContentType); > } > > + } catch (e) { > + } > > hmm. so failure to access (say) the filename from the URL will not set any > filename hint. that seems suboptimal to me. why don't you put this catch after > the uO.file* accessors? Noted. This seemed to be a fiddle to get working in all cases but will look at it again. > // Ignore the supplied aFileName... well err... the old code seemed to > anyway... > that doesn't seem right... it was used in case 4 of getDefaultFileName, no? oops :-[ Maybe I got side-tracked by function saveDocument... fill fix up. > can you explain your changes to getDefaultFileName? Sure. Obviously step1 got removed because that was derived from header-sniffer stuff. I replaced this with the code stating that if aDefaultFileName is not blank, then use it (I mean if it's ok, it was supplied, then why not?) Next, "if (aDocumentURI)" - i.e why do that block of code if aDocumentURI is null? However, the 'catch' will not get executed either... but I'd assumed that aDocument and aDocumentURI are related, so should the catch get executed if aDocumentURI is null? Similarly further down, "if (aDocumentURI)", avoid exceptions if aDocumentURI is null. > manageFilePicker: > fp.filterIndex = aFpP.prefs.getIntPref("save_converter_index"); > hm... does setting the filterIndex work, when the filters haven't been appended > yet? Appears to be ok (I haven't had any problems), but that is a good point - I probably got those bits around the wrong way while moving things :-[ > sigh, this would've been easier to review if you hadn't rearranged the lines in > manageFilePicker, compared to the old foundHeaderInfo Not trying to make things hard, I've simply been thru so many iterations of changes that maybe I didn't leave things as tidy as possible. Will get back to the old order of things (unless something doesn't seem right, in which case I'll say so). Have been snowbound, so hopefully can get on to this in the next day or so.
> So I guess if the user has typed a new URL, not pressed enter and yet has chosen
> save, I'd suspect that they typed the new URL for a reason?
When I do this, it's because I realize that I want to save the page before I
replace its content (usually a receipt of some kind; I don't save much else).
Reporter | ||
Comment 25•20 years ago
|
||
(In reply to comment #24) > > So I guess if the user has typed a new URL, not pressed enter and yet has chosen > > save, I'd suspect that they typed the new URL for a reason? > > When I do this, it's because I realize that I want to save the page before I > replace its content (usually a receipt of some kind; I don't save much else). good point
Comment 26•20 years ago
|
||
(In reply to comment #23) > Sorry about violating that edict... I think there were already some lines longer > than this so I was happily trying to operate within the bounds of "when in > someone elses source code follow their coding style" hm, in the case of line length, I don't think improving upon current file style is a bad thing ;) > > please don't add trailing whitespace > Damn, that is one of my pet hates, so that embarasses me :-[ > I'm more likely to be caught trimming someone *elses* trailing whitespace heh :) > > just curious... is there a reason why you renamed saveInternal to internalSave? > The main reason was to highlight that the signature (parameters) was completely > different, should anyone be looking for the old name. We can use the old name if > you prefer? No, the new name is fine. > Yes, a little ugly, like denormalised data. However the nsIURI parameter in this > 'standard' code is always null, whereas it is supplied in my 'save all' code. > Maybe I should rethink that. Oh... hm... Well I don't want to mark review+ on a patch that has both parameters; please decide on one :) I'd prefer nsIURI, but I'd accept strings, too. The problem with strings is that they are not necessarily sufficient to identify a URI, if non-ASCII characters are involved. > ok ;-) I was getting lazy... and it wasn't documented before :-) have to seize the opportunity that someone is willing to write documentation :) > BTW, should something like 'save all open documents' be an extension or a new > option in the standard Mozilla product (as I was hoping)? hm, I think that'd be nice for standard mozilla, but talk to neil about that > Oh? Not sure where that bit come from, will look into it... but hasn't been > causing me any problems ;-) hmmmm.... I wonder if you actually need it... I suspect for currently-displayed pages, you always want to decode the content. On the other hand... I wonder what happens when you're saving ("save link target") a .tar.gz file, sent with Content-Encoding:gzip it should not be decoded, and I think the current code gets that right. > > +// aUriAutoChosen - This is the nsIURI object for the target > > > > why do both have to be supplied? I mean, there's a bijective mapping from > > nsIFile <-> nsIFileURL, isn't there. so internalSave, or maybe this function, > > could determine the URL automatically. > Ok, again that's my unfamiliarity with what XPCOM can do for me (and not being > able to find enough doc's on what it does ;-) ) to go from nsIURI to nsIFile, you QueryInterface to nsIFileURL and get the file attribute for the other way, you can use newFileUri like your patch does. > > can you explain your changes to getDefaultFileName? > Sure. [...] I'll read that after having had more than 4 hours of sleep :)
Updated•20 years ago
|
Flags: blocking1.8a2? → blocking1.8a2-
Updated•20 years ago
|
Attachment #151287 -
Attachment is patch: true
Reporter | ||
Comment 27•20 years ago
|
||
(In reply to comment #21) > +// aURL - The String representation of the URL of the > document being saved, according to > +// the browser window, or supplied with 'save-link' > +// aUriObj - If available, the URL of the document being saved > as an nsIURL object > > hmm. I don't like this. why not use a single parameter, that is an nsIURI? Removed aUriObj - it was redundant. > + var contentEncodingType = (aDocument ? aDocument.actualEncoding : null); > > uh, document.actualEncoding is a character set (in fact, it is the very same > thing as document.characterSet, at least in mozilla). this means that this is > wrong. Can you give me some hints about how to do this? I can see how to get the encoding type in the old header sniffer code (given a channel), but not sure how to start if the doc is already cached. I've seen the page you published (mimetypes.html on Mozilla.org), but couldn't get a starting point as to what to do for encoding types... > +// aUriAutoChosen - This is the nsIURI object for the target > > why do both have to be supplied? I mean, there's a bijective mapping from > nsIFile <-> nsIFileURL, isn't there. so internalSave, or maybe this function, > could determine the URL automatically. For the normal code yes, but when using the new 'save all' (where the 'auto-chosen' data is actually used), the filename to use may not come directly from the URL. Eg this is the case if the filename is already used and the 'save as' code is allowed to automatically pick a new name. The code is barely a tiny bit smaller, but file has 2.6k more doxygen comments :-[ And I found an obscure bug which hopefully I should fix soon... I'll post the next patch to bug 160454...
Comment 28•20 years ago
|
||
(In reply to comment #27) > Can you give me some hints about how to do this? I can see how to get the > encoding type in the old header sniffer code (given a channel), but not sure how > to start if the doc is already cached. I don't think you need this at all... it seems to me like we always want to apply decoding if we are curently displaying the page. I think. bz, thoughts?
Comment 29•20 years ago
|
||
That's correct. The times we want to not apply decoding are the times when we want to save a gzipped file because that's the "real" file (eg a .tar.gz should not be decompressed). One exception I can think of is a .ps.gz that we render with a plugin; that sort of think we _may_ want to save as a .ps.gz...
Comment 30•20 years ago
|
||
oh... re the getDefaultFilename changes... a "diff -w" would've really helped I'm not sure if I mentioned this already, but: + if (aDefaultFileName && (aDefaultFileName != "")) the second part is not necessary, "" tests as false in js
Reporter | ||
Comment 31•20 years ago
|
||
(In reply to comment #29) > That's correct. The times we want to not apply decoding are the times when we > want to save a gzipped file because that's the "real" file (eg a .tar.gz should > not be decompressed). > > One exception I can think of is a .ps.gz that we render with a plugin; that sort > of think we _may_ want to save as a .ps.gz... So do you mean that we should just use the outermost encoding? And is this what document.contentType is? (I guess it may be for rendered docs, but not for non-document objects (eg images), and what do we do when we are saving a link?)
Reporter | ||
Comment 32•20 years ago
|
||
(In reply to comment #30) > oh... re the getDefaultFilename changes... a "diff -w" would've really helped > > > I'm not sure if I mentioned this already, but: > + if (aDefaultFileName && (aDefaultFileName != "")) > > the second part is not necessary, "" tests as false in js <rant> I'm not sure if I'm keen on taking the odd shortcuts available in javascript; I'm not overly keen working in a weakly-typed environment like JS. Eg there was a suggestion to change: aFI.uriFileName = (uO && uO.fileName ? uO.fileName : null); to: aFI.uriFileName = uO && uO.fileName; When the filename does not exist in the uO object, the 'simplified' line yields void, which in my books is not the same as null (call me old fashioned). Similarly when a method is called and 1 or more parameters are left out by the caller - this can have the ugly result of (eg) a boolean parameter having the value of void. Almost as bad (IMHO) in the old code is where null was passed to a method where a boolean was expected, and vice-versa... Maybe in the world of javascript, void/null/false are generally all equivalent, but I noticed subtle quirks in venkman when I took that approach... </rant> Maybe this isn't the best place for a rant :-[
Comment 33•20 years ago
|
||
> So do you mean that we should just use the outermost encoding? I'm not sure what you mean by that. > And is this what document.contentType is? document.contentType is a type. It's never an encoding. > (I guess it may be for rendered docs, but not for non-document objects (eg > images) document.contentType works just fine for standalone images and plugins... > and what do we do when we are saving a link?) We guess the type based on the extension of the URI.
Comment 34•20 years ago
|
||
> I'm not overly keen working in a weakly-typed environment like JS.
yeah well. me neither. let's move this code to c++ ;) (just kidding)
Comment 35•20 years ago
|
||
> > I'm not overly keen working in a weakly-typed environment like JS.
>
> yeah well. me neither. let's move this code to c++ ;) (just kidding)
Don't write JS as if it were C -- that is bound to get you in trouble. Learn the
language, including its idioms. Use them well. Otherwise, you should confine
yourself to C or C++, and be prepared to have patches that should have been
written in JS rejected.
Programming is not about making the problem fit your favorite language,
especially not if that language is the wrong tool for the job.
Time_lord: please report "subtle quirks in Venkman" as bugs. JS, like many
other dynamic languages, is best used in ways that do not test (arg === false)
when (!arg) is shorter, more efficient, and more generic.
Why is this bug still UNCONFIRMED, yet being patched?
/be
Reporter | ||
Comment 36•20 years ago
|
||
(In reply to comment #33) > > So do you mean that we should just use the outermost encoding? > > I'm not sure what you mean by that. > > > And is this what document.contentType is? > > document.contentType is a type. It's never an encoding. Ok, that all came from the comments in the old code, i.e 288 if (!isDocument && !shouldDecode && contentEncodingType) { 289 // The data is encoded, we are not going to decode it, and this is not a 290 // document save so we won't be doing a "save as, complete" (which would 291 // break if we reset the type here). So just set our content type to 292 // correspond to the outermost encoding so we get extensions and the like 293 // right. 294 contentType = contentEncodingType; So this is where I'm seeing a blurring between the encoding type of a document and the content type. I guess when the document shouldn't be decoded then the content type is more closely related to the non-decoded encoding... ;-) > > > (I guess it may be for rendered docs, but not for non-document objects (eg > > images) > > document.contentType works just fine for standalone images and plugins... > > > and what do we do when we are saving a link?) > > We guess the type based on the extension of the URI. So what I don't understand is how we get the encoding type when we have a document (i.e similar to when we want the contentType and we have a document, we can use document.contentType). I.e what document field do we access (nice if there was a document.contentEncodingType) or what XPCOM object do we use, etc. I assume we shouldn't ALWAYS just examine the extension of the URL?
Comment 37•20 years ago
|
||
> I guess when the document shouldn't be decoded then the content type is more > closely related to the non-decoded encoding... ;-) Exactly. > So what I don't understand is how we get the encoding type when we have a > document If we need that information, we have to add it to documents. They don't store that right now.
Reporter | ||
Comment 38•20 years ago
|
||
(In reply to comment #35) > Time_lord: please report "subtle quirks in Venkman" as bugs. Sorry, I should have been more specific and stated 'subtle behaviour I observed in the code while I was running it in venkman'... although it is a while now since I observed whatever it was that made me shudder and got to me... > > Why is this bug still UNCONFIRMED, yet being patched? Probably because the bug (the original topic of this bug) IS still unconfirmed (by someone other than me, the reporter). The patch really belongs with bug 160454 (and I'll post any further patches there). The recent discussion has become off-topic WRTO this bug - shall we take the remainder of that bit off-line?
Comment 39•20 years ago
|
||
> Ok, that all came from the comments in the old code, i.e
ah... yeah, well, necko code maps Content-Encoding types a server sends (e.g.
"gzip") onto content types (e.g. "application/x-gzip" (iirc)) before other code
sees it. So, content-encodings can be interpreted as content types. however,
they are different things.
Reporter | ||
Comment 40•20 years ago
|
||
(In reply to comment #29) > That's correct. The times we want to not apply decoding are the times when we > want to save a gzipped file because that's the "real" file (eg a .tar.gz should > not be decompressed). > > One exception I can think of is a .ps.gz that we render with a plugin; that sort > of think we _may_ want to save as a .ps.gz... I have a stupid question - when DO we want to apply decoding??
Comment 41•20 years ago
|
||
If you're saving a file called "index.html" that claims to be of type text/html and gzip-encoded, for example.
Reporter | ||
Comment 42•20 years ago
|
||
(In reply to comment #41) > If you're saving a file called "index.html" that claims to be of type text/html > and gzip-encoded, for example. But if we're saving a document (which was originally a gzipped text/html file) which has been passed to this method, then it has already been decoded (if necessary) prior to rendering, right? So for the purposes of this method, why would we care about the original encoding type? Is it because (a) the WebBrowserPersist loads the document out of the cache and the cache stores the original encoded file, or (b) because the WebBrowserPersist may be told to reload the document from the original source (even if we already have it cached) in which case we want the re-retrieved document automatically decoded (just as it has been when it was first retrieved)? If it's (a) then is this correct? Because afterall we are passing the document to the WebBrowserPersist... If it's (b), is there any harm in assuming that decoding *may* be necessary (will WBP handle it when it isn't necessary), i.e always assume we should decode: if (shouldDecode) persist.persistFlags &= ~nsIWBP.PERSIST_FLAGS_NO_CONVERSION; Or is there something I'm missing? (Likely ;-) ) ... not forgetting that we established (#29) that *generally* we don't want to decode when doing a 'save-link'...
Comment 43•20 years ago
|
||
> Is it because (a) the WebBrowserPersist loads the document out of the cache and
> the cache stores the original encoded file
For the "save as HTML only" case this is exactly what happens. There is no way
to reconstruct the original HTML source from the document object, in general.
So when saving the raw HTML we reget it as binary data from the cache.
As for save link, that's actually the hard case. We _really_ have to guess
whether to decode or not there... The easy case is when we have a document
rendered; chances are we definitely want to decode then.
Reporter | ||
Comment 44•20 years ago
|
||
(In reply to comment #43) So what would be the effect of always setting: persist.persistFlags &= ~nsIWBP.PERSIST_FLAGS_NO_CONVERSION regardless of encoding type? Other than that, I'm really stuck at the moment
Comment 45•20 years ago
|
||
The effect would be that sometimes you would save a file called index.html that contained gzip data.
Comment 46•20 years ago
|
||
>If it's (a) then is this correct? Because afterall we are passing the document
>to the WebBrowserPersist...
not in the case when you call SaveURI - that does not get the document.
Comment 47•20 years ago
|
||
(In reply to comment #43) > As for save link, that's actually the hard case. We _really_ have to guess > whether to decode or not there... Or webbrowserpersist could be hacked to ask the helperappservice in onstartrequest whether to do decoding
Comment 48•20 years ago
|
||
> Or webbrowserpersist could be hacked to ask the helperappservice
Oh, right. That's what I wanted to do -- change applyconversion from a boolean
to a tristate: yes, no, "decide yourself as best you can".
Reporter | ||
Comment 49•20 years ago
|
||
(In reply to comment #47) > Or webbrowserpersist could be hacked to ask the helperappservice in > onstartrequest whether to do decoding ok I'll take a look...
Comment 50•19 years ago
|
||
This is an automated message, with ID "auto-resolve01". This bug has had no comments for a long time. Statistically, we have found that bug reports that have not been confirmed by a second user after three months are highly unlikely to be the source of a fix to the code. While your input is very important to us, our resources are limited and so we are asking for your help in focussing our efforts. If you can still reproduce this problem in the latest version of the product (see below for how to obtain a copy) or, for feature requests, if it's not present in the latest version and you still believe we should implement it, please visit the URL of this bug (given at the top of this mail) and add a comment to that effect, giving more reproduction information if you have it. If it is not a problem any longer, you need take no action. If this bug is not changed in any way in the next two weeks, it will be automatically resolved. Thank you for your help in this matter. The latest beta releases can be obtained from: Firefox: http://www.mozilla.org/projects/firefox/ Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html Seamonkey: http://www.mozilla.org/projects/seamonkey/
Reporter | ||
Comment 51•19 years ago
|
||
ok, responding to that auto-resolve warning. I haven't had time to come back to this but suspect this is still a problem.
Updated•15 years ago
|
Assignee: file-handling → nobody
QA Contact: ian → file-handling
Updated•8 years ago
|
Product: Core → Firefox
Version: Trunk → unspecified
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•