Open Bug 486921 Opened 12 years ago Updated 9 years ago
For "Save As File Only" functions, reuse the docshell code to prepare the channel to be persisted
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:22.214.171.124) Gecko/2009032609 Firefox/3.0.8 (.NET CLR 3.5.30729) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:126.96.36.199) Gecko/2009032609 Firefox/3.0.8 (.NET CLR 3.5.30729) When a page that is already displayed in a browser window needs to be saved as file only, the network channel that was used to fetch the original page must be set up again, this time with instructions to read the data from the cache, if available. This channel setup operation is similar, but not identical, to the operation performed when a currently displayed page is reloaded. Currently, this logic is reimplemented (partially) by the various save functions defined in "contentAreaUtils.js". Instead of having the logic duplicated, the code in the docshell component could be reused. Reproducible: Always The "View Source" window already does something similar to save the displayed source locally, for editing with an external editor: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/viewsource/content/viewSourceUtils.js#138 In this case, the load operation completes inside the docshell itself, then the loaded document is saved. This code also performs character set conversion while saving the file (line 233). For this method to work in a browser window, without affecting the displayed document, there should be a way for saving the raw channel data without actually creating a content viewer. In other words, the page reload code path should be followed until just before the channel is opened, then the channel should be returned to the caller, still unopened. The code path starts here [nsDocShell::LoadPage]: http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#3625 And terminates here [aURILoader->OpenURI(aChannel, ...)]: http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#7803 Note that some of the operations that are performed by the docshell code, such as possibly adding the page to session history, are not needed if the purpose is to create a channel that can be used elsewhere.
The following bugs may be related to the differences between the two procedures that set up the channel for save and for reload. Sharing the code may resolve or help to resolve them: Bug 485196 - Web page generated by POST is retried as GET when Save Frame As used, and the page is no longer in the cache Bug 479296 - Saving POST result page silently resends POST data if page no longer in cache Bug 177329 - nsIWebBrowserPersist can't persist things whose cache entries have expired (maybe, now, a duplicate of 479296) Bug 472895 - Save page, HTML only, on document.write result document (wyciwyg document) completely fails (saves document that called write() instead) Bug 120809 - Save as function refetches data or images that are in the cache
Paolo: I've given you the Bugzilla privileges required to file bugs as NEW, and edit existing bugs.
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #2) > Paolo: I've given you the Bugzilla privileges required to file bugs as NEW, and > edit existing bugs. Thanks Gavin, this is helpful :-)
This patch is an experiment, to allow an easier review of the save process. It does not reflect necessarily how I'd do the final implementation. I created a new method named newSaveChannel in the nsIWebPageDescriptor interface. This method basically follows the code path described in comment 0, and results in a channel being returned to the caller. The channel to be persisted is prepared in the same way as history navigation does (LOAD_HISTORY type). While replicating the code path, I copied the InternalLoad function and created InternalLoadForNewSaveChannel_COPY_FOR_REVIEW. I removed much of the tests done by InternalLoad; I'm interested in knowing if the remaining checks are useful for a "Save As" operation, or if the InternalLoad function can be bypassed completely. Also, I'd be interested in knowing if the load flags that are set on the channel for history reloads may be appropriate for the "Save As" case. When this patch is present, bug 479296 is fixed since the download fails if the page with the POST data is no longer in the cache. After a reload, the page can be saved. Bug 485196 and bug 472895 are apparently fixed. I don't know whether bug 120809 is affected.
So... This only helps cases when we're saving a document that's loaded in a docshell; it doesn't help with various other cases (e.g. save image as), right? Some of the code really doesn't look quite right to me (e.g. the CheckLoadURI check), but I'd need to dig more to see whether it's actually correct. I think it's worth going through exactly how docshell and web browser persist differ in their channel setup, figuring out what those differences mean in practice (e.g. the former can read from an offline app cache) and then for each difference deciding which behavior is desirable for save as. Then seeing where we stand on reusing the docshell code.
(In reply to comment #5) > So... This only helps cases when we're saving a document that's loaded in a > docshell; it doesn't help with various other cases (e.g. save image as), > right? Yes. > Some of the code really doesn't look quite right to me (e.g. the CheckLoadURI > check), but I'd need to dig more to see whether it's actually correct. Thank you! > I think it's worth going through exactly how docshell and web browser persist > differ in their channel setup, figuring out what those differences mean in > practice (e.g. the former can read from an offline app cache) and then for > each difference deciding which behavior is desirable for save as. Then > seeing where we stand on reusing the docshell code. Yes, that's nearly what I intended to ask with my specific questions in comment 4. I don't know what the desirable behavior should be; someone with more expertise should actually answer these questions. I hope this experimental patch helps by highlighting the code path on the docshell side, regardless of what the final implementation will be. Actually, a reviewer might as well proceed in the opposite direction: first go through the docshell code, which is the most complex and complete, and figure out what the desirable behavior for "Save As" could be; then cross-check with the current "Save As" behavior. Maybe it would turn out that something that is not a difference in the current channel setup should actually be done differently.
This is an actual patch implementing the newSaveChannel method. One of the goals of this work is also to separate the channel preparation from the actual load in the docshell, which may be useful in general. However, this is not the cleanest implementation that's possible; feedback is welcome :-)
I've still not analyzed the differences between the load flags that are currently set up by the "Save As" functions and the flags that this new method will set. This in another area where some advice would be useful.
Comment on attachment 391907 [details] [diff] [review] nsIWebPageDescriptor.newSaveChannel method I like the idea of this patch, but I'm sure it is outdated by now. I'll review a new version, if someone wants to produce one.
You need to log in before you can comment on or make changes to this bug.