Closed Bug 170722 Opened 22 years ago Closed 22 years ago

referrer & headers not sent by persistence object.

Categories

(Core Graveyard :: File Handling, defect)

x86
Windows 2000
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mgabriel, Assigned: adamlock)

References

Details

Attachments

(1 file, 4 obsolete files)

http://pub101.ezboard.com/fhbsgrafix38611frm1.showMessage?topicID=2606.topic try downloading the images with save linktarget as or shift-click instead of html its saving as .html.gif
similar thing happening on http://pub101.ezboard.com/fhbsgrafix38611frm1.showMessage?topicID=2599.topic .jpg gets saved as .jpg.gif
Not a bug. Those documents are being served as image/gif.
humm, seems like its some kind of protection of those images. with netscape it works tho mozilla submitting a wrong referrer ?
Mozilla does not send a referrer for "save link as". We should. The persistence object needs to support this.
Assignee: law → adamlock
Summary: file.html saved as file.html.gif → referrer not sent by persistence object.
Blocks: 99642
QA Contact: sairuh → mdunn
nsIWebBrowserPersist::saveURI api needs referrer and headers parameters.
Summary: referrer not sent by persistence object. → referrer & headers not sent by persistence object.
Attached patch Proposed patch (obsolete) — Splinter Review
Can I have feedback on this proposed patch to add headers and referrer arguments to saveURI? I'll submit a new patch including fixes to all calls to saveURI() if this meets with general approval. Note that the nsIWebNavigation::loadURI specifies the headers argument as an input stream but the docshell impl has a horrible time turning it into a string. Therefore I thought it would save a lot of effort for nsIWebBrowserPersist to just specify the headers as a string argument. Due to its optional nature, I have not used AString since I want callers to pass nsnull if they don't care the argument.
QA Contact: mdunn → depstein
QA Contact: depstein → dsirnapalli
Hey, as long as we're changing this interface could we make it take a page descriptor too? That would sorta help with the whole "repostning postdata documents" problem....
What the persist object needs is the cache key. The docshell's impl of the IWebPageDescriptor::currentDescriptor attribute returns an ISHEntry object with a cacheKey property. So whoever calls the the persist object could grab the cache key from that assuming the persist object had a cache key parameter. It may straightfoward to implement one for the saveURI method. The saveDocument method could also have a key parameter but the issue is more complicated since the document is fixed up and saved from the DOM and not refetched from the network. I don't know if the document cache key affects how URIs contained in a page are fetched but if so then it's possible it would have some use in saveDocument too. I'll have to investigate further.
> So whoever calls the the persist object could grab the cache key This is what rpotts was trying so hard to avoid when he put the nsIWebPageDescriptor stuff in place... Hence me ccing him on this bug. Note that an nsISupports is returned, not an nsISHEntry... this is not an accident. You're correct that this would be primarily useful for saveURI. Even that is a good start... This is why I'd rather webbrowserpersist took an nsIWebPageDescriptor -- that way we can maybe create a descriptor at some point that would actually do what we want for saveDocument too and callers of nsIWebBrowserPersist would not need to be updated... (or the interface, for that matter). We know for a fact that the cacheKey will not become useful for fetching sub-parts of a page...
*** Bug 177340 has been marked as a duplicate of this bug. ***
The problem is that if the persist object takes nsIWebPageDescriptor, it relies on implementation specific details to work. In this case, persist has to know that docshell's nsIWebPageDescriptor::currentDescriptor is actually an nsISHEntry and from that grab the cache key. I'd much rather the persist object were given the cache key straight and leave it up to the caller to pull it out from wherever it is stored.
*** Bug 186030 has been marked as a duplicate of this bug. ***
Attached patch Patch (obsolete) — Splinter Review
Patch adds cache key, referrer and headers args to the saveURI method. I'll have to plug in some test combinations before requesting reviews on this. The issue of how the caller gets a cache key to give to the persist object in the first place is their problem.
Attachment #102027 - Attachment is obsolete: true
Could you please coordinate this with darin and rpotts? It would be a shame if Rick spends his time making sure cache keys are not exposed anywhere outside of Necko and nsIWebBrowserPersist requires them to function...
Cache keys are exposed as opaque nsISupports objects, available from nsICachingChannel. Anyone who is interested in saving the cache key need only grab it from their web progress listener as the data arrives in the first place. Docshell does this and anyone calling the persist object could too. In other words, it is not hard for the caller to get a cache key and supply it to saveURI. The persist object then opens the channel, QIs for nsICachingChannel and sets the provided key on it, thus allowing data to be refetched from the cache if its there. The key is just some opaque object and neither persist nor the caller needs to care what it is beyond that. This is all the docshell is doing, except it buries the key in the session history entry and ties session history entries to the nsIWebPageDescriptor. The nsIWebPageDescriptor hides the current session history entry behind an opaque currentDescriptor member that would need the persist object to use internal implementation details of the docshell in order to fetch the cache key. The caller might not even be using docshell so this is no good, and if they were it is straightforward enough to supply their own web progress listener and obtain the key that way. Please look at the docshell and see what this interface is doing.
Adam, I know what that interface is doing -- I reviewed the patch that added it. Here is the problem as I see it: 1) The callers of nsIWebBrowserPersist (99% of the time this is the browser "save as" code in Mozilla and is presumably the "save as" UI in most embedding apps) are typically not web progress listeners 2) The callers of nsIWebBrowserPersist thus have to get the cache key from the docshell directly. 3) There is no API for this short of getting the webnavigation, getting the session history, getting the entry at "0", QIing to nsISHEntry, and getting the cache key from that. Now I could certainly make Mozilla's "save as" code jump through all those hoops to get the cache key, and I will do just that if needed. But the fact that saving a web page requires knowledge of webnavigation and session history for the imbedder seems really wrong... All of this is of course tangential to this bug, which is about the much simpler referrer/headers problems...
1. The nsIWebBrowserPersist interface has an "progressListener" attribute itself, so most embedders will be familiar with it and many will implement one anyway. This is true even for nsProgressDlg.js in Mozilla. 2. The docshell's cache key is only relevant if the caller wants to save a document, otherwise it is no good. It is irrelevant for saving links, images, js, etc. Neither is the page descriptor any good if you want to save a document which is not currently visible, since only the current session history entry is accessible via nsIWebPageDescriptor. 3. There is a frozen embedding api for adding a web progress listener from nsIWebBrowser as well as via docshell, and from these it is just a matter of QI'ing the request for the nsICachingChannel to store the cache key. I really do not believe the nsIWebPageDescriptor is useful substitute at all for just handing the bona fide and easily available cache key. However, I could add some convenience code to the persist object to make life easier for the 'save as' code, i.e. nsIWebBrowserPersist would still document and expect a cache key in the nsISupports value, however the implementation would first QI for nsIWebPageDescriptor and use the cache key obtained from that instead if it does. This will introduce a new build dependency to docshell, but barring any issues with that should be reasonably straightforward.
I believe the idea of nsIWebPageDescriptor was that at a later point it could transparently be made to be useful for things other than just the HTML of the page, unlike cache keys which can't be, pretty much by definition... But yes, the real problem here is the lack of a good api for getting the "cached page" with all the things hanging off it... As for web progress listeners, the progress listener on the persistence object is the one that listens to the saving process. This can't be the listener that stashes away the cache key, because the cache key is needed before that. So we come back to the fact that the only place to get the cache key is from the docshell (and there is no way to get even that for linked resources such as images...)
Blocks: 61660
does patch send only http:// referrers sending refferer for file:, mailbox, data: is useless and for https: - a security risk
All that stuff is already handled by necko. Read nsHttpChannel::SetReferrer please....
Attached patch Updated patch (obsolete) — Splinter Review
Patch accepts a cache key or a page descriptor. I've also updated contentAreaUtils.js to supply a web page descriptor when calling saveURI. The cache key / descriptor thing only kicks in when the page in question is a post request so test with something like the mozilla.org search page.
Attachment #109842 - Attachment is obsolete: true
> + * @return NS_ERROR_INVALID_ARG One or more arguments was invalid, including if > + * referrer, extraheaders or postdata are supplied > + * when the uri scheme is not http or https. Why impose the burden of checking scheme for referrer on the caller? Can't you just ignore the referrer if the channel you create from the uri is not an HTTP channel? (It may make sense to throw if postdata or extraheaders are set, as you do now). The code in contentAreaUtils is wrong for things like "save link as", "save frame as", and probably a few other callers... That really needs to be dealt with at a much higher level in the save as code (feel free to just file a separate bug on doing that). This looks pretty good, though!
Attached patch Updated patch 2 (obsolete) — Splinter Review
Patch is more lenient about the postdata, referrer, headers args. I've also removed the code in contentAreaUtils.js with the intent that this will be covered by another bug. Boris, do you want to r= this?
Attachment #110869 - Attachment is obsolete: true
Comment on attachment 110885 [details] [diff] [review] Updated patch 2 r=bzbarsky; please file separate bugs on having a necko utility function to do that header parsing stuff and on the contentAreaUtils changes that are needed...
Attachment #110885 - Flags: review+
Comment on attachment 110885 [details] [diff] [review] Updated patch 2 Darin can you sr this as it deals mainly with cache / necko issues?
Attachment #110885 - Flags: superreview?(darin)
Comment on attachment 110885 [details] [diff] [review] Updated patch 2 >Index: mozilla/embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp >+ oneHeader.StripWhitespace(); hmm... consider this header: Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT if you stripped all whitespace, then you'd have: Last-Modified:Tue,15Nov199412:45:26GMT which looks invalid to me. perhaps you should just trim whitespace after the colon? otherwise, this patch seems fine to me.
Attachment #110885 - Flags: superreview?(darin) → superreview-
Attached patch Updated patch 3Splinter Review
Hmm, I copied that header parsing code from nsDocshell.cpp, so I have produced an updated patch that corrects docshell too. Calls to StripWhitespace are replaced with Trim on the header name and its value.
Attachment #110885 - Attachment is obsolete: true
Comment on attachment 110947 [details] [diff] [review] Updated patch 3 sr? New patch uses Trim instead and fixes code in docshell that the headers parser was based off
Attachment #110947 - Flags: superreview?(darin)
Comment on attachment 110947 [details] [diff] [review] Updated patch 3 sr=darin
Attachment #110947 - Flags: superreview?(darin) → superreview+
Thanks all, fix is checked in, bug 188253 opened to deal with callers supplying a cache key or descriptor
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
*** Bug 188726 has been marked as a duplicate of this bug. ***
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: