Closed Bug 170722 Opened 22 years ago Closed 22 years ago

referrer & headers not sent by persistence object.

Categories

(Core Graveyard :: File Handling, defect)

x86
Windows 2000
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mgabriel, Assigned: adamlock)

References

Details

Attachments

(1 file, 4 obsolete files)

http://pub101.ezboard.com/fhbsgrafix38611frm1.showMessage?topicID=2606.topic
try downloading the images with save linktarget as or shift-click
instead of html its saving as .html.gif
similar thing happening on
http://pub101.ezboard.com/fhbsgrafix38611frm1.showMessage?topicID=2599.topic
.jpg gets saved as .jpg.gif
Not a bug.  Those documents are being served as image/gif.
humm, seems like its some kind of protection of those images.
with netscape it works tho
mozilla submitting a wrong referrer ?
Mozilla does not send a referrer for "save link as".  We should.  The
persistence object needs to support this.
Assignee: law → adamlock
Summary: file.html saved as file.html.gif → referrer not sent by persistence object.
Blocks: 99642
QA Contact: sairuh → mdunn
nsIWebBrowserPersist::saveURI api needs referrer and headers parameters.
Summary: referrer not sent by persistence object. → referrer & headers not sent by persistence object.
Attached patch Proposed patch (obsolete) — Splinter Review
Can I have feedback on this proposed patch to add headers and referrer
arguments to saveURI? I'll submit a new patch including fixes to all calls to
saveURI() if this meets with general approval.

Note that the nsIWebNavigation::loadURI specifies the headers argument as an
input stream but the docshell impl has a horrible time turning it into a
string. Therefore I thought it would save a lot of effort for
nsIWebBrowserPersist to just specify the headers as a string argument. Due to
its optional nature, I have not used AString since I want callers to pass
nsnull if they don't care the argument.
QA Contact: mdunn → depstein
QA Contact: depstein → dsirnapalli
Hey, as long as we're changing this interface could we make it take a page 
descriptor too?  That would sorta help with the whole "repostning postdata 
documents" problem....
What the persist object needs is the cache key. The docshell's impl of the
IWebPageDescriptor::currentDescriptor attribute returns an ISHEntry object with
a cacheKey property. So whoever calls the the persist object could grab the
cache key from that assuming the persist object had a cache key parameter. 

It may straightfoward to implement one for the saveURI method. The saveDocument
method could also have a key parameter but the issue is more complicated since
the document is fixed up and saved from the DOM and not refetched from the
network. I don't know if the document cache key affects how URIs contained in a
page are fetched but if so then it's possible it would have some use in
saveDocument too.

I'll have to investigate further.
> So whoever calls the the persist object could grab the cache key

This is what rpotts was trying so hard to avoid when he put the
nsIWebPageDescriptor stuff in place... Hence me ccing him on this bug.  Note
that an nsISupports is returned, not an nsISHEntry... this is not an accident.

You're correct that this would be primarily useful for saveURI.  Even that is a
good start... This is why I'd rather webbrowserpersist took an
nsIWebPageDescriptor -- that way we can maybe create a descriptor at some point
that would actually do what we want for saveDocument too and callers of
nsIWebBrowserPersist would not need to be updated... (or the interface, for that
matter).  We know for a fact that the cacheKey will not become useful for
fetching sub-parts of a page...

*** Bug 177340 has been marked as a duplicate of this bug. ***
The problem is that if the persist object takes nsIWebPageDescriptor, it relies
on implementation specific details to work. In this case, persist has to know
that docshell's nsIWebPageDescriptor::currentDescriptor is actually an
nsISHEntry and from that grab the cache key.

I'd much rather the persist object were given the cache key straight and leave
it up to the caller to pull it out from wherever it is stored.
*** Bug 186030 has been marked as a duplicate of this bug. ***
Attached patch Patch (obsolete) — Splinter Review
Patch adds cache key, referrer and headers args to the saveURI method. I'll
have to plug in some test combinations before requesting reviews on this.

The issue of how the caller gets a cache key to give to the persist object in
the first place is their problem.
Attachment #102027 - Attachment is obsolete: true
Could you please coordinate this with darin and rpotts?  It would be a shame if 
Rick spends his time making sure cache keys are not exposed anywhere outside of 
Necko and nsIWebBrowserPersist requires them to function...
Cache keys are exposed as opaque nsISupports objects, available from
nsICachingChannel. Anyone who is interested in saving the cache key need only
grab it from their web progress listener as the data arrives in the first place.
Docshell does this and anyone calling the persist object could too. 

In other words, it is not hard for the caller to get a cache key and supply it
to saveURI. The persist object then opens the channel, QIs for nsICachingChannel
and sets the provided key on it, thus allowing data to be refetched from the
cache if its there. The key is just some opaque object and neither persist nor
the caller needs to care what it is beyond that.

This is all the docshell is doing, except it buries the key in the session
history entry and ties session history entries to the nsIWebPageDescriptor. The
nsIWebPageDescriptor hides the current session history entry behind an opaque
currentDescriptor member that would need the persist object to use internal
implementation details of the docshell in order to fetch the cache key. The
caller might not even be using docshell so this is no good, and if they were it
is straightforward enough to supply their own web progress listener and obtain
the key that way. Please look at the docshell and see what this interface is doing.
Adam, I know what that interface is doing -- I reviewed the patch that added 
it.  Here is the problem as I see it:

1)  The callers of nsIWebBrowserPersist (99% of the time this is the
    browser "save as" code in Mozilla and is presumably the "save as" UI in
    most embedding apps) are typically not web progress listeners
2)  The callers of nsIWebBrowserPersist thus have to get the cache key from the
    docshell directly.
3)  There is no API for this short of getting the webnavigation, getting the
    session history, getting the entry at "0", QIing to nsISHEntry, and getting
    the cache key from that.

Now I could certainly make Mozilla's "save as" code jump through all those 
hoops to get the cache key, and I will do just that if needed.  But the fact 
that saving a web page requires knowledge of webnavigation and session history 
for the imbedder seems really wrong...

All of this is of course tangential to this bug, which is about the much 
simpler referrer/headers problems...
1. The nsIWebBrowserPersist interface has an "progressListener" attribute
itself, so most embedders will be familiar with it and many will implement one
anyway. This is true even for nsProgressDlg.js in Mozilla.

2. The docshell's cache key is only relevant if the caller wants to save a
document, otherwise it is no good. It is irrelevant for saving links, images,
js, etc. Neither is the page descriptor any good if you want to save a document
which is not currently visible, since only the current session history entry is
accessible via nsIWebPageDescriptor.

3. There is a frozen embedding api for adding a web progress listener from
nsIWebBrowser as well as via docshell, and from these it is just a matter of
QI'ing the request for the nsICachingChannel to store the cache key.

I really do not believe the nsIWebPageDescriptor is useful substitute at all for
just handing the bona fide and easily available cache key.

However, I could add some convenience code to the persist object to make life
easier for the 'save as' code, i.e. nsIWebBrowserPersist would still document
and expect a cache key in the nsISupports value, however the implementation
would first QI for nsIWebPageDescriptor and use the cache key obtained from that
instead if it does. This will introduce a new build dependency to docshell, but
barring any issues with that should be reasonably straightforward.
I believe the idea of nsIWebPageDescriptor was that at a later point it could 
transparently be made to be useful for things other than just the HTML of the 
page, unlike cache keys which can't be, pretty much by definition...

But yes, the real problem here is the lack of a good api for getting 
the "cached page" with all the things hanging off it...

As for web progress listeners, the progress listener on the persistence object 
is the one that listens to the saving process.  This can't be the listener that 
stashes away the cache key, because the cache key is needed before that.  So we 
come back to the fact that the only place to get the cache key is from the 
docshell (and there is no way to get even that for linked resources such as 
images...)
Blocks: 61660
does patch send only http:// referrers
sending refferer for file:, mailbox, data: is useless
and for https: - a security risk
All that stuff is already handled by necko.  Read nsHttpChannel::SetReferrer
please....
Attached patch Updated patch (obsolete) — Splinter Review
Patch accepts a cache key or a page descriptor. I've also updated
contentAreaUtils.js to supply a web page descriptor when calling saveURI. The
cache key / descriptor thing only kicks in when the page in question is a post
request so test with something like the mozilla.org search page.
Attachment #109842 - Attachment is obsolete: true
> +   * @return NS_ERROR_INVALID_ARG One or more arguments was invalid, including if
> +   *                              referrer, extraheaders or postdata are supplied
> +   *                              when the uri scheme is not http or https.

Why impose the burden of checking scheme for referrer on the caller?  Can't you
just ignore the referrer if the channel you create from the uri is not an HTTP
channel?  (It may make sense to throw if postdata or extraheaders are set, as
you do now).

The code in contentAreaUtils is wrong for things like "save link as", "save
frame as", and probably a few other callers...  That really needs to be dealt
with at a much higher level in the save as code (feel free to just file a
separate bug on doing that).

This looks pretty good, though!
Attached patch Updated patch 2 (obsolete) — Splinter Review
Patch is more lenient about the postdata, referrer, headers args. I've also
removed the code in contentAreaUtils.js with the intent that this will be
covered by another bug.

Boris, do you want to r= this?
Attachment #110869 - Attachment is obsolete: true
Comment on attachment 110885 [details] [diff] [review]
Updated patch 2

r=bzbarsky; please file separate bugs on having a necko utility function to do
that header parsing stuff and on the contentAreaUtils changes that are
needed...
Attachment #110885 - Flags: review+
Comment on attachment 110885 [details] [diff] [review]
Updated patch 2

Darin can you sr this as it deals mainly with cache / necko issues?
Attachment #110885 - Flags: superreview?(darin)
Comment on attachment 110885 [details] [diff] [review]
Updated patch 2

>Index: mozilla/embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp

>+                oneHeader.StripWhitespace();

hmm... consider this header:

  Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT

if you stripped all whitespace, then you'd have:

  Last-Modified:Tue,15Nov199412:45:26GMT

which looks invalid to me.  perhaps you should just trim whitespace after the
colon?

otherwise, this patch seems fine to me.
Attachment #110885 - Flags: superreview?(darin) → superreview-
Attached patch Updated patch 3Splinter Review
Hmm, I copied that header parsing code from nsDocshell.cpp, so I have produced
an updated patch that corrects docshell too. Calls to StripWhitespace are
replaced with Trim on the header name and its value.
Attachment #110885 - Attachment is obsolete: true
Comment on attachment 110947 [details] [diff] [review]
Updated patch 3

sr?

New patch uses Trim instead and fixes code in docshell that the headers parser
was based off
Attachment #110947 - Flags: superreview?(darin)
Comment on attachment 110947 [details] [diff] [review]
Updated patch 3

sr=darin
Attachment #110947 - Flags: superreview?(darin) → superreview+
Thanks all, fix is checked in, bug 188253 opened to deal with callers supplying
a cache key or descriptor
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
*** Bug 188726 has been marked as a duplicate of this bug. ***
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: