Closed Bug 172279 Opened 22 years ago Closed 12 years ago

nsHttpChannel::SetCacheAsFile ignores HTTP content-encoding header

Categories

(Core :: Networking, defect)

defect
Not set
major

Tracking

()

RESOLVED INVALID

People

(Reporter: darin.moz, Unassigned)

References

Details

(Keywords: helpwanted, topembed-, Whiteboard: info needed)

nsHttpChannel's implementation of nsICachingChannel::cacheAsFile is broken if
the HTTP response headers include a "Content-Encoding: gzip" header.  The
problem is that we store the file in the cache in compressed form, but the
expectation of cache-as-file is that the file will match the document that would
be streamed via OnDataAvailable.  well, in this case, it doesn't match.

this potentially effects jar:http:// URLs and plugins (since those are the only
consumers of the cache-as-file feature).  fortunately, jar:http:// corresponds
to a http:// URL referencing a .jar file, which is itself compressed.  hence, it
is very uncommon to see a jar file compressed again, but as Frank Tang can
attest to, it does happen! :(

solving this bug for plugins is easy.  we can just make SetCacheAsFile fail if a
content-encoding header is set (SetCacheAsFile can only be called inside an
OnStartRequest event).

however, solving this bug for jar:http:// is going to be much more tricky.  i'm
thinking about the possibility of introducing a new cache client for
uncompressed downloads.  this would be used to store the uncompressed versions
of those files served with a Content-Encoding in the disk cache for the purposes
of cache-as-file.  HTTP could do this transparently under the hood so as to
maximize the likelihood of storing documents in the disk cache (remember that
plugins has their temp dir w/ all its issues).  storing temporary downloads in
the disk cache is a good thing from the point of view of resource management, i
think.  nsDownloader and possibly the JAR channel also need to be taught to
reference the cache token until the cache file is completely read (something
plugins should also do).

if done right, this could eliminate the need for the plugin temp dir. i'm just
not sure how to handle a "cache-control: no-store" or HTTPS data.  do we stream
it to disk? ...and then maybe delete it as soon as the cache token is released??
Status: NEW → ASSIGNED
How do we handle no-store data that needs to go to a helper or plugin right now?
 We dump it to disk (plugins do their own detection of content-encoding for
precisely the reason you describe).... Dumping it to disk in a different
directory is not really substantively different, is it?
with helper apps, the data is always streamed to a separate file, and we have
the problem of having to keep the file around for an unspecified length of time.

with plugins the situation is much different.  we know the lifetime of the
plugin (sort of), so we can delete the downloaded data if need be after the
plugin has shutdown (user loads a new page).  in the case of no-store content,
etc., nsHttpChannel::SetCacheAsFile(PR_TRUE) returns an error, which causes
plugins to fallback to using their plugin specific temp dir.  if we made necko
smarter about implementing cache-as-file, then we could avoid the plugin
specific temp dir.
another solution for jar:http:// might be to do away with the downloader concept
and just fix jar to load data from a channel instead.  not sure how much work
would be involved, since it probably means big changes to libjar.
Hmm.. What if the incoming data is bigger than the disk cache limit?  Would
SetCacheAsFile() still succeed (and temporarily exceed the cache limit)?  I'm
fine  with it if it does, but we should make sure the interface makes this
clear.  ;)
nope, by default the cache will error out a call to nsIOutputStream::Write when
the single entry grows too big.  we could, i suppose, add an option to the cache
that would allow us to temporarily exclude an entry from being included in the
size calculations... that is until its descriptor is released.  hmm..
My point is, plugins are a lot more likely to run into this problem, especially
on multi-user systems with low quotas....
Severity: normal → major
Keywords: nsbeta1, topembed
Priority: -- → P2
Target Milestone: --- → mozilla1.3beta
Discussed in edt team meeting.  Minusing because we don't see a case where this
hurts users.  If anyone can point out a concrete case where users are harmed
please renominate.
Keywords: topembedtopembed-
It hurts users any time plugin content is served encoded (eg with gzip).  In
those cases, the plugin will fail to run and is likely to crash, in fact,
bringing down the browser.
Keywords: topembed-topembed
Actually it should not hurt any users, plugin code saves already decoded data
into tmp location if there is "Content-Encoding" header
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/modules/plugin/base/src/nsPluginHostImpl.cpp&rev=1.453&root=/cvsroot#2533
Is there any particular plugin's crasher test case?

on darin's comment #2
>if we made necko smarter about implementing cache-as-file, then we could avoid
>the plugin specific temp dir.
unfortunately it's not enough to eliminate plugin's tmp dir, some plugins
require the file extension to handle the content properly (bug 90558) 
do we have any test cases in the wild where we see this happening for any
protocol? Where did FTang run into it?
Whiteboard: info needed
saari: yes, the china folks got hit by this bug.
discussed in edt.  Looks like there is end user/embedding client impact so plussing.
Keywords: topembedtopembed+
isn't this cache or http?
QA Contact: benc → tever
yes and no.  the related code spans various necko components (downloader, http,
cache, jar).
Target Milestone: mozilla1.3beta → mozilla1.4alpha
Priority: P2 → P5
Target Milestone: mozilla1.4alpha → mozilla1.4beta
Discussed in topembed bug triage.  Minusing.
Keywords: topembed+topembed-
adt: nsbeta1-
Keywords: nsbeta1nsbeta1-
Blocks: jar:https
Keywords: helpwanted
Priority: P5 → --
Target Milestone: mozilla1.4beta → Future
Assignee: darin → nobody
Status: ASSIGNED → NEW
QA Contact: tever → networking
Target Milestone: Future → ---
A JAR file on www.mozilla.org is suffering from this bug.

"Signed Scripts & Privileges: An Example"
http://www.mozilla.org/projects/security/components/signed-script-example.html
has a link to the page in a JAR file.
jar:http://www.mozilla.org/projects/security/components/signed-script-demo.jar!/signed-script-demo.html
You cannot see the page ("Signed Script Example") because www.mozilla.org sends the JAR file with "Content-Encoding: gzip".
The reporter of Bug 358436 is confused by this bug.

NOTE: Maybe by the effect of proxies, sometimes you may get non-gzip'ed file after one has accessed with "network.http.accept-encoding" emptied. Please try again after a while to see this problem.
nsHttpChannel::SetCacheAsFile may be going away according to bug 725993.
Depends on: 725993
This is now invalid because SetCacheAsFile is gone.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.