Closed Bug 56599 Opened 24 years ago Closed 24 years ago

Image cache never used when VALIDATE_ALWAYS set

Categories

(Core :: Networking: Cache, defect, P3)

x86
Other
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: hyatt, Assigned: hyatt)

References

()

Details

(Keywords: perf, Whiteboard: [rtm++] r=pnunn, r=be, sr=sfraser)

Attachments

(1 file)

Platform: Win32
Build: 10/13 N6 branch build
Steps
1. Go to the above URL. Watch as the background slowly loads
and as the navigation menu (it looks like a piece of parchment)
loads each individual menu item.  This is all quite slow so you
can watch the images come in.
2. Start moving through the pages of the walkthrough by clicking
on the "Next" links at the bottom.
3. Watch as most of the time the background happily reloads (along
with the navigation menu images).

Expected Results:
With the default mem/disk cache size of 1024/7680 MB, these images
should stay in the cache through every page of the walkthrough.
Test this with IE4, IE5, or NS4, and you'll see that the images
never reload throughout the walkthrough.

Actual Results:
Images should be fetched from cache.  Either our caching algorithm
is screwing this up, or something else is going wrong.
Nominating for rtm.
Keywords: rtm
Rick,

I find that for image urls, for normal loads, the channel attribute passed to 
the cache is 0. This causes an If-Modified-Since request being made for all 
images for normal loads because the user pref settings of 
VALIDATE_ONCE_PER_SESSION, VALIDATE_ALWAYS, VALIDATE_NEVER are never applied to 
image urls. Now, in the case of non-image urls, these attributes are set on the 
channel in nsDocShell::DoChannelLoad(..) method. Where should these prefs be set 
for image urls? In the image cache?, or should the nsDocShell::DoChannelLoad(..) 
method, be also called for image urls, or should netwerk cache set these user 
pref setting for all urls it processes, which have their attibute set to zero?

Thanks,
Neeti
Even with VALIDATE_ALWAYS on, the images shouldn't be reloading.  They seem not
to be in the cache at all.
 
David,

Do you find that the images are not in the cache, even if you start with a clean 
cache directory? We had a bug http://bugzilla.mozilla.org/show_bug.cgi?id=54630,
where we stopped writing files to the disk cache after 512 files were reached. 
The fix was checked on 10/12.
FWIW, on a current trunk build on Win2k, this seems to be working just fine.  On
a Win98 branch build, I see the problem.

Let me try flushing the cache on my WIn98 box when I get home.  I'll let you
knwo if the problem goes away.
This happens on both the branch and trunk, and it seems to be Win98 only.
Nope, it occurs even starting fresh with a new install and profile.
Reassigning to Pam for now, as per discussion.
Assignee: neeti → pnunn
Status: NEW → ASSIGNED
Target Milestone: --- → M18
*** Bug 57015 has been marked as a duplicate of this bug. ***
I'm not convinced that 57015 is really a dup of this bug.  Anyway, if a fix is
eventually produced, we should verify the thread pane scrolling problem
mentioned in 57015 before declaring this bug fixed.
It isn't Win98!  The problem is with the setting in the cache.  If you set it to
"Every time" instead of "Once per session", we reload images everywhere and
never use the cache.
ok.
imgcache is working as advertised.
The problem isn't a win98 problem.
The bug occurs when 'validate every time'
is set as a preference.

In 4.x, only the top level doc was validated
for this preference. Now, every item under the
doc (or everything on the page) is validated.

So unfortunately, this is working as designed.
I will reopen the chrome portion of this bug and
see what I can do to isolate the chrome images from 
this attribute's affect on the cache.

-pn
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Ok, I'm going to be a stinker about this.  I can't let this go. :)

It may be working as advertised, but I think we should reconsider the policy 
decision to alter the meaning of this pref from Netscape 4.x to 6.0.  The 
performance degradation is dramatic.

I'll go ahead and take the bug and then kick off a discussion of the policy.


Status: RESOLVED → REOPENED
Assignee: pnunn → hyatt
Status: REOPENED → NEW
Accepting.
rtm need info: 'working as designed' begs the question: why was it designed that
way?
Severity: normal → major
Whiteboard: [rtm need info]
I have found the offending line.  It's in gfx.  Attaching a one-line patch that
fixes this bug as well as 57015.
Status: NEW → ASSIGNED
Note that this patch doesn't break the reload case, since FORCE_VALIDATION and
FORCE_RELOAD still bypass the image cache.  This will basically handle both the
chrome case and the case where you browse among multiple pages with the same
images (or go back or forward to pages with the same images).
Summary: Cache doesn't work well with images → Image cache never used when VALIDATE_ALWAYS set
*** Bug 57015 has been marked as a duplicate of this bug. ***
Jump-happy code, maybe a decent optimizer will simplify it, but why not simplify
the source too?  Instead of

      if((nsIChannel::FORCE_VALIDATION & defchan_attribs)||   
-        (nsIChannel::VALIDATE_ALWAYS & defchan_attribs) ||
         (nsIChannel::INHIBIT_PERSISTENT_CACHING & defchan_attribs)||
         (nsIChannel::FORCE_RELOAD & defchan_attribs)) {

We'd have

      if (defchan_attribs & (nsIChannel::FORCE_VALIDATION |
-                            nsIChannel::VALIDATE_ALWAYS |
                             nsIChannel::INHIBIT_PERSISTENT_CACHING |
                             nsIChannel::FORCE_RELOAD)) {

Let the compiler fold the | expressions into a constant mask.

r=brendan@mozilla.org; rpotts should bless.

/be
I'll do that for the trunk if you like, but I'm keeping the original patch for
the branch, since it's 1 line. :)
Is the image cache stored on disk at any point?  I had thought not, so I wonder
why that code is conditioned for INHIBIT_PERSISTENT_CACHING.

Seeming like removing that flag as well will let SSL pages benefit from the
recovered grail, too.

In the interest of "measure once, cut twice" ...

This test was to create a new profile, set the content area to the same
size for 6.0/4.7, set the pref to ONCE per session, set the home page
to be the first of the test pages, (three identical pseudo
Netscape.com home pages), and then cycle loading these pages.

Then I would set the pref to be EVERY time, shutdown, restart and
repeat the times.

A JS routine reports the time until the onload handler fires, which is
after all images have loaded.

  http://jrgm.mcom.com/adhoc/time-to-load/home-netscape-com-1.html
  http://jrgm.mcom.com/adhoc/time-to-load/home-netscape-com-2.html
  http://jrgm.mcom.com/adhoc/time-to-load/home-netscape-com-3.html

This page has a total of 98 IMG tags, 76 of which are 'pixel.gif' used
as spacer images. The page has no background image and is almost
verbatim the contents of home.netscape.com from this summer (Changes
were to add a JS timer routine, and remove all other <script> from the
page).  Aside from the images, it is loading no other external
resources (scripts, style, object, etc.), and uses no CSS.

        msec to onload() firing
                 ONCE 4.73   EVERY 4.73   ONCE Moz      EVERY Moz
                    1609         1359         2657          2812               
                     687         1125         1547          2766               
                    1031         1000          875          2172               
                    1110          906         1375          2797               
                     922          813          844          2516               
                     860          860          859          2547               
                     860          844          859          2219               
                     844          375          860          2750               
                     812          375          843          2204               
                     812          844          860          2813               
                     907          907          860          5562               
                     375          375          860          2203               
                     375          391          891          2828               
                     375          875          891          2656               
                     843          812          860          2594               

                  ONCE 4.73    EVERY 4.73    ONCE Moz      EVERY Moz     
avg. first  3       1109          1161         1693          2583          
avg. next  12        757           698          905          2807          


Overall, Moz is 25 to 50% slower than 4.73 when checking ONCE per session
(which in itself needs investigation -- 4.7 would occasionally have some 
really fast loads  ~375msec (?)).

But, Moz is 230% to 380% slower when checking EVERY time.

----------------------------------------------------------------------
-- 2000101810 MN6 build on win2k 500MHz 128MB RAM 100Mbps LAN

----------------------------------------------------------------------
[Also of note, when I set the pref to be ONCE per session, I failed to 
actually fire the onload handler in greater than 50% of page clicks/loads).
Anybody, aware of that. Do we need a bug for that.]
I just tested this patch on a linux *debug* build.

The thread pane scrolling performance is greatly improved.  I only wish I had an
optimized build to try it out with.

I hope we get this into RTM.
Still looking for an a=...
Wait a minute.

If you set VALIDATE_ALWAYS and you change the following line
to always use the image cache if you find the image in the image cache,
you will never never update the image.

This will screw up all sorts of dynamic pages that count on us actually
validating the image.

Yes its faster, but it is also wrong.

-p
If you really want to fix this bug, the right thing to do is
1)verify the neckocache data is fresh
2)verify the imgcache version is the same as the necko cache version
  currently, we have no way to do that. This is on my list for ImgLibReturns.
3)Use the imgcache version if fresh
  or retrieve new image data and redecode if not

-pn
defer adding an image container to the imgcache until
(One small correction: the test url was 
http://jrgm.mcom.com/adhoc/time-to-load/v1.html and v2.html and v3.html)
(And the time difference is 130% to 280%, or 2.3 to 3.8 times)
I thought that Netscape 4.x did not perform validation on images.  It treated 
them as if they were "Once per session".  The proposed patch is reverting us to 
the way 4.x behaved.  Am I wrong when I say that 4.x behaved this way, or are 
you saying you disagree with reverting to 4.x's behavior?
Or is it that pages can somehow indicate that they don't want images on their 
page cached, and we'd end up ignoring that?
I just got off the phone with Pam, and I think I understand things a lot better
now.

Our current situation is that we have these options:
 - Never
 - Once per session
 - Always (every URL causes a server round-trip)

The problems this causes are:
 - We don't have a 4.x-esque setting of ``Always, except for non-top-level 
   images''
 - We migrate the 4.x ``Always, except...'' pref to ``Always, and we mean it''

So users that had their 4.x setting to always -- so that things like tinderbox
refreshes and reloading slashdot would work ``correctly'' -- are for the first
time paying a high performance price in the form of _tons_ of server round-trips
for images, like mouse-overs.

Pam and I discussed a handful of options:
 - Migrate the 4.x Always pref to be once-per-session.  Since ``reload'' will 
   always validate, regardless of cache setting, this isn't as painful for 
   dynamic content sites as it would have been in 4.x.  Pam is investigating the 
   effects on refresh-based sites like Tinderbox.
 - Add a fourth setting, ``Always (purist)'', which does the server round-trip
   each time, and modify our current Always value to mean ``Almost Always, like 
   4.x''.  This could well be a hidden pref value for RTM, since UI changes are,
   I've been led to believe, blood from the RTM stone.

(That chrome images should not go to the ``server'' every time is not
controversial: Pam believes that they should be in a separate load group, or
something, and that said group should never set VALIDATE_ALWAYS.)

With Dave's patch, we don't quite follow the 4.x behaviour, since we don't
validate for top-level images when Always is selected, but 4.x did.  When his
porn sites of choice next updated their content, I'm sure he'd have noticed. =)

Pragma: no-cache issues are distinct from this.  Nothing marked with Pragma:
no-cache should ever appear in the memory, disk or image caches, and should
therefore never cause a load to skip a server trip, right?

Pam: please correct my summary if you think it's incorrect.

Personally (and I know you're all dying to know what I think), I believe that
RTM should:
 - Make the current Always setting mimic 4.x (which requires the top-level-image
   check to be added, unless we think that our superior reload behaviour saves
   is from problems there).
And, if they want,
 - Add a hidden pref value (4, I think) to indicate the Always-purist setting.

Mozilla should visually expose both options in the prefs panel for 1.0.
I disagree with us going with wrong behavior (4.x) and
that the patch simulates 4.x behaviour.

I agree that chrome should not be validated. Chrome should be
in a different load group from the page load group and shouldn't
inherit the page's load attributes. I think that is what is now
happening with chrome.

Validating the objects on a page, take time. It will be slower.
Saying you are going to validate, and then not validating is a 
bad solution.

I set the pref to validate once per session and tinderbox is updating 
fine. The page doesn't stop dead, frozen in time.

When we migrate the user's prefs to the new browser, we should,
by default, set the pref to 'validate once per session'and add a release
note. It simulates the 4.7 behaviour much more closely. Most users will
be happy as punch, and more discerning users will understand that they
can change their pref to 'validate always' if thats what they really want.

Try setting your prefs to 'validate once' and take a look at tinderbox.
Your page does up date when the page is updated. It does the Right Thing.
-p
once someone desides how we want to migrate the pref, let me know and I'll add
it to the other pref migration code.
adding gagan to cc:. We need his input here on 
the meaning of the prefs and their application.
-p
Shaver, can you define what you mean by "top-level image"?
Chrome URLs belong to any number of load groups.  They can occur in a XUL doc, 
on a Web page, etc. etc.  I don't believe that they should be placed in special 
load groups (doing so is too risky for RTM anyway).  

Why can't we just check to see if the URI is a chrome URI in the same place 
where I applied my patch, and if it is a chrome URI, we ignore VALIDATE_ALWAYS?
When does chrome appear in a web page?

and image as toplevel doc is when you view-image rather
than access it in a page.

-p
Ok. Gordon, Gagan, Hyatt and I talked this out. 
I can give an r:pnunn, if we get a big performance
win AND we document that the pref so that the user
knows VALIDATE EVERY TIME only means VALIDATE the
top document (page) not every item on the page,
AND we work on the right solution for the next release.

Everyone agreed that the best solution is a 4th pref so
we have the following choices:
1)Verify once per session
2)Verify the page every time 
3)Verify the page and everything on the page everytime
4)Verify never

Hyatt's patch doesn't quite duplicate 4.x behaviour when 
an image is the top document, however. So we are looking
at ways to detect the case where a view-image occurs and can
provide for that case.

-p
Let me summarize the current situation before I
burn out tonight.

Use the synthdoc test from the image container, because
we only know we are using a synthdoc on the second pass.
We need to catch it on the first pass to see if we should
ignore or yield to the VALIDATE_ALL attribute.

I'm having trouble testing if the default channel and the channel
are the same, which would also tell us if the image is the
top document. I have the default channel, but I can't see how to
get the channel from the loadgroup... only the default channel.

Gordon, Hyatt, feel free to jump in at any time.
-p
Pam: what are the problems with checking the default channel on the loadgroup?
post a patch here or email so that we can solve that...
Refer to
http://lxr.mozilla.org/seamonkey/source/gfx/src/nsImageNetContextAsync.cpp#862

At this point, I have the LoadContext. I can get the loadgroup and the defload 
channel.

I don't see a way to get access to the channel. I'm sure its obvious, but I 
don't see it.

-p
thats weird. Why are we not using the necko_attribs? I guess since we don't have 
a channel at this stage the check would have to be deferred till we do. Look for 
areas where you do construct a new channel for that image. 
*** Bug 53921 has been marked as a duplicate of this bug. ***
reassigning to self while I figure out this 
 #&*%! view-image problem.

-p
!@(#$@#$*!
Assignee: hyatt → pnunn
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
I spent several days testing Hyatt's patch and
other alternative fixes. I think Hyatt's patch is
our best choice for RTM. 

Its benefits far outweigh its limitations.
With the patch, page loading for users with the
VALIDATE_FOR_EVERY_PAGE will speed up alot. (See
John's data in the bug. thanks, John)

The patch does cause view-image pages (where the image
itself is the document) to not update properly. Users
will need to shift-reload to update view-images.  This
bug will also exist when an image is the only thing in
an html frame.

I think this bug is a very small price to pay for the
benefit of increased performance.

Gagan, Gordon, Neeti and I have discussed how this bug
can be fixed in later revisions of the necko cache api's
(post RTM). I will file a separate bug to track it.

-pn


It doesn't completely mimic 4.x behaviour.
Whiteboard: [rtm need info] → [rtm+]
arrrgh.
That last line is cruft from an earlier buffer. 
Please ignore.
----------------------------------
The bug to handle imgcache/netcache interactions with
preference settings is #57730.

-pn
Hmm..all these comments about performance, and no one added the "perf" keyword
to the bug. I've built with this patch and it does DRAMATICALLY increase
performance, so I've added that to the status whiteboard so the PDT can see how
much of a win this would be.
Keywords: perf
Whiteboard: [rtm+] → [rtm+] DRAMATICALLY increases performance.
Move back to [rtm+] when the r= and sr= have noted their approval in the bug.
Whiteboard: [rtm+] DRAMATICALLY increases performance. → [rtm need info] DRAMATICALLY increases performance.
We have two r='s in the comments for this bug, but I guess we need a sr (unless
pnunn can act as a sr).

we have a r=brendan, and r=pnunn. Was pnunn supposed to be an a=, not a r=?
brendan also said something about rpotts being the a=.
Whiteboard: [rtm need info] DRAMATICALLY increases performance. → [rtm need info] DRAMATICALLY increases performance. r=pnunn, need sr=
Reordered status whiteboard comments so the "need sr=" is visible immediately
upon load of this bug.
Whiteboard: [rtm need info] DRAMATICALLY increases performance. r=pnunn, need sr= → [rtm need info] r=pnunn, need sr=, DRAMATICALLY increases performance.
Since we have two r= already, I'll sr this: sr=sfraser
cool beans.
Hyatt, we're ready to go. Awaiting an extra PLUS from pdt.

-p
Whiteboard: [rtm need info] r=pnunn, need sr=, DRAMATICALLY increases performance. → [rtm+] r=pnunn, r=be, sr=sfraser
reassigning to hyatt, cause he has fresh
files with clean change to check in.
-p
Assignee: pnunn → hyatt
Status: ASSIGNED → NEW
I can give a=, sr=, or whatever you want.  Alphabet soup, yuck.

Please take this for rtm.

/be
rtm++
Whiteboard: [rtm+] r=pnunn, r=be, sr=sfraser → [rtm++] r=pnunn, r=be, sr=sfraser
Fix checked in on branch and trunk.
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
verified on branch:
WinNT4 1024
Linux rh6 1024
Mac os9 1024

needs verified on trunk
Keywords: vtrunk
*** Bug 57015 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: