Closed Bug 2860 Opened 27 years ago Closed 25 years ago

WebSite-browser sends numerous repeated requests to download layer images when cache set to 0/0

Categories

(Core :: Networking: Cache, defect, P1)

All
Solaris
defect

Tracking

()

VERIFIED INVALID

People

(Reporter: chao, Assigned: warrensomebody)

References

()

Details

(This bug imported from BugSplat, Netscape's internal bugsystem.  It
was known there as bug #86390
http://scopus.netscape.com/bugsplat/show_bug.cgi?id=86390
Imported into Bugzilla on 02/03/99 11:49)

ESC WS-sumaray of the problem (chofmann told us to mark this in the bug report;
don't know what it means)

This is a cross-platform, all Communicator 4.0.x bug.
I don't know what happened to the bugsplat interface, since I can only find
version number 4.0b4 for Communicator.  Shouldn't it be 4.04b instead?
Anyway, this should be a bug against all 4.0.3 releases.

The browser can generate numerous repeated server requests for images that are
either in the layer or are the layer background images on a forced RELOAD or a
RESIZE when cache is set to 0/0.  For example, with the current Netscape home
page, there are a set of 10 images used by the menu on the top left corner.
Depending on the version/platform of the browser, the repeated download requests
for those 10 images can be anywhere from 50  to 1300 .  This problem is
particularly severe with the Sun Solaris browser:  It generated 1334 download
requests for those 10 images and 1139 of the 1334 reuests are for a particular
image mcombo.gif.

Another strange thing I observed is that the repeated request numbers are
different depending also on the server releases.  For example, with Sun Solaris
browser, the request number 1334 is against the Enterprise 3.0 server.  I also
tried it with a very old server on my Sun machine, I think it's the 1.2 release,
and the request number is 613 for those 10 images.  Strange enough, this appears
to be a much less severe problem with the IRIX browser; though repeated requests
is also seen with the IRIX browser.

If you want to take a look of those repeated requests with your own eyes, you
can try the URL http://chaos/www/index4.html.  It is very close to what we have
on the Netscape home page.  The server access log is at
chaos:/home1/siteadm/suitespot/https-chaos/logs, and it is for an Enterprise 3.0
server on chaos.

According to fur, this problem is caused by the browser IMAGE layer, since it
has no link with the NETWORK layer, and has no sense of whether an image file
has just been loaded within the context of a network connection and/or a page.

fur also suggested a possible workaround:

<LAYER VISIBILITY="show">
<IMG NAME=work SRC=bogus WIDTH=200 HEIGHT=200>
<SCRIPT LANGUAGE="JavaScript1.2">
document.images.work.src="/images/jb.gif";
</SCRIPT>
</LAYER>

That is, to set the image SRC to a bogus thing, and then use JavaScript
to assign the SRC of the image.  I tried this, and it does not work as a
wrokaround.  I found out that the SRC attribute has to be dropped in the <IMG>
tag to make it somewhat work; otherwise, the browser would still send a bogus
request to load a bogus file, and that would cause a server error and overhead
to the server to handle the error, though it could save the actual download of
the image.  However, the browser would crash on a RESIZE.
This is perhaps yet another bug!?

Please mark this bug to be fixed in the next immediate release.

The Sun Solaris 4.0.x browser bug 81663 I filed in the past seemingly are
releated to this bug.
dp, another 4.04 Layers bug.  The work-around that fur suggested didn't quite
work.  Can you ask fur or vidur to look at this again?  Thanks.
This is an image cache problem, which is not directly related to layers.
(You should see the same sorts of problems on non-layers pages.)
I have a safe fix for the image library part of this problem.
Once I get an ok to check it into 4.04, I will.
-pn
I have checked this into Gromit tip.
When 4.04 tree opens on Thursday I'll get this checked in.
pn
pnunns fix is going into 4.04
fenella can you help test?
The fix is now checked into 4.04 Branch and 5.0 Tip
Since the component is marked netlib, I have not seen this bug until tonight
when Sharon came and asked me to verify this bug.  Based On the information in
the bug report, I do not have the resource to verify this bug.  It requires
someone to have access to the server to verify this bug.  I talked to the bug
reporter and developer lead, they agree.  I will talk to pnunn on Monday how to
verify it.
I just did a quick test with the Nov. 7 Sun Solaris 2.4 briano build,
and it does not look like fixed.  In one sense, it cut down the repeated
image requests from over 600 (or at one time with some build, 1,300) for
a set of 10 images to now 120 .  Though the time to wait the browser to
finish its redundant file/image requests is now cut down, but I still have
to wait for well over 30 seconds for it to finish the repeated download
of images with our home page, and note this is on Netscape's LAN.
In some other sense, it introduced a couple of new bugs from what I can tell:

1. In the home page, we have an HTML file that's SRC'ed from a layer:
   <LAYER NAME="menu" SRC="&{/*STOKEN*/menu_file/*ETOKEN*/};" HEIGHT=450
WIDTH=50 LEFT=0 TOP=0 Z-INDEX=100 VISIBILITY="show">
   </LAYER>

   The JS variable 'menu_file' is ususally defined as 'menu4.html'.
   Before this bug fix, the browser only generated repeated image requests
   for images in 'menu4.html', but now it is generating numerous repeated
   requests for menu4.html as well on a RESIZE when the cache is 0,0.
   In the case of Netscape's home page, it generated 20 repeated requests for
   menu4.html out of 120  repeated requests for objects already downloaded on
   a RESIZE when the cache is 0,0.

2. The menu tab now disppears after a RESIZE.  I suspect it is releated to
   some layout and/or image code change in the browser.

Client QA please follow these steps to verify the bug:

The easiest way to see what HTTP requests are sent to the server
from the browser is to verify the access log of the server.
You can use the web site page and the server on my Sun Solaris
machine 'chaos' as the target test server.

1. Start the test browser, make sure the cache is set to 0,0.
2. rlogin to chaos, and cd to /home1/siteadm/suitespot/https-chaos/logs.
3. Do a 'tail -f access'.
4. Go to the URL http://chaos/www/index4.html.  Note this is an
   older version of the home page on our live site, it may contain some
   broken images, and please ignore that.
5. From the output of 'tail' in step 3, check the requests sent to the server
   are for what files.  Specifically, look for the requests for image files
   after the request for "GET /menu4.html HTTP/1.0", and look for this after
   the last occurrence of the requests for "menu4.html".
6. Stop 'tail', and do a 'tail -f access' again.
7. RESIZE the browser.  Now you should see the term window with
   'tail -f access' displaying those new requests after a RESIZE.
   Alternatively, instead of doing a 'tail -f access', you can do a
   'tail -f access > <temp_file>' before whenever you do a RESZIE.
   This way you can check from the <temp_file> exactly how many requests
   are sent to the server after a RESIZE.  Comapre the results gatthered
   here and compare with results gathered from step 5, you'll see what
   duplicated requests for the same files are sent to the server.
   A few examples of those are blank.gif, big_cell.gif, mcombo.gif etc.,
   and now also menu4.html itself.

To claim this bug as fixed/verified, you must not see any request
for the same file happen more than once in server's access log after
a RESIZE, and the 'ad' layer on the lower right corner of the browser
window should follow the browser window RESIZE in a reasonable responding
manner, say, a few seconds on the LAN.  Note that the ad layer movement
is controlled by JavaScript onResize event handler, and that is not triggered
until the browser objects download is completed.
The menu and its tab disappearing after a RESIZE of the browser window
is a correct behavior.  We made that on purpose to get around some other
problem we had with browser RESIZE.  Must be too many of them, even myself
forgot what we did.
pnunn:

A correction is needed here. A file will be rerequested from the
network if it is requested with a differenct size. The images
stored in the cache must be the same image in the same width and height
and the same pixel depth.
pnunn, does your comment have anything to do with the test case for this
bug?  The test case -- Netscape's home page, does not have any image that
gets resized to a different width, height or depth after a browser window
RESIZE, at least not by anything, such as JavaScript code in the page.
My comments described how the image cache works. In Chao's note for testing, it
said:
"To claim this bug as fixed/verified, you must not see any request
for the same file happen more than once in server's access log after
a RESIZE...."
I would imagine we might want to test it on other pages as well.
Ok, so maybe I'm not doing this right, but the only
images I see that are requested from the server are ibd_signup.gif
and netcent.gif. These images are not in the specified directory.
and so they can't be cached.

I am using 5.0 tip, which should be exactly the same in terms of the
image library cross format stuff. I've set breakpnts where I look up
the image in the image cache to see if I have a match. Its looking fine.
Could this be in netlib? I'm adding montulli and valeski to the stew.
Ok, so maybe I'm not doing this right, but the only
images I see requested repeatedly from the server logs are ibd_signup.gif
and netcent.gif. These images are not in the specified directory.
and so they can't be cached.

I am using 5.0 tip, which should be exactly the same in terms of the
image library cross format stuff. I've set breakpnts where I look up
the image in the image cache to see if I have a match. Its looking fine.
Could this be in netlib? I'm adding montulli and valeski to the stew.
adding valeski and joki to the cc list to see if additional fixes might
be required from the code they work on.

Based on the comments I see in the the report that we have improved
in 4.04, but have not fixed all the problems here is what I think we need
to do:

 ship 4.04 with the improvement.
 move the bug to 4.05 for continued investigation.
 if the investigation turns up additional fixes, we need to test
  to confirm we have fixed the entire problem.

marking tvf 4.05  -> let me know if this results in sigficant problems
 and we need to pull the release off the site for this problem
Please ignore my comment on step 5 of the QA notes I posted earlier:
'look for the requests for image files after the request for "GET /menu4.html
HTTP/1.0", and look for this after the last occurrence of the requests for
"menu4.html'.  Below is the set of image files loaded with menu4.html, and they
get repeatedly reloaded after a RESIZE:

"GET /images/big_light_minus.gif HTTP/1.0"
"GET /images/light_minus.gif HTTP/1.0"
"GET /images/big_cell.gif HTTP/1.0"
"GET /images/light_plus.gif HTTP/1.0"
"GET /images/big_light_plus.gif HTTP/1.0"
"GET /images/no_sign.gif HTTP/1.0"
"GET /images/blank.gif HTTP/1.0"
"GET /images/mcombo.gif HTTP/1.0"
"GET /images/vgripper.gif HTTP/1.0"
"GET /images/gripper.gif HTTP/1.0"
Please ignore my comment on step 5 of the QA notes I posted earlier:
'look for the requests for image files after the request for "GET /menu4.html
HTTP/1.0", and look for this after the last occurrence of the requests for
"menu4.html'.  Below is the set of image files loaded with menu4.html, and they
get repeatedly reloaded after a RESIZE:

"GET /images/big_light_minus.gif HTTP/1.0"
"GET /images/light_minus.gif HTTP/1.0"
"GET /images/big_cell.gif HTTP/1.0"
"GET /images/light_plus.gif HTTP/1.0"
"GET /images/big_light_plus.gif HTTP/1.0"
"GET /images/no_sign.gif HTTP/1.0"
"GET /images/blank.gif HTTP/1.0"
"GET /images/mcombo.gif HTTP/1.0"
"GET /images/vgripper.gif HTTP/1.0"
"GET /images/gripper.gif HTTP/1.0"
I ran across this comment from Brendan on a related issue, which seems to
describe the example given by chao above....

     > The currently shipping layout engine sucks.  It has no persistent
     > model, which is why it reloads the original document when you resize
     > horizontally (and which would, without wysiwyg:, reset all JS variables
     > and rerun all scripts).  It has been hacked over for three major
     > releases since its author told management to find someone else to
     > rewrite it.
this one sounds like target tix version = 5.0.
can I get a witness?
I'm your witness. This is one of the billions of bugs that are related to cache
being set to 0/0 (turned off). setting to 5.0
There is a new netlib. And Gagan is interested in addressing this problem.
Here's a quick summary of the problem.

When the image library gets an image request, it checks to see if a decoded,
sized version exists in the image cache. If the page has requested a different
size on the same image, its not a match.

The imglib then talks to the netlib. If netlib has the image bits in its
cache, then the imglib can get the cached bits to recode to the new, requested
size. We shouldn't have to get those bits from the server again.

There is a glitch when the imglib issues 2 or more requests for an image close
enough together so the first request is not finished before the second request
for the same image is issued. So another request
is sent to the server which is not really needed.
The image data is available in the netlib cache.

This part of the problem is on the netlib side.
and so .....is Gagan's bug.
Gagan, check my facts: I recall talking to montulli long ago about a general
cache design flaw, which might show up for other racing loads: the cache fails
to make an entry for a cache-fill-in-progress, so that racing loads that lost to
the first one (the NET_GetURL that knows it wants to cache, but has not yet got
data back from the server) can find the entry and wait for the fill to complete.

Of course, you have to deal with errors, and pseudo-threading the waiting fill
requests so they hit when the fill completes, and continue running with the
cached data in hand.  But this problem is not peculiar to image fetches, it just
happens to be more likely to bite those kinds of requests.

/be
Another problem that pops to mind: no-cache pragmas in HTTP headers would need
to make a nominal-cache-fill turn into a don't-fill.  I'm sure you can think of
more wrinkles; let's list them here.

/be
setting paulmac as QA contact for all gagan's bugs (sorry for the spam)
Target Milestone: M6
Per DP's suggestion marking these till M8. Though Necko lands with M7, we will
be able to verify it for M8.
Target Milestone: M8 → M9
Moving to M9
Changing all Networking Library/Browser bugs to Networking-Core component for
Browser.

Occasionally, Bugzilla will burp and cause Verified bugs to reopen when I do
this in a bulk change.  If this happens, I will fix. ;-)
Component: Networking-Core → Cache
Marking component cache. Deferred till cache lands in Necko.
Target Milestone: M9 → M10
Blocks: 14050
Target Milestone: M10 → M12
Deferring till cache lands. CC'ng fur.
Moving Assignee from gagan to warren since he is away.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → INVALID
This is a bug on the old MozillaClassic codebase concerning layer-related
features that we don't even support anymore.  Marking INVALID
Status: RESOLVED → REOPENED
I thought this was a dup of one Cathleen reported just recently. She left the
browser up overnight on the epinons site, and the next day they had gotten
50,000 hits from her. I'm going to reopen until we verify.
As I recall, this 4.x bug was specific to repeated images in layers, i.e. the
same image used in more than one layer and it only happened during a
resize-reload.  I would be rather surprised if the Gecko team managed to
duplicate exactly this problem in a completely new codebase, especially now that
resize-reload's don't even exist, and I'm sure that Cathleen's bug, whatever it
is, is not directly related.  But, if you want to leave this bug open, feel
free...
Resolution: INVALID → ---
Clearing INVALID resolution due to reopen.
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → INVALID
Given that this bug originates from '97, and Cathleen isn't around to tell us
what steps we have to do to reproduce it today (if indeed it still happens),
I'm marking this invalid.
Bulk move of all Cache (to be deleted component) bugs to new Networking: Cache
component.
Sorry for the spam, changing QA contact.
QA Contact: paulmac → tever
verified invalid
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.