big regression in page loading times on Mac

RESOLVED WONTFIX

Status

()

Core
Networking: Cache
--
major
RESOLVED WONTFIX
17 years ago
4 years ago

People

(Reporter: John Morrison, Assigned: John Morrison)

Tracking

({perf})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Assignee)

Description

17 years ago
Today's tests showed a big slowdown in reported page load times. The tinderbox
on coffee (which may be reliable now) showed the increase happening with a 
build pulled at ~8pm.

It almost looks like the improvements noted in bug 75868 were unwound.
(Assignee)

Comment 1

17 years ago
Created attachment 30831 [details]
the time series of page load times, per platform
(Assignee)

Comment 2

17 years ago
Created attachment 30836 [details]
page load times for 33 urls; apr-6 to apr-13; win98
(Assignee)

Comment 3

17 years ago
Created attachment 30837 [details]
page load times for 33 urls; apr-6 to apr-13; linux
(Assignee)

Comment 4

17 years ago
Created attachment 30838 [details]
page load times for 33 urls; apr-6 to apr-13; mac

Comment 5

17 years ago
Could be that we now actually wait for all the images to load before firing the
onload handler?
(Assignee)

Comment 6

17 years ago
That's what I was wondering (this is what got mid-air collided).

For win98 and linux, this looks very much like the mirror image of 
bug 75868. On mac, well, I don't know what to make of those results.

So, am I measuring some kind of artifact here? Was mozilla telling me 
it was done loading at a different point in the process of loading a 
page and its images? Or, was the other result bona fide, and somehow 
it's been nullified by another checkin.
Keyword soup: let's not let this one get buried
Severity: normal → major
Keywords: nsbeta1, nsCatFood, perf

Updated

17 years ago
Blocks: 71668

Comment 8

17 years ago
It's probably worth noting that I'm not *feeling* an
appreciable slowdown.  So whatever it is would seem
to be something that affects benchmarks rather
than humans.  Anyone else?
(Assignee)

Comment 9

17 years ago
Created attachment 31657 [details]
comparison of the 32 url's that consistently loaded: mar19 to apr18
(Assignee)

Comment 10

17 years ago
*** Bug 75868 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 11

17 years ago
So, I confirmed that my test script was being notified of document.onload
before all images had loaded for the builds 4/10, 4/11, and 4/12. (I rigged a
few pages to dump out additional times for <img>.onload events, and compared
with the time reported for document.onload). That is why times were apparently 
better on windows and linux (and that basically makes bug 75868 a dup of this
one).

However, at the same time that this was happening, Mac was not also showing
slower times. Don't know why exactly, but here's some more information on
what times were measured on the three platforms.

To explain what that graph represents:

1) all times are the (effectively) the average "already cached" page load
   time for a set of URLs (time in msec).

2) for each platform, there are two sets of URL's shown:

  (a) whatever URL's loaded on a given day. (Note: as libpr0n was turned on,
      first on windows (3/23), then Mac (3/26), and finally linux (3/30), 5
      of the slowest pages stopped loading; this artificially lowered this
      average).

  (b) only a 32 URL subset, which were the 32 URL's that consistently loaded
      each day of the testing period.

3) the left most three days are prior to the new cache and libpr0n (i.e.,
   that is where we were).

4) I've omitted 3/22 and 3/23, when the cache and libpr0n/win32 landed (or
   sort of landed). It's just confuses the issue (i.e., some platforms did or
   did not have a cache, and/or libpr0n on a given day, and the results were
   all over the map).

5) 4/9 to 4/10 shows a drop on windows and linux, but no change on mac. This
   is also when document.onload began firing "early" (e.g., before all images
   on the page were loaded). So, these times are incorrectly low.

6) 4/12 to 4/13 shows a rise on windows and linux, and a rise on mac. This is
   also when document.onload stopped firing "early" (as much as I can tell
   from testing).

7) between the build of 4/9 and the build of 4/13, the times for mac have
   gone up in total by 0.38 seconds, or 13%, while (modulo the false drop in
   the middle) times on windows and linux have stayed effectively level.

I'll note that there is one "smoking gun" on the mac: the times for
www.microsoft.com went from 2.0 sec. to 5.0 sec., which accounts for about
25% of the total increase.

Summary: big regression in page loading times → big regression in page loading times on Mac

Comment 12

17 years ago
sfraser or saari, can you guys help investigate why we're much slower on loading
www.microsoft.com on the Mac?  jrgm's last comment said it accounts to 25% of 
total page-load regression on Mac.

Thanks!!  :-)
this is fast for me on my powerbook. not sure what the problem could be. i 
see about 1-2 seconds, not 5.
(Assignee)

Comment 14

17 years ago
I ran the builds for Apr 9th and Apr 13th once again on 3 different G4
Macs, to see whether (a) the smoketest machine could reproduce the times
that were previously measured (i.e., had that machine changed), and/or 
(b) whether that machine was somehow showing a response that other similar 
machines were not showing. Here are those results.

                                   First Visit         Subsequent Visits        
                       
                              Apr9    Apr13    %       Apr9   Apr13     %
----------------------------------------------------------------------------
500MHz/128MB/G4               2450    2377    97%      2101    2083    99%
450MHz/128MB/G4               2793    2709    97%      2366    2365   100%
450MHz/128MB/G4 - smoketest   3452    3718   108%      2905    3314   114%

All machines 256MB RAM, VM ON (257), File Sharing OFF.

So, the answer is (b): that machine is flat out showing something that 
is not happening on two similar machines, but is doing so reliably (i.e.
this does not appear to be case that the machine has changed in some way).

Note that even before this regression, it was running this test about 25%
slower than another G4/450. Now it is 40% slower on the same basic hardware.

sfraser or pinkerton: is there a good time to you to come poke around at this 
machine to figure out what is "wrong" with this machine? I don't want to "tune
for the test", but I don't want to test in a way that isn't a mainstream 
configuration.

Comment 15

17 years ago
My top guess: the virus checker on the machine is killing file I/O. I'd be happy 
to vet it.

Comment 16

17 years ago
so um, do we know why this is assigned to me?

Comment 17

17 years ago
cuz yer a loozer.
(Assignee)

Comment 18

17 years ago
sfraser had a look through the mac where these tests are run, and identified
a few things that were suboptimal. We decided to just try the most likely, 
and temporarily disabled the virus software that is in place on that Mac, but
not on the other two that I tested yesterday. 

So, rather dully, it turns out that this was the cause. With that disabled, the 
times show that the measured slowdown between apr09 and apr13 was a "bad 
reaction" between a change in mozilla and the virus checking software. 

                                   First Visit         Subsequent Visits        
                              Apr9    Apr13    %       Apr9   Apr13     %
----------------------------------------------------------------------------
smoketest - no virus checker  2824    2907   103%      2441    2397   98%

I've re-enabled the virus checking on that machine, pending some discussion
on the right configuration to test on the mac. 

Given that this is tied to the virus checking, it would hint at some change
in either file or network I/O that mac is sensitive about. Any mac-heads 
motivated to hunt this down further? (Taking bug from pav for now, so alecf
doesn't taunt him further ...).

Assignee: pavlov → jrgm

Comment 19

17 years ago
To what end? To show that virus checkers make file I/O performance really suck? I 
think we know that already; we should just file that in the back of our minds, 
turn off the virus checking on this machine, and get on with life.

jrgm: any plans to rerun the page loading tests for a set of older builds with 
virus checking off? It would be nice to re-write the page-loading history now we 
know about this.
(Assignee)

Comment 20

17 years ago
Yes, virus checkers make file I/O performance suck. However, the win98 machine
that is tested is also running a virus checker (in continuous scan mode) and
it did not show a 14% increase during the same period, for the same code 
changes. My question was directed towards "how come. Why only the Mac? Did
someone make a Mac-unfriendly assumption in their code".

But, if Mac people are comfortable with letting that anomaly pass, then there
is nothing further to look at. (Yeah, and I'm tired of this bug too).

As far as disabling the virus checker on the Mac, I'll go with what you and/or
pinkerton, etc. recommend as the best test environment. However, since I don't 
absolutely "control" that test machine, and given recent woes with win32 virii, 
I'd rather get consent before just doing this unannounced. chofmann, twalker:
is it ok if we disable the virus checking on the Mac smoketest machine? 

As far as redoing some portion of the Mac tests, I think we'll just have to 
live with a Roger Maris asterisk for the existing results.

Comment 21

17 years ago
we could look at not running the checker during test runs
but don't disable the checker altogher unless you want to
help granrose scan ton-o-gigabytes during the next infection.

if we are looking to normalize all the results I'd just a soon
normalize with virus checking in place
well, i'm not absolutely comfortable with just letting this lie. Some of our 
users will be running virus checkers, and they will be running with file sharing 
on. What i'd like to know is does IE share the same % slowdown with these two 
variables? If so, discussion over. If not, then we have serious work to do.

Comment 23

17 years ago
The question is, what do we really want the page-loading data to tell us? Should 
it represent performance on the average users machine? Or should it reveal 
performance changes 'in isolation', on a carefully-vetted machine that removes as 
many other variables as possible?

Enabling virus checking, and other background-process software on the machine 
will do two things: i) slow things down, and ii) add more noise. I think we need 
to reduce noise as much as possible, but continue to keep software loaded that 
"most users" will have on their machines. I don't think that most users have 
virus software scanning every file when it's opened.

So my vote is to disable the virus software (no screams about the danger of 
viruses, please; Mac is much less susceptible to viruses than Windows), and turn 
off other software that could kick in in the background and add noise (Timbuktu, 
time synch, Software update, Sherlock indexing).

I do agree with pinkerton's suggestion that we should compare our performance 
with virus-scanners enabled against IE, but I don't think we should do this as 
part of the regular page-loading tests.

Comment 24

17 years ago
Agree, the biggest part of this decision should be made on
the grounds of "what constitues a standard configuration."

I'm guessing by the end of this year, or in the very short term,
it might be hard to find a user that doesn't want to run a fairly
high level of virus checking or OS vendors that provide it as
a standard feature.
http://netscape.zdnet.com/zdnn/stories/news/0,4586,5081825,00.html

Also agree that we should redo the the IE and 4.x numbers with
virus checking on for a sanity check.
i know of no mac users that run with virus checking. macs just aren't infected 
like win32 systems are.

Comment 26

17 years ago
Could you set the software to _ignore_ activity by APPL MOZZ (assuming that's 
mozilla's appname) and have the same software scan mozilla (set to kill 
application if it detects a virus) before running mozilla?  I know this would 
affect total testing time, but imo it would be a safe approach.

Also, could someone find out if it's TCP/IP, Disk I/O, Cache management or 
something else about the scanning that is killing us?  I don't see any comments 
from cache people, but it does seem like a reasonable possibility.

Does the scanner check files of all types or only certain types, and does cache 
correctly label images as something likely to be treated as non executable?

Comment 27

17 years ago
on second thought, from
Additional Comments From John Morrison 2001-04-26 01:25  in Bug 77002:
(Running over internal LAN, 500/128 win98, 500/128 Linux, 450/256 G4).
                      First visit   Subseq. visits
mac (with virus on)        3%              8%
mac (with virus off)      13%             16%

So it appears to be cache, if this box has 256mb of ram, could we disable disk 
cache and set memory cache to say 64mb?

These tests are manual right? sorry to ask you to do more work...
Component: Browser-General → Networking: Cache

Updated

17 years ago
Blocks: 104166
(Assignee)

Comment 28

15 years ago
Too much water under the bridge to make this worthwhile anymore.
Status: NEW → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.