15.70 KB, image/gif
16.62 KB, image/gif
17.16 KB, image/gif
22.22 KB, image/gif
13.37 KB, image/gif
Today's tests showed a big slowdown in reported page load times. The tinderbox on coffee (which may be reliable now) showed the increase happening with a build pulled at ~8pm. It almost looks like the improvements noted in bug 75868 were unwound.
Could be that we now actually wait for all the images to load before firing the onload handler?
That's what I was wondering (this is what got mid-air collided). For win98 and linux, this looks very much like the mirror image of bug 75868. On mac, well, I don't know what to make of those results. So, am I measuring some kind of artifact here? Was mozilla telling me it was done loading at a different point in the process of loading a page and its images? Or, was the other result bona fide, and somehow it's been nullified by another checkin.
Keyword soup: let's not let this one get buried
Severity: normal → major
Keywords: nsbeta1, nsCatFood, perf
It's probably worth noting that I'm not *feeling* an appreciable slowdown. So whatever it is would seem to be something that affects benchmarks rather than humans. Anyone else?
Created attachment 31657 [details] comparison of the 32 url's that consistently loaded: mar19 to apr18
*** Bug 75868 has been marked as a duplicate of this bug. ***
So, I confirmed that my test script was being notified of document.onload before all images had loaded for the builds 4/10, 4/11, and 4/12. (I rigged a few pages to dump out additional times for <img>.onload events, and compared with the time reported for document.onload). That is why times were apparently better on windows and linux (and that basically makes bug 75868 a dup of this one). However, at the same time that this was happening, Mac was not also showing slower times. Don't know why exactly, but here's some more information on what times were measured on the three platforms. To explain what that graph represents: 1) all times are the (effectively) the average "already cached" page load time for a set of URLs (time in msec). 2) for each platform, there are two sets of URL's shown: (a) whatever URL's loaded on a given day. (Note: as libpr0n was turned on, first on windows (3/23), then Mac (3/26), and finally linux (3/30), 5 of the slowest pages stopped loading; this artificially lowered this average). (b) only a 32 URL subset, which were the 32 URL's that consistently loaded each day of the testing period. 3) the left most three days are prior to the new cache and libpr0n (i.e., that is where we were). 4) I've omitted 3/22 and 3/23, when the cache and libpr0n/win32 landed (or sort of landed). It's just confuses the issue (i.e., some platforms did or did not have a cache, and/or libpr0n on a given day, and the results were all over the map). 5) 4/9 to 4/10 shows a drop on windows and linux, but no change on mac. This is also when document.onload began firing "early" (e.g., before all images on the page were loaded). So, these times are incorrectly low. 6) 4/12 to 4/13 shows a rise on windows and linux, and a rise on mac. This is also when document.onload stopped firing "early" (as much as I can tell from testing). 7) between the build of 4/9 and the build of 4/13, the times for mac have gone up in total by 0.38 seconds, or 13%, while (modulo the false drop in the middle) times on windows and linux have stayed effectively level. I'll note that there is one "smoking gun" on the mac: the times for www.microsoft.com went from 2.0 sec. to 5.0 sec., which accounts for about 25% of the total increase.
Summary: big regression in page loading times → big regression in page loading times on Mac
sfraser or saari, can you guys help investigate why we're much slower on loading www.microsoft.com on the Mac? jrgm's last comment said it accounts to 25% of total page-load regression on Mac. Thanks!! :-)
this is fast for me on my powerbook. not sure what the problem could be. i see about 1-2 seconds, not 5.
I ran the builds for Apr 9th and Apr 13th once again on 3 different G4 Macs, to see whether (a) the smoketest machine could reproduce the times that were previously measured (i.e., had that machine changed), and/or (b) whether that machine was somehow showing a response that other similar machines were not showing. Here are those results. First Visit Subsequent Visits Apr9 Apr13 % Apr9 Apr13 % ---------------------------------------------------------------------------- 500MHz/128MB/G4 2450 2377 97% 2101 2083 99% 450MHz/128MB/G4 2793 2709 97% 2366 2365 100% 450MHz/128MB/G4 - smoketest 3452 3718 108% 2905 3314 114% All machines 256MB RAM, VM ON (257), File Sharing OFF. So, the answer is (b): that machine is flat out showing something that is not happening on two similar machines, but is doing so reliably (i.e. this does not appear to be case that the machine has changed in some way). Note that even before this regression, it was running this test about 25% slower than another G4/450. Now it is 40% slower on the same basic hardware. sfraser or pinkerton: is there a good time to you to come poke around at this machine to figure out what is "wrong" with this machine? I don't want to "tune for the test", but I don't want to test in a way that isn't a mainstream configuration.
My top guess: the virus checker on the machine is killing file I/O. I'd be happy to vet it.
so um, do we know why this is assigned to me?
cuz yer a loozer.
sfraser had a look through the mac where these tests are run, and identified a few things that were suboptimal. We decided to just try the most likely, and temporarily disabled the virus software that is in place on that Mac, but not on the other two that I tested yesterday. So, rather dully, it turns out that this was the cause. With that disabled, the times show that the measured slowdown between apr09 and apr13 was a "bad reaction" between a change in mozilla and the virus checking software. First Visit Subsequent Visits Apr9 Apr13 % Apr9 Apr13 % ---------------------------------------------------------------------------- smoketest - no virus checker 2824 2907 103% 2441 2397 98% I've re-enabled the virus checking on that machine, pending some discussion on the right configuration to test on the mac. Given that this is tied to the virus checking, it would hint at some change in either file or network I/O that mac is sensitive about. Any mac-heads motivated to hunt this down further? (Taking bug from pav for now, so alecf doesn't taunt him further ...).
Assignee: pavlov → jrgm
To what end? To show that virus checkers make file I/O performance really suck? I think we know that already; we should just file that in the back of our minds, turn off the virus checking on this machine, and get on with life. jrgm: any plans to rerun the page loading tests for a set of older builds with virus checking off? It would be nice to re-write the page-loading history now we know about this.
Yes, virus checkers make file I/O performance suck. However, the win98 machine that is tested is also running a virus checker (in continuous scan mode) and it did not show a 14% increase during the same period, for the same code changes. My question was directed towards "how come. Why only the Mac? Did someone make a Mac-unfriendly assumption in their code". But, if Mac people are comfortable with letting that anomaly pass, then there is nothing further to look at. (Yeah, and I'm tired of this bug too). As far as disabling the virus checker on the Mac, I'll go with what you and/or pinkerton, etc. recommend as the best test environment. However, since I don't absolutely "control" that test machine, and given recent woes with win32 virii, I'd rather get consent before just doing this unannounced. chofmann, twalker: is it ok if we disable the virus checking on the Mac smoketest machine? As far as redoing some portion of the Mac tests, I think we'll just have to live with a Roger Maris asterisk for the existing results.
we could look at not running the checker during test runs but don't disable the checker altogher unless you want to help granrose scan ton-o-gigabytes during the next infection. if we are looking to normalize all the results I'd just a soon normalize with virus checking in place
well, i'm not absolutely comfortable with just letting this lie. Some of our users will be running virus checkers, and they will be running with file sharing on. What i'd like to know is does IE share the same % slowdown with these two variables? If so, discussion over. If not, then we have serious work to do.
The question is, what do we really want the page-loading data to tell us? Should it represent performance on the average users machine? Or should it reveal performance changes 'in isolation', on a carefully-vetted machine that removes as many other variables as possible? Enabling virus checking, and other background-process software on the machine will do two things: i) slow things down, and ii) add more noise. I think we need to reduce noise as much as possible, but continue to keep software loaded that "most users" will have on their machines. I don't think that most users have virus software scanning every file when it's opened. So my vote is to disable the virus software (no screams about the danger of viruses, please; Mac is much less susceptible to viruses than Windows), and turn off other software that could kick in in the background and add noise (Timbuktu, time synch, Software update, Sherlock indexing). I do agree with pinkerton's suggestion that we should compare our performance with virus-scanners enabled against IE, but I don't think we should do this as part of the regular page-loading tests.
Agree, the biggest part of this decision should be made on the grounds of "what constitues a standard configuration." I'm guessing by the end of this year, or in the very short term, it might be hard to find a user that doesn't want to run a fairly high level of virus checking or OS vendors that provide it as a standard feature. http://netscape.zdnet.com/zdnn/stories/news/0,4586,5081825,00.html Also agree that we should redo the the IE and 4.x numbers with virus checking on for a sanity check.
i know of no mac users that run with virus checking. macs just aren't infected like win32 systems are.
Could you set the software to _ignore_ activity by APPL MOZZ (assuming that's mozilla's appname) and have the same software scan mozilla (set to kill application if it detects a virus) before running mozilla? I know this would affect total testing time, but imo it would be a safe approach. Also, could someone find out if it's TCP/IP, Disk I/O, Cache management or something else about the scanning that is killing us? I don't see any comments from cache people, but it does seem like a reasonable possibility. Does the scanner check files of all types or only certain types, and does cache correctly label images as something likely to be treated as non executable?
on second thought, from Additional Comments From John Morrison 2001-04-26 01:25 in Bug 77002: (Running over internal LAN, 500/128 win98, 500/128 Linux, 450/256 G4). First visit Subseq. visits mac (with virus on) 3% 8% mac (with virus off) 13% 16% So it appears to be cache, if this box has 256mb of ram, could we disable disk cache and set memory cache to say 64mb? These tests are manual right? sorry to ask you to do more work...
Component: Browser-General → Networking: Cache
Too much water under the bridge to make this worthwhile anymore.
Status: NEW → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.