Further adventures in page loading times (beyond style resolution)

VERIFIED FIXED in mozilla0.8.1

Status

defect
VERIFIED FIXED
19 years ago
15 years ago

People

(Reporter: jrgmorrison, Assigned: darin.moz)

Tracking

({perf, topperf})

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: nsbeta1+)

Attachments

(3 attachments)

The page loading times for Mac and Linux, increased over and above
known problems with style resolution and another XP bug that affected
page load times for one page in a known way (see below for details).

Overall, Mac page load times are now averaging ~7800msec, up from
~6300msec on Monday and Tuesday, and ~5500msec last week. 
Linux is up to ~6800msec, up from ~5300msec on Tuesday, and ~4500 last
week.

There isn't yet a verification build that includes the backout of 
the style resolution problem code, but twalker will be running that
build in the morning. Unless this is a transient build problem with
the builds I tested, I expect that tomorrow's tests for Mac and Linux
will still show an increase over times for last week.

[Tracy: can you update this bug with a copy/paste of the first row of
the table from the Mac test tomorrow, the line "All Pages"].

I don't know who to assign this to, so I pick me for now. CC: some folks
in general in the interim, if they want to have a look at recent checkins 
that may have affected Mac and Linux (but apparently not win32).

Details from post to porkjockeys/performance: 
-----

An updated time series: 

1) The times for 01/24/2001 are for builds before the style checkin
   was backed out (with the exception of windows which has times for
   both the am and the pm build (post backout)). In general, the time
   on windows _has_returned_ to the level of previous tests.

2) However, today's builds on all platforms were affected by 
          http://bugzilla.mozilla.org/show_bug.cgi?id=63750
   a.k.a. http://bugzilla.mozilla.org/show_bug.cgi?id=52798
   where slashdot.org gets into a state where layout oscillates
   with multiple fetches of the same image from the server, and 
   finally completes page load in >20s. I cancel a page load if
   it takes >30s, but in this case, that 20s number is included 
   in the overall average (increasing the average load time by 
   >400msec). Not sure how to deal with it otherwise.

3) Over and above the effect of the style checkin, and the effect of
   the slashdot page, Mac times and Linux times are showing an
   additional increase from Monday and Tuesday. 

   I re-ran this today's Mac build two more times with similar
   numbers, and re-ran yesterday's Mac build and I can reproduce the
   increase. [e.g., machine and profile are the same but today's build
   is slower than yesterday's]. I also ran the Linux test again, back
   to back, same machine, same profile (but emptied cache), dormant
   server and quiet network, and there's a clear difference (unless of
   course, I'm wrong, which is of course, conclusion #2).
Times are lower after the backout of code for 

 1) http://bugzilla.mozilla.org/show_bug.cgi?id=66263

but are still up from last week, due to:

 2) http://bugzilla.mozilla.org/show_bug.cgi?id=63750, but that is 
    not of major concern (affects one page in a quirky way).

 3) http://bugzilla.mozilla.org/show_bug.cgi?id=66516, which only 
    (apparently) affects mac and linux, but don't know much else 
    about the reason for the slowdown.


perf keyword
Keywords: perf
Looks like minus the style checkin, there is still about a 400-500ms increase that
is unaccounted for.  I am in the process of comparing load times with and without
my necko changes made on 1/23 around 2pm PST.
It looks like about 1200-1300ms still unaccounted for under linux and mac.
adding top perf
Keywords: topperf
I've identified a 500ms slowdown (on linux) in my changes from 1/23.  I'm not sure
yet whether or not it is my changes that can account for the other 700-800ms.
Comparing today's trunk with today's necko vs. pre-1/23's necko, I see a ~900ms
slowdown.  I have been able to account for ~500ms of this, and I am working on
a patch for this.  But, I am still in the process of tracking down the cause of
the other ~400ms.

But, my changes aside, there still seems to be about a 300-400ms slowdown
elsewhere in the code.

-> assigning to myself, setting target as 0.9 (hopefully to land much sooner
though).
Assignee: jrgm → darin
Target Milestone: --- → mozilla0.9
See attinasi's post to n.p.m.seamonkey

  news://news.mozilla.org/3A81FFD6.1090401@netscape.com

which indicates there may be some other slowdowns in layout...
There are two contributors to the slowdown.

1) My patch made it so that writing data to the socket requires posting a
message back to the client of the socket transport.  This contributes about
1/2 of the slowdown.

2) The other 1/2 results from the fact that I am only trying to read from the
socket if PR_Poll sets PR_POLL_READ.  Previously, we tried to read every chance
we got.

I am close to having patches for these.
I am waiting for dougt's branch to land before finalizing my fix.
Blocks: 64833
dougt's branch has landed, right?
Whiteboard: waiting for dougt landing
The branch has landed, yes.
Added keyword nsbeta1.
Keywords: nsbeta1
Whiteboard: waiting for dougt landing → nsbeta1+
My patch does three things:

1) It allows a nsIStreamProvider's OnDataWritable to be run on the socket
transport thread.  This happens automatically if a client uses
NS_AsyncWriteFromStream().

2) It eliminates any need to copy the data to a temporary buffer before writing
it to the socket.  Previously (and even before my 1/23 landing) it was necessary
to read the source into a temporary buffer and then flush that data to the
socket when possible.  This buffer copy can be avoided by calling ReadSegments
on the source nsIInputStream.

3) Removes the call to PR_Available on the socket, in favor of a fixed value
of 8192 (as was previously the case before my 1/23 landing).  This effectively
means, read as much as you can (not exceeding 8k).  This has no direct effect
on clients of the socket transport.
nominating for mozilla 0.8.1
Keywords: mozilla0.8.1
Target Milestone: mozilla0.9 → mozilla0.8.1
Still looking at the patch, wondered about the unconditional QI of listener:

+            if (!(aFlags & nsITransport::DONT_PROXY_STREAM_PROVIDER)) {
+                rv = NS_NewStreamListenerProxy(getter_AddRefs(listener),
+                                               aListener, nsnull,
+                                               mBufferSegmentSize,
+                                               mBufferMaxSize);
+                observer = do_QueryInterface(listener);
+            }

Won't do_QueryInterface crash on a null pointer?

/be
brendan: good catch... thanks!
do_QueryInterface handles null pointers safely.  See nsCOMPtr.cpp line 30
(nsQueryInterface::operator() returns NS_ERROR_NULL_POINTER if |!mRawPtr|) and
nsCOMPtr.h line 970 (nsCOMPtr<T>::assign_from_helper assigns 0 if helper fails).
dbaron: thanks, good to know (I should have checked).  It simplifies the code in
callers, as in darin's patch.

/be
cool. i like these changes. esp. the part about controlling whether a proxy is made.

sr=mscott
dougt gave me some suggestions for improving the comments in nsNetUtil.h...
i've added those to my tree and will be checking in soon.
Fix checked in.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Blocks: 71668
No longer blocks: 64833
jrgm, was this intended as an ongoing tracker bug, if so please reopen.  I'm 
going to mark it verified.
Status: RESOLVED → VERIFIED
No, it wasn't a tracking bug. (Hey, ancient history!)
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.