Closed Bug 337516 Opened 18 years ago Closed 16 years ago

Txul regression resulting from threadmanager branch landing

Categories

(Core :: Widget: Win32, defect, P3)

x86
Windows XP
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: darin.moz, Unassigned)

References

Details

(Keywords: perf, regression)

Attachments

(1 file, 1 obsolete file)

Txul regression resulting from threadmanager branch landing

I'm seeing an increase from about 140 to 200 milliseconds for min(Txul) on Windows.  Interestingly enough, on other platforms Txul improved.
This problem can be seen on the Firefox pacifica tinderbox and the Firefox-Cairo gaius tinderbox.
Status: NEW → ASSIGNED
Priority: -- → P1
This regression turns out to be entirely due to the fact that I disabled the idle timer code.  The slight rise in Tp on pacifica is probably similarly explained.

I'm not sure if it makes sense to re-enable idle timers to solve this or to try to attack the timers that are running during xul window open and suppress / delay them in some other manner.
OK, after some investigation, I've determined that I can fix Txul by making nsBrowserStatusHandler's onLocationChange method return early.  There are a couple setTimeout calls made by it that are 0 ms, and when they do their work, it takes on the order of 40 - 50 ms.  There is another similar timer that costs about 10 ms.  These timers run after Txul's child-window.html has loaded, but before we fire onload.  We are still finishing some things in layout.  It seems that the correct solution is to defer nsBrowserStatusHandler's onLocationChange.  Maybe the work it is doing should actually happen when it receive onStateChange STATE_STOP | STATE_IS_NETWORK :)
In fact it is the setting of gURLBar's value property that makes up most of the expense.  I propose setting that on a timer w/ an interval greater than 0 ms.
And, here's what boris found:

bz SetValue does a sync paint
bz of the whole window
bz Which is probably why it's 40ms
bz oh, it flushes reflow too
bz in case that hasn't happened by then
* bz loves editor dearly
Sounds like there's some possible wins there :-)
Bug 174823 covers the editor issues, fwiw.
Attached patch v1 patch (obsolete) — Splinter Review
This patch works around the problem by deferring update of the location bar.
Attachment #221667 - Flags: review?
Attachment #221667 - Flags: review? → review?(mconnor)
I went ahead and landed this patch to clear the Txul regression (w/ r=bz).  I will gladly back it out for something better.
Comment on attachment 221667 [details] [diff] [review]
v1 patch

This patch seemed to make pacifica quite unhappy.  I think something else is needed.
Attachment #221667 - Attachment is obsolete: true
Attachment #221667 - Flags: review?(mconnor)
So is this a regression that actually affects when the new window is usable for the end user, or is it only a regression in what we're measuring because the point that we're measuring moved within the process of getting a new window ready?  If it's not a regression that actually affects user-percieved performance, why should we care about fixing it?  The performance tests are simply a tool for us to help measure what we care about; they're not actually what we care about.
(In reply to comment #11)
> So is this a regression that actually affects when the new window is usable for
> the end user, or is it only a regression in what we're measuring because the
> point that we're measuring moved within the process of getting a new window
> ready?

Very good question.  This is something that we were discussing on IRC yesterday in fact.  I plan on trying to answer that question.  The setting of the URL bar happens from onLocationChange, which occurs when we figure out which DOM window the new load should be targeted at.  For the XUL window open test, the size of the child-window.html is so small that there isn't much time between onLocationChange and onload.  It's unclear to me when the browser window paints relative to these events, so I want to try to study that.
From what I can tell, the browser window is already visible by the time onLocationChange fires.  I'm not sure if that means we should ignore the cost of updating the URL bar in terms of Txul.  Afterall, the browser window may be visible but it isn't _usable_ until after it updates the URL bar.  If updating the value of a <textbox> is this  expensive, then I think we should focus our efforts on making it less expensive.  I would expect that updating the URL bar negatively impacts Tp since we fire onLocationChange well before onload for even moderately sized web pages.
> If updating the value of a <textbox> is this  expensive

It really shouldn't be, on its own.  I guess I should profile Txul and see what all we run under that SetValue call....
I made the following change to browser.js:

             setTimeout(function(loc) {
+                         JProfStartProfiling();
                          gURLBar.value = ""; // hack for bug 249322
                          gURLBar.value = loc;
                          SetPageProxyState("valid");
-                       }, 50, location);
+                         JProfStopProfiling();
+                       }, 0, location);

and this is the resulting profile of running Txul.  Note that the profile has 462 hits; the RTZ frequence is 8192, so that's about 56ms, spread over 10 window.open() calls, for about 5-6ms per window.  All the time seems to be spent in painting.

Not sure what the Windows equivalent would look like, of course....
-> reassign to default owner
Assignee: darin.moz → nobody
Status: ASSIGNED → NEW
Flags: blocking1.9?
not sure if we should block on this or not.  we certainly want to get back the regression but not sure how we'd do it...  schrep, damon got anyone who can take a look at this?
Flags: wanted1.9+
Priority: P1 → --
Target Milestone: mozilla1.9alpha1 → ---
Benjamin can we draft you off moz2 for a second to take a look at this?
Moving to blocker list since we are trying to do a perf push...
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
I must be stupid today, but I'm having trouble understanding where to look to figure out what the problem might be. Why did the patch "make pacifica unhappy"? Should I be instrumenting when we do painting versus the timers in question?

I can take the bug if somebody points me at what I should do to measure it.
Benjamin - can you take this?
Assignee: nobody → benjamin
dbaron, I know what Txul is...

* I don't know what an idle timer was
* How do I tell if something is kicking off a sync paint and a reflow?

Is this a case where I just need to move some action after the txul measurement point, or actually make setting the urlbar not cause a syncpaint/reflow, or just make sure that we're not reflowing twice when we should only be reflowing once?
Reassigning to nobody, per damons.
Assignee: benjamin → nobody
From Darin:

 Ts regressed?  I'm surprised to hear that.  I remember Txul 
 regressed, and my investigation of that problem revealed that the 
 Txul test was sensitive to whether or not updating of the location 
 bar happened before or after the test thought it was done.

 I recall discussing the issue with Stuart since updating the 
 location bar seemed to take ~40ms on Windows.  Looking into it, it 
 seemed that the Windows code for the editor (used to implement the 
 location bar) does some synchronous painting when its contents are 
 changed that can lead to significant delays.

 Maybe the startup test is also subject to the same issue since it 
 merely tests the time it takes for the browser to load the page 
 contents w/o actually ensuring that the entire browser has loaded 
 and finished processing related to browser startup.  The test just 
 waits for the load event to fire.  It should really wait for the 
 page to render and for all associated processing to complete.

 I guess I'm blaming the tests, which probably isn't the answer you 
 wanted :-)

 The deal w/ the thread manager patch is that it changed the way 
 events are ordered, which could cause some UI updating to happen 
 before it would have previously.  The goal of the thread manager was 
 to ensure that UI updates are not starved by a flood of other events 
 (PLEvents dispatched by necko and layout), so it is not surprising 
 that some UI events might now run before necko events that trigger the "load"
 event.

 Hope this helps!

 -Darin
Flags: wanted1.9.0.x+
Priority: P2 → P3
Flags: tracking1.9+ → blocking1.9-
I'm going through and marking old performance regression bugs as INCOMPLETE
that are likely too old to be valid or get any traction on them.

Please re-open if you have more information or can demonstrate the regression
still exists.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → INCOMPLETE
Flags: wanted1.9.0.x+ → wanted1.9.0.x-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: