337516 - Txul regression resulting from threadmanager branch landing

Reporter

Description

•

19 years ago

Txul regression resulting from threadmanager branch landing I'm seeing an increase from about 140 to 200 milliseconds for min(Txul) on Windows. Interestingly enough, on other platforms Txul improved.

Darin Fisher

Reporter

Comment 1

•

19 years ago

This problem can be seen on the Firefox pacifica tinderbox and the Firefox-Cairo gaius tinderbox.

Darin Fisher

Reporter

Updated

•

19 years ago

Status: NEW → ASSIGNED

Priority: -- → P1

Darin Fisher

Reporter

Updated

•

19 years ago

Blocks: nsIThreadManager

Darin Fisher

Reporter

Comment 2

•

19 years ago

This regression turns out to be entirely due to the fact that I disabled the idle timer code. The slight rise in Tp on pacifica is probably similarly explained. I'm not sure if it makes sense to re-enable idle timers to solve this or to try to attack the timers that are running during xul window open and suppress / delay them in some other manner.

Darin Fisher

Reporter

Comment 3

•

19 years ago

OK, after some investigation, I've determined that I can fix Txul by making nsBrowserStatusHandler's onLocationChange method return early. There are a couple setTimeout calls made by it that are 0 ms, and when they do their work, it takes on the order of 40 - 50 ms. There is another similar timer that costs about 10 ms. These timers run after Txul's child-window.html has loaded, but before we fire onload. We are still finishing some things in layout. It seems that the correct solution is to defer nsBrowserStatusHandler's onLocationChange. Maybe the work it is doing should actually happen when it receive onStateChange STATE_STOP | STATE_IS_NETWORK :)

Darin Fisher

Reporter

Comment 4

•

19 years ago

In fact it is the setting of gURLBar's value property that makes up most of the expense. I propose setting that on a timer w/ an interval greater than 0 ms.

Darin Fisher

Reporter

Comment 5

•

19 years ago

And, here's what boris found: bz SetValue does a sync paint bz of the whole window bz Which is probably why it's 40ms bz oh, it flushes reflow too bz in case that hasn't happened by then * bz loves editor dearly

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 6

•

19 years ago

Sounds like there's some possible wins there :-)

Boris Zbarsky [:bzbarsky]

Comment 7

•

19 years ago

Bug 174823 covers the editor issues, fwiw.

Darin Fisher

Reporter

Comment 8

•

19 years ago

Attached patch v1 patch (obsolete) — Details — Splinter Review

This patch works around the problem by deferring update of the location bar.

Attachment #221667 - Flags: review?

Darin Fisher

Reporter

Updated

•

19 years ago

Attachment #221667 - Flags: review? → review?(mconnor)

Darin Fisher

Reporter

Comment 9

•

19 years ago

I went ahead and landed this patch to clear the Txul regression (w/ r=bz). I will gladly back it out for something better.

Darin Fisher

Reporter

Comment 10

•

19 years ago

Comment on attachment 221667 [details] [diff] [review] v1 patch This patch seemed to make pacifica quite unhappy. I think something else is needed.

Attachment #221667 - Attachment is obsolete: true

Attachment #221667 - Flags: review?(mconnor)

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 11

•

19 years ago

So is this a regression that actually affects when the new window is usable for the end user, or is it only a regression in what we're measuring because the point that we're measuring moved within the process of getting a new window ready? If it's not a regression that actually affects user-percieved performance, why should we care about fixing it? The performance tests are simply a tool for us to help measure what we care about; they're not actually what we care about.

Darin Fisher

Reporter

Comment 12

•

19 years ago

(In reply to comment #11) > So is this a regression that actually affects when the new window is usable for > the end user, or is it only a regression in what we're measuring because the > point that we're measuring moved within the process of getting a new window > ready? Very good question. This is something that we were discussing on IRC yesterday in fact. I plan on trying to answer that question. The setting of the URL bar happens from onLocationChange, which occurs when we figure out which DOM window the new load should be targeted at. For the XUL window open test, the size of the child-window.html is so small that there isn't much time between onLocationChange and onload. It's unclear to me when the browser window paints relative to these events, so I want to try to study that.

Darin Fisher

Reporter

Comment 13

•

19 years ago

From what I can tell, the browser window is already visible by the time onLocationChange fires. I'm not sure if that means we should ignore the cost of updating the URL bar in terms of Txul. Afterall, the browser window may be visible but it isn't _usable_ until after it updates the URL bar. If updating the value of a <textbox> is this expensive, then I think we should focus our efforts on making it less expensive. I would expect that updating the URL bar negatively impacts Tp since we fire onLocationChange well before onload for even moderately sized web pages.

Boris Zbarsky [:bzbarsky]

Comment 14

•

19 years ago

> If updating the value of a <textbox> is this expensive It really shouldn't be, on its own. I guess I should profile Txul and see what all we run under that SetValue call....

Boris Zbarsky [:bzbarsky]

Comment 15

•

19 years ago

Attached file Profile of the function in question — Details

I made the following change to browser.js: setTimeout(function(loc) { + JProfStartProfiling(); gURLBar.value = ""; // hack for bug 249322 gURLBar.value = loc; SetPageProxyState("valid"); - }, 50, location); + JProfStopProfiling(); + }, 0, location); and this is the resulting profile of running Txul. Note that the profile has 462 hits; the RTZ frequence is 8192, so that's about 56ms, spread over 10 window.open() calls, for about 5-6ms per window. All the time seems to be spent in painting. Not sure what the Windows equivalent would look like, of course....

Nelson Bolyard (seldom reads bugmail)

Updated

•

18 years ago

Blocks: 381699

Darin Fisher

Reporter

Comment 16

•

18 years ago

-> reassign to default owner

Assignee: darin.moz → nobody

Status: ASSIGNED → NEW

Jesse Ruderman

Updated

•

18 years ago

Flags: blocking1.9?

Stuart Parmenter

Comment 17

•

18 years ago

not sure if we should block on this or not. we certainly want to get back the regression but not sure how we'd do it... schrep, damon got anyone who can take a look at this?

Flags: wanted1.9+

Priority: P1 → --

Target Milestone: mozilla1.9alpha1 → ---

Mike Schroepfer

Comment 18

•

18 years ago

Benjamin can we draft you off moz2 for a second to take a look at this?

Mike Schroepfer

Comment 19

•

18 years ago

Moving to blocker list since we are trying to do a perf push...

Flags: blocking1.9? → blocking1.9+

Priority: -- → P2

Benjamin Smedberg

Comment 20

•

18 years ago

I must be stupid today, but I'm having trouble understanding where to look to figure out what the problem might be. Why did the patch "make pacifica unhappy"? Should I be instrumenting when we do painting versus the timers in question? I can take the bug if somebody points me at what I should do to measure it.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 21

•

18 years ago

http://build-graphs.mozilla.org/graph/query.cgi?tbox=pacifica&testname=xulwinopen&autoscale=&size=&days=&units=&ltype=&points=&avg=&showpoint= shows the regression. That runs the Txul test, documented at http://wiki.mozilla.org/Performance:Tinderbox_Tests#Txul:_XUL_window_open_time

Mike Schroepfer

Comment 22

•

18 years ago

Benjamin - can you take this?

Assignee: nobody → benjamin

Benjamin Smedberg

Comment 23

•

18 years ago

dbaron, I know what Txul is... * I don't know what an idle timer was * How do I tell if something is kicking off a sync paint and a reflow? Is this a case where I just need to move some action after the txul measurement point, or actually make setting the urlbar not cause a syncpaint/reflow, or just make sure that we're not reflowing twice when we should only be reflowing once?

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 24

•

18 years ago

I don't know either.

Benjamin Smedberg

Comment 25

•

18 years ago

Reassigning to nobody, per damons.

Assignee: benjamin → nobody

Mike Schroepfer

Comment 26

•

18 years ago

From Darin: Ts regressed? I'm surprised to hear that. I remember Txul regressed, and my investigation of that problem revealed that the Txul test was sensitive to whether or not updating of the location bar happened before or after the test thought it was done. I recall discussing the issue with Stuart since updating the location bar seemed to take ~40ms on Windows. Looking into it, it seemed that the Windows code for the editor (used to implement the location bar) does some synchronous painting when its contents are changed that can lead to significant delays. Maybe the startup test is also subject to the same issue since it merely tests the time it takes for the browser to load the page contents w/o actually ensuring that the entire browser has loaded and finished processing related to browser startup. The test just waits for the load event to fire. It should really wait for the page to render and for all associated processing to complete. I guess I'm blaming the tests, which probably isn't the answer you wanted :-) The deal w/ the thread manager patch is that it changed the way events are ordered, which could cause some UI updating to happen before it would have previously. The goal of the thread manager was to ensure that UI updates are not starved by a flood of other events (PLEvents dispatched by necko and layout), so it is not surprising that some UI events might now run before necko events that trigger the "load" event. Hope this helps! -Darin

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Updated

•

17 years ago

Flags: wanted1.9.0.x+

Priority: P2 → P3

Stuart Parmenter

Updated

•

17 years ago

Flags: tracking1.9+ → blocking1.9-

Shawn Wilsher :sdwilsh

Comment 27

•

17 years ago

I'm going through and marking old performance regression bugs as INCOMPLETE that are likely too old to be valid or get any traction on them. Please re-open if you have more information or can demonstrate the regression still exists.

Status: NEW → RESOLVED

Closed: 17 years ago

Resolution: --- → INCOMPLETE

Daniel Veditz [:dveditz]

Updated

•

17 years ago

Flags: wanted1.9.0.x+ → wanted1.9.0.x-

v1 patch 19 years ago Darin Fisher 1.29 KB, patch		Details \| Diff \| Splinter Review
Profile of the function in question 19 years ago Boris Zbarsky [:bzbarsky] 285.25 KB, text/html		Details