Closed
Bug 381537
Opened 18 years ago
Closed 18 years ago
large performance regression on fx-win32-tbox perf
Categories
(Firefox :: General, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: sayrer, Unassigned)
Details
(Keywords: perf)
Tp: 582ms -----------> 651ms
Tp2: 471.6375ms ------> 502.7ms
Tdhtml: 1212ms ----------> 1256ms
Txul: 532ms -----------> 641ms
Ts: 1890ms ----------> 2641ms
Dunno if this is a problem with the box, or real regression. Either way, we should figure it out asap.
| Reporter | ||
Updated•18 years ago
|
Severity: normal → critical
OS: Linux → Windows XP
Comment 1•18 years ago
|
||
For comparison, the same box's tests on mozilla1.8, after the same outage, a period when there were no 1.8 checkins:
Tp: 421ms --------> 442ms
Tp2: 343.6625ms --> 391.05ms
Tdhtml: 1225ms ---> 1334ms
Txul: 359ms ------> 375ms
Ts: 1375ms -------> 1968ms
Comment 2•18 years ago
|
||
note that Tdhtml moved on the mac yesterday too: 504--->516. All the other numbers remained about constant, however.
Comment 3•18 years ago
|
||
I'm going to try rebooting the perf machine. If this helps, I think we should start rebooting between tests (or at least fairly often).
Comment 4•18 years ago
|
||
Post-reboot (and clobber to fix some CVS fail) the numbers still show regression.
Tp:642ms
Tp2:500.0875ms
Tdhtml:1237ms
Txul:640ms
Ts:2031ms
It may still be the box but the reboot didn't seem to fix it.
| Reporter | ||
Comment 5•18 years ago
|
||
This did fix the large Ts jump, though there is still a small one.
What's the checkin window?
Oh, it looks like this was a change on the testing box. We've had yardsticks change all the time -- we don't hold the tree closed for it. If we did, the tree would still be closed from when btek was moved in 2003.
| Reporter | ||
Comment 8•18 years ago
|
||
(In reply to comment #7)
> Oh, it looks like this was a change on the testing box. We've had yardsticks
> change all the time -- we don't hold the tree closed for it. If we did, the
> tree would still be closed from when btek was moved in 2003.
This comment doesn't make much sense to me, if the testing box did change, we should back out the patches that went in while it was missing, and re-add them. More than 7000 lines of code went in while it was missing.
| Reporter | ||
Comment 9•18 years ago
|
||
checkins to Firefox code in the range:
mrbkap%gmail.com
Mark the overwritten scope property in the space between where we remove it and re-add it in its changed form. bug 381374, r=igor
sdwilsh%shawnwilsher.com
Bustage fix for Bug 380250. (Windows)
jonas%sicking.cc
Bug 380872: Forgot to address bzs review comment to remove this assertion. r/sr=bz
sdwilsh@shawnwilsher.com
Bug 380250 - Convert Download Manager's RDF backend to mozStorage. r=cbiesinger,r=mconnor
mrbkap%gmail.com
Protect the number from GC, even if it was originally a number. bug 375976, r=crowder
masayuki%d-toybox.com
Bug 381426 Can't be activated Input Method in the Bookmark Properties. r+sr=roc
crowder%fiverocks.com
Bug 380998: StackGrowthDirection is not reliable with Sun Studio 11, patch by Ginn Chen <ginn.chen@sun.com>, r=brendan
mrbkap%gmail.com
Don't assume that the parser is still enabled after we've returned to the event loop. bug 380590, r+sr=sicking
jonas%sicking.cc
Bug 380872: Call BindToTree on anonymous children too when BindToTree is called on an element. r/sr=bz
jonas%sicking.cc
Bug 53901: Make sure to also release controllers when unbinding xul elements from the DOM. r/sr=bz
Checkin window for regression on http://tinderbox.mozilla.org/Firefox/ :
http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&date=explicit&mindate=1179785640&maxdate=1179795899
Checkin window for regresion on http://tinderbox.mozilla.org/Mozilla1.8/ :
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-05-21+14%3A59&maxdate=2007-05-21+19%3A33&cvsroot=%2Fcvsroot
The latter (especially the yellow showing that the tinderbox was interrupted in its run on Mozilla1.8) suggests that this isn't a real regression, but a configuration change in the machine.
Therefore, I don't think this should hold the tree closed.
| Reporter | ||
Comment 11•18 years ago
|
||
(In reply to comment #10)
>
> The latter (especially the yellow showing that the tinderbox was interrupted in
> its run on Mozilla1.8) suggests that this isn't a real regression, but a
> configuration change in the machine.
>
> Therefore, I don't think this should hold the tree closed.
>
I agree that it is more likely not to be a real regression at all. However, we don't know that no regressions occurred on Windows at the same time as the changes on the box (which contributed to the large Ts spike, for sure).
Comment 12•18 years ago
|
||
See http://build-graphs.mozilla.org/graph/query.cgi?tbox=bl-bldxp01_head&testname=startup&autoscale=1&size=&units=ms<ype=&points=&showpoint=2007%3A05%3A22%3A10%3A30%3A58%2C2031&avg=1&days=40 for an example of how we are (or aren't, who can tell?) hiding perf regressions in these periods where bl-bldxp01 spikes after a restart - either the ceiling was raised 5% by completely unknown and random factors (calling it a "configuration change" seems wrong, since the change consists of killing hung processes and restarting tinderbox), or we had a 5% Ts regression between 04/29 and 05/05, and either way I was probably wrong to resolve bug 379257.
Comment 13•18 years ago
|
||
There've been no configuration changes on the perf machines. We don't install updates, or otherwise make any unannounced modifications. The only thing that's happened is that the machine has rebooted.
I think we really need to look into how to get reliable performance numbers, in a way that's reproducible on more than one machine :) That's actually a non-trivial task.
I think we should consider rebooting regularly, if rebooting affects the numbers so much.
| Reporter | ||
Comment 14•18 years ago
|
||
(In reply to comment #12)
> See
> http://build-graphs.mozilla.org/graph/query.cgi?tbox=bl-bldxp01_head&testname=startup&autoscale=1&size=&units=ms<ype=&points=&showpoint=2007%3A05%3A22%3A10%3A30%3A58%2C2031&avg=1&days=40
> for an example of how we are (or aren't, who can tell?) hiding perf regressions
(In reply to comment #13)
> I think we really need to look into how to get reliable performance numbers, in
> a way that's reproducible on more than one machine :) That's actually a
> non-trivial task.
I'm sure it is non-trivial. But until it's fixed, we'll face the unpleasant choice of unwittingly piling on performance regressions for 95% of our users, or not getting work done. :(
Comment 15•18 years ago
|
||
This seems to have sorted itself out, with the exception of Tp. The tree is reopened now, we will be monitoring Tp to see whether it stabilizes. WORKSFORME?
Comment 16•18 years ago
|
||
Do I have the timeline right? rhelmer rebooted it ~8:30, it did two more runs with bad numbers, then something unspecified was done to it between 10:30 and 11:00, and now it's all better? WORKSFORME if that's reproducible, and the steps are known, and every single on-call IT person knows what they are and how to do them, and every sheriff and likely bug filer (okay, that's just me) knows what to ask IT to do when bl-bldxp01 hangs, because this isn't at all an isolated incident, it's either the 11th or 12th time since last summer.
| Reporter | ||
Comment 17•18 years ago
|
||
there may have been a small performance regression after this large perf regression was rememdied by rebooting the box a bunch of time. That's covered in bug 381782.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•