Closed Bug 372870 Opened 18 years ago Closed 17 years ago

"Linux bl-bldlnx01 Dep argo-vm test perf" tests are incredibly noisy

Categories

(Webtools Graveyard :: Tinderbox, defect, P3)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: bzbarsky, Unassigned)

Details

(Keywords: regression)

I've been trying to figure out why I couldn't get a feel for what the numbers should be (like I had with btek, etc) and finally realized what the issue is. Tp has over 10% noise. Tdhtml has at least 6%. Tp2 has over 10%. Txul has about 3% noise. So does Ts. Those numbers are almost impossible to work with, basically -- it's _very_ hard to spot a minor perf regression in a cycle or two. In fact, I doubt they're spottable at all except over very long timeframes when they look like steady increases...
CC'ing robcee, as this is relevant to the new perf testing farm (bl-bldlnx01 was the first prototype XP "test-only" tinderbox). The only docs on reducing noise I know of are: http://wiki.mozilla.org/index.php?title=Performance:Tinderbox_Tests I'll make sure bl-bldlnx01's setup conforms to those suggestions. If anyone knows of additional ways to track down and reduce noise, please add to the doc and/or comment here.
I should note that noise for Tp on this box jumped by a factor of 4 or so on March 1. So maybe a good start is hunting that down...
Keywords: regression
It's worth making sure the Time::HiRes perl module is installed. (Somebody should really make the relevant tinderbox tests give an error if it's not.)
(In reply to comment #3) > It's worth making sure the Time::HiRes perl module is installed. It is. [root@bl-bldlnx01 ~]# hostname && perl -e 'use Time::HiRes;' && echo "Yay" bl-bldlnx01.office.mozilla.org Yay [root@bl-bldlnx01 ~]#
So do we know what caused the noise to jump on March 1?
Assignee: build → rhelmer
(In reply to comment #5) > So do we know what caused the noise to jump on March 1? We were having a lot of X crashes around that time, which needed to be restarted a few times. I don't see any extra services running, or anything like that. Is this still happening? We could try rebooting it.
Status: NEW → ASSIGNED
Could it be running a desktop environment now, when it wasn't before?
(In reply to comment #7) > Could it be running a desktop environment now, when it wasn't before? Nope, same as before; it has a custom xinitrc which loads vino (VNC service for the normal X server) and blackbox (a minimalistic window manager). I don't see any unusual services running (it's RHEL4 at init level 3). I am checking into the disk IO stats (maybe hard drive going bad?). Nothing unusual in the logs. I don't see any reason why a reboot would help, but it's probably worth a shot if nothing else helps.
Ah, it looks like the Tp noise got better around March 8 or 9. See the graph. Current noise numbers: Tp: about 3%. Tdhtml: at least 6%. Tp2: over 10%. Txul: about 3%. Ts: about 3%. which is still really bad (the Tp numbers on btek were stable to within 1%, and so were these originally, no?), but the regression I mention in comment 2 is gone... What changed to make it go away? See graph or raw data for exact times...
(In reply to comment #9) > Ah, it looks like the Tp noise got better around March 8 or 9. See the graph. > > Current noise numbers: > > Tp: about 3%. > Tdhtml: at least 6%. > Tp2: over 10%. > Txul: about 3%. > Ts: about 3%. > > which is still really bad (the Tp numbers on btek were stable to within 1%, and > so were these originally, no?), but the regression I mention in comment 2 is > gone... > > What changed to make it go away? See graph or raw data for exact times... > I'll take a closer look, but there were no logins to the machine between March 6th and March 11th, and nothing odd in the logs.
I'd also still like us to get the noise down to 1% or less. Perhaps we need to make some changes to the tests to achieve this?
(In reply to comment #11) > I'd also still like us to get the noise down to 1% or less. Perhaps we need to > make some changes to the tests to achieve this? Please file a bug in Core/Testing for any ideas you have on this; we'll be rolling out some new perf machines soon. I think that our tests (and test platforms) could use a lot of work, having a specific target like "reduce standard deviation in identical" would be a great start. Still not sure what caused the regression in this bug, reassigning to build alias.
Assignee: rhelmer → build
Status: ASSIGNED → NEW
(In reply to comment #12) > I think that our tests (and test platforms) could use a lot of work, having a > specific target like "reduce standard deviation in identical" would be a great > start. That should read "reduce standard deviation across identical runs", or something along those lines. That seems like the most obvious place to start that I can think of.
Severity: critical → normal
Priority: -- → P3
Noone is putting any effort into the old Tinderbox-style tests in favor of Talos, at this point, and the bl-bld* machines will go away once Talos takes over. Is Talos stable enough, or is the stability being worked on? I don't think there's much hope of anyone finding time to do this for Tinderbox, at this point. We should file an equivalent bug in Core/Testing if necessary and WONTFIX this one.
From IRC with alice/robcee/rhelmer it looks like our situation is: Currently run on tinderbox only Tp, Tp2 Currently run on talos only Tp3, tgfx, tsvg. New tjss, sunspider suites rolling out. Currently run on *both* tinderbox and talos: txul, ts, tdhtml Lets add this to tomorrows perf infra meeting and perf meetings, and see what we still need to run, what we can migrate from tinderbox->talos and what we can end-of-life because its no longer needed.
After today's perf_infra and perf meetings, seems that we do not need to run Tp, Tp2 suites anymore, and all these other perf suites are running in talos already. This means we can now mothball these tinderbox perf machines, after appropriate public forewarning. The mothballing work is being tracked in bug#413695. Therefore closing this bug as WONTFIX. If these test suites are still noisy when running under talos, please file bugs with Core/Testing.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
Component: Tinderbox Configuration → Tinderbox
Product: mozilla.org → Webtools
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.