The qm-centos5 VM has severe performance issues, resulting in really, really long test times; the reftest portion takes about 12 min, the mochitest portion takes about 27 min. In contrast, the mac takes about 4 min to run reftest and 4-5 min to run mochitest, win32 takes takes about 4 and 6. (These are numbers straight off the buildbot waterfall, where each individual section is separated out.) Can we take a look at the config on that machine? My patch in bug 387132 should speed up reftests across the board, but that won't help the mochitests.
we're also seeing some spurious errors (saw a random necko test failure this morning on a README file checkin). This might be better moved to real hardware. It would also give us the benefit of having real video hardware which is the reason I suspect the tests are taking so long. VMWare video drivers aren't exactly high-speed.
Morphing this a bit.. something is very odd. qm-xserve01 is now taking *1 hour* to run mochitests; the centos machine is now taking 8 min for mochitests. What's odd is that it's taking almost exactly 1 hour. The 1hr mochitests started right after crowder checked in the patch for bug 121183, but that was just an error message change.
Reassigning. If this is not appropriate, please reassign back, ok?
Isn't "it takes exactly an hour on Mac" the result of a timeout, because (bug I forget the number) we don't kill the test run on Mac when it finishes, a timeout that cf tweaked down a month or two ago?
I adjusted Tp,Tp2 etc timeouts for Mac nightly tinderbox, which are specified in the tinder-config.pl. I don't know how the mochitest are set up.
No, there's code that knows how to quit on the macs now; in any case, the timeout is supposed to be 20 minutes, not 60.
it's not a timeout issue because we'd be seeing failures on the machine if that were the case. Process killed messages and orange on the tinderbox. I looked at this yesterday and backed-out crowder's patch on the machine and the tests were still taking way too long. They're executing really slowly even though the machine's processors are barely moving. All the RAM that's supposed to be there is showing up and the harddrive's got plenty of space. I'm at a loss to explain this as none of the other checkins around the time of the test slowdown appeared suspect. It's also not happening on the other machines. Lastly, cf, your Tp* timeouts wouldn't affect this machine. Should we schedule some hardware diagnostics?
from the logs: *** 35685 INFO Running /tests/toolkit/themes/pinstripe/tests/test_bug371080.xul... *** 35686 INFO PASS | mac button width *** 35687 INFO PASS | mac button height *** 35689 INFO SimpleTest FINISHED *** 35690 INFO Passed: 32755 *** 35691 INFO Failed: 0 *** 35692 INFO Todo: 929 kill TERM 6208 Process killed. Took 1 second to die. started: Wed Aug 8 06:50:52 2007 finished: Wed Aug 8 07:53:22 2007 So it's not taking *exactly* an hour to run, but close to it. For the record, I meant we'd be seeing different Process killed messages. Those ones are normal. :)
I haven't heard anyone complaining about this lately, but I could be wrong.