Closed
Bug 392788
Opened 18 years ago
Closed 18 years ago
Intermittent reftest failures on "qm-centos5-01" Tinderbox
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: MatsPalmgren_bugz, Assigned: mrz)
References
()
Details
I looked at 4 logs, 3 had these errors:
REFTEST UNEXPECTED FAIL (LOADING): file:///builds/slave/trunk_centos5/mozilla/layout/reftests/bugs/28811-2a.html
REFTEST UNEXPECTED FAIL: file:///builds/slave/trunk_centos5/mozilla/layout/reftests/bugs/28811-2b.html
1 had these errors:
REFTEST UNEXPECTED FAIL (LOADING): file:///builds/slave/trunk_centos5/mozilla/layout/reftests/bugs/382600-1.html
REFTEST UNEXPECTED FAIL: file:///builds/slave/trunk_centos5/mozilla/layout/reftests/bugs/384576-1.html
Since there is no exception for Orange on this Tinderbox,
this bug blocks me from doing checkins as far as I understand it.
http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox
Reporter | ||
Updated•18 years ago
|
Summary: Intermittent reftest failures on "centos5-01" Tinderbox → Intermittent reftest failures on "qm-centos5-01" Tinderbox
Reporter | ||
Comment 2•18 years ago
|
||
Apparently people are checking in anyway...
Severity: blocker → major
bug 381765 can't be the problem because that's Mac-only.
Comment 4•18 years ago
|
||
I made a checkin yesterday to increase the reftest load timeout from 10s to 30s, and that seems to have mostly fixed the problem. The question is, what the heck is causing these simple pages to take more than 10s to load? Is something weird going on with the VM?
Comment 5•18 years ago
|
||
It's entirely possible, if not likely. We see intermittent problems still on qm-winxp01 related to slow VM performance. I think there are around 8 VMs running on that vhost so there's a lot of competition for the hardware. If we continue to see problems we can try to move qm-centos5-01 to physical hardware.
Comment 6•18 years ago
|
||
The VM is apparently still not entirely its usual self. Just before I checked in a mochitest (bug 380595), the box was already orange due to a strange netwerk unittest failure, and afterwards it failed one of my mochitests, without being able to find a logical explanation for this failure. I've backed out the test for now, but it'd be good to figure out what's up with the box and/or how it could be fixed.
Assignee | ||
Comment 7•18 years ago
|
||
(In reply to comment #5)
> If we
> continue to see problems we can try to move qm-centos5-01 to physical hardware.
What about another ESX host as a first try? As mentioned in bug 394051, there will be another QA ESX host coming online probably this week.
This would be a lot easier than trying to build a new host (which would be a reinstall and some effort on someone's behalf).
Comment 8•18 years ago
|
||
Not sure. How many machines are we going to be running on the new host? These machines seem to be really sensitive to hardware availability. We could try it, but I'm worried about running into the same problem down the road as we add more machines to the vm host.
Assignee | ||
Comment 9•18 years ago
|
||
Justin had a good point in the other bug - if this is perf stuff it should be a seperate box. He mentioned setting up a mini - will that be fine (and can these two be combined?)?
Comment 10•18 years ago
|
||
Any progress on the setting up of another machine?
Comment 11•18 years ago
|
||
We've setup lots of new machines, just not this one. :)
Is this reftest still failing intermittently?
Matt, is the new vmhost ready? If so, we can clone this machine or move it over.
Assignee | ||
Comment 12•18 years ago
|
||
Which esx server do you mean? qm-vmware01 and qm-vmware02 are the two QA ESX servers.
Comment 13•18 years ago
|
||
(In reply to comment #11)
> We've setup lots of new machines, just not this one. :)
>
> Is this reftest still failing intermittently?
> <snip>
Not that I know of. However, I had to comment out the most important of the tests for bug 380595 *again*, back in December, because it intermittently failed on this box.
Comment 14•18 years ago
|
||
Matthew: I'm not sure who lives on which server. Are they fairly balanced or does one have more cycles available than the other? If so, I'd like qm-centos5-01 moved to the less-occupied box.
I think this is going to be a temporary solution at-best as we fill up both of these servers.
Gijs: I feel your pain.
Assignee | ||
Comment 15•18 years ago
|
||
qm-vmware02 has capacity. Who's doing the clone, me or you?
Comment 16•18 years ago
|
||
I don't think they let me play with clones. That'll have to be you. Let me know when you're ready to do it and I'll take the machine down.
Assignee | ||
Comment 17•18 years ago
|
||
qm-centos5-01 is on dev-vmware01 probably because it was made before qm-vmware02 existed.
The right place is really qm-vmware01 but it's short on disk space - will setup iscsi and hot clone.
Assignee: nobody → mrz
Assignee | ||
Comment 18•18 years ago
|
||
This got buried.
I'll have to shift a whole bunch of VMs around to make room on qm-vmware01 and I don't know if that will have any performance benefit over where it is now. I can move it to qm-vmware02 but that requires downtime.
Comment 19•18 years ago
|
||
if it's more work than setting up a dedicated box, why don't we do that instead? The whole plan was originally to just move this (or setup a new instance of) to dedicated hardware but it looks like it's more of a pain than you originally thought.
Assignee | ||
Comment 20•18 years ago
|
||
What's the action plan then?
Comment 21•18 years ago
|
||
I'll file a bug/reopen existing bug to order hardware with specs. Do we have standard-issue server grade hardware that can run with a reasonable graphics card in linux?
Assignee | ||
Comment 22•18 years ago
|
||
Rob - can I take a downtime hit on qm-centos5-01 to move it to the SAN and onto an unloaded ESX host? I also want to up the memory and and CPU.
Comment 23•18 years ago
|
||
yup, do it up.
Comment 24•18 years ago
|
||
I should have asked, how long will it take first? And when would you like to do it so I can give some headsup?
Comment 25•18 years ago
|
||
The tree will need to be closed for any downtime to qm-centos5-01, so doing it sometime out of normal hours would be nice.
Comment 26•18 years ago
|
||
We have changes in bug 393413 that need to land to and will also require minimal downtime. Let's coordinate the two.
Assignee | ||
Comment 27•18 years ago
|
||
(In reply to comment #25)
> The tree will need to be closed for any downtime to qm-centos5-01, so doing it
> sometime out of normal hours would be nice.
Oh, in that case, let me do a hot clone and let you know when that's done. From your perspective, it'll be a reboot so it should be quick-ish.
Let's plan on Thursday morning around 10am to do the "reboot" ?
Comment 28•18 years ago
|
||
(In reply to comment #27)
> Let's plan on Thursday morning around 10am to do the "reboot" ?
Works for me.
Comment 29•18 years ago
|
||
me too. We'll meet up with you then.
Assignee | ||
Comment 30•18 years ago
|
||
Moved this morning. The original VM was paused and I'll leave it there for awhile before removing it.
The new image has twice the memory and a second virtual CPU.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Comment 31•18 years ago
|
||
this isn't working. It might be the additional CPU or something else, but we're having a bunch of inexplicable problems.
Could we revert to the original image?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 32•18 years ago
|
||
Dropped the extra cpu lastnight.
Status: REOPENED → RESOLVED
Closed: 18 years ago → 18 years ago
Resolution: --- → FIXED
Comment 33•18 years ago
|
||
Updated•17 years ago
|
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•