Closed
Bug 384035
Opened 18 years ago
Closed 17 years ago
Upgrade qa and build machines to reflect latest linux runtime requirements
Categories
(Release Engineering :: General, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rcampbell, Assigned: preed)
References
()
Details
Attachments
(1 file)
3.87 KB,
patch
|
coop
:
review+
|
Details | Diff | Splinter Review |
Vlad attempted to land an upgrade to in-repository version of Cairo last night and witnessed some reftest errors due to platform incompatibility. We need to upgrade the machines to reflect the new platform requirements proposed in the url: http://wiki.mozilla.org/Linux/Runtime_Requirements
Reporter | ||
Comment 1•18 years ago
|
||
Preed and Ben have raised good points about this hitting a number of extra machines. Perftest machines, for example. While it'll be fairly easy to upgrade qm-rhel02,3, the others might be a little trickier. There might be staging issues if we don't upgrade these at the same time.
Comment 2•18 years ago
|
||
Some notes extracted from email and previous discussions: We tried to replace argo-vm (an undocumented install currently production firefox trunk nightlies) with fx-linux-tbox (The current linux refplatform), but the builds produced by fx-linux-tbox were 5-10% slower than argo-vm when tested under identical conditions and identical configs. I don't know how to diagnose this, so I was going to propose that we ignore it and switch to fx-linux-tbox anyway, because it's a known config and with --enable-libxul it's within a few % of argo-vm. Now, as for switching to CentOS5 as a build refplatform: builds produced by CentOS5 will not run on CentOS4. This means that we would have to upgrade the performance test boxes to a newer runtime (I believe that the new perf farm is already running some modern version of Ubuntu, right?). If we decide to switch to CentOS5, we should switch the perf box first, and get some historical data. Then we can switch the build machine. This doesn't really affect the unit-test boxes specifically, because they test their own builds; so you could theoretically upgrade them independently. But that means that our unit tests are testing an entirely different environment than our production builds, which isn't especially helpful.
Reporter | ||
Comment 3•18 years ago
|
||
(In reply to comment #2) > Now, as for switching to CentOS5 as a build refplatform: builds produced by > CentOS5 will not run on CentOS4. This means that we would have to upgrade the > performance test boxes to a newer runtime (I believe that the new perf farm is > already running some modern version of Ubuntu, right?). If we decide to switch > to CentOS5, we should switch the perf box first, and get some historical data. > Then we can switch the build machine. Yup. The linux perf boxes (qm-plinux01-05) are running Ubuntu Feisty, iirc. > This doesn't really affect the unit-test boxes specifically, because they test > their own builds; so you could theoretically upgrade them independently. But > that means that our unit tests are testing an entirely different environment > than our production builds, which isn't especially helpful. Right. Once nice option since we've got a new machine on dedicated hardware is that we can install CentOS5 on it and bring up a test box and run it alongside qm-rhel02 for a couple of days to compare results. This might give us some useful data before we upgrade the build machines.
Blocks: 383960
Comment 4•18 years ago
|
||
I would love to stop using the old perf machine setup and use the new qm-plinux boxes instead, but I'm concerned about when they're going to be ready. Would it be faster to commission a new tinderbox-based perftest machine?
Flags: blocking1.9+
Target Milestone: --- → mozilla1.9alpha6
Comment 5•18 years ago
|
||
from offline discussions: 1) we've already moved from argo-vm (an undocumented install currently production firefox trunk nightlies) to fx-linux-tbox (The current linux refplatform, same as linux refplatform rel4). 2) we need to create a new refplatform rel5. Assigning to preed, based on yesterday's build meeting. 3) we need to rollout this new refplatform rel5 out to build and QA machines to close this bug.
Assignee: nobody → build
Component: Build Config → Build & Release
Flags: blocking1.9+
Product: Core → mozilla.org
QA Contact: build-config → preed
Target Milestone: mozilla1.9alpha6 → ---
Version: Trunk → other
Updated•18 years ago
|
Assignee: build → preed
Comment 6•18 years ago
|
||
I noticed that fx-linux-tbox, from the current generation of the ref vm, didn't have TBOX_CLIENT_CVS_DIR set, so it wasn't updating its tinderbox code. This line needs to be added to ~cltbld/.bash_profile export TBOX_CLIENT_CVS_DIR="/builds/tinderbox/mozilla/tools" Bug 384626 might change that a little.
Blocks: 333126
Assignee | ||
Comment 7•18 years ago
|
||
I'm actually working on this nowish, so resetting priority.
Status: NEW → ASSIGNED
Priority: -- → P1
Assignee | ||
Comment 8•18 years ago
|
||
Attachment #269265 -
Flags: review?(ccooper)
Updated•18 years ago
|
Attachment #269265 -
Flags: review?(ccooper) → review+
Assignee | ||
Comment 9•18 years ago
|
||
Alright, ref-vm is created. It does *NOT* (as of now) include the buildbot dependencies (I will add those on Monday). It's reporting into the MozillaExperimental tinderbox page (http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaExperimental), and builds are being uploaded in the experimental directory under linux-newref. We'll test this VM out for a few days, and then look at when it makes sense to switch it so it's the nightly Linux build machine for trunk (probably next week)?
It would probably be good for the reference build VM (if this is what we're going to be using for tinderboxes) to have some debuginfo packages, which help for generating stack traces and performance data when needed. The ones I have installed on my machine are: cairo-debuginfo expat-debuginfo fontconfig-debuginfo freetype-debuginfo glib2-debuginfo glibc-debuginfo gnome-vfs2-debuginfo gtk2-debuginfo hal-debuginfo libX11-debuginfo libXft-debuginfo libgnome-debuginfo pango-debuginfo scim-bridge-debuginfo
Comment 11•18 years ago
|
||
gcc-debuginfo is also useful sometimes (it has symbols for libstdc++)
Based on some trace-malloc stacks I took recently, the following also show up in some stacks: dbus-debuginfo libselinux-debuginfo gtk2-engines-debuginfo ORBit2-debuginfo libXcursor-debuginfo libXext-debuginfo libXfixes-debuginfo libXi-debuginfo libXinerama-debuginfo libXrender-debuginfo atk-debuginfo libbonobo-debuginfo dbus-glib-debuginfo GConf2-debuginfo popt-debuginfo gcc-debuginfo
Assignee | ||
Comment 13•17 years ago
|
||
Sorry for the bugspam; these are now P2 in the New View of the World (tm).
Priority: P1 → P2
Reporter | ||
Comment 14•17 years ago
|
||
This is holding up trunk development, which was the original reason for filing this bug. If there's some way that we can build a later version of Cairo on the existing refplatform, then that's OK, but I understood this was blocking some important patches. Vlad, Dbaron?
Assignee | ||
Comment 15•17 years ago
|
||
(In reply to comment #14) > This is holding up trunk development, which was the original reason for filing > this bug. If there's some way that we can build a later version of Cairo on the > existing refplatform, then that's OK, but I understood this was blocking some > important patches. robcee: I'm planning on deploying this for nightlies on Thursday, 5 July. There was a question about whether we need to coordinate the deployment together, and I think the answer technically is no, but a) I could be wrong, and b) it doesn't actually solve the problem until reftest is running on this version. So, will you have time on Thursday to deploy the new image?
Reporter | ||
Comment 16•17 years ago
|
||
Hey preed, I will be around all day Thursday and can help set this up under your expert tutelage. I'm not sure if we'll want to replace the existing machine (qm-rhel02, scary thought as it's still running the master) or install either a new VM with the reference image or install it on the mac mini we had set aside for this. We'll have to discuss. have a good 4th!
Assignee | ||
Comment 17•17 years ago
|
||
(In reply to comment #16) > new VM with the reference image or install it on the mac mini we had set aside > for this. We'll have to discuss. The quickest/easiest way to do this is run it in a VM. The unit test/ref test doesn't have performance requirements, does it? I was going to make the switch tonight (just now, actually), but I ran into a couple of problems getting all the extra packages people wanted. I also don't want to make the switch until you're around (and for others reading the bug, to be clear, robcee did bug me about it today, but I was distracted, looking at a couple of other fires). robcee: let's do this on Friday during the day; you gonna be around?
Assignee | ||
Comment 18•17 years ago
|
||
Update: bug 387128 requests cloning the new VM for the reftest. I'll be switching the nightly tinderbox over this afternoon; we'll see what happens.
Assignee | ||
Comment 19•17 years ago
|
||
Update: We attempted to switch over to the new refplatform for nightly builds on friday; that went fine. However, when the performance testing machine (not unit test or ref test) tried to run the build, it failed to find a (pango) shared library. So, I reverted us back to the older ref platform, since keeping us on the new refplatform would have meant no performance data. Reed and rhelmer pulled some heroics on Friday to get them up and I just pointed the new test machine to the new ref-test builds; they're both reporting to MozillaExperimental: http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaExperimental
Reporter | ||
Comment 20•17 years ago
|
||
New machine's up and running, and reporting to: http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox just waiting to checkin configuration files.
Comment 21•17 years ago
|
||
Are there bugs filed on the failing tests on qm-centos5-01? I didn't look at the unit/chrome tests, but I did check out the reftests yesterday, and 3/4 of them were due to font kerning. The other one looked like it could have been rounding, it looked like a 1 pixel difference in the size of a rect.
Reporter | ||
Comment 22•17 years ago
|
||
I blogged about them and roc saw it, does that count? ;) I don't believe individual bugs have been filed yet. We should do that.
Assignee | ||
Comment 23•17 years ago
|
||
So are we ready to make this switch again during Thursday's (12 July) nightly outage? Or do we need to wait for something else? I'm going to re-assign this bug to robcee, since I'm ready to go, to let him comment. There are bugs and test results reporting into MozillaExperimental now (although, they're currently down due to a VMware migration; should be back up shortly, though): http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaExperimental
Assignee | ||
Updated•17 years ago
|
Assignee: preed → rcampbell
Status: ASSIGNED → NEW
Reporter | ||
Comment 24•17 years ago
|
||
I believe we're ready to roll with this. I'll file individual bugs on the failures on qm-centos5-01 tomorrow if they haven't been filed already by then and send out pleas and bribes to try to get people looking at the errors.
Reporter | ||
Updated•17 years ago
|
Assignee: rcampbell → preed
Assignee | ||
Comment 25•17 years ago
|
||
fxnewref-linux-tbox is reporting into the Firefox page, cycle times look good, test results look good, I've posted to m.d.planning, m.d.a.firefox, and m.d.platform about its existence. Bug 387167 tracks renaming the machine to fx-linux-tbox; bug 388054 tracks an issue with spikes in the Tp/Ts graphs, which rhelmer is dealing with/has addressed.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•