Closed
Bug 820235
Opened 12 years ago
Closed 12 years ago
Perform verification of new Linux64 and Linux32 test reference platform on iX node
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: coop, Assigned: rail)
References
Details
Attachments
(5 files, 3 obsolete files)
4.01 KB,
patch
|
rail
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
2.55 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
1.36 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
6.19 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
26.35 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
Like the other new test reference platforms, we need to get the evaluation node hooked up to a dev buildbot master, run the full suite of tests against it, and document the test failure delta from the current ref platform.
Comment 1•12 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #0)
> Like the other new test reference platforms, we need to get the evaluation
> node hooked up to a dev buildbot master, run the full suite of tests against
> it, and document the test failure delta from the current ref platform.
Hey Coop, do we have that test failure delta data from the current ref platform somewhere?
Reporter | ||
Comment 2•12 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #1)
> Hey Coop, do we have that test failure delta data from the current ref
> platform somewhere?
Once the new test machine is ready for verification, it's as simple as pulling down the log from a current test run on mozilla-central and making sure you test the same packaged build on the new machine, i.e. uploading that build to dev-stage so the new machine can grab it. Alternatively, you could run the same build against the both the old ref platform and the new ref platform in staging. That would eliminate any production vs. staging differences.
Not really something we can do in advance unless you preserve the logs & build...best to wait until the new machine is setup to minimize result drift.
Comment 3•12 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #2)
> Not really something we can do in advance unless you preserve the logs &
> build...best to wait until the new machine is setup to minimize result drift.
thanks! maschine is up and working on the puppet stuff, will update the bug as i get news/results
Comment 4•12 years ago
|
||
Tomcat, are you working on getting this to use X11 on the external graphics card? I can take a whack at it if you're not. That should probably be in a sub-bug anyway, since this bug is about evaluating the result.
Comment 5•12 years ago
|
||
So bug 838351 is for the graphics and audio, which I and/or someone else from IT will take care of. Tomcat, do you know of anything else that will need to be done before we put these into production?
Flags: needinfo?(cbook)
Comment 6•12 years ago
|
||
Tomcat, I'm going to grab ix-mn-linux64-001 for this purpose.
Comment 7•12 years ago
|
||
As I had mentioned in irc, the disableservices line is disabling services which aren't even installed in the Base group. Puppet fails to stop a service it doesn't know about, so instead this just uninstalls them where they're installed.
This also adds support for ipmitools on Ubuntu, since it's required on this hardware.
Graphics work will be in a patch on bug 838351.
Attachment #710451 -
Flags: review?(cbook)
Assignee | ||
Comment 8•12 years ago
|
||
Comment on attachment 710451 [details] [diff] [review]
bug820235.patch
s/modem-manager/modemmanager/
Otherwise looks good.
Attachment #710451 -
Flags: feedback+
Comment 9•12 years ago
|
||
Thanks for checking - I fixed that but I think I forgot to push the diff update.
Comment 10•12 years ago
|
||
Comment on attachment 710451 [details] [diff] [review]
bug820235.patch
r+ also see the notes from rail :)
Attachment #710451 -
Flags: review?(cbook) → review+
Flags: needinfo?(cbook)
Comment 11•12 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> So bug 838351 is for the graphics and audio, which I and/or someone else
> from IT will take care of. Tomcat, do you know of anything else that will
> need to be done before we put these into production?
no so far i don't know anything else, thanks for checking, will update when i have somehting
Comment 12•12 years ago
|
||
Comment on attachment 710451 [details] [diff] [review]
bug820235.patch
Per IRC conversations with Rail, this needs a bit more (installing ubuntu-desktop) and some changes to disableservices (install, but disable, anacron, since it's required for ubuntu-desktop)
Attachment #710451 -
Flags: review-
Assignee | ||
Comment 14•12 years ago
|
||
Comment on attachment 710788 [details] [diff] [review]
bug820235-r2.patch
Review of attachment 710788 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm
Attachment #710788 -
Flags: review?(rail) → review+
Updated•12 years ago
|
Attachment #710788 -
Flags: checked-in+
Updated•12 years ago
|
Attachment #710451 -
Attachment is obsolete: true
Comment 15•12 years ago
|
||
Close! I landed this to avoid including linux_desktop on Darwin:
diff --git a/modules/toplevel/manifests/slave/test.pp b/modules/toplevel/manifests/slave/test.pp
index 4416dbd..ab1e9aa 100644
--- a/modules/toplevel/manifests/slave/test.pp
+++ b/modules/toplevel/manifests/slave/test.pp
@@ -6,11 +6,15 @@ class toplevel::slave::test inherits toplevel::slave {
# so we get the GUI for free and just need to ensure VNC is enabled.
include vnc
include screenresolution::talos
- include packages::linux_desktop
include users::builder::autologin
include talos
include ntp::atboot
include packages::fonts
include tweaks::fonts
include tweaks::cleanup
+
+ # this will get fixed in a subsequent patch for bug 838351
+ if ($::operatingsystem == 'Ubuntu') {
+ include packages::linux_desktop
+ }
}
Updated•12 years ago
|
Comment 16•12 years ago
|
||
I think this is ready to go, and in fact may already be done. Rail?
Flags: needinfo?(rail)
Assignee | ||
Comment 17•12 years ago
|
||
We still need to hookup these machines to one of the non-staging branches and run them in parallel with fedora slaves. This would require some changes in builbot-configs (evil loops, of course) and probably a person from a-team to look at the possible failures.
We attached some of the machines to my staging master, replacing existing fedora slaves, but evaluating results without TBPL is hard.
Flags: needinfo?(rail)
Reporter | ||
Comment 18•12 years ago
|
||
(In reply to Rail Aliiev [:rail] from comment #17)
> We attached some of the machines to my staging master, replacing existing
> fedora slaves, but evaluating results without TBPL is hard.
Is cedar being specifically used for Win8 or could we also hook up the Fedora slaves there and just have the Windows guys ignore the Linux results and vice versa? Seems like the best use of existing resources to me.
Assignee | ||
Comment 19•12 years ago
|
||
Yeah, Cedar is my favorite too. :)
Assignee | ||
Comment 20•12 years ago
|
||
Attachment #729206 -
Flags: review?(catlee)
Assignee | ||
Comment 21•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #729207 -
Flags: review?(catlee)
Assignee | ||
Comment 22•12 years ago
|
||
* use talos_slave_platforms by default and slave_platforms as fallback
Assignee | ||
Updated•12 years ago
|
Attachment #729211 -
Flags: review?(catlee)
Assignee | ||
Comment 23•12 years ago
|
||
It generates sane diffs:
config.py: https://gist.github.com/rail/5240897
builders: https://gist.github.com/rail/5240891
Attachment #729212 -
Flags: review?(catlee)
Assignee | ||
Comment 24•12 years ago
|
||
the only difference is number of slaves (100 vs 50) in production_config.py
Attachment #729212 -
Attachment is obsolete: true
Attachment #729212 -
Flags: review?(catlee)
Attachment #729232 -
Flags: review?(catlee)
Updated•12 years ago
|
Attachment #729206 -
Flags: review?(catlee) → review+
Updated•12 years ago
|
Attachment #729207 -
Flags: review?(catlee) → review+
Updated•12 years ago
|
Attachment #729211 -
Flags: review?(catlee) → review+
Comment 25•12 years ago
|
||
Comment on attachment 729232 [details] [diff] [review]
configs
Review of attachment 729232 [details] [diff] [review]:
-----------------------------------------------------------------
::: mozilla-tests/BuildSlaves.py.template
@@ +25,5 @@
> "ubuntu64_vm-b2g": "pass",
> "ubuntu64_vm": "pass",
> + "ubuntu32_hw": "pass",
> + "ubuntu64_hw-b2g": "pass",
> + "ubuntu64_hw": "pass",
nit: can you sort these platforms? maybe group all the ubuntu32 variants together, and then all the ubuntu64 variants together?
Attachment #729232 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 26•12 years ago
|
||
Comment on attachment 729232 [details] [diff] [review]
configs
http://hg.mozilla.org/build/puppet-manifests/rev/b4887372734d
with platforms sorted
Attachment #729232 -
Flags: checked-in+
Assignee | ||
Comment 27•12 years ago
|
||
Comment on attachment 729207 [details] [diff] [review]
buildapi
http://hg.mozilla.org/build/buildapi/rev/d6c6f0fe65b1
Attachment #729207 -
Flags: checked-in+
Assignee | ||
Comment 28•12 years ago
|
||
Comment on attachment 729211 [details] [diff] [review]
buildbotcustom
http://hg.mozilla.org/build/buildbotcustom/rev/1b90a741fdd9
Attachment #729211 -
Flags: checked-in+
Assignee | ||
Updated•12 years ago
|
Attachment #729206 -
Flags: checked-in+
Assignee | ||
Comment 29•12 years ago
|
||
Comment on attachment 729232 [details] [diff] [review]
configs
http://hg.mozilla.org/build/buildbot-configs/rev/fa82bb34dd25 actually
Assignee | ||
Comment 30•12 years ago
|
||
Back out: http://hg.mozilla.org/build/buildbot-configs/rev/c5631ad322a0
INFO - created "bm18-tests1-linux" master, running checkconfig
INFO - starting to print log file '/builds/buildbot/preproduction/slave/test-masters/buildbot-configs/test-output/bm18-tests1-linux-jp7wI4-checkconfig.log'
INFO - /builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/twisted/mail/smtp.py:10: DeprecationWarning: the MimeWriter module is deprecated; use the email package instead
INFO - import MimeWriter, tempfile, rfc822
INFO - Traceback (most recent call last):
INFO - File "/builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/buildbot-0.8.2_hg_41fc8a9db7a0_production_0.8-py2.6.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
INFO - ConfigLoader(configFileName=configFileName)
INFO - File "/builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/buildbot-0.8.2_hg_41fc8a9db7a0_production_0.8-py2.6.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
INFO - self.loadConfig(configFile, check_synchronously_only=True)
INFO - File "/builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/buildbot-0.8.2_hg_41fc8a9db7a0_production_0.8-py2.6.egg/buildbot/master.py", line 808, in loadConfig
INFO - % (b['name'], n))
INFO - ValueError: builder Ubuntu HW 12.04 x64 cedar talos svgr uses undefined slave talos-linux64-ix-001
INFO - finished printing log file '/builds/buildbot/preproduction/slave/test-masters/buildbot-configs/test-output/bm18-tests1-linux-jp7wI4-checkconfig.log'
ERROR - TEST-FAIL bm18-tests1-linux failed to run checkconfig
Hmmm. It worked fine on my <s>laptop</s> dev-master...
Assignee | ||
Updated•12 years ago
|
Attachment #729232 -
Flags: checked-in+ → checked-in-
Assignee | ||
Comment 31•12 years ago
|
||
Hmmm, it turns out that running dump_master.py and builder_list.py doesn't check the configs...
A trivial interdiff: https://gist.github.com/rail/5248201
It passes test-master.sh now and has the same diff of list of builders.
Attachment #729232 -
Attachment is obsolete: true
Attachment #729700 -
Flags: review?(catlee)
Updated•12 years ago
|
Attachment #729700 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 32•12 years ago
|
||
Comment on attachment 729700 [details] [diff] [review]
configs v2
http://hg.mozilla.org/build/buildbot-configs/rev/dfb9ba1d7083
Attachment #729700 -
Flags: checked-in+
Comment 33•12 years ago
|
||
random aside - can we make the hostnames consistent between HW and VMs? We currently have
tst-linux32-ec2-XXX
talos-linux32-ix-YYY
Does it still make sense to have 'talos' in the hostname?
Assignee | ||
Comment 34•12 years ago
|
||
b2g emulator tests are enabled on cedar in parallel with the old ones
Depends on: 858214
Assignee | ||
Comment 35•12 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #33)
> random aside - can we make the hostnames consistent between HW and VMs? We
> currently have
>
> tst-linux32-ec2-XXX
> talos-linux32-ix-YYY
>
> Does it still make sense to have 'talos' in the hostname?
Yeah, I think this will be better.
Ami, how long it may take to rename the slaves?
From our side this will require changes in puppet patterns and buildbot configs.
Flags: needinfo?(arich)
Reporter | ||
Updated•12 years ago
|
Assignee: cbook → rail
Comment 36•12 years ago
|
||
So this is a significant couple of days worth of work to reimage all of the linux hosts we already have. We would also need to change dns, inventory, and nagios. We also would need to change all of inventory and dns for the windows xp and w7 hosts as well since the hardware is on ordered and all of these things have already been pre-populated.
I thought that talos was still a useful designator, because the whole reason we have physical hardware (vs the AWS machines) was because we *needed* it for talos (and graphics) tests (eg AWS can't do talos). So to me it makes sense to still differentiate.
Flags: needinfo?(arich)
Comment 37•12 years ago
|
||
(In reply to Amy Rich [:arich] [:arr] from comment #36)
> So this is a significant couple of days worth of work to reimage all of the
> linux hosts we already have. We would also need to change dns, inventory,
> and nagios. We also would need to change all of inventory and dns for the
> windows xp and w7 hosts as well since the hardware is on ordered and all of
> these things have already been pre-populated.
>
> I thought that talos was still a useful designator, because the whole reason
> we have physical hardware (vs the AWS machines) was because we *needed* it
> for talos (and graphics) tests (eg AWS can't do talos). So to me it makes
> sense to still differentiate.
This probably got clarified somewhere else but in-house test machines will be doing mainly two types of jobs:
* unit tests that need graphic cards support
* talos jobs
We also established a hostname naming convention that is going through review by Relops and DCops.
Assignee | ||
Comment 38•12 years ago
|
||
ATM this platform can be used for Talos without any problems. B2G emulator failures are tracked by bug 850105.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 39•12 years ago
|
||
BTW, (almost) everything here applies to 32-bit platform as well. Updating the summary accordingly.
Summary: Perform verification of new Linux64 test reference platform on iX node → Perform verification of new Linux64 and Linux32 test reference platform on iX node
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•