Closed Bug 820235 Opened 12 years ago Closed 11 years ago

Perform verification of new Linux64 and Linux32 test reference platform on iX node

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: coop, Assigned: rail)

References

Details

Attachments

(5 files, 3 obsolete files)

bug820235.patch 11 years ago Dustin J. Mitchell [:dustin] (he/him) 1.90 KB, patch	cbook : review+ dustin : review- rail : feedback+	Details \| Diff \| Splinter Review
bug820235-r2.patch 11 years ago Dustin J. Mitchell [:dustin] (he/him) 4.01 KB, patch	rail : review+ dustin : checked-in+	Details \| Diff \| Splinter Review
sekrets 11 years ago Rail Aliiev [:rail] 2.55 KB, patch	catlee : review+ rail : checked-in+	Details \| Diff \| Splinter Review
buildapi 11 years ago Rail Aliiev [:rail] 1.36 KB, patch	catlee : review+ rail : checked-in+	Details \| Diff \| Splinter Review
buildbotcustom 11 years ago Rail Aliiev [:rail] 6.19 KB, patch	catlee : review+ rail : checked-in+	Details \| Diff \| Splinter Review
configs 11 years ago Rail Aliiev [:rail] 23.71 KB, patch		Details \| Diff \| Splinter Review
configs 11 years ago Rail Aliiev [:rail] 23.71 KB, patch	catlee : review+ rail : checked-in-	Details \| Diff \| Splinter Review
configs v2 11 years ago Rail Aliiev [:rail] 26.35 KB, patch	catlee : review+ rail : checked-in+	Details \| Diff \| Splinter Review

Chris Cooper [:coop] (he/him)

Reporter

Description

•

12 years ago

Like the other new test reference platforms, we need to get the evaluation node hooked up to a dev buildbot master, run the full suite of tests against it, and document the test failure delta from the current ref platform.

Chris Cooper [:coop] (he/him)

Reporter

Updated

•

12 years ago

Depends on: 820238

Chris Cooper [:coop] (he/him)

Reporter

Updated

•

12 years ago

Blocks: 820243

Carsten Book [:Tomcat]

Comment 1

•

12 years ago

(In reply to Chris Cooper [:coop] from comment #0)
> Like the other new test reference platforms, we need to get the evaluation
> node hooked up to a dev buildbot master, run the full suite of tests against
> it, and document the test failure delta from the current ref platform.

Hey Coop, do we have that test failure delta data from the current ref platform somewhere?

Chris Cooper [:coop] (he/him)

Reporter

Comment 2

•

12 years ago

(In reply to Carsten Book [:Tomcat] from comment #1) 
> Hey Coop, do we have that test failure delta data from the current ref
> platform somewhere?

Once the new test machine is ready for verification, it's as simple as pulling down the log from a current test run on mozilla-central and making sure you test the same packaged build on the new machine, i.e. uploading that build to dev-stage so the new machine can grab it. Alternatively, you could run the same build against the both the old ref platform and the new ref platform in staging. That would eliminate any production vs. staging differences.

Not really something we can do in advance unless you preserve the logs & build...best to wait until the new machine is setup to minimize result drift.

Carsten Book [:Tomcat]

Comment 3

•

11 years ago

(In reply to Chris Cooper [:coop] from comment #2)

> Not really something we can do in advance unless you preserve the logs &
> build...best to wait until the new machine is setup to minimize result drift.

thanks! maschine is up and working on the puppet stuff, will update the bug as i get news/results

Dustin J. Mitchell [:dustin] (he/him)

Comment 4

•

11 years ago

Tomcat, are you working on getting this to use X11 on the external graphics card?  I can take a whack at it if you're not.  That should probably be in a sub-bug anyway, since this bug is about evaluating the result.

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

11 years ago

Depends on: 838351

Dustin J. Mitchell [:dustin] (he/him)

Comment 5

•

11 years ago

So bug 838351 is for the graphics and audio, which I and/or someone else from IT will take care of.  Tomcat, do you know of anything else that will need to be done before we put these into production?

Flags: needinfo?(cbook)

Dustin J. Mitchell [:dustin] (he/him)

Comment 6

•

11 years ago

Tomcat, I'm going to grab ix-mn-linux64-001 for this purpose.

Dustin J. Mitchell [:dustin] (he/him)

Comment 7

•

11 years ago

Attached patch bug820235.patch (obsolete) — Details — Splinter Review

As I had mentioned in irc, the disableservices line is disabling services which aren't even installed in the Base group.  Puppet fails to stop a service it doesn't know about, so instead this just uninstalls them where they're installed.

This also adds support for ipmitools on Ubuntu, since it's required on this hardware.

Graphics work will be in a patch on bug 838351.

Attachment #710451 - Flags: review?(cbook)

Rail Aliiev [:rail]

Assignee

Comment 8

•

11 years ago

Comment on attachment 710451 [details] [diff] [review]
bug820235.patch

s/modem-manager/modemmanager/
Otherwise looks good.

Attachment #710451 - Flags: feedback+

Dustin J. Mitchell [:dustin] (he/him)

Comment 9

•

11 years ago

Thanks for checking - I fixed that but I think I forgot to push the diff update.

Carsten Book [:Tomcat]

Comment 10

•

11 years ago

Comment on attachment 710451 [details] [diff] [review]
bug820235.patch

r+ also see the notes from rail :)

Attachment #710451 - Flags: review?(cbook) → review+

Flags: needinfo?(cbook)

Carsten Book [:Tomcat]

Comment 11

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> So bug 838351 is for the graphics and audio, which I and/or someone else
> from IT will take care of.  Tomcat, do you know of anything else that will
> need to be done before we put these into production?

no so far i don't know anything else, thanks for checking, will update when i have somehting

Dustin J. Mitchell [:dustin] (he/him)

Comment 12

•

11 years ago

Comment on attachment 710451 [details] [diff] [review]
bug820235.patch

Per IRC conversations with Rail, this needs a bit more (installing ubuntu-desktop) and some changes to disableservices (install, but disable, anacron, since it's required for ubuntu-desktop)

Attachment #710451 - Flags: review-

Dustin J. Mitchell [:dustin] (he/him)

Comment 13

•

11 years ago

Attached patch bug820235-r2.patch — Details — Splinter Review

This should do the trick..

Attachment #710788 - Flags: review?(rail)

Rail Aliiev [:rail]

Assignee

Comment 14

•

11 years ago

Comment on attachment 710788 [details] [diff] [review]
bug820235-r2.patch

Review of attachment 710788 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm

Attachment #710788 - Flags: review?(rail) → review+

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

11 years ago

Attachment #710788 - Flags: checked-in+

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

11 years ago

Attachment #710451 - Attachment is obsolete: true

Dustin J. Mitchell [:dustin] (he/him)

Comment 15

•

11 years ago

Close!  I landed this to avoid including linux_desktop on Darwin:

diff --git a/modules/toplevel/manifests/slave/test.pp b/modules/toplevel/manifests/slave/test.pp
index 4416dbd..ab1e9aa 100644
--- a/modules/toplevel/manifests/slave/test.pp
+++ b/modules/toplevel/manifests/slave/test.pp
@@ -6,11 +6,15 @@ class toplevel::slave::test inherits toplevel::slave {
     # so we get the GUI for free and just need to ensure VNC is enabled.
     include vnc
     include screenresolution::talos
-    include packages::linux_desktop
     include users::builder::autologin
     include talos
     include ntp::atboot
     include packages::fonts
     include tweaks::fonts
     include tweaks::cleanup
+
+    # this will get fixed in a subsequent patch for bug 838351
+    if ($::operatingsystem == 'Ubuntu') {
+        include packages::linux_desktop
+    }
 }

Armen [:armenzg]

Updated

•

11 years ago

Blocks: linux64-ix-releng
No longer blocks: 820243

Dustin J. Mitchell [:dustin] (he/him)

Comment 16

•

11 years ago

I think this is ready to go, and in fact may already be done.  Rail?

Flags: needinfo?(rail)

Rail Aliiev [:rail]

Assignee

Comment 17

•

11 years ago

We still need to hookup these machines to one of the non-staging branches and run them in parallel with fedora slaves. This would require some changes in builbot-configs (evil loops, of course) and probably a person from a-team to look at the possible failures.

We attached some of the machines to my staging master, replacing existing fedora slaves, but evaluating results without TBPL is hard.

Flags: needinfo?(rail)

Chris Cooper [:coop] (he/him)

Reporter

Comment 18

•

11 years ago

(In reply to Rail Aliiev [:rail] from comment #17)
> We attached some of the machines to my staging master, replacing existing
> fedora slaves, but evaluating results without TBPL is hard.

Is cedar being specifically used for Win8 or could we also hook up the Fedora slaves there and just have the Windows guys ignore the Linux results and vice versa? Seems like the best use of existing resources to me.

Rail Aliiev [:rail]

Assignee

Comment 19

•

11 years ago

Yeah, Cedar is my favorite too. :)

Rail Aliiev [:rail]

Assignee

Comment 20

•

11 years ago

Attached patch sekrets — Details — Splinter Review

Attachment #729206 - Flags: review?(catlee)

Rail Aliiev [:rail]

Assignee

Comment 21

•

11 years ago

Attached patch buildapi — Details — Splinter Review

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Attachment #729207 - Flags: review?(catlee)

Rail Aliiev [:rail]

Assignee

Comment 22

•

11 years ago

Attached patch buildbotcustom — Details — Splinter Review

* use talos_slave_platforms by default and slave_platforms as fallback

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Attachment #729211 - Flags: review?(catlee)

Rail Aliiev [:rail]

Assignee

Comment 23

•

11 years ago

Attached patch configs (obsolete) — Details — Splinter Review

It generates sane diffs:

config.py: https://gist.github.com/rail/5240897
builders: https://gist.github.com/rail/5240891

Attachment #729212 - Flags: review?(catlee)

Rail Aliiev [:rail]

Assignee

Comment 24

•

11 years ago

Attached patch configs (obsolete) — Details — Splinter Review

the only difference is number of slaves (100 vs 50) in production_config.py

Attachment #729212 - Attachment is obsolete: true

Attachment #729212 - Flags: review?(catlee)

Attachment #729232 - Flags: review?(catlee)

Chris AtLee [:catlee]

Updated

•

11 years ago

Attachment #729206 - Flags: review?(catlee) → review+

Chris AtLee [:catlee]

Updated

•

11 years ago

Attachment #729207 - Flags: review?(catlee) → review+

Chris AtLee [:catlee]

Updated

•

11 years ago

Attachment #729211 - Flags: review?(catlee) → review+

Chris AtLee [:catlee]

Comment 25

•

11 years ago

Comment on attachment 729232 [details] [diff] [review]
configs

Review of attachment 729232 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla-tests/BuildSlaves.py.template
@@ +25,5 @@
>      "ubuntu64_vm-b2g": "pass",
>      "ubuntu64_vm": "pass",
> +    "ubuntu32_hw": "pass",
> +    "ubuntu64_hw-b2g": "pass",
> +    "ubuntu64_hw": "pass",

nit: can you sort these platforms? maybe group all the ubuntu32 variants together, and then all the ubuntu64 variants together?

Attachment #729232 - Flags: review?(catlee) → review+

Rail Aliiev [:rail]

Assignee

Comment 26

•

11 years ago

Comment on attachment 729232 [details] [diff] [review]
configs

http://hg.mozilla.org/build/puppet-manifests/rev/b4887372734d
with platforms sorted

Attachment #729232 - Flags: checked-in+

Rail Aliiev [:rail]

Assignee

Comment 27

•

11 years ago

Comment on attachment 729207 [details] [diff] [review]
buildapi

http://hg.mozilla.org/build/buildapi/rev/d6c6f0fe65b1

Attachment #729207 - Flags: checked-in+

Rail Aliiev [:rail]

Assignee

Comment 28

•

11 years ago

Comment on attachment 729211 [details] [diff] [review]
buildbotcustom

http://hg.mozilla.org/build/buildbotcustom/rev/1b90a741fdd9

Attachment #729211 - Flags: checked-in+

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Attachment #729206 - Flags: checked-in+

Rail Aliiev [:rail]

Assignee

Comment 29

•

11 years ago

Comment on attachment 729232 [details] [diff] [review]
configs

http://hg.mozilla.org/build/buildbot-configs/rev/fa82bb34dd25 actually

Rail Aliiev [:rail]

Assignee

Comment 30

•

11 years ago

Back out: http://hg.mozilla.org/build/buildbot-configs/rev/c5631ad322a0


INFO  - created  "bm18-tests1-linux" master, running checkconfig
INFO  - starting to print log file '/builds/buildbot/preproduction/slave/test-masters/buildbot-configs/test-output/bm18-tests1-linux-jp7wI4-checkconfig.log'
INFO  - /builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/twisted/mail/smtp.py:10: DeprecationWarning: the MimeWriter module is deprecated; use the email package instead
INFO  -   import MimeWriter, tempfile, rfc822
INFO  - Traceback (most recent call last):
INFO  -   File "/builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/buildbot-0.8.2_hg_41fc8a9db7a0_production_0.8-py2.6.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
INFO  -     ConfigLoader(configFileName=configFileName)
INFO  -   File "/builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/buildbot-0.8.2_hg_41fc8a9db7a0_production_0.8-py2.6.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
INFO  -     self.loadConfig(configFile, check_synchronously_only=True)
INFO  -   File "/builds/buildbot/preproduction/slave/test-masters/sandbox/lib/python2.6/site-packages/buildbot-0.8.2_hg_41fc8a9db7a0_production_0.8-py2.6.egg/buildbot/master.py", line 808, in loadConfig
INFO  -     % (b['name'], n))
INFO  - ValueError: builder Ubuntu HW 12.04 x64 cedar talos svgr uses undefined slave talos-linux64-ix-001
INFO  - finished printing log file '/builds/buildbot/preproduction/slave/test-masters/buildbot-configs/test-output/bm18-tests1-linux-jp7wI4-checkconfig.log'
ERROR - TEST-FAIL bm18-tests1-linux failed to run checkconfig


Hmmm. It worked fine on my <s>laptop</s> dev-master...

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Attachment #729232 - Flags: checked-in+ → checked-in-

Rail Aliiev [:rail]

Assignee

Comment 31

•

11 years ago

Attached patch configs v2 — Details — Splinter Review

Hmmm, it turns out that running dump_master.py and builder_list.py doesn't check the configs...

A trivial interdiff: https://gist.github.com/rail/5248201

It passes test-master.sh now and has the same diff of list of builders.

Attachment #729232 - Attachment is obsolete: true

Attachment #729700 - Flags: review?(catlee)

Chris AtLee [:catlee]

Updated

•

11 years ago

Attachment #729700 - Flags: review?(catlee) → review+

Rail Aliiev [:rail]

Assignee

Comment 32

•

11 years ago

Comment on attachment 729700 [details] [diff] [review]
configs v2

http://hg.mozilla.org/build/buildbot-configs/rev/dfb9ba1d7083

Attachment #729700 - Flags: checked-in+

Chris AtLee [:catlee]

Comment 33

•

11 years ago

random aside - can we make the hostnames consistent between HW and VMs? We currently have

tst-linux32-ec2-XXX
talos-linux32-ix-YYY

Does it still make sense to have 'talos' in the hostname?

Rail Aliiev [:rail]

Assignee

Comment 34

•

11 years ago

b2g emulator tests are enabled on cedar in parallel with the old ones

Depends on: 858214

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Depends on: 858587

Rail Aliiev [:rail]

Assignee

Comment 35

•

11 years ago

(In reply to Chris AtLee [:catlee] from comment #33)
> random aside - can we make the hostnames consistent between HW and VMs? We
> currently have
> 
> tst-linux32-ec2-XXX
> talos-linux32-ix-YYY
> 
> Does it still make sense to have 'talos' in the hostname?

Yeah, I think this will be better.

Ami, how long it may take to rename the slaves?

From our side this will require changes in puppet patterns and buildbot configs.

Flags: needinfo?(arich)

Chris Cooper [:coop] (he/him)

Reporter

Updated

•

11 years ago

Assignee: cbook → rail

Amy Rich [:arr] [:arich]

Comment 36

•

11 years ago

So this is a significant couple of days worth of work to reimage all of the linux hosts we already have.  We would also need to change dns, inventory, and nagios.  We also would need to change all of inventory and dns for the windows xp and w7 hosts as well since the hardware is on ordered and all of these things have already been pre-populated.

I thought that talos was still a useful designator, because the whole reason we have physical hardware (vs the AWS machines) was because we *needed* it for talos (and graphics) tests (eg AWS can't do talos).  So to me it makes sense to still differentiate.

Flags: needinfo?(arich)

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Updated

•

11 years ago

Depends on: 859972

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Depends on: 859867

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Depends on: 861580

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Depends on: 862327

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

Depends on: 863022

Armen [:armenzg]

Comment 37

•

11 years ago

(In reply to Amy Rich [:arich] [:arr] from comment #36)
> So this is a significant couple of days worth of work to reimage all of the
> linux hosts we already have.  We would also need to change dns, inventory,
> and nagios.  We also would need to change all of inventory and dns for the
> windows xp and w7 hosts as well since the hardware is on ordered and all of
> these things have already been pre-populated.
> 
> I thought that talos was still a useful designator, because the whole reason
> we have physical hardware (vs the AWS machines) was because we *needed* it
> for talos (and graphics) tests (eg AWS can't do talos).  So to me it makes
> sense to still differentiate.

This probably got clarified somewhere else but in-house test machines will be doing mainly two types of jobs:
* unit tests that need graphic cards support
* talos jobs

We also established a hostname naming convention that is going through review by Relops and DCops.

Rail Aliiev [:rail]

Assignee

Updated

•

11 years ago

No longer depends on: 859867

Rail Aliiev [:rail]

Assignee

Comment 38

•

11 years ago

ATM this platform can be used for Talos without any problems. B2G emulator failures are tracked by bug 850105.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Rail Aliiev [:rail]

Assignee

Comment 39

•

11 years ago

BTW, (almost) everything here applies to 32-bit platform as well. Updating the summary accordingly.

Summary: Perform verification of new Linux64 test reference platform on iX node → Perform verification of new Linux64 and Linux32 test reference platform on iX node

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: Platform Support → Buildduty

Product: Release Engineering → Infrastructure & Operations

BMO Automation

Updated

•

4 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

You need to log in before you can comment on or make changes to this bug.