Closed Bug 1007967 Opened 6 years ago Closed 6 years ago

Unable to puppetize a fresh tst-linux64-ec2 slave for loan

Categories

(Infrastructure & Operations :: RelOps: Puppet, task, blocker)

x86_64
Linux
task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: coop, Assigned: rail)

References

Details

I've been banging my head against this all afternoon trying to get an ec2 slave loaned to jmaher using the instructions in the wiki[1]. 

While running aws_create_instance.py, puppet on the slave just loops indefinitely retrying because dependency issues are preventing the python install.

Log snippet:

The following packages have unmet dependencies:
 python2.7-dev : Depends: libexpat1-dev but it is not going to be installed
                 Depends: libssl-dev but it is not going to be installed
                 Recommends: libc6-dev but it is not going to be installed or
                             libc-dev
E: Unable to correct problems, you have held broken packages.

Full log is here:

https://coop.pastebin.mozilla.org/5104697

1. https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#Test_machines
Looking at the syslog on tst-linux64-ec2-jmaher, there was an earlier failure trying to update authorized_keys:

May  8 20:58:03 ip-10-3-55-244 puppet-agent[2108]: (/Stage[main]/Users::Root::Setup/Ssh::Userconfig[root]/Concat[/root/.ssh/authorized_keys]/Exec[concat_/root/.ssh/authorized_keys]) Triggered 'refresh' from 3 events
May  8 20:58:03 ip-10-3-55-244 puppet-agent[2108]: Could not back up /root/.ssh/authorized_keys: uninitialized constant Puppet::FileSystem::File
May  8 20:58:03 ip-10-3-55-244 puppet-agent[2108]: Could not back up /root/.ssh/authorized_keys: uninitialized constant Puppet::FileSystem::File
May  8 20:58:03 ip-10-3-55-244 puppet-agent[2108]: (/Stage[main]/Users::Root::Setup/Ssh::Userconfig[root]/Concat[/root/.ssh/authorized_keys]/File[/root/.ssh/authorized_keys]/content) change from {md5}0bac70bb8d4f61e6f38538be3638bdea to {md5}944908a4c2bba4dd624c05474c14b33a failed: Could not back up /root/.ssh/authorized_keys: uninitialized constant Puppet::FileSystem::File

Full syslog here:

https://people.mozilla.org/~coop/tst-linux64-ec2-jmaher.syslog
I've given up with this particular instance and terminated it. If anyone wants to try again for a 4th time, be my guest, otherwise I'll try again tomorrow AM.
I tackled this a bit and it looks like we have some issues here :/

We have missing dependencies in the releng-updates repo:

 - python2.7-dev : Depends: libexpat1-dev
 - libexpat1-dev : Depends: libc6-dev
 - libc6-dev : Depends: libc6 (= 2.15-0ubuntu10.2) but 2.15-0ubuntu10.5 is to be installed

$ apt-cache policy libc6
libc6:
  Installed: 2.15-0ubuntu10.5
  Candidate: 2.15-0ubuntu10.5
  Version table:
 *** 2.15-0ubuntu10.5 0
        500 http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/releng-updates/precise/ precise-updates/all amd64 Packages
        100 /var/lib/dpkg/status
     2.15-0ubuntu10.2 0
        500 http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/ubuntu/precise/ precise-security/main amd64 Packages
     2.15-0ubuntu10 0
        500 http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/ubuntu/precise/ precise/main amd64 Packages

If you browse to http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/releng-updates/precise/pool/main/e/eglibc/ there are only 4 files in the directory. If you look at http://archive.ubuntu.com/ubuntu/pool/main/e/eglibc/ you can find 30 files with version 2.15-0ubuntu10.5.

I think we added the file in bug 994061 (I cannot access it though), Callek may have more insight what happened there (I see his r+ in http://hg.mozilla.org/build/puppet/rev/6802e5b7ff5f).
Blocks: 986477
Severity: normal → blocker
Flags: needinfo?(bugspam.Callek)
Creating slave AMIs from base AMIs without incremental puppet updates would have caught this earlier. Soon!
Assignee: relops → rail
looks like the broken eglibc packages aren't used in production:

$ apt-cache policy libc6
libc6:
  Installed: 2.15-0ubuntu10.2
  Candidate: 2.15-0ubuntu10.5
  Version table:
     2.15-0ubuntu10.5 0
        500 http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/releng-updates/precise/ precise-updates/all amd64 Packages
 *** 2.15-0ubuntu10.2 0
        500 http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/ubuntu/precise/ precise-security/main amd64 Packages
        100 /var/lib/dpkg/status
     2.15-0ubuntu10 0
        500 http://puppetagain-apt.pvt.build.mozilla.org/repos/apt/ubuntu/precise/ precise/main amd64 Packages


I'm going to (safely) remove them from the repo.
Flags: needinfo?(bugspam.Callek)
I killed the direrectory and regenerated the package indexes as described at
https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Packages#Adding_a_Single_Package

rsynced to releng-puppet2.scl3, waiting until it rsynced to other masters so I can bump the repo flag
it worked!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.