Closed Bug 1454911 Opened 7 years ago Closed 7 years ago

Run-Puppet :: puppet agent summary contains error

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: apop, Assigned: dividehex)

References

Details

Attachments

(1 file)

Today, while monitoring the channel #buildduty, this notification appeared : [sns alert] Apr 18 10:16:11 y-2008-ec2-golden.try.releng.use1.mozilla.com Userdata: Run-Puppet :: puppet agent summary contains error#015 Can you please help with this issue or point me to someone who knows how this issue can be solved ?
I have tried to log on the server via ssh, but I have received the following error : Unable to negotiate with UNKNOWN port 65535: no matching cipher found. Their offer: 3des-cbc,blowfish-cbc,cast128-cbc,idea-cbc
Assignee: relops → jwatkins
this seems to be the commit that generates the issue : https://hg.mozilla.org/build/puppet/rev/d3a68c812267
the golden ami should be recreated after the problem with the puppet has been resolved : Wed 13:41:06 UTC [7891] [] aws-manager2.srv.releng.scl3.mozilla.com:procs age - golden AMI is CRITICAL: ELAPSED CRITICAL: 15 crit, 0 warn out of 15 processes with args 'ec2-golden' (http://m.mozilla.org/procs+age+-+golden+AMI)
This looks like it is affecting all Off the top of my head there are 2 things this could be: 1: golden amis instances include puppet::periodic that causes them to also include the puppet::renew which in turn fails because it doesn't support windows. My assumption was that golden images only run puppet during creations and therefore do not include puppet::periodic 2: Instance data json files contain puppetmasters to puppetize against. They were changed yesterday to only point to MDC1/2, USE1, and USW2. SCL3 masters were removed completely. This means golden images tried to puppetize elsewhere but may be blocked by flows.
This is the commit that changes the puppetmasters available to AWS instances to puppetized against https://github.com/mozilla-releng/build-cloud-tools/commit/0a5449b2a03640a77b4022b50a58cbe0a68a4c82
Opps, didn't finish a thought. This looks like it is affecting all WINDOWS golden ec2 instances.
This facter module should explicitly define its confined OSes that it supports 2018-04-18 01:45:42 -0700 /File[C:/ProgramData/PuppetLabs/puppet/var/lib/facter/puppet_agent_cert.rb]/ensure (notice): defined content as '{md5}34ac2cb59cbf1af8f0213b8df5bc879d' 2018-04-18 01:45:47 -0700 Puppet (err): Could not retrieve local facts: uninitialized constant Facter::Core 2018-04-18 01:45:47 -0700 Puppet (err): Failed to apply catalog: Could not retrieve local facts: uninitialized constant Facter::Core
This limits puppet::renew_cert and puppet_agent_cert.rb facter module from running on windows
Attachment #8968979 - Flags: review?(dhouse)
Attachment #8968979 - Flags: review?(dhouse) → review+
See Also: → 1444467
(In reply to Jake Watkins [:dividehex] from comment #9) > Comment on attachment 8968979 [details] [diff] [review] > Limit cert renew manifest and facter to Linux and Darwin only > > https://hg.mozilla.org/build/puppet/rev/ > c1f4dc76da6979282dbd0fc631d736aff9a94489 > https://hg.mozilla.org/build/puppet/rev/ > 0309058909d8525b7020ec61e183b9e1e7806887 This changeset has propagated to all the puppetmasters by now. It should be safe to kill and restart the golden image generation for the windows hosts that failed earlier.
y-2008-ec2-golden.try.releng.use1.mozilla.com Userdata: Run-Puppet :: puppet agent summary contains error#015. ^^Still coming in even after I terminated y-2008-ec2-golden and the daily cron re-created it. Any updates on the #015 error?
Flags: needinfo?(jwatkins)
(In reply to Zsolt Fay [:zsoltfay] from comment #11) > y-2008-ec2-golden.try.releng.use1.mozilla.com Userdata: Run-Puppet :: puppet > agent summary contains error#015. > ^^Still coming in even after I terminated y-2008-ec2-golden and the daily > cron re-created it. > > Any updates on the #015 error? The patch I applied fixed the bustage caused by the puppet_agent_cert.rb. After that it looks like it was still suffering from problems caused by another patch (https://hg.mozilla.org/build/puppet/rev/07e4d452e404). I pinged bhearsum in #buildduty and he push more patches that should have fixed the issue. As of right now, I've seen y-2008-ec2-golden.try.releng.use1.mozilla.com successfully complete a puppet run. Therefore, I believe this issue is fixed.
Flags: needinfo?(jwatkins)
I'll mark this as fixed then. Will re-open it if y-2008-ec2-golden misbehaves.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: