Closed Bug 1450035 Opened 7 years ago Closed 6 years ago

Package and deploy python 2.7.15 to release infrastructure

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: bhearsum)

References

Details

(Whiteboard: [keep-open][leave-open][releng:q22018])

Attachments

(8 files, 3 obsolete files)

78.48 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
11.42 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
3.08 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
5.31 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
995 bytes, patch
mtabara
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
764 bytes, patch
mtabara
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
868 bytes, patch
Callek
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
3.85 KB, patch
mozilla
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
No description provided.
Assignee: nobody → bhearsum
Because build & test machines aren't in scope here, I think that we only need to worry about CentOS. It looks like we have some packages that may depend on Python, so we may need to build new versions of the Mercurial and Virtualenv packages as well -- I'll have to test this on one machine first to be sure.
I was able to get 2.7.14 rpms built with this.
Blocks: 1434730
This doesn't quite work yet (it doesn't gracefully shutdown anything - which causes various issues with most services), but it does rebuild things correctly if it managed to remove the virtualenv!
This patch should cause all of our puppet managed virtualenvs to rebuild whenever the Python version that they depend on changes. For virtualenvs that are part of a running service, they will first shut down the service (this is not needed for certain ones that only support cron or at-startup services). I haven't tested this on every possible machine, but I have tested various dev machines of different types. This patch should be a no-op until Python versions are actually changed -- which we'll be doing with production canaries anyways.
Attachment #8967873 - Attachment is obsolete: true
Attachment #8968533 - Flags: review?(rail)
Comment on attachment 8968533 [details] [diff] [review] rebuild virtualenvs with graceful shutdowns Review of attachment 8968533 [details] [diff] [review]: ----------------------------------------------------------------- In overall it looks great! I have a couple of questions before I can r+ this. ::: modules/aws_manager/manifests/install.pp @@ +16,5 @@ > > python::virtualenv { > $aws_manager::settings::root: > + python => $packages::mozilla::python27::python, > + rebuild_trigger => Class['packages::mozilla::python27'], Hmm, in some places you use `rebuild_trigger` with the same value as `require`, in some places it is an `Exec`. Probably the former is the right option. Am I correct? ::: modules/buildbot_bridge/manifests/init.pp @@ +17,5 @@ > > + # If the Python installation changes, we need to rebuild the virtualenv > + # from scratch. Before doing that, we need to stop the running instance. > + exec { > + "stop-for-rebuild-bblistener": Soon it will be dead! ::: modules/python/manifests/virtualenv.pp @@ +57,5 @@ > + exec { > + "rebuild $virtualenv": > + user => $ve_user, > + logoutput => on_failure, > + command => "/bin/rm -rf $virtualenv", Wow! I wonder what happens to services using their home directory as virtualenv root. Foe example, releaserunner uses /build/releaserunner for configs, venv, logs, tools checkout. I wonder if it'd be safer to remove only particular directories (bin, lib, etc)? ::: modules/python27/manifests/virtualenv.pp @@ +57,5 @@ > + exec { > + "rebuild $virtualenv": > + user => $ve_user, > + logoutput => on_failure, > + command => "/bin/rm -rf $virtualenv", The same here. ::: modules/python35/manifests/virtualenv.pp @@ +47,5 @@ > + if ($::operatingsystem != Windows) { > + exec { > + "rebuild $virtualenv": > + logoutput => on_failure, > + command => "/bin/rm -rf $virtualenv", And here.
(In reply to Rail Aliiev [:rail] ⌚️ET from comment #5) > Comment on attachment 8968533 [details] [diff] [review] > rebuild virtualenvs with graceful shutdowns > > Review of attachment 8968533 [details] [diff] [review]: > ----------------------------------------------------------------- > > In overall it looks great! I have a couple of questions before I can r+ this. > > ::: modules/aws_manager/manifests/install.pp > @@ +16,5 @@ > > > > python::virtualenv { > > $aws_manager::settings::root: > > + python => $packages::mozilla::python27::python, > > + rebuild_trigger => Class['packages::mozilla::python27'], > > Hmm, in some places you use `rebuild_trigger` with the same value as > `require`, in some places it is an `Exec`. Probably the former is the right > option. Am I correct? For services that need to be stopped, we depend an Exec (which depends on the python package). For services that don't, we depend directly on the python package (which causes the virtualenv to be immediately rebuilt after the python version changes). > ::: modules/buildbot_bridge/manifests/init.pp > @@ +17,5 @@ > > > > + # If the Python installation changes, we need to rebuild the virtualenv > > + # from scratch. Before doing that, we need to stop the running instance. > > + exec { > > + "stop-for-rebuild-bblistener": > > Soon it will be dead! :) > ::: modules/python/manifests/virtualenv.pp > @@ +57,5 @@ > > + exec { > > + "rebuild $virtualenv": > > + user => $ve_user, > > + logoutput => on_failure, > > + command => "/bin/rm -rf $virtualenv", > > Wow! I wonder what happens to services using their home directory as > virtualenv root. Foe example, releaserunner uses /build/releaserunner for > configs, venv, logs, tools checkout. I wonder if it'd be safer to remove > only particular directories (bin, lib, etc)? Hm, maybe. virtualenv appears to be OK creating a virtualenv in an existing directory, and the virtualenv type uses "bin/pip" for creates, so this might work. Puppet _should_ rebuild anything that gets destroyed though :).
Attachment #8968533 - Attachment is obsolete: true
Attachment #8968533 - Flags: review?(rail)
Attachment #8968899 - Flags: review?(rail)
Comment on attachment 8968899 [details] [diff] [review] rebuild virtualenvs without wiping out other things in the directory Review of attachment 8968899 [details] [diff] [review]: ----------------------------------------------------------------- LGTM!
Attachment #8968899 - Flags: review?(rail) → review+
Depends on: 1455030
I built new RPMs and a DMG last week. I've tried it on a few dev hosts without issue. I think rolling it out to low risk production hosts is the best next step. This should verify the virtualenv rebuild logic in production, too.
Attachment #8968988 - Flags: review?(rail)
Attachment #8968988 - Flags: review?(rail) → review+
Attachment #8967150 - Attachment is obsolete: true
Today's rollout is done, but had numerous issues to work through: * Some virtualenvs failed to remove, because we were doing so as the virtualenv user, and there were root-owned files. My suspicion here is that someone manually did something in the virtualenv as root. https://hg.mozilla.org/build/puppet/rev/79d182641a3d was landed to fix this, and I removed the broken virtualenvs by hand. * Some virtualenvs failed to install due to missing Python dependencies. These were probably ancient virtualenvs that were built against a different pypi host. I landed https://hg.mozilla.org/build/puppet/rev/48c0f608153b to fix them. * aws-manager2 had a stale cronjob. It had been removed in https://github.com/mozilla/build-puppet/commit/36cb47edc73abab7b38c5fe857538af4c586db7a#diff-3a5a43e6c2741272f8266595308fcef9, but the cron.d was not removed. When the virtualenv was rebuilt, the script it was running disappeared. I removed this cron.d entry by hand to fix. * Some virtualenvs failed to install some modules due to the existence of a "build" directory. I removed it by hand, and I have an incoming patch to remove it as part of the rebuild process. Most of these issues _shouldn't_ come up for future rollouts once all the fixes are landed. However, we should do an audit of all of the pinned python module versions we depend on, and make sure they're all available on the Puppet python package mirrors.
See previous comment for "build" directory issues. This also includes a typo fix on the treescriptworker prod hostname - which didn't get upgraded yet because of it.
Attachment #8969307 - Flags: review?(rail)
I also noticed that PyYAML kept reinstalled over and over again on the aws-manager machines - Puppet claimed it exited successfully, but PyYAML 3.11 remained installed (instead of upgrading to 3.12). I wasn't able to reproduce this by hand even with the exact same command that puppet ran. I suspect something similar will happen on other hosts at some point, but I don't know how to debug further at the moment :(. Something to watch out for.
Attachment #8969307 - Flags: review?(rail) → review+
(In reply to Ben Hearsum (:bhearsum) from comment #14) > I also noticed that PyYAML kept reinstalled over and over again on the > aws-manager machines - Puppet claimed it exited successfully, but PyYAML > 3.11 remained installed (instead of upgrading to 3.12). I wasn't able to > reproduce this by hand even with the exact same command that puppet ran. I > suspect something similar will happen on other hosts at some point, but I > don't know how to debug further at the moment :(. Something to watch out for. I tracked this down to cloud-tools pinning 3.11. I've opened https://github.com/mozilla-releng/build-cloud-tools/pull/339 to fix that.
(In reply to Ben Hearsum (:bhearsum) from comment #12) > * Some virtualenvs failed to install due to missing Python dependencies. > These were probably ancient virtualenvs that were built against a different > pypi host. I landed https://hg.mozilla.org/build/puppet/rev/48c0f608153b to > fix them. I dug into this a bit more and looked at all pinned package versions in puppet -- I did not find any additional missing modules. Any future rebuilds *should* work as far a python package availability goes.
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/2cf09f68de4f remove 'build' directory from virtualenvs; fix treescriptworker hostname. r=rail
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [keep-open][leave-open]
Depends on: 1457996
I tried an upgrade to Python 2.7.14 on a brand new signing server today. The good news is that the signing server continued to function correctly after Python was upgraded and the virtualenv rebuilt (but before I restarted the signing server instance). The bad news is that the signing servers don't work after restarting them with Python 2.7.14. It looks like the fix for this will be non-trivial, so I've filed bug 1457996 to track.
Whiteboard: [keep-open][leave-open] → [keep-open][leave-open][releng:q22018]
Summary: Package and deploy python 2.7.14 to release infrastructure → Package and deploy python 2.7.15 to release infrastructure
I think it should be safe to upgrade any hosts that already got 2.7.14 to 2.7.15.
Attachment #8972531 - Flags: review?(rail)
Comment on attachment 8972531 [details] [diff] [review] update 2.7.14 -> 2.7.15 Review of attachment 8972531 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/packages/manifests/mozilla/python27-dmg.sh @@ +55,3 @@ > > # %prep > +tar -jxf Python-$pyver.$pyrel.tar.xz Shouldn't this be capital j? Or just `tar -xf ...` if mac's tar supports guessing.
Attachment #8972531 - Flags: review?(rail) → review+
(In reply to Rail Aliiev [:rail] ⌚️ET from comment #21) > Comment on attachment 8972531 [details] [diff] [review] > update 2.7.14 -> 2.7.15 > > Review of attachment 8972531 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: modules/packages/manifests/mozilla/python27-dmg.sh > @@ +55,3 @@ > > > > # %prep > > +tar -jxf Python-$pyver.$pyrel.tar.xz > > Shouldn't this be capital j? Or just `tar -xf ...` if mac's tar supports > guessing. -j worked when I built it yesterday, actually. Maybe it was ignored...
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/0968a3163854 upgrade aws-manager, cruncher, buildduty-tools, and treescriptworkers to Python 2.7.15. r=rail
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/19b072d95912 add python 2.7.15 virtual resource. r=bustage
Depends on: 1458557
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #8972612 - Flags: review?(mtabara) → review+
Keywords: leave-open
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/6afd53772fbf upgrade dev balrogworkers to python 2.7.15. r=mtabara
Attachment #8972612 - Flags: checked-in+
Depends on: 1458692
It looks like the dev upgrade went fine, and I rerun a couple of tasks on maple to ensure that submission still works.
Attachment #8972898 - Flags: review?(mtabara)
Comment on attachment 8972898 [details] [diff] [review] upgrade prod balrogworker hosts to 2.7.15 Sweet, let's do this!
Attachment #8972898 - Flags: review?(mtabara) → review+
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/f5e388f6e946 upgrade prod balrogworker hosts to 2.7.15. r=mtabara
Attachment #8972898 - Flags: checked-in+
I did a quick test with dev against my environment -- slaveapi rebuilt and restarted properly.
Attachment #8972967 - Flags: review?(bugspam.Callek)
Comment on attachment 8972967 [details] [diff] [review] upgrade slaveapi to new python Review of attachment 8972967 [details] [diff] [review]: ----------------------------------------------------------------- r+, but lets make sure ciduty/jordan is aware this is happening.
Attachment #8972967 - Flags: review?(bugspam.Callek) → review+
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/06f4fef9a7b8 upgrade slaveapi to python 2.7.15. r=callek
Attachment #8972967 - Flags: checked-in+
This patch should have the net effect of upgrading all of the signing servers and signingworkers to python 2.7.15. I've removed the big long conditional host block for CentOS, because AFAIK everything except signing servers and workers should be on 2.7.15 already. I probably missed something here, so I'm prepared to look for and handle any bustage that may happen. Mac maintains a conditional block to avoid upgrading the yosemite machines, which we expect to be dying in the summer, so it's not worth bothering. When I land this, I intend to pin a few canaries to my environment first, and then upgrade the rest of the signingserver/worker pools. Most likely I'll wait until Monday to do it.
Attachment #8974775 - Flags: review?(aki)
Attachment #8974775 - Flags: review?(aki) → review+
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/build/puppet/rev/385d54481e46 upgrade everything we're going to to python 2.7.15. r=aki
Attachment #8974775 - Flags: checked-in+
No longer depends on: 1458557
This is almost done. The only things left are: - Some in tree docker images whose base images haven't yet updated to 2.7.15. This is tracked in bug 1455061. - Merge day instance, which we need to build from scratch with puppet. This is tracked in bug 1459005.
Depends on: 1455061, 1459005
QA Contact: catlee
(In reply to Ben Hearsum (:bhearsum) from comment #37) > This is almost done. The only things left are: > - Some in tree docker images whose base images haven't yet updated to > 2.7.15. This is tracked in bug 1455061. > - Merge day instance, which we need to build from scratch with puppet. This > is tracked in bug 1459005. The Merge day instance is now done. Most of the Docker stuff is done too - just trying to figure out what to do about the Snap container, then we're all done here (although we're still dealing with the last bits of package upgrades in https://bugzilla.mozilla.org/show_bug.cgi?id=1458329).
I'm calling this bug fixed -- we've upgraded everything that we're going to to 2.7.15. The Snap image is probably staying as it is until Ubuntu upgrades Python in LTS. Dependency upgrades are tracked elsewhere. Thank you to everyone who helped out with work/reviews/research!
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: