Closed Bug 1458329 Opened 2 years ago Closed Last year

update and pin python dependencies everywhere

Categories

(Release Engineering :: General, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

(Whiteboard: [releng:q22018])

Attachments

(12 files, 7 obsolete files)

5.09 KB, text/plain
Details
48.34 KB, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
34.61 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
8.38 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
1.88 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
7.96 KB, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
6.01 KB, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
8.56 KB, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
814 bytes, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
25.88 KB, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
299 bytes, patch
aki
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
3.91 KB, patch
sfraser
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
Per our new Python standards (https://docs.google.com/document/d/1tS0xfE9sOCCJOBr2J3nY2zo0nn9B0fj0MLzXe704n4g/edit#heading=h.xq8wl9ttmsxn) we must update and pin all of our Python dependencies.

This bug will primarily serve as a tracking bug. We can deal with easy upgrades/pinning directly here, and farm out anything more complex to separate bugs.
Whiteboard: [releng:q22018]
Catlee suggested looking through our dependencies for higher priority things to update. I ran the full set of our Python requirements through https://pyup.io/safety/, here's the report. I'll reply in line with notes on what services/apps are actually affected.
tl;dr -- the highest priority updates to make are:
* Cryptography (balrog_scriptworker, signingserver, pushapk_scriptworker)
* Pycrypto (releaserunner, signingworker, slaveapi, aws_manager, buildduty_tools, cruncher)
* Pylons (selfserve agent)
* PyOpenssl (buildbot masters)
* Requests (slaveapi, buildbot bridge, signing workers)

I should note that this only covers stuff in Puppet, so there may be additional things to update in services or other non-puppet locations.


> │ beaker                     │ 1.5.4     │ <1.6.4                   │ 25636    │

This is only used by selfserve agent, and it looks like we're not affected by the CVE. According to https://www.cvedetails.com/vulnerability-list/vendor_id-10210/product_id-23210/version_id-134370/Python-Beaker-1.6.4.html, it only beaker when using pycrypto. I don't see pycrypto installed in selfserve agent's virtualenv, so I *think* we're in the clear on this one.

> │ cryptography               │ 0.6       │ <1.5.3                   │ 25680    │

This has one vulnerability that might be relevant to us (https://www.cvedetails.com/cve/CVE-2016-9243/). This module is used by releaserunner, balrog_scriptworker, signingserver, pushapk_scriptworker, buildduty_tools, and cruncher.

I think the scriptworkers and signing servers are the most urgent ones to update.

> │ gevent                     │ 0.13.8    │ <1.2a1                   │ 25837    │

The vulnerability that was flagged here is only around pywsgi logging potentially leaking secrets. We do use pywsgi, but none of the logs are publicly available. I *think* it's safe to say that this is lower priority. With that said, we're taking a gevent upgrade when upgrading python on the servers anyways. The only other place gevent this is used in slaveapi.

> │ jinja2                     │ 2.5.5     │ <2.7.3                   │ 25866    │

The only sec bug here is https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734747, which may cause jinja2 to use an insecure directory (/tmp) for some cache, which can let other local users affect the app. Probably low priority for us because we don't have untrusted users on the systems that matter here.

buildbot masters, releaserunner, beetmover_scriptworke, slaveapi, buildbot, buildduty_tools, and cruncher use this module.

> │ pastescript                │ 1.7.3     │ <1.7.5                   │ 25925    │

This is a bug around not dropping group root privileges when start as root. This is only used by selfserve, which we don't run as root.

> │ pycrypto                   │ 2.6.1     │ <=2.6.1                  │ 35015    │

There's a couple of highly technical vulnerabilities in Pycrypto that I don't fully understand. We just need a minor version upgrade to fix them (2.6/2.6.1 -> 2.6.2), so we just probably just do it soon.

releaserunner, signingworker, slaveapi, aws_manager, buildduty_tools, and cruncher are affected.

> │ pylons                     │ 1.0       │ <1.0.2                   │ 26046    │

There's a potential XSS vulnerability here. It's only used by selfserve agent (which I don't think is publicly accessible). We should probably just upgrade though.

> │ pyopenssl                  │ 0.10      │ <0.13.1                  │ 35460    │

<highly technical vulnerability that I don't understand>. buildbot masters, cruncher, and buildduty_tools are affected by this. Only the buildbot masters are high priority, I think.

> │ python-jose                │ 0.5.2     │ <1.3.2                   │ 35682    │

Another vulnerability that I don't fully funderstand (https://www.cvedetails.com/cve/CVE-2016-7036/). Only releaserunner and signingworker are on an affected version.

> │ requests                   │ 1.2.3     │ <2.6.0                   │ 26102    │

There's a vulnerability when accessing a url that redirects and sets a cookie. Only slaveapi, buildbot bridge, and signingworker are on an affected version.

> │ werkzeug                   │ 0.9.3     │ <0.11.11                 │ 35661    │

The only vulnerability present happens when the debugger is turned on. We only use this on slaveapi, and we don't enable the debugger.
I've been pondering how to keep our dependencies up to date for all of our puppet managed applications. This proof of concept allows us to do that by moving requirements out to separate files, and reading them into the manifests.

If we were to do this, we'd need to either move puppet manifests to github, or perhaps we set pyup to make pull requests against the mirror, and manually pull those in ourselves.

It's also notable that as written now, this doesn't support comments or any other fancy requirements.txt syntax. If we go this route, it probably needs to be enhanced to at least support comments (for readability, and so we can pin stuff for pyup).
Attachment #8973851 - Flags: feedback?(catlee)
Attachment #8973851 - Flags: feedback?(aki)
Comment on attachment 8973851 [details] [diff] [review]
proof of concept for pyup-able requirements in puppet

I think this works.

Adding requirements.txt support might be nice, since we'd then get other things like hashes. I think our current method in puppet is good, though, and this allows us to use pyup.

About pyup: Callek had to add in extra logic to avoid updating certain packages past upstream pins (e.g., scriptworker pins aiohttp to <3; pyup bumped aiohttp to 3.x anyway, despite scriptworker being one of the upstream deps). So we'll need to somehow deal with that type of scenario.
Attachment #8973851 - Flags: feedback?(aki) → feedback+
(In reply to Aki Sasaki [:aki] from comment #4)
> Comment on attachment 8973851 [details] [diff] [review]
> proof of concept for pyup-able requirements in puppet

We may or may not need to be able to specify that certain requirements files are py3 specific, and others are py2. These will be uploaded to different python root dirs, and there may be cases of a package dropping py2 support at a certain version. (We might be able to solve for the former by the requirements filename, and latter by manually pinning the version.)
Comment on attachment 8973851 [details] [diff] [review]
proof of concept for pyup-able requirements in puppet

Review of attachment 8973851 [details] [diff] [review]:
-----------------------------------------------------------------

PRs to the github mirror are probably fine for now.

One possible enhancement would be to save requirements.txt onto the remote machine, and change the puppet virtualenv management to re-run pip when the file changes. That may be more than you're willing to do for this bug though :)

You could also compare `pip freeze` on the remote to requirements.txt, and then run pip install if there are differences.
Attachment #8973851 - Flags: feedback?(catlee) → feedback+
(In reply to Aki Sasaki [:aki] from comment #4)
> Comment on attachment 8973851 [details] [diff] [review]
> proof of concept for pyup-able requirements in puppet
> 
> I think this works.
> 
> Adding requirements.txt support might be nice, since we'd then get other
> things like hashes. I think our current method in puppet is good, though,
> and this allows us to use pyup.

As written now, I'm not sure this would allow us to use hash pinning -- our virtualenv module doesn't support it. I do intend to see how difficult it would be to add, though.

(In reply to Chris AtLee [:catlee] from comment #6)
> One possible enhancement would be to save requirements.txt onto the remote
> machine, and change the puppet virtualenv management to re-run pip when the
> file changes. That may be more than you're willing to do for this bug though
> :)

Yeah, I don't think I want to go that far -- at least not until we have a better idea what our long term deployment plan is. Eg: if we're moving to containers in 2019, I don't think it's worth making larger changes to Puppet.
(In reply to Ben Hearsum (:bhearsum) from comment #7)
> (In reply to Chris AtLee [:catlee] from comment #6)
> > One possible enhancement would be to save requirements.txt onto the remote
> > machine, and change the puppet virtualenv management to re-run pip when the
> > file changes. That may be more than you're willing to do for this bug though
> > :)
> 
> Yeah, I don't think I want to go that far -- at least not until we have a
> better idea what our long term deployment plan is. Eg: if we're moving to
> containers in 2019, I don't think it's worth making larger changes to Puppet.

+1, this is probably a good enough solution if we're likely to move to containers.
Duplicate of this bug: 1376862
Attached patch improved pyup-able pinning patch (obsolete) — Splinter Review
I was able to get both inline and whole line comments supported, which was rather tricky because our puppet doesn't support lambdas...

The only consumer of python27::virtualenv is balrog_scriptworker, so I think that's a good first choice to roll this out to.

If that goes well I intend to copy this logic to python::virtualenv and python35::virtualenv, and move all of the package lists to requirements files.
Attachment #8974477 - Flags: review?(aki)
Comment on attachment 8974477 [details] [diff] [review]
improved pyup-able pinning patch

I'm stamping the latter portions; if this works it lgtm.
Attachment #8974477 - Flags: review?(aki) → review+
This is the previous patch + a conversion of all of the python::virtualenv callers to requirements files. I think we may as well do them at the same time - it should be pretty safe (and I'll be watching closely).
Attachment #8974477 - Attachment is obsolete: true
Attachment #8974826 - Flags: review?(aki)
Comment on attachment 8974826 [details] [diff] [review]
pyup pinning for python27 + python

r+ as long as you intended to delete buildbot_master/simple.
Attachment #8974826 - Flags: review?(aki) → review+
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/74648e3a555e
bump signing scriptworker deps to latest. r=versionbump
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Keywords: leave-open
QA Contact: catlee
Resolution: FIXED → ---
(In reply to Aki Sasaki [:aki] from comment #13)
> Comment on attachment 8974826 [details] [diff] [review]
> pyup pinning for python27 + python
> 
> r+ as long as you intended to delete buildbot_master/simple.

Yeah, that was only used by "bors", which I deleted awhile back.
Attachment #8973851 - Attachment is obsolete: true
Attachment #8974826 - Flags: checked-in+
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/3a521200571d
convert python and python27 virtualenvs to requirements files. r=aki
I had to land this bustage fix because I accidentally put variables in one of the requirements files: https://hg.mozilla.org/build/puppet/rev/540df11ca331
Attached patch py3-req.diffSplinter Review
Attachment #8975129 - Flags: review?(bhearsum)
Attachment #8975129 - Flags: review?(bhearsum) → review+
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/252407a6e248
python3 requirements files. r=bhearsum
Attached patch py3-venv.diffSplinter Review
I'm pretty much copying what you did with py27. Tested on signing-linux-dev1.
Attachment #8979353 - Flags: review?(bhearsum)
Comment on attachment 8979353 [details] [diff] [review]
py3-venv.diff

Review of attachment 8979353 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/python3/manifests/virtualenv.pp
@@ -82,5 @@
> -            }
> -            $os = $::operatingsystem ? {
> -                        windows => "${virtualenv}/Scripts/pip.exe",
> -                        default => "${virtualenv}/bin/pip"
> -            }

Looks like this removal is the main difference, which I see is because "virtualenv" is smart enough to pull its deps from a "virtualenv_support" dir. Nice! This should mean we can just unpack any new virtualenv tarballs into their own dir, and they'll "just work".
Attachment #8979353 - Flags: review?(bhearsum) → review+
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/dceb846d9224
update pip, wheel, virtualenv in python3. r=bhearsum
Attachment #8979759 - Flags: review?(bhearsum) → review+
Depends on: 1463774
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/15e53d56fab7
install xz-devel on balrog scriptworkers. r=bhearsum
Attached patch upgrade signing server deps (obsolete) — Splinter Review
I've done testing of this on an mdc2 signing server (which isn't used in production). The only bump I hit was needing to add the newly required ptyprocess module (via the latest pexpect version).

If this looks OK I plan to canary it with one linux and one mac server tomorrow morning, and then roll it out to the rest if things look fine.
Attachment #8980397 - Flags: review?(aki)
Depending on how things go with the signing servers, I may try to land this tomorrow, too. This should include all of the other python 2.7 virtualenvs, except the buildbot master ones (I consider those higher risk, so I intend to do them separately).
Attachment #8980400 - Flags: review?(aki)
Comment on attachment 8980397 [details] [diff] [review]
upgrade signing server deps

Let's grab the wheels whenever possible - we'll need the mac and manylinux wheels for many of these.

I use `pip download --no-deps -r FILE` to help get a lot of this going... after populating FILE with the new deps.
Attachment #8980397 - Flags: review?(aki) → review+
(In reply to Aki Sasaki [:aki] from comment #27)
> Comment on attachment 8980397 [details] [diff] [review]
> upgrade signing server deps
> 
> Let's grab the wheels whenever possible - we'll need the mac and manylinux
> wheels for many of these.
> 
> I use `pip download --no-deps -r FILE` to help get a lot of this going...
> after populating FILE with the new deps.

Will do, thanks for the tip!
Attachment #8980400 - Flags: review?(aki) → review+
I pinned signing4, mac-v2-signing3, and signingworker-1 to my environment today. I restarted the dep instances on the signing servers, and they seem to be working fine. The signingworker required a whole bunch of new packages, but after fixing all of that it looks like it's working well too.

I didn't have time to do canaries for any other machines today, so I'll be continuing this work on Tuesday. I'll need to finish doing canaries, post an updated patch with the signingworker and any other necessary fixes, and then roll out to the full set of machines.
This is an updated patch with a bunch of fixes after canary testing. All of the newly added packages are new dependencies after updating existing packages. Other notes:
* fabric 2.0 is a very big breaking change, so i locked aws_manager to the latest fabric 1 -- i don't think it's worth upgrading the cloud tools scripts.
* locked to twisted 10.1.0 in places that use it with buildbot -- the latest twisted breaks in all sorts of ways with our buildbot stuff. again, not worth fixing.
* i explicitly did not do any hosts that have python 2 + python 3 dependencies because I thought it would be better to canary the python 2 and 3 stuff at the same time.

At this point, I _think everything on all of my canaries is working fine -- there's no more nagios alerts failing, and I've inspected the running stuff on the machines and it seems OK. Unless something comes up, I'd like to do these upgrades tomorrow.
Attachment #8980397 - Attachment is obsolete: true
Attachment #8980400 - Attachment is obsolete: true
Attachment #8981605 - Flags: review?(aki)
Comment on attachment 8981605 [details] [diff] [review]
updated patch with upgraded 2.7 deps

>diff --git a/modules/slaveapi/files/requirements.txt b/modules/slaveapi/files/requirements.txt
>index 21957b05..bc994630 100644
>--- a/modules/slaveapi/files/requirements.txt
>+++ b/modules/slaveapi/files/requirements.txt
>@@ -1,45 +1,45 @@
>-gevent==0.13.8 # test inline comment
>-greenlet==0.4.1
>+gevent==1.3.1 # test inline comment

do we still need this comment?
Attachment #8981605 - Flags: review?(aki) → review+
(In reply to Aki Sasaki [:aki] from comment #31)
> Comment on attachment 8981605 [details] [diff] [review]
> updated patch with upgraded 2.7 deps
> 
> >diff --git a/modules/slaveapi/files/requirements.txt b/modules/slaveapi/files/requirements.txt
> >index 21957b05..bc994630 100644
> >--- a/modules/slaveapi/files/requirements.txt
> >+++ b/modules/slaveapi/files/requirements.txt
> >@@ -1,45 +1,45 @@
> >-gevent==0.13.8 # test inline comment
> >-greenlet==0.4.1
> >+gevent==1.3.1 # test inline comment
> 
> do we still need this comment?

Nope, I'll remove it.
Turns out I missed a couple of things on aws manager - the failures weren't obvious though. Adding ipaddress and enum34 fixed things up.
Attachment #8981605 - Attachment is obsolete: true
Attachment #8981643 - Flags: review?(aki)
Attachment #8981643 - Flags: review?(aki) → review+
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/7397ee9694c8
upgrade dependencies on python2 apps. r=aki
Attachment #8981643 - Flags: checked-in+
Not really sure what happened here, but i managed to not include a whole bunch of 2.7 dep upgrades in my first patch. This patch includes all of them except buildslave/, which I doubt we're going to want to do...
Attachment #8981873 - Flags: review?(aki)
Attachment #8981873 - Flags: review?(aki) → review+
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/a6be2c823f13
finish up 2.7 dep upgrades. r=aki
Attachment #8981873 - Flags: checked-in+
There's two parts to this:
1) Make update.sh support git
2) Switch puppetmaster manifests to github

You'll also notice that I'm switching to the master branch of the Github repo, _not_ production. As far as I know, the production branch adds zero value these days since we always immediately merge default -> production.

I tested this as much as possible on bhearsum-fake-puppet.srv.releng.usw2.mozilla.com -- you should have access to that as well if you want to poke around.

Landing this will be a little bit interesting. All of the manifest changes should be a no-op because /etc/puppet/production already exists - so the initial landing will just change update.sh. From there, I can clobber a single /etc/puppet/production directory, rerun puppet, and it should repopulate from the git repo. Once that's done on all of the masters we'll be pulling from git via vcs2vcs. When we're ready, we can disable that and start landing via PRs to the Github repo, and enable Pyup.
Attachment #8982024 - Flags: review?(aki)
More prep for pyup.io enabling.
Attachment #8982028 - Flags: review?(aki)
Comment on attachment 8982024 [details] [diff] [review]
add support for puppet manifests in a git repo

Do we want some way to automatically switch from hg to git? Or will we just nuke/clone in a one-time manual fix?
Attachment #8982024 - Flags: review?(aki) → review+
Attachment #8982028 - Flags: review?(aki) → review+
(In reply to Aki Sasaki [:aki] from comment #39)
> Comment on attachment 8982024 [details] [diff] [review]
> add support for puppet manifests in a git repo
> 
> Do we want some way to automatically switch from hg to git? Or will we just
> nuke/clone in a one-time manual fix?

Ah, you answered that above :)
(In reply to Aki Sasaki [:aki] from comment #40)
> (In reply to Aki Sasaki [:aki] from comment #39)
> > Comment on attachment 8982024 [details] [diff] [review]
> > add support for puppet manifests in a git repo
> > 
> > Do we want some way to automatically switch from hg to git? Or will we just
> > nuke/clone in a one-time manual fix?
> 
> Ah, you answered that above :)

Yeah, I thought about implementing an automatic switch, but I don't think it's worth the effort, and it feels a bit more risky (we lose a bit of control about when it happens).
Attachment #8982028 - Flags: checked-in+
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/543c6bd043cd
add a .pyup.yml and make it ignore buildslave requirements. r=aki
This is mostly a wrapper around "pip download". It requires the latest pip to work correctly, so I'm creating a python2 and python3 virtualenv to make that available (as described in the comments, we can't download python3 packages with python2 pip). It seems to work quite well on my test puppetmaster -- the only error that comes up is:
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-download-Twxse2/MySQL-python/

...which happens after the package is downloaded, so it doesn't break anything.

As its written now, it finds all of the requirements files on the puppetmaster, and uses those to download. This means that we don't download new packages until _after_ the manifests are in production, and then it takes even more time for non-distinguished puppetmasters to receive them. This means we're likely to have install failures that fix themselves whenever we update a dependency. I'm trying to think of a way to make this less likely/impossible.
Attachment #8982308 - Flags: feedback?(aki)
Comment on attachment 8982308 [details] [diff] [review]
add script to automatically download required python dependencies

I have slight preferences for:

- 2.7 and 3.6 over 27 and 36 for legibility
- two spaces in front of the # comment, similar to flake8

I don't think either one blocks, especially if we're moving away from puppet eventually.
Attachment #8982308 - Flags: feedback?(aki) → feedback+
(In reply to Aki Sasaki [:aki] from comment #44)
> Comment on attachment 8982308 [details] [diff] [review]
> add script to automatically download required python dependencies
> 
> I have slight preferences for:
> 
> - 2.7 and 3.6 over 27 and 36 for legibility

I don't think we can do this because of what "pip download --python-version" wants:
  --python-version <python_version>
                              Only download wheels compatible with Python interpreter version <version>. If not specified, then the current system interpreter minor version is used. A major version (e.g. '2') can be specified to match all minor revs of
                              that major version.  A minor version (e.g. '34') can also be specified.

I think it uses whatever you pass when trying to figure out if wheels are compatible with your target version. I _might_ be able to just remove that since we're running pip with the target version, but being explicit seems like it would be preferable.

> - two spaces in front of the # comment, similar to flake8

Easy, let's do it!
Changes since the previous version:
* Install mysql+mysql_devel on distinguished puppet masters, to avoid spurious error messages during "pip download". Seems a bit like overkill, but the alternative is those spurious errors getting sent to Puppet Mail every 5 minutes.
* Run pip download every 5 minutes, to minimize the time between new manifests being pulled and new dependencies being downloaded.
* Some changes to reduce pip download runtime:
** Use separate cache dirs for 27 and 36, because pip refuses to share them.
** Parse and deduplicate requirements rather than using "pip download -r"
* Two spaces before inline comments in requirements files.
* More comments; improved e-mail subject

With all of this, I think this is now in landable shape. There's still a possibility that some installs could fail with missing packages, but it's greatly reduced. Even if non-distinguished puppet masters don't have the files (because they only sync every ~30min), as soon as the distinguished master gets them, they'll be available.
Attachment #8982549 - Flags: review?(aki)
Attachment #8982549 - Flags: review?(aki) → review+
All the work for switch our Puppet masters to Github is written, tested, and reviewed. I'm planning to begin the rollout on Monday, here's the plan:

Early Monday ET:
- Land https://bugzilla.mozilla.org/attachment.cgi?id=8982024&action=edit, which will essentially be a no-op (it will alter update.sh, but won't change where the manifests are pulled)
- Clobber /etc/puppet/production on one of the non-distinguished masters; let puppet rebuild it (to ensure the new manifests work.
- Land https://bugzilla.mozilla.org/attachment.cgi?id=8982549&action=edit, which will enable automatic package downloads from Pypi. It will also serve to ensure that the puppet master that's been switched to Github is properly picking up changes.

Monday afternoon ET:
- Clobber /etc/puppet/production on releng-puppet2.srv.releng.scl3.mozilla.com (the distinguished master); let it rebuild itself. Continue watching to make sure looks OK - this will mainly consist of making sure machines are puppetizing OK, and landing a small dummy change to make sure it is picked up.
- If everything looks okay, clobber /etc/puppet/production on the remaining puppet masters (a couple at a time, to be careful), and make sure they all rebuild properly.
- Once we're sure everything is still OK, disable vcssync for the puppet repo

After the above is completed, https://github.com/mozilla/build-puppet will be considered the Repository of Record for the RelEng Puppet manifests. Changes will need to go in as PRs and get merged.
dhouse tells me that vcssync pulls mozharness automatically before running, so landing + merging this to the production branch should be all that's needed to disable syncing.
Attachment #8982669 - Flags: review?(aki)
Attachment #8982669 - Flags: review?(aki) → review+
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/e2bbbba1e625
add support for puppet manifests in a git repo. r=aki
Attachment #8982024 - Flags: checked-in+
(In reply to Ben Hearsum (:bhearsum) from comment #47)
> All the work for switch our Puppet masters to Github is written, tested, and
> reviewed. I'm planning to begin the rollout on Monday, here's the plan:
> 
> Early Monday ET:
> - Land https://bugzilla.mozilla.org/attachment.cgi?id=8982024&action=edit,
> which will essentially be a no-op (it will alter update.sh, but won't change
> where the manifests are pulled)
> - Clobber /etc/puppet/production on one of the non-distinguished masters;
> let puppet rebuild it (to ensure the new manifests work.

I had to tweak this a bit after discovering that non-distinguished masters puppetize against themselves -- so clobbering /etc/puppet/production breaks them without a way to recover. (Disgtinuished masters puppetize against a different master for whatever reason). I ended up manually cloning and setting up /etc/puppet/production all of the puppet masters to workaround this issue. This takes care of the first two points on the afternoon work, too.

I'm paused here at the moment while I wait for some "Import loop" errors to clear. These are the result of /etc/puppet/production/environment.conf disappearing for a brief time, and should clear up on their own. I'll continue once they have.
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/2f772867ed63
automatically pull python dependencies from pypi. r=aki
Attachment #8982549 - Flags: checked-in+
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/c46741bf3dfd
Bustage fix for bug 1458329 - install the same version of mysql and mysql-devel. r=aki
Attachment #8982669 - Flags: checked-in+
(In reply to Ben Hearsum (:bhearsum) from comment #50)
> (In reply to Ben Hearsum (:bhearsum) from comment #47)
> > All the work for switch our Puppet masters to Github is written, tested, and
> > reviewed. I'm planning to begin the rollout on Monday, here's the plan:
> > 
> > Early Monday ET:
> > - Land https://bugzilla.mozilla.org/attachment.cgi?id=8982024&action=edit,
> > which will essentially be a no-op (it will alter update.sh, but won't change
> > where the manifests are pulled)
> > - Clobber /etc/puppet/production on one of the non-distinguished masters;
> > let puppet rebuild it (to ensure the new manifests work.
> 
> I had to tweak this a bit after discovering that non-distinguished masters
> puppetize against themselves -- so clobbering /etc/puppet/production breaks
> them without a way to recover. (Disgtinuished masters puppetize against a
> different master for whatever reason). I ended up manually cloning and
> setting up /etc/puppet/production all of the puppet masters to workaround
> this issue. This takes care of the first two points on the afternoon work,
> too.
> 
> I'm paused here at the moment while I wait for some "Import loop" errors to
> clear. These are the result of /etc/puppet/production/environment.conf
> disappearing for a brief time, and should clear up on their own. I'll
> continue once they have.

These slowed down greatly over time. I'm pretty sure there's some funky caching that made them last longer than expected.

> - Land https://bugzilla.mozilla.org/attachment.cgi?id=8982549&action=edit,
> which will enable automatic package downloads from Pypi. It will also serve
> to ensure that the puppet master that's been switched to Github is properly
> picking up changes.

I had a bit of trouble with this on the distinguished master - it caused errors like:
Error: /Stage[main]/Packages::Mysql/Package[mysql]/ensure: change from 5.1.73-8.el6_8 to 5.1.73-3.el6_5 failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y downgrade mysql-5.1.73-3.el6_5' returned 1: Error: Package: mysql-devel-5.1.73-8.el6_8.x86_64 (@mysql)
           Requires: mysql = 5.1.73-8.el6_8
           Removing: mysql-5.1.73-8.el6_8.x86_64 (@mysql)
               mysql = 5.1.73-8.el6_8
           Downgraded By: mysql-5.1.73-3.el6_5.x86_64 (updates)
               mysql = 5.1.73-3.el6_5
           Available: mysql-5.1.71-1.el6.x86_64 (base)
               mysql = 5.1.71-1.el6
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

This is because we're pinning different distribution versions of mysql and mysql-devel, so installing both ends up with unmet dependencies. This was fixed with https://github.com/mozilla/build-puppet/commit/cda743baa920b80ee8deecffa03efb3613d2f991.

> - Once we're sure everything is still OK, disable vcssync for the puppet repo
> After the above is completed, https://github.com/mozilla/build-puppet will
> be considered the Repository of Record for the RelEng Puppet manifests.
> Changes will need to go in as PRs and get merged.

This is done! I have a few followups to take care of, including:
* Make the hg repo read only (https://bugzilla.mozilla.org/show_bug.cgi?id=1466637)
* Move dxr indexing to use the Github repo
* Update a few more things on the PuppetAgain wiki
* Update phabricator to point at the Github repo
Depends on: 1466637
Comment on attachment 8982669 [details] [diff] [review]
stop syncing puppet repo

merged default into production (mozharness)
Depends on: 1466644
Depends on: 1466646
Attachment #8982308 - Attachment is obsolete: true
Hello Ben,
Today (05 June 2018) this alert [1] started showing up in #buildduty channel at 11:55 AM  (GMT+3) 


[1] - [sns alert] Jun 05 01:53:11 releng-puppet1.srv.releng.use1.mozilla.com puppetmaster_git_sync: error: Pull is not possible because you have unmerged files.

I think that it might be related with this bug. 

Please take a look.
Flags: needinfo?(bhearsum)
(In reply to Radu Iman[:riman] from comment #55)
> Hello Ben,
> Today (05 June 2018) this alert [1] started showing up in #buildduty channel
> at 11:55 AM  (GMT+3) 
> 
> 
> [1] - [sns alert] Jun 05 01:53:11 releng-puppet1.srv.releng.use1.mozilla.com
> puppetmaster_git_sync: error: Pull is not possible because you have unmerged
> files.
> 
> I think that it might be related with this bug. 
> 
> Please take a look.

As far as I can tell, this wasn't related to my work. I found this in the ssl git repo:
On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")

Changes to be committed:

	new file:   agent-certs/releng-puppet1.srv.releng.mdc1.mozilla.com/av-linux64-ec2-golden.build.releng.use1.mozilla.com.crt
	new file:   agent-certs/releng-puppet1.srv.releng.mdc1.mozilla.com/tst-linux32-ec2-golden.test.releng.use1.mozilla.com.crt
	modified:   agent-certs/releng-puppet1.srv.releng.mdc2.mozilla.com/tst-linux64-ec2-golden.test.releng.use1.mozilla.com.crt
	deleted:    agent-certs/releng-puppet1.srv.releng.use1.mozilla.com/tst-emulator64-ec2-golden.test.releng.use1.mozilla.com.crt
	new file:   revocation-requests/releng-puppet1.srv.releng.use1.mozilla.com/tst-emulator64-ec2-golden.test.releng.use1.mozilla.com-for-releng-puppet1.srv.releng.use1.mozilla.com.crt
	renamed:    agent-certs/releng-puppet2.srv.releng.mdc1.mozilla.com/tst-linux32-ec2-golden.test.releng.use1.mozilla.com.crt -> revocation-requests/releng-puppet2.srv.releng.mdc1.mozilla.com/tst-linux32-ec2-golden.test.releng.use1.mozilla.com-for-releng-puppet1.srv.releng.mdc1.mozilla.com.crt

Unmerged paths:
  (use "git add/rm <file>..." as appropriate to mark resolution)

	both deleted:    agent-certs/releng-puppet2.srv.releng.mdc2.mozilla.com/av-linux64-ec2-golden.build.releng.use1.mozilla.com.crt
	added by them:   revocation-requests/releng-puppet2.srv.releng.mdc2.mozilla.com/av-linux64-ec2-golden.build.releng.use1.mozilla.com-for-releng-puppet1.srv.releng.mdc1.mozilla.com.crt
	added by us:     revocation-requests/releng-puppet2.srv.releng.mdc2.mozilla.com/av-linux64-ec2-golden.build.releng.use1.mozilla.com-for-releng-puppet1.srv.releng.use1.mozilla.com.crt


It's a weird conflict because there's no differences between what each side did (both removed the same file, and added one of their own with a unique name).

To "fix" it, I just ran "git add ." and "git commit". Then it happened again, and I did it again, and it stopped.

I have no idea what caused it, but my work didn't directly touch the ssl git repo in any way. There could be some indirect effect here that I don't understand. https://github.com/mozilla/build-puppet/commit/61fc4bf2007eba2941c75aa0d36cbf1d90121e37 also landed recently, which may have caused it? I don't see how, but it's the only other explanation I can think of.
Flags: needinfo?(bhearsum)
We're nearly done here. The only things left are:
* Land my patch for the last round of signing server upgrades (https://github.com/mozilla/build-puppet/pull/43).
* Move the build-puppet repo to the mozilla-releng organization.
* Enable pyup (which will immediately find a few minor dependency upgrades).

I hope to get to all of these on Monday.

It would be nice to get to https://bugzilla.mozilla.org/show_bug.cgi?id=1467255 (requirements file verification in CI) before too long, but I don't consider it a blocker.
Depends on: 1467254
(In reply to Ben Hearsum (:bhearsum) from comment #57)
> We're nearly done here. The only things left are:
> * Land my patch for the last round of signing server upgrades
> (https://github.com/mozilla/build-puppet/pull/43).
> * Move the build-puppet repo to the mozilla-releng organization.

These are both done.

> * Enable pyup (which will immediately find a few minor dependency upgrades).

I'm not sure I can do this. Pyup doesn't show the repositories for me, and if I start to grant read-only access, I get shown a screen that wants me to grant a ton of permissions (https://screenshotscdn.firefoxusercontent.com/images/67353b27-35fd-4df0-a054-1ad0cdc9aa87.png) - including *write* access to numerous organizations.

Chris, are you able to enable Pyup without granting all this extra access? If not, I should probably consult with Hal or ulfr to make sure we do this correctly.
Flags: needinfo?(catlee)
Haven't tested this yet, and I suspect the aiohttp upgrade may break something (unless we're only using it indirectly). I'm not sure if this is testable on try, but a staging release might do the trick. It's risky enough that I don't think we should land it without some sort of verification.
Attachment #8985023 - Flags: feedback?(sfraser)
(In reply to Ben Hearsum (:bhearsum) from comment #59)
> Created attachment 8985023 [details] [diff] [review]
> update funsize python dependencies
> 
> Haven't tested this yet, and I suspect the aiohttp upgrade may break
> something (unless we're only using it indirectly). I'm not sure if this is
> testable on try, but a staging release might do the trick. It's risky enough
> that I don't think we should land it without some sort of verification.

I tried to do something similar to tom's recent release-on-try work: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d1559d46117ad1bf61d5a08a20fbec76e01eac6f&selectedJob=182937716

It didn't work, unfortunately. It did build the image successfully (after a couple of fixes due to https://github.com/peterjc/backports.lzma/pull/33), but it didn't run funsize at all.
(In reply to Ben Hearsum (:bhearsum) from comment #60)
> (In reply to Ben Hearsum (:bhearsum) from comment #59)
> > Created attachment 8985023 [details] [diff] [review]
> > update funsize python dependencies
> > 
> > Haven't tested this yet, and I suspect the aiohttp upgrade may break
> > something (unless we're only using it indirectly). I'm not sure if this is
> > testable on try, but a staging release might do the trick. It's risky enough
> > that I don't think we should land it without some sort of verification.

When I've run tests in the past I've built the docker image locally and run it with the taskID of an existing partials task, which catches most things. I agree that release-on-try would be an improvement, though.

The README.md has an example command
I found a couple of other things that needed dealing with in addition to enabling pyup:
* Update & enable pyup for mozapkpublisher. I filed https://github.com/mozilla-releng/mozapkpublisher/issues/64 for this.
* Update funsize-update-generator dependencies, which should be done with the patch I posted earlier today. I haven't figure out how to do automatic dependency updates for it yet, as it uses a requirements file embedded into the gecko tree.
Thanks for all of the pointers, that was very helpful in testing! I managed to generate partials for a recent nightly with this version of the patch. Should we test releases explicitly, too, or is this enough to have confidence?
Attachment #8985023 - Attachment is obsolete: true
Attachment #8985023 - Flags: feedback?(sfraser)
Attachment #8985055 - Flags: review?(sfraser)
(In reply to Ben Hearsum (:bhearsum) from comment #62)
> * Update funsize-update-generator dependencies, which should be done with
> the patch I posted earlier today. I haven't figure out how to do automatic
> dependency updates for it yet, as it uses a requirements file embedded into
> the gecko tree.

Tom and Simon had some good ideas about this on IRC. I think we settled on:
* Set up something that runs through './mach python-test' that will run safety CI against the funsize-update-generator requirements file.
* It should be tier 3, and send e-mail if anything is vulnerable (maybe the best way to do this is fail if things are out of date, anduse regular taskcluster notifications).

This won't catch all out of date deps, but it will catch any that have vulnerabilities. We might be able to enhance this test to find other out of date dependencies. The way I did the initial update for funsize was to tweak the requirements file to remove pinning, then install it into a virtualenv and run "pip freeze". We might be able to use this strategy to do it.
Attachment #8985055 - Flags: review?(sfraser) → review+
Attachment #8985055 - Flags: checked-in+
Pushed by ncsoregi@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/a963e4afed1e
update funsize update generator deps. r=sfraser
(In reply to Ben Hearsum (:bhearsum) from comment #58)
> > * Enable pyup (which will immediately find a few minor dependency upgrades).
> 
> I'm not sure I can do this. Pyup doesn't show the repositories for me, and
> if I start to grant read-only access, I get shown a screen that wants me to
> grant a ton of permissions
> (https://screenshotscdn.firefoxusercontent.com/images/67353b27-35fd-4df0-
> a054-1ad0cdc9aa87.png) - including *write* access to numerous organizations.

I believe I was able to enable without actually granting to other orgs... https://github.com/mozilla-releng/build-puppet/pull/76#partial-pull-merging
Enough modules have released new versions that we should probably go back and update things piecemeal again... I'll start with the scriptworkers.
Flags: needinfo?(catlee)
(In reply to Aki Sasaki [:aki] (pto til july16) from comment #67)
> (In reply to Ben Hearsum (:bhearsum) from comment #58)
> > > * Enable pyup (which will immediately find a few minor dependency upgrades).
> > 
> > I'm not sure I can do this. Pyup doesn't show the repositories for me, and
> > if I start to grant read-only access, I get shown a screen that wants me to
> > grant a ton of permissions
> > (https://screenshotscdn.firefoxusercontent.com/images/67353b27-35fd-4df0-
> > a054-1ad0cdc9aa87.png) - including *write* access to numerous organizations.
> 
> I believe I was able to enable without actually granting to other orgs...
> https://github.com/mozilla-releng/build-puppet/pull/76#partial-pull-merging
> Enough modules have released new versions that we should probably go back
> and update things piecemeal again... I'll start with the scriptworkers.

Thank you! I'll look at updating some of the other apps/services.
(In reply to Ben Hearsum (:bhearsum) from comment #62)
> * Update funsize-update-generator dependencies, which should be done with
> the patch I posted earlier today. I haven't figure out how to do automatic
> dependency updates for it yet, as it uses a requirements file embedded into
> the gecko tree.

Looks like this will be covered by https://bugzilla.mozilla.org/show_bug.cgi?id=1468394.
(In reply to Ben Hearsum (:bhearsum) from comment #62)
> I found a couple of other things that needed dealing with in addition to
> enabling pyup:
> * Update & enable pyup for mozapkpublisher. I filed
> https://github.com/mozilla-releng/mozapkpublisher/issues/64 for this.

This got done at some point.
I went over and reviewed things today. It looks like there's a few things that don't have automatic dependency upgrades yet:
1) releng/services apps, which is tracked in https://github.com/mozilla-releng/services/issues/1084
2) the in-tree images, which i just filed https://bugzilla.mozilla.org/show_bug.cgi?id=1477021

Other than that, I think we're done?
(In reply to Ben Hearsum (:bhearsum) from comment #71)
> I went over and reviewed things today. It looks like there's a few things
> that don't have automatic dependency upgrades yet:
> 1) releng/services apps, which is tracked in
> https://github.com/mozilla-releng/services/issues/1084
> 2) the in-tree images, which i just filed
> https://bugzilla.mozilla.org/show_bug.cgi?id=1477021
> 
> Other than that, I think we're done?

Nobody has objected, so I'm going with yes. We'll track the two stragglers in their own bugs.
Status: REOPENED → RESOLVED
Closed: 2 years agoLast year
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.