Closed Bug 912970 Opened 9 years ago Closed 9 years ago

Create a resilient Python package repository apart from PuppetAgain

Categories

(Infrastructure & Operations :: IT-Managed Tools, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

Attachments

(2 files, 1 obsolete file)

c.f. bug 900663 for the similar approach to NPM repositories

PuppetAgain has a Python packages repository at /data/python/packages/.  This was originally intended to support installs of python packages *by* Puppet, (Buildbot, Mozpool, etc.).  It grew organically to support installs during builds, via mozharness.  This added to the tree-criticality of the service -- now a broken httpd will cause running builds to burn, rather than just causing puppet runs to fail and retry.  This has also added a lot of churn to the puppetagain repos, and separated something we want to control strictly (system installs) from something that needs to be fluid and fast (build components).

So, I think a more flexible solution is to host a python repository on the releng cluster.  Something like --find-links=http://python-packages.pvt.build.mozilla.org and a corresponding http://python-packages.pub.build.mozilla.org with the same content.  The releng cluster is redundant and already known to be tree-closing, so client-side resiliency isn't required.  Relengers already have access to upload files there.  We could also add pypi-style indexes at some point, if that's helpful.

Aki: do we need to support "private" python packages that are only available on the releng network?

Jake: I will get some rough figures on bandwidth for you.  Do you have any additional questions or concerns?
Assignee: relops → dustin
Component: RelOps → WebOps: IT-Managed Tools
QA Contact: arich → nmaul
Pardon my weak Apache analysis skills.  For all of September 2nd, across the 8 puppet masters, on Monday we saw the following total sizes for URLs matching "python/packages":

releng-puppet1.srv.releng.scl3.mozilla.com 2360148147
releng-puppet2.srv.releng.scl3.mozilla.com 19074532193
releng-puppet1.srv.releng.use1.mozilla.com 5046006772
releng-puppet2.srv.releng.use1.mozilla.com 5231304921
releng-puppet1.srv.releng.usw2.mozilla.com 4679691096
releng-puppet2.srv.releng.usw2.mozilla.com 4637072933
releng-puppet2.build.mtv1.mozilla.com      4487349899
releng-puppet2.build.scl1.mozilla.com      7094840732
                                           ----------
                                           52610946693

or about 600 KBps, just under 5 Mbps.  Of course, that doesn't count TCP overhead, but it's a ballpark.  To me, that doesn't sound problematic.  What do you think, Jake?
Flags: needinfo?(aki)
(In reply to Dustin J. Mitchell [:dustin] from comment #0)
> c.f. bug 900663 for the similar approach to NPM repositories
> Aki: do we need to support "private" python packages that are only available
> on the releng network?

At this moment, we do not; in fact, we'd like the python packages to be available to the public at some url like http://puppetagain.pub.build.mozilla.org/data/python/packages/ .

I don't foresee needing a private python package, but I can't guarantee that.
Flags: needinfo?(aki)
Great!  I'll leave in the possibility of supporting private-only packages, but for now everything will appear on both hostnames.
Et voila, http://pypi.pub.build.mozilla.org/pub/ and http://pypi.pvt.build.mozilla.org/pub/ were born, with all of the current contents of puppetagain's /python/packages.
Attached patch bug912970.patchSplinter Review
I'm not sure how best to go about landing this.  If I commit to mozpool's default branch, will that immediately go into production?  I'm worried both about correctness and load.

Also, I'll need to make sure everyone is aware of the new location for adding Python packages.  Aside from an email to release@, what wiki pages should I adjust?
Attachment #815062 - Flags: review?(aki)
Attached patch bug912970.patch (obsolete) — Splinter Review
Point pip.conf to the mirrors, too, except for Puppet-driven Python installs.
Attachment #815106 - Flags: review?(bugspam.Callek)
(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> Created attachment 815062 [details] [diff] [review]
> bug912970.patch
> 
> I'm not sure how best to go about landing this.  If I commit to mozpool's
> default branch, will that immediately go into production?  I'm worried both
> about correctness and load.

Landing on `default` won't go live automatically, it would require us to merge it to the production branch.
Comment on attachment 815106 [details] [diff] [review]
bug912970.patch

Review of attachment 815106 [details] [diff] [review]:
-----------------------------------------------------------------

withholding r+ pending my question on second hunk answered. but overall this looks good.

::: modules/python/manifests/virtualenv/package.pp
@@ +24,5 @@
> +    $pip_options = inline_template("--no-deps --no-index <%
> +servers = [ @data_server ] + Array(@data_servers)
> +servers.uniq.each do |mirror_server| -%> --find-links=http://<%= mirror_server %>/python/packages <%
> +end
> +-%>")

looking at this syntax as written is difficult, I'd love an alternate approach (even using template() rather than inline_template()) but I can't think of a better suggestion at present so "this will work" if you can't either.

::: modules/python/templates/user-pip-conf.erb
@@ +6,3 @@
>  find-links =
> +    http://pypi.pvt.build.mozilla.org
> +    http://pypi.pub.build.mozilla.org

to be clear, we *want* seamonkey and other orgs to also attempt the .pvt. link?  That actually surprises me a bit.

If its not expected to work for those, I'd like to not have it here though. But if it is then by all means leave it.
Attachment #815106 - Flags: feedback+
I don't want to use template() since that removes what is essentially a for loop from its related context.

As for pypi.pvt, no, that won't work from other orgs, but nor will it hurt, which is why I left it there.  I could conditionalize it on $org == "moco", but that seems like unnecessary work.
(In reply to Dustin J. Mitchell [:dustin] from comment #10)
> I don't want to use template() since that removes what is essentially a for
> loop from its related context.
> 
> As for pypi.pvt, no, that won't work from other orgs, but nor will it hurt,
> which is why I left it there.  I could conditionalize it on $org == "moco",
> but that seems like unnecessary work.

but then how will other orgs provide a "secret python packages dir" when we need?

So my suggestion/request is to either omit it or create a config var (default to unset) that is "private python package url" which is only included here when it exists.
Comment on attachment 815106 [details] [diff] [review]
bug912970.patch

Fair enough - I'll revise.
Attachment #815106 - Flags: review?(bugspam.Callek) → review-
Attachment #815106 - Attachment is obsolete: true
Attachment #816568 - Flags: review?(bugspam.Callek)
Attachment #816568 - Flags: review?(bugspam.Callek) → review+
this is in mozharness production
Coop, FYI, I'm planning to land the Puppet part of this tomorrow.  I don't think there's any impact, and Callek doesn't either, but I'll ping you beforehand anyway.
Attachment #816568 - Flags: checked-in+
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.