749471 - dev, staging environments for autoland

Reporter

Description

•

12 years ago

Autoland is already set up as a webapp, so deployment of new nodes is easy.  It currently only has a production instance in phx1.

This should get dev and staging instances, each a single VM.  If these are in scl3, they'll need their own onboard rabbitmq servers (since there's no generic celery cluster in scl3).  They should also have a *different* SSH key that is not scm level 1.

This can wait until after the sjc1 evac is complete, or at least until the virtualization team is less overloaded.

Shyam Mani [:fox2mike]

Comment 1

•

12 years ago

Autoland isn't dev services.

Assignee: server-ops-devservices → server-ops-releng

Component: Server Operations: Developer Services → Server Operations: RelEng

QA Contact: shyam → arich

Amy Rich [:arr] [:arich]

Comment 2

•

12 years ago

Hal, neither Lukas nor Marc are working for releng anymore.  Who, if anyone has taken over this project?  My understanding is that it's currently non-functional, even in prod.  Should we work at removing the service entirely like you've done with briar patch, WONTFIX this until there are releng resources available to maintain this or..?

Hal Wine [:hwine] use NI!

Comment 3

•

12 years ago

Autoland is, at the moment, community supported, coordinated by Lukas. (None of the production machinery is on the build-vpn.) Adding Lukas to comment on dev/staging machine needs & timetables.

Amy Rich [:arr] [:arich]

Comment 4

•

12 years ago

This was not set up as a community service.  There is no access to these machines by community members since they sit inside the mozilla internal infrastructure.

Hal Wine [:hwine] use NI!

Comment 5

•

12 years ago

Bad choice of terms - I didn't mean the traditional "community" - I meant "non-releng folk who already have access to the internal infrastructure".

Jake Maul [:jakem]

Comment 6

•

12 years ago

Stealing this into WebOps, because...

This is hosted on 2 servers, autoland1.shared.phx1 and autoland2.shared.phx1.

autoland1 is a webapp module admin node. It's got one env configured, "autoland". Members are itself and autoland2. It has a slightly weird "update" script, but a normal "deploy" script, and uses the normal src/www git deployment mechanism. There appears to be no auto-deploy mechanism, so I presume folks are logging in to this node to do an update/deploy.

autoland2 is a more normal web node. It however lacks a /etc/motd, making me wonder how much if it is really puppetized like a normal webapp node. Might need some massaging.

I found some documentation for the app here:
https://wiki.mozilla.org/ReleaseEngineering:Autoland

I don't know what the domain/URL for this app is... it might not have anything obvious (user-accessible), because it's intended to pull from bugzilla and push to hg... it doesn't necessarily have a user interface itself (as far as I can tell from those docs).

Adding a single dev and a single stage node should be *relatively* straightforward, provided the basic webapp stuff is actually puppetized. They could potentially be exactly the same, just with -dev having a cron to update itself (stage would be just like prod, just to provide a way to test a deployment before actually deploying to prod).

One thing I don't know about is dependencies. Can anyone provide some documentation on what this app requires? Is everything local to a web node in puppet, and are the external dependencies (bugzilla, mysql, whatever) documented somewhere?

Assignee: server-ops-releng → server-ops-webops

Component: Server Operations: RelEng → Server Operations: Web Operations

QA Contact: arich → cshields

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 7

•

12 years ago

I set this up - I had a meeting with dev services a while back about it, but they didn't want to take it on.

Both nodes are puppetized, but I don't think I added an /etc/motd to the files section.

Yes, pushes have been manual so far.

There's no website - the webheads aren't even running Apache.  Which makes it a bit odd for webops :)

The better docs are on mana - kinda surprised you didn't look there!
  https://mana.mozilla.org/wiki/display/websites/Autoland

Yeah, setting up new environments should be easy, although I'm not sure it's worth it.

Lukas Blakk [:lsblakk] use ?needinfo

Comment 8

•

12 years ago

We do plan to have a web component for this though, both a dashboard viewable to users and an API that we can script to.

Jake Maul [:jakem]

Updated

•

12 years ago

Whiteboard: [pending triage]

Jake Maul [:jakem]

Comment 9

•

12 years ago

@dustin: why do you say you're not sure it's worth having dev/stage environments? I don't have any prior knowledge of this service, so I'm not in a position to say how much it would or wouldn't benefit.

I'm perfectly willing to bow to RelOps' input on this... if you think it's not worthwhile, we can WONTFIX this. :)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 10

•

12 years ago

I'd actually defer to Lukas on the need for dev/stage - realistically, how much work is this going to see over the next, say, 6mo, and how likely is it that such work will need to be segregated into environments, vs. just going into the only-sorta-working production environment?

The work involved is minimal, but I hate to think we'd spin up two new hosts that won't be used.  That's why I had stalled on this initially.

Jake Maul [:jakem]

Updated

•

12 years ago

Priority: -- → P4

Whiteboard: [pending triage] → [triaged 20120907][waiting][releng]

Melissa O'Connor [:melissa]

Comment 11

•

12 years ago

Lukas we need feedback from you on this before it will get moved into a work queue.

Thanks

Peter Radcliffe [:pir]

Comment 12

•

12 years ago

autoland1 went down after failing to get a dhcp lease. After bringing it back up manually via the console it is barfing on puppet/package configs. It's also way out of date on package updates.

Trying to get dhcp help/relay addresses fixed to bring these boxes back up properly.

info: Applying configuration version '48962'
err: /Stage[main]/Webapp::Python/Package[python-libs]/ensure: change from 2.6.6-29.el6_3.3 to 2.6.6-29.el6_2.2 failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y downgrade python-libs-2.6.6-29.el6_2.2' returned 1: Error: Package: python-libs-2.6.6-29.el6_2.2.x86_64 (rhel-x86_64-server-6)
           Requires: python = 2.6.6-29.el6_2.2
           Installed: python-2.6.6-29.el6_3.3.x86_64 (@rhel-x86_64-server-6)
               python = 2.6.6-29.el6_3.3
           Available: python-2.6.5-3.el6.i686 (rhel-x86_64-server-6)
               python = 2.6.5-3.el6
           Available: python-2.6.5-3.el6_0.2.i686 (rhel-x86_64-server-6)
               python = 2.6.5-3.el6_0.2
           Available: python-2.6.6-20.el6.x86_64 (rhel-x86_64-server-6)
               python = 2.6.6-20.el6
           Available: python-2.6.6-29.el6.x86_64 (mozilla)
               python = 2.6.6-29.el6
           Available: python-2.6.6-29.el6_2.2.x86_64 (rhel-x86_64-server-6)
               python = 2.6.6-29.el6_2.2
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest
 at /etc/puppet/modules/webapp/manifests/python.pp:21
err: /Stage[main]/Webapp::Python/Package[python]/ensure: change from 2.6.6-29.el6_3.3 to 2.6.6-29.el6_2.2 failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y downgrade python-2.6.6-29.el6_2.2' returned 1: Error: Package: python-libs-2.6.6-29.el6_3.3.x86_64 (@rhel-x86_64-server-6)
           Requires: python = 2.6.6-29.el6_3.3
           Removing: python-2.6.6-29.el6_3.3.x86_64 (@rhel-x86_64-server-6)
               python = 2.6.6-29.el6_3.3
           Downgraded By: python-2.6.6-29.el6_2.2.x86_64 (rhel-x86_64-server-6)
               python = 2.6.6-29.el6_2.2
           Available: python-2.6.5-3.el6.i686 (rhel-x86_64-server-6)
               python = 2.6.5-3.el6
           Available: python-2.6.5-3.el6_0.2.i686 (rhel-x86_64-server-6)
               python = 2.6.5-3.el6_0.2
           Available: python-2.6.6-20.el6.x86_64 (rhel-x86_64-server-6)
               python = 2.6.6-20.el6
           Available: python-2.6.6-29.el6.x86_64 (mozilla)
               python = 2.6.6-29.el6
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest
 at /etc/puppet/modules/webapp/manifests/python.pp:21

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 13

•

12 years ago

Wrong bug for that, but I'll take a look.  The host was only built, what, 5 months ago? How can it be that out of date?

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

Jason Crowe [:jd]

Comment 14

•

10 years ago

Autoland is no more. I will be filing a bug to decommission the prod server. Therefore there is no need to complete this work.

Thanks everybody.

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → WONTFIX

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

Bugzilla

Quick Search

dev, staging environments for autoland

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task, P4)

Tracking

(Not tracked)

People

(Reporter: dustin, Unassigned)

References

Details

(Whiteboard: [triaged 20120907][waiting][releng])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Comment 13

Updated

Comment 14

Updated