Closed Bug 1233154 Opened 8 years ago Closed 8 years ago

Stand up stage instance of Elmo (l10n) using python 2.7

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ericz, Assigned: ericz)

References

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2334] )

Attachments

(1 file)

Elmo / l10n dashboard is migrating to the python cluster, starting here with the staging site.
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2334]
Eric, where are we on this?
Flags: needinfo?(eziegenhorn)
Swamped in other things.  I have to fix a stage site that is down first, and I'm on triage for the next 6 days so I'll be busy but will get to this as soon as I can.  I haven't forgotten.
Flags: needinfo?(eziegenhorn)
Eric, can I bump this back up on your list?
Flags: needinfo?(eziegenhorn)
Hi all, is there anything we can do to speed this up? 1.4 was EOL 6 months ago and we need to find a way to get this updated.
Yes, sorry for the slowness, this is back at the top of my list for early next week.
Flags: needinfo?(eziegenhorn)
Because of NFS and other complex dependencies, we're going to stand this up on a relatively new unused VM, l10.stage.webapp.scl3.mozilla.com, rather than the python cluster.
l10n.allizom.org is running on l10n.stage.webapp.scl3.mozilla.com (name was typoed in previous comment).  There is still an issue where logins are not working due to CSRF tokens being invalid for unknown reasons.  The rest of the site seems good so far.
oy, there are local changes. That makes it tricky
Comment on attachment 8731631 [details] [diff] [review]
found local changes

Review of attachment 8731631 [details] [diff] [review]:
-----------------------------------------------------------------

::: .gitmodules
@@ +1,3 @@
>  [submodule "vendor-local/src/compare-locales"]
>         path = vendor-local/src/compare-locales
> +       url = https://github.com/Pike/compare-locales.git

This shouldn't be necessary, and is commonly a sign of missing network routes?

::: requirements/compiled.txt
@@ +6,4 @@
>  
>  # doesn't compile on bm-l10n-dashboard01, disabled
>  # sha256: H6pJd-mFBzHot_ih1YuGJzk6N0RUN4jsQ3-0XgbP5AE
> +MySQL-python==1.2.3c1

This...

@@ +16,4 @@
>  
>  # doesn't compile on bm-l10n-dashboard01, disabled
>  # sha256: QXrj9uL2gEYWEdxgyVrJ_LPFurLgDgb54FcrhZA-zJs
> +python-ldap==2.3.13

... and this change need to go on a branch while we're still using the old VM for prod.

We also need a completely new deployment script. We're currently using --target to get per-install python libraries for dev and prod, and pip broke that, https://github.com/pypa/pip/pull/3450.

Also, that new deployment scheme needs to go on a branch that's not master for now.
I had to take an excess "elmo/" out of the config for static in 
/etc/httpd/mozilla/domains/l10n.allizom.org.conf.

I also had to reinstall Pillow with blowing the pip cache away. Apparently libz wasn't installed when it was first compiled on the VM.
Side note, /var/log/messages has lots of

Mar 17 12:11:28 l10n.stage.webapp.scl3.mozilla.com audisp-json: Couldn't send JSON message (message is lost):  HTTP error code 0.

No idea if that's bad.
I think that the problem with auth is that we're using memcache cache in our configs quite a bit, and there's no memcached running on the VM.

Eric, can you set one up and integrate that into the process?
Depends on: 1257512
(In reply to Axel Hecht [:Pike] from comment #9)
> Comment on attachment 8731631 [details] [diff] [review]
> found local changes
> 
> Review of attachment 8731631 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: .gitmodules
> @@ +1,3 @@
> >  [submodule "vendor-local/src/compare-locales"]
> >         path = vendor-local/src/compare-locales
> > +       url = https://github.com/Pike/compare-locales.git
> 
> This shouldn't be necessary, and is commonly a sign of missing network
> routes?
> 
> ::: requirements/compiled.txt
> @@ +6,4 @@
> >  
> >  # doesn't compile on bm-l10n-dashboard01, disabled
> >  # sha256: H6pJd-mFBzHot_ih1YuGJzk6N0RUN4jsQ3-0XgbP5AE
> > +MySQL-python==1.2.3c1
> 
> This...
> 
> @@ +16,4 @@
> >  
> >  # doesn't compile on bm-l10n-dashboard01, disabled
> >  # sha256: QXrj9uL2gEYWEdxgyVrJ_LPFurLgDgb54FcrhZA-zJs
> > +python-ldap==2.3.13
> 
> ... and this change need to go on a branch while we're still using the old
> VM for prod.
> 
> We also need a completely new deployment script. We're currently using
> --target to get per-install python libraries for dev and prod, and pip broke
> that, https://github.com/pypa/pip/pull/3450.
> 
> Also, that new deployment scheme needs to go on a branch that's not master
> for now.

Yes the git protocol is not allowed through the firewall and we typically use https for git repos, treating the webhead as a read-only copy of it.  Feel free to blow away my changes to the requirements files (or put them on a branch), I was just trying to get it up and running.
(In reply to Axel Hecht [:Pike] from comment #10)
> I had to take an excess "elmo/" out of the config for static in 
> /etc/httpd/mozilla/domains/l10n.allizom.org.conf.
> 
> I also had to reinstall Pillow with blowing the pip cache away. Apparently
> libz wasn't installed when it was first compiled on the VM.

The Apache config is controlled by puppet and likely already reverted your change.  Can you tell me what line the extra "elmo/" is on and I'll fix it?
(In reply to Axel Hecht [:Pike] from comment #11)
> Side note, /var/log/messages has lots of
> 
> Mar 17 12:11:28 l10n.stage.webapp.scl3.mozilla.com audisp-json: Couldn't
> send JSON message (message is lost):  HTTP error code 0.
> 
> No idea if that's bad.

That is unfortunately normal for audisp-json and no cause for concern.
(In reply to Axel Hecht [:Pike] from comment #12)
> I think that the problem with auth is that we're using memcache cache in our
> configs quite a bit, and there's no memcached running on the VM.
> 
> Eric, can you set one up and integrate that into the process?

Sure, I'll get memcached running.
Thanks.

    Alias /favicon.ico ~{APP_PATH}/elmo/static/img/favicon.ico
    Alias /static ~{APP_PATH}/elmo/collected/static
    Alias /robots.txt ~{APP_PATH}/elmo/static/robots.txt

should be 

    Alias /favicon.ico ~{APP_PATH}/elmo/static/img/favicon.ico
    Alias /static ~{APP_PATH}/collected/static
    Alias /robots.txt ~{APP_PATH}/elmo/static/robots.txt

The middle it is.
Fixed the path to static, thanks.
Summary: Stand up stage instance of Elmo (l10n) on python 2.7 cluster → Stand up stage instance of Elmo (l10n) using python 2.7
memcached is running now.
Sweet, site works now, log-in does the right thing, too. Perf is OK, too, even for pages that heavily use the mounted repos there's not much difference. Runs successfully with and without DEBUG, too.

I think we need two more things, and then we're ready for the next phase.

There is one web-related cron job on bm-l10n-dashboard:

5       *       *       *       *       cd /home/dashboard/site/elmo;./manage.py progress

This is updating a cashed imaged we show on team pages.

And then there's another rule to add to the apache config, to actually show a summary log file we generate on bm-l10n-dashboard01, 
    Alias /logs.txt /mnt/logs/logs.txt

Could you add that, too? Either there or as /static/logs.txt, which is where i'm currently showing that: https://l10n.mozilla.org/static/logs.txt. Poor mans logging :-/

I've opened a thread on https://discourse.mozilla-community.org/t/dropping-redirect-to-https-l10n-mozilla-community-org-pt-br-seamkoneybr/7689 to discuss if we can drop the old seamonkey pt-BR redirect.
We'll not need the redirect. Flagging Eric for the rest of comment 20.

Note, I updated the git remote upstream, and changed the update_site script.

Creating a deployment is now the flow in https://github.com/mozilla/elmo/wiki/Running-locally, up and including the
  ./scripts/update_site.py
 command.

Updating a deployment is just

  ./scripts/update_site.py

All requirements are in requirements/env.txt, and are installed into the virtualenv now. The command also fails if you don't have a virtualenv activated.

(Old bm-l10n-dashboard set-up still exists, with passing on --vendor)
Flags: needinfo?(eziegenhorn)
Cron job created.
Flags: needinfo?(eziegenhorn)
Sorry, too literal copy and paste.

# Bug 1233154: Update cached image
5 * * * * cd /data/www/l10n.allizom.org/elmo; /data/www/l10n.allizom.org/venv/bin/python manage.py progress

is what we actually need, so that the virtualenv is picked up.

The 

    Alias /logs.txt /mnt/logs/logs.txt

to add to /etc/httpd/mozilla/domains/l10n.allizom.org.conf is correct like that, though.
Flags: needinfo?(eziegenhorn)
python path added to cron job and logs.txt alias created, will roll out to the site in the next hour.
Flags: needinfo?(eziegenhorn)
/etc/cron.d/elmo isn't successful still. I wonder if that's because it's the only one of the files that 755.

I tried to set it to 644, but puppet might have reset the permissions quicker than the cron job touched elmo.
Good catch, set it to mode 644 in puppet.
I see that the permissions got fixed, but the cron job still doesn't run.

If it helps, feel free to let the cron job run every 5 minutes or so. The load it generates isn't all too big, and you can just check the modification time of

  ls -l /data/www/l10n.allizom.org/elmo/collected/static/l10nstats/progress.png

to see if the cronjob actually ran.
Flags: needinfo?(eziegenhorn)
Working now, I needed to add a username to the cron file.
Flags: needinfo?(eziegenhorn)
Anything else to do here Axel?
Nope, stage is all done and good.

Thanks a lot.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Blocks: 1264404
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: