[bedrock] demo php repo (mozilla.com) is not updating

RESOLVED FIXED

Status

P1
major
RESOLVED FIXED
4 years ago
2 years ago

People

(Reporter: pmac, Assigned: nmaul)

Tracking

Details

(Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/1362] )

It should be auto-updating every 15min or so (I believe). This also impacts translations in the "locales" directory on the webhead (/data/www/www-dev.allizom.org/locales) which is what all of the bedrock instances (demo?s and dev) use for localizations. It seems to be only behind by a few hours at this point.

Updated

4 years ago
Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/1362]
Just in case. if the OS for the servers was upgraded from 12.04 to 14.04 (we had such an upgrade on our l10n community server last week), you probably have to run an 'svn upgrade' command in the repo root to get your 'svn update' commands to keep working because the svn client in 14.04 got updated to a more recent version (1.8) and the data storage format between 1.6 and 1.8 has changed.
These servers are Redhat Enterprise Linux (RHEL) I believe. Those sounds like an Ubuntu version numbers. That said, an upgrade in SVN on the admin node could have caused this. Don't know if that's happened here or not. Thanks Pascal.
(Assignee)

Updated

4 years ago
Assignee: server-ops-webops → nmaul
Has anyone managed to look into this?

It's definitely updating, but not at the right pace.

Example: I'm not seeing a page from 2 hours ago on demo5 (r132466). This page should appear localized
https://www-demo3.allizom.org/de/contribute/stories/ruben/

But I'm seeing strings that I committed 17 hours ago (r132432).
I'm wondering if this is a side effect of some work I think jakem was doing to try to cut down on unnecessary deployments. I'm guessing that the PHP side is only being deployed when there's something to deploy on the Python side. I have no real evidence yet, but this would explain the seemingly random timing and the fact that it's still working some of the time.
I'm bumping the importance since we've got a lot of l10n in the pipeline right now with several major efforts launching in 2 weeks.
Severity: normal → major
(Assignee)

Comment 6

4 years ago
Not sure what the problem here is. This is still independent of the python code... it deploys the php side whenever the SVN version gets bumped, via cron. This happens every 10 minutes.

Separately, it also updates the locales that are housed inside of the python code (www.mozilla.org-django/bedrock/locale). This happens every 15 minutes. It does *not* restart Apache... this is fine for the actual PHP code, but might mean that the locales inside of bedrock don't get picked up... honestly not sure how that works.


In any case I've updated both by hand... there were a couple SVN revisions, but I see there's been changes this morning so I'm not sure if I actually did anything useful or just pulled stuff in a few minutes faster than normal.



Is there some way we can see the problem? A page we can visit, for example, and the matching commit in the locales SVN repo?
This page https://www-demo3.allizom.org/sr/contribute/stories/ruben/

Should be localized because of r132501, which happened about an hour ago, but it's still mostly in English.
(Assignee)

Comment 8

4 years ago
I don't see this commit in the SVN log:


[root@bedrockadm.private.phx1 locales]# svn log | head
------------------------------------------------------------------------
r132506 | flodolo@mozilla.com | 2014-10-03 11:52:02 -0700 (Fri, 03 Oct 2014) | 1 line

l10n: translation update
------------------------------------------------------------------------
r132497 | flodolo@mozilla.com | 2014-10-03 10:25:03 -0700 (Fri, 03 Oct 2014) | 1 line

l10n: translation update
------------------------------------------------------------------------
r132491 | flodolo@mozilla.com | 2014-10-03 08:24:40 -0700 (Fri, 03 Oct 2014) | 1 line



Production reads from: URL: http://svn.mozilla.org/projects/mozilla.com/tags/production/locales. Perhaps it's on the wrong branch?
Demo servers and www-dev must read /trunk

Besides that, the file is also in /production just with a different commit (should be r132506)
Demos and dev do indeed read trunk.
(Assignee)

Comment 11

4 years ago
Doh, I completely misread this and was looking at production, not dev/demo. Probably keyed off of the summary.

In any case, yes, the PHP side of the dev/demo server was indeed out of date. Not sure why exactly, except that "svnversion" was returning odd results. I suspect that was confusing the update script.

I updated it manually and deployed it. That page is much more translated now. :)
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Thanks, I'll keep an eye on it and eventually reopen (pages were updating, just extremely slowly).
Could it be because the "locales" directory is inside of the "mozilla.com" repo? So perhaps the two jobs are competing? My theory:

1. "mozilla.com/trunk" is updated
2. "mozilla.com/trunk/locales" doesn't appear to have any changes due to step1 having already updated

In this case it'd be a race, so only if locales changes were committed at the right time (between steps 1 and 2) would they be recognized and deployed.
Unfortunately the problem is not solved, and we're approaching a bunch of deadlines (Oct 14).

Translation for the new home page landed in r132696 (3 hours ago), no trace of it on demo5
https://www-demo5.allizom.org/tr/

4 strings landed in r132695 (caption under mozillians photo), but they're not visible here
https://www-demo3.allizom.org/tr/contribute/

r132692 (7 hours ago), same thing
https://www-demo3.allizom.org/es-CL/contribute/

I see r132685, which is not 16 hours old.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Francesco Lodolo [:flod] from comment #14)
> Translation for the new home page landed in r132696 (3 hours ago), no trace
> of it on demo5
> https://www-demo5.allizom.org/tr/

All changes are now visible. So, it's updating, definitely not in the usual 15 minutes.
Trying with a NI since the problem persists.
Flags: needinfo?(nmaul)
Note: if you want an updated reference.

r132954 (about 60 minutes ago)
https://www-demo5.allizom.org/it/ (green box about Firefox should be localized, not reading "Try the best Firefox yet").

Updated

4 years ago
Blocks: 1083590, 1071965
Marking this a P1 since it's blocking the l10n work for the Firefox 10th Anniversary and putting the project at risk. We also have the Android product page launching next week and l10n is blocked.

We need to get this resolved ASAP.

Jake, the webprod and l10n folks are available to test anything. Let us know.
Priority: -- → P1

Comment 20

4 years ago
I don't have the permissions necessary to fix this, but I have read-only access to the bedrock admin node, and here's what I'm seeing so far:

in /etc/cron.d/bedrock-dev, there is a typo in the MAILTO that sends the cron mail to "cron-bedorck@mozilla.com" so I can't see what errors may be happening, unlike the cron jobs in prod (/etc/cron.d/bedrock-stage has the same typo).

the cron job itself looks correct to me:

*/15 * * * * root /data/bedrock-dev/src/update-www-dev.allizom.org-svn-locale.sh


For the purposes of discussion, I'm pasting the contents of the script below:

$ cat /data/bedrock-dev/src/update-www-dev.allizom.org-svn-locale.sh
#!/bin/sh

UPSTREAM=$(svn info http://svn.mozilla.org/projects/mozilla.com/trunk/locales | grep Revision | cut -f2 -d' ')
cd /data/bedrock-dev/src/www-dev.allizom.org-django/bedrock

pushd locale > /dev/null
RUNNING=$(svnversion ./ | tr -d '[A-Z]')
UPDATED=0
if [ $RUNNING -ne $UPSTREAM ]; then
    svn -q up
    UPDATED=1
fi
popd > /dev/null

if [ $UPDATED = 1 ]; then
    /data/bedrock-dev/deploy -q www-dev.allizom.org-django
else
#    echo "Nothing to deploy."
    exit 0
fi


--- end script ---

My best guess at this point from doing a little testing on my laptop is that the error is in the comparison between $RUNNING and $UPSTREAM, because $RUNNING is not an integer so the "if [ $RUNNING -ne $UPSTREAM ]" fails

~/bedrock/locale$ RUNNING=$(svnversion ./ | tr -d '[A-Z]')
~/bedrock/locale$ echo $RUNNING
118893:133406
(Assignee)

Comment 21

4 years ago
Hmm... maybe. It seems to be working at this exact moment:

[root@bedrockadm.private.phx1 locale]# UPSTREAM=$(svn info http://svn.mozilla.org/projects/mozilla.com/trunk/locales | grep Revision | cut -f2 -d' ')
[root@bedrockadm.private.phx1 locale]# RUNNING=$(svnversion ./ | tr -d '[A-Z]')
[root@bedrockadm.private.phx1 locale]# if [ $RUNNING -ne $UPSTREAM ]; then echo hi; fi
[root@bedrockadm.private.phx1 locale]# echo $RUNNING
133412
[root@bedrockadm.private.phx1 locale]# echo $UPSTREAM
133412
[root@bedrockadm.private.phx1 locale]# svnversion ./ | tr -d '[A-Z]'
133412

But perhaps it got updated somehow... maybe someone pushed the button.
Flags: needinfo?(nmaul)
(Assignee)

Comment 22

4 years ago
I just committed the fix to the MAILTO in puppet... should be live in an hour or less.


Testing myself, that does indeed appear to be a possibility:

[root@bedrockadm.private.phx1 locale]# RUNNING="118893:133406"
[root@bedrockadm.private.phx1 locale]# echo $RUNNING
118893:133406
[root@bedrockadm.private.phx1 locale]# UPSTREAM=133406
[root@bedrockadm.private.phx1 locale]# echo $UPSTREAM
133406
[root@bedrockadm.private.phx1 locale]# if [ $RUNNING -ne $UPSTREAM ]; then echo hi; fi
-bash: [: 118893:133406: integer expression expected

"svnversion --help" implies the <int>:<int> response indicates we have a "mixed revision working copy". I'm not sure how we get into that state, but it should be easy enough to get past. I'll change it to a string comparison.
(Assignee)

Comment 23

4 years ago
I've made the change. The relevant lines now look like this:

if [ "$RUNNING" != "$UPSTREAM" ]; then
    svn -q up
    UPDATED=1
fi


Marking as resolved, please let me know if it seems to be not fixed.
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Nope, still not working :-\ 

At this point I'd suggest to leave this open until we're sure it's fixed, instead of going back and forth.

r133418 is over an hour old, cronjob is set up to run every 15 minutes, but still no trace of it on demo server: https://www-demo1.allizom.org/sl/firefox/android/
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It should be possible to simplify the 1st part without pushd/popd 

RUNNING=$(svnversion locale | tr -d '[A-Z]')
UPDATED=0
if [ $RUNNING -ne $UPSTREAM ]; then
    svn -q up locale
    UPDATED=1
fi

Besides that, I don't see anything weird in the script. Could it be that the issue is in /data/bedrock-dev/deploy instead? It could be useful to know the current revision on SVN while pages are not updating (or log this script somewhere to see if SVN is updating).
(In reply to Francesco Lodolo [:flod] from comment #25)
> if [ $RUNNING -ne $UPSTREAM ]; then

This should obviously be the updated version
if [ "$RUNNING" != "$UPSTREAM" ]; then
There are several things in this bash script that I think may be wrong:

1/ The remote grep assumes that the locale running the server is English (likely, but we don't know) and that the line with the revision number is 'Revision: 12345'. On my machine this line is 'Révision : 133423' (note the accented letter and the punctuation difference. AFAIK, there is no assurance that all svn versions give the same output for svn info, that's why there is an xml output option which is stable over time, so I believe the UPSTREAM line should be:
UPSTREAM=$(svn info --xml https://svn.mozilla.org/projects/mozilla.com/trunk/locales | grep 'revision' | head -1 | grep -Eo '[0-9]{1,}')

I also changed http to https out of habit. That new line should also only match the first number in the line, so no mixed revision output.

2/ The check on updated should probably not use the equal sign to test a number:

if [ $UPDATED -eq 1 ]; then

3/ This is called as a shell script but it uses built-in bash commands (pushd and popd):
The shebang is:
#!/bin/sh

It probably should be:
#!/usr/bin/env bash

With #!/bin/sh I get a 'pushd: not found' error when running the script locally.

4/ If we have a mixed revision number (something that happens if an 'svn up' command is interrupted), we should grep only the first revision number and not the range because this is the significant part
RUNNING=$(svnversion ./ | grep -Eo '[0-9]{1,}' | head -1)

5/ If we have a mixed revision copy, something probably went wrong in a previous update and an 'svn cleanup' command should be used, otherwise 'svn up' may not work because our local repository is locked. We should actually always run 'svn cleanup' before svn up

Here is a pastebin of my proposal:
http://pastebin.mozilla.org/6837840
(In reply to Josh Mize [:jgmize] from comment #20)
> My best guess at this point from doing a little testing on my laptop is that
> the error is in the comparison between $RUNNING and $UPSTREAM, because
> $RUNNING is not an integer so the "if [ $RUNNING -ne $UPSTREAM ]" fails
> 
> ~/bedrock/locale$ RUNNING=$(svnversion ./ | tr -d '[A-Z]')
> ~/bedrock/locale$ echo $RUNNING
> 118893:133406

I'll bet that's it. When making an auto-update script for webwewant-dev I had to use this to get the currently checked out svn version:

RUNNING=$(svnversion -cn ./ | cut -d ':' -f 2)
(Assignee)

Comment 29

4 years ago
Ugh. We missed something hiding in plain sight. The demoX sites are peculiar in that they're not complete... they share the PHP codebase, and only have separate Django codebases. They also share a locale checkout... from the PHP version. The file we're working on only affects www-dev.allizom.org-django. It does not have any bearing on the demoX sites.

The demoX sites get their locales updated by the cron that updates the PHP codebase:

0-59/10 * * * * root /data/bedrock-dev/src/update-www-dev.allizom.org.sh

*This* script had the same sort of [ $RUNNING -ne $UPSTREAM ] logic error that the other script did... that's probably the real problem.

I've fixed this script now too, similar to comment 23. I've also pulled several of the fixes in comment 27, though in this case none of them have an effect on this specific problem.


I actually verified this time that a missing commit appeared on www-demo1.allizom.org after making this change and running the script, so I'm confident that it's fixed now. Anyone else care to verify?
We can check with r133440. This landed 20 seconds ago
https://www-demo1.allizom.org/ga-IE/firefox/android/
(Assignee)

Comment 31

4 years ago
Looks good to me!
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Thanks Jake, I'll keep an eye on it, hopefully it's fixed now :-)
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.