investigate missing symlink on prod webheads

RESOLVED FIXED

Status

Socorro
Infra
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: phrawzty, Assigned: phrawzty)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

4 years ago
/data/www/crash-stats.mozilla.org/socorro/webapp-django/media/symbols_upload is supposed to be a symlink to /mnt/socorro/symbols_upload/ on the Prod webheads. For an indeterminate period of time it wasn't[1].  I fixed the problem manually[2] but this should not have happened in the first place.  I suspect deployment nastiness - we'll need to investigate further.

[1] https://errormill.mozilla.org/webtools/socorro-prod/group/172759/
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1071019
We could ship a symlink inside the tarball (and RPM etc) if that's helpful - we do this for local.py already. Might not be the friendliest thing to do to outside people using the packages.
(Assignee)

Comment 2

4 years ago
Given that we're trying to move towards genericisation, I'd be against shipping such an element as part of our default configuration - it's something that we could include in our deployment script.  In fact, we already do this for Stage, so there's precedent for the same in Prod.
Assignee: nobody → dmaher
Status: NEW → ASSIGNED
(Assignee)

Comment 3

4 years ago
I have made two modifications to the Prod deployment script[1]:
* Added and implemented function "techo" which timestamps echoed strings.
* Added a step which destroys any pre-deploy symbols_upload dir and replaces it with the appropriate symlink. (In fact, there was a rudimentary step for this in place already, but it was both incomplete and incorrect.)

This *should* solve the problem - let's keep an eye on it during the next Prod deployment.

As an aside, I continue to be concerned that this script doesn't appear to be in Puppet anywhere.  We may wish to do something about that, eventually...

[1] socorroadm.private.phx1:/data/crashstats/src/crash-stats.mozilla.org/update
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
(Assignee)

Updated

4 years ago
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 4

4 years ago
see also bug 1043777
(Assignee)

Comment 5

4 years ago
The prod webheads have this cron job:

  */5 * * * * root /data/bin/update-www.sh > /dev/null 2>&1


Which effectively does this:

  cd /data/www
  /usr/bin/git fetch -q && /usr/bin/git reset -q --hard origin/master && /usr/bin/git clean -f -q


Which pulls from this repo:

  git://socorroadm.private.phx1.mozilla.com/data/crashstats/www


The contents of which contained this:

  drwxr-xr-x 3 root root 4096 Aug  7 19:07 symbols_upload
  lrwxrwxrwx 1 root root   27 Sep 12 19:57 symbols-upload -> /mnt/socorro/symbols-upload


Which was exactly what was clobbering the symlink in Prod.  I have fixed the contents of this repo, and the changes have propagated to the webheads as expected:

  [dmaher@socorro4.webapp.phx1 media]$ ls -l /data/www/crash-stats.mozilla.org/socorro/webapp-django/media
  lrwxrwxrwx 1 root root 27 Sep 24 05:18 symbols_upload -> /mnt/socorro/symbols_upload


This represents the solution to immediate problem (see also bug 1071019); however, what I'm not sure clear on is why the contents of that repo were incorrect in the first place.  Normally that repo is generated from the "src" directory - which, in this case, is totally correct.  We'll have to keep on eye on this next time we push in case there are other moving parts involved that aren't immediately evident at this stage.


(p.s. The bug in comment #4 was unrelated, as it turns out.)
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Summary: investigate missing symilnk on prod webheads → investigate missing symlink on prod webheads
(Assignee)

Updated

4 years ago
See Also: → bug 1074684
You need to log in before you can comment on or make changes to this bug.