Closed Bug 980082 Opened 10 years ago Closed 10 years ago

NFS mount for webheads to write to

Categories

(Socorro :: Infra, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 994128

People

(Reporter: peterbe, Assigned: dmaher)

References

Details

As of 977216 we now have the possibility to upload files (actually zip files containing symbols) into the webapp. These are stored in the webapps `settings.MEDIA_ROOT/symbols-uploads` directory. 

That needs to be an NFS mounted directory so that the server that runs the crontabber can read from that same directory. 

There's some more explanation in comment 6 if that helps https://bugzilla.mozilla.org/show_bug.cgi?id=977216#c6
Blocks: 977211
How much storage are we talking about here? 10GB, 100GB, 1TB?
Flags: needinfo?(peterbe)
Less than 1Gb. Actually less than that. It's just a temporary way to get a file from the webhead to the crontabber to the eventual symbol server which is already in place.
Flags: needinfo?(peterbe)
Okay, thanks... out of curiosity, long-term, would you be happier with an object store? Ceph, S3, DDN WOS, etc... all we have right now is S3, and since none of Socorro is in AWS that seems ill-advised at this time. But if we had something local... ?


Dan / Greg:

I'm thinking a 5GB volume... r/w access to:
socorro1.webapp.phx1
socorro2.webapp.phx1
socorro3.webapp.phx1
socorro4.webapp.phx1
sp-admin01.phx1

:peterbe, does that seem like the right list of nodes? Been a while since I touched Socorro, this is only my guess based on my last recollection...

Perhaps something like "symbols-uploads" for the name, since that'd match the directory it's being mounted to... just for symmetry and obviousness.
Assignee: server-ops-webops → server-ops-virtualization
Component: WebOps: Socorro → Server Operations: Virtualization
Flags: needinfo?(peterbe)
Product: Infrastructure & Operations → mozilla.org
QA Contact: nmaul → dparsons
I'll have to refer to :lonnen regarding the list of webheads. 

We could rewrite it to use S3 instead. Technically not much harder for me but it adds [deployment] complexity in terms of we need a secret + key and another dependency (boto). 

I chose NFS based on discussion with :solarce who think it would be slightly simpler with NFS.
Flags: needinfo?(peterbe) → needinfo?(chris.lonnen)
@peterbe https://mana.mozilla.org/wiki/display/websites/crash-stats.mozilla.com+%28Socorro%29 has all the info

@jakem - that looks right
Flags: needinfo?(chris.lonnen)
Assignee: server-ops-virtualization → gcox
I'm assuming no snapshots since the data is transient.
What's the path where it should be mounted on the servers?
(In reply to Greg Cox [:gcox] (plz don't needinfo me) from comment #6)
> I'm assuming no snapshots since the data is transient.

Data is transient. If all webheads and all cron processes were on the same machine we wouldn't have this challenge. 
I'm not sure what "no snapshots" means. 

> What's the path where it should be mounted on the servers?

I don't know where on the webheads the webapp is located exactly. :jakem, :lonnen?
When you've located that, the exact path needs to be `<path>/media/symbols-uploads`
I believe <path> is /data/www/crash-stats.mozilla.org/socorro/webapp-django

so the path should be /data/www/crash-stats.mozilla.org/socorro/webapp-django/media/symbols-uploads


@peterbe -- those directories don't exist. Do they need to be created?
Flags: needinfo?(peterbe)
/data/www/crash-stats.mozilla.org/socorro/webapp-django/media/ might need to be created. the `symbols-uploads` part is, I guess, where the mount starts. 

Honestly, I don't know how NFS works really.
Flags: needinfo?(peterbe)
(In reply to Greg Cox [:gcox] (plz don't needinfo me) from comment #6)
> I'm assuming no snapshots since the data is transient.
> What's the path where it should be mounted on the servers?

Based on this:
[dmaher@sp-admin01.phx1 ~]$ hostname
sp-admin01.phx1.mozilla.com
[dmaher@sp-admin01.phx1 ~]$ mount | grep symbol
10.8.75.14:/symbols on /mnt/socorro/symbols type nfs (rw,noatime,rsize=32768,wsize=32768,addr=10.8.75.14)

I would submit that a reasonable mount point would be /mnt/socorro/symbols-upload


(In reply to Peter Bengtsson [:peterbe] from comment #9)
> /data/www/crash-stats.mozilla.org/socorro/webapp-django/media/ might need to
> be created. the `symbols-uploads` part is, I guess, where the mount starts. 
> 
> Honestly, I don't know how NFS works really.

In general what happens is that there is a system mount (see above) which is then symlinked to from the application directory.  In this case:
[...]socorro/webapp-django/media/symbols-upload -> /mnt/socorro/symbols-upload

This symlink is managed on the admin node and pushed out to the webheads just like the rest of the content.
Sorry for the delay on this.  I'm really slow at figuring out puppet references.

Vol created.  It's mountable as 10.8.75.14:/symbols_upload or 10.8.81.10:/symbols_upload depending on your VLAN.

Updated manifests/nodes/socorro.pp and created modules/socorro/manifests/symbols_upload.pp as change 85142 to do the mounts.  /mnt/socorro/symbols-upload is mounted on sp-admin01 and socorro[1234].webapp.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Pardon my ignorance of deployment stuff but I ssh'ed in to socorroadm.private.phx1.mozilla.com and looked in /data/crashstats/www/crash-stats.mozilla.org/socorro/webapp-django/media/ and there's no symbols-uploads there :(

And there's no /mnt/socorro/symbols-upload either. 

I did notice that there is a /mnt/socorro/symbols-upload on the admin node (sp-admin01.phx1.mozilla.com) which is great because that's where we'll read from in the cron job.
socorroadm wasn't asked for back in comment 3.  Any others needed?
:phrawzty mentioned symlinks in comment 10, in more of the theoretical realm, so I didn't act on changing that since it was deep in someone-else's-data-structure.
Status: RESOLVED → REOPENED
Component: Server Operations: Virtualization → Server Operations: Storage
Resolution: FIXED → ---
Added mount to socorroadm (manifests/nodes/socorro.pp, change 85158).
(In reply to Greg Cox [:gcox] (plz don't needinfo me) from comment #13)
> socorroadm wasn't asked for back in comment 3.  Any others needed?
> :phrawzty mentioned symlinks in comment 10, in more of the theoretical
> realm, so I didn't act on changing that since it was deep in
> someone-else's-data-structure.

Pardon me. I was vague. I meant we need a place to write to the NFS mount (the webheads) and a place to read from the NFS mount (admin node). 

Regarding the puppet stuff, is that something you can help with :phrawzty? I.e. make a symlink inside the webapp-django/media/ called "symbols-uploads" to point to the NFS mount?
Flags: needinfo?(dmaher)
:peterbe and I went over this in person.  The tl;dr is:

# socorroadm.private.phx1:/data/crashstats/src/crash-stats.mozilla.org/update
ln -s /mnt/socorro/symbols-upload $APP/webapp-django/media/symbols-upload      # line 29

This only affects prod.  I'm not sure how stage gets deployed (yet) so I haven't made a similar modification there (yet).
Assignee: gcox → dmaher
Status: REOPENED → ASSIGNED
Component: Server Operations: Storage → Infra
Product: mozilla.org → Socorro
QA Contact: dparsons
Can you attempt the same thing for stage? That way, we can actually start testing the actual upload part.
Currently, socorro[1-2].stage.webapp.phx1 do not have *any* netapp mounts, and socorroadm.stage.private.phx1 only has 10.8.75.14:/symbols .  I will proceed with the necessary adjustments.
08:31:43 < phrawzty> peterbe: i'm unsure as to whether the stage webheads need *only* /mnt/socorro/symbols-upload, or both that *and* /mnt/crashanalysis
08:32:42 < peterbe> *only*
Committed to Puppet (r85536); however :

Error: /Stage[main]/Socorro::Symbols_upload/Mount[/mnt/socorro/symbols-upload]: Could not evaluate: Execution of '/bin/mount -o rw,hard,nointr,rsize=65536,wsize=65536,bg,proto=tcp,vers=3,noatime /mnt/socorro/symbols-upload' returned 32: mount.nfs: access denied by server while mounting 10.8.81.10:/symbols_upload

I'll open a bug to deal with that.
Flags: needinfo?(dmaher)
Depends on: 991775
Status: ASSIGNED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.