Closed
Bug 821004
Opened 12 years ago
Closed 12 years ago
Set up metrics1.dmz.scl3.mozilla.com for use by Metrics volunteers
Categories
(mozilla.org :: Server Operations: Community IT, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: davidwboswell, Assigned: tom)
References
Details
(Whiteboard: [2013q3] webops mtg)
Opening this bug to request a 1U HP server that will be used for volunteers working with the Metrics team. Annie can provide more details about the requirements.
Comment 1•12 years ago
|
||
Please provide details such as:
CPU/RAM/Disk
OS
Network access needed (who can reach it, and what can it reach?)
Also, to help with the details, please add some info about the nature of the work that will be done on it, who all will have access, and what type of vouching will be done before access is given, etc.
Comment 2•12 years ago
|
||
Hi Justin. This is meant to be outside of the metrics data center, in an IT-owned space.
David Boswell is the person who would need to answer the vouching question, but basically, it should reach currently publicly available volunteer submission repos (for coding, L10n, webdev, sumo), as well as David Boswell's volunteer management data resources.
The type of work that will be done on it is basic ETL.
Comment 3•12 years ago
|
||
Specs should support good amount of diskspace (1.5-2 TB, it's cheap) and enough processor and memory to handle substantial ETL and a mySQL instance.
Reporter | ||
Comment 4•12 years ago
|
||
For who can reach it, we're looking to be able to give access to people without LDAP or VPN accounts.
It wouldn't be public either though -- the Metrics team will have a vetting process to identify who to give access to (this sort of vetting process is similar to how the Security team onboards volunteers). We'd document the vetting process and look for advice/feedback from IT before putting it into practice.
In terms of the type of work, here is one example (Annie will have more details or better examples):
The Metrics team is creating a dashboard about Coding contributions and that data is spread across several data sources (Hg, Git, Bugzilla). We need space for someone to create some MySQL queries that pulls relevant information from this publicly available information (for example, number of people who have a Bugzilla account and have committed code to Hg). The Metrics team could then integrate that information into a Coding contribution dashboard.
Comment 5•12 years ago
|
||
For completeness' sake, these disparately sourced, pre-aggregated data would likely also be loaded into the metrics data warehouse as well (uni-directional pull, no data pushed back out).
Comment 6•12 years ago
|
||
All -
Missing some details and everyone who's talking to me is utterly confused. The asks is very specific but I'd rather start with the requirements and application (vs. specific hardware).
We don't generally have a network for arbitrary users to access and doing so has security ramifications.
Also sounds like a call would be easier than bugmail. Can you (David) take point in scheduling this?
Comment 7•12 years ago
|
||
Please include :joes, :kang, :jabba in a meeting request.
Reporter | ||
Comment 8•12 years ago
|
||
(In reply to matthew zeier [:mrz] from comment #6)
> Also sounds like a call would be easier than bugmail. Can you (David) take
> point in scheduling this?
Sure, I'll set up a call as soon as everyone is available. Look for a Zimbra invite.
Reporter | ||
Comment 9•12 years ago
|
||
The only time we were all free over the next couple of weeks is this afternoon at 3. Apologies for the late notice -- if this doesn't work I'll look for another time.
Comment 10•12 years ago
|
||
Hmm, I can't actually make that meeting. My main concern is just the specific network access that is required to/from the host and access methods and vouching methods, all of which are mostly things that joes or kang would be better suited to discuss. So don't block the meeting just because I can't make it, if just one of joes or kang can attend and then I can figure out from them how/where to put this host and figure out the remaining specs from comment 1 from the bug.
Comment 11•12 years ago
|
||
Host needs to run mysql. Metrics will push/pull content. Is that a VM or physical host?
Could this be on an external cloud provider or self hosted?
Comment 12•12 years ago
|
||
Cloud provisioning is the way to go, much easier to scale up in the face of future demand than racked metal.
Updated•12 years ago
|
Assignee: server-ops-infra → server-ops
Component: Server Operations: Infrastructure → Server Operations
QA Contact: jdow → shyam
Comment 13•12 years ago
|
||
Hi -
Checking in... Has this been spun up, and if so, do we have mySql available yet? Who is our goto for cloud hosting? EC2? Do we have an URI yet?
Flags: needinfo?(tom)
Assignee | ||
Comment 14•12 years ago
|
||
I'm waiting for mrz to send me details to login to ec2. He's just checking up on some security practices before we proceed.
What exactly do you need? Ie, how much disk? CPU power? RAM? We can set your threshold just above so that you can take what you need as you need it, but still have room to grow.
We're planning on using ec2
Flags: needinfo?(tom)
Comment 15•12 years ago
|
||
Could we starting with 600Gb disk, 8GB RAM, and a quad core?
This is a bit of uncharted territory. Metrics volunteers will trickle in, but they could be doing some very beefy ETL, and Metrics itself will also be pulling from the machine. We may likely need to bump the memory before anything else.
Assignee | ||
Comment 16•12 years ago
|
||
Hey Annie,
This should be fine, but we should get this confirmed from a member of staff. Do you need a slightly higher threshold in case you burst above that amount of RAM/Disk just so you've got some space to operate past the limits you stated, before we have to upgrade your threshold?
I've been told to assume EC2 for this, so it looks like we'll put this server there, unless you have anything that means we shouldn't.
Comment 17•12 years ago
|
||
To my thinking, in the cloud on EC2 would be perfect. This machine is meant to be accessed by vetted volunteers, and I prefer not to have to open up access to our data centers.
I think the disk space is likely just fine, but as I said, I am more worried about the possibility of needing more memory than anything. The purpose of this machine is to run MySQL and ETL processes with it.
Component: Server Operations → Server Operations: Community IT
QA Contact: shyam → mrz
Comment 18•12 years ago
|
||
Let's see if we can't make this happen in Q3. It's uncharted territory, but there's no reason we shouldn't be able to do this. We should be putting addons failovers for Marketplace in EC2 also, so this is not the only use case.
Whiteboard: [2013q3]
Assignee | ||
Comment 19•12 years ago
|
||
This is happening. mrz is on PTO, but when he's back I'll see what we can do. Once there's a user accounts system setup of some level within AWS, then were pretty much set to make this happen.
Status: NEW → ASSIGNED
Updated•12 years ago
|
Whiteboard: [2013q3] → [2013q3] webops mtg
Comment 20•12 years ago
|
||
Matthew, Corey - what are the blockers to allocate an instance to the BI team for Community Access? We want to give a tableau public access of data to community.
Adding Joe Stephenson for OpsSec review.
Comment 21•12 years ago
|
||
After meeting with mrz and Annie yesterday (separately), I have a good grasp on the next steps. I have made dependent bug 892598 to spin up a machine* and dependent bug 892600 to install MySQL and any other software folks want.
If anyone on this bug has software requests, please put them in bug 892600. Thank you.
* With the expected continual CPU usage, AWS would become prohibitively expensive. With disk space requirements, a VM would use NFS, which MySQL does not play nicely with. After meeting with Jake's team this morning (webops), we decided a physical machine is better.
Comment 22•12 years ago
|
||
WIll there be more VPN requirements because of this?
Comment 23•12 years ago
|
||
If this host lives in the community network a VPN will not be required. There are other concerns if the host will need to access any Mozilla internal resources. I don't know if this was answered or not.
Comment 24•12 years ago
|
||
It will need to have a connection to the metrics assets in our protected data center [net] to pull data (read only) from our DW.
Comment 25•12 years ago
|
||
Can you enumerate the flows then per <https://mana.mozilla.org/wiki/display/NOC/ACL+Requests> to jump start any questions Opsec may have?
Updated•12 years ago
|
Group: mozilla-corporation-confidential
Comment 26•12 years ago
|
||
Annie - I can do that as we talked about yesterday. (My notes are in the office and I'm at home today - did you want to copy files or have this MySQL instance replicate the metrics ones?)
We will do a one-way flow - so that the data can go from the DW to this community instance, but folks can't get from the community instance to the protected data center.
Comment 27•12 years ago
|
||
Moving the ACL discussion to a private bug so we can keep this one open for the community to see.
Group: mozilla-corporation-confidential
Updated•12 years ago
|
Summary: Set up a 1U HP server for use by Metrics volunteers → Set up metrics1.dmz.scl3.mozilla.com for use by Metrics volunteers
Comment 28•12 years ago
|
||
I've set this up, and myself, Daniel Einspanjer and Annie Elliott have root access. It is currently managed by puppet, so if you have package reqeusts please note them in bug 892600 and we'll get them installed ASAP.
Backups and monitoring are in other bugs, so I'm going to close this one out. I will be in the office on Thursday and will check my notes about what netflows and such we should ask to be open.
tl;dr - MySQL is on metrics1.dmz.scl3.mozilla.com but it's not 100% ready for use just yet.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•