Closed Bug 1139242 Opened 11 years ago Closed 11 years ago

figure out how to get ADI into Socorro in AWS

Categories

(Socorro :: Infra, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rhelmer, Assigned: rhelmer)

References

Details

ADI data comes from a Hive database hosted in PHX - it contains parsed apache logs from the blocklist server. Right now we just query Hive directly, we'll need to figure out how this is going to work once we've moved everything out of PHX.
Assignee: nobody → rhelmer
Status: NEW → ASSIGNED
Sheeri, is there a plan to move ADI out of PHX? Currently we query Hive in PHX, should we continue doing this for the time being?
Flags: needinfo?(scabral)
The Data Team's Hadoop infrastructure, including hive, is compeletely in scl3. Can you give more details about where in phx you are connecting to for the querying? It may be as simple as moving a gateway. Also, there *is* a plan to move ADI processing from SCL3 to the cloud, which is happening this quarter. So that workflow will change. Depending on what you need, we will have some kind of Amazon EMR (their "Hadoop as a service" offering, "Elastic Map Reduce") so it may be just querying that directly.
Flags: needinfo?(scabral)
(In reply to Sheeri Cabral [:sheeri] from comment #2) > The Data Team's Hadoop infrastructure, including hive, is compeletely in > scl3. Can you give more details about where in phx you are connecting to for > the querying? It may be as simple as moving a gateway. Oh ok! Good to know thanks. I was wrong and we are actually using SCL now actually. Socorro is moving to AWS this quarter, so I need to figure out how to get access to it. I might have to leave a VM in SCL3 to do this. > Also, there *is* a plan to move ADI processing from SCL3 to the cloud, which > is happening this quarter. So that workflow will change. Depending on what > you need, we will have some kind of Amazon EMR (their "Hadoop as a service" > offering, "Elastic Map Reduce") so it may be just querying that directly. That would work well for us, I am interested in knowing more details when they are available.
OK we have an interim plan - a VM in SCL that is able to read from Hive and write to RDS. I have this running now.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.