Closed Bug 1182063 Opened 10 years ago Closed 7 years ago

Find home for some daily scripts that use the BMO ES cluster

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: ekyle, Unassigned)

References

(Blocks 1 open bug)

Details

I have 6 cron jobs (in Python) running on a machine in the Toronto office. They are responsible for updating indexes that are derivatives of the BMO ES index. They have been running over a year without changes, and since the machine in the office is exhibiting unusual behavior, I believe it is a good time they find a permanent home. * Private bug leak checker (every 10min) * Private reviews (run daily) * Private bug hierarchy (run daily, 2gig memory required) * Private phone book (run daily) * Public reviews (run daily) * Public bug hierarchy (run daily, 2gig memory required) The first four require access to the private cluster, and must run in a secure environment.
bugzilladm maybe? that's where bmo's cron jobs are executed. how resource intensive are these scripts?
The bug hierarchy jobs are most demanding: It will consume 100% of one CPU for about 3 hours, and requires 2gig of memory. In practice, I split those two jobs into 4 so they run twice as fast.
i retract bugzilladm as a suggestion then :)
Can these be run in AWS, or do they need to stay inside Mozilla?
I believe AWS can be used to extend a private subnet into the cloud, but I need help from someone with more access and experience. http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario4.html
Glob, do you know how to setup an instance with the necessary magic? If not, let's file a bug with IT and describe our requirements and they can choose to either create an AWS instance or a VM at SCL3.
> do you know how to setup an instance with the necessary magic? i do not i'm sorry. i suspect most of it would be moco network work, not aws. given these scripts need to connect to the bmo database it's probably wise to involve opsec as well as infra.
These scripts do not connect to the BMO database, but they do connect to the private ES cluster.
getting AWS access to the private cluster is probably a no-go. it'd almost certainly be more effort than it's worth. offhand, I'd be inclined to add a little more memory to etl1.bugs.scl3 since it already has access to the ES cluster and runs the current ETL jobs. :ekyle, would that be likely to adversely interfere with the ETL jobs?
That sounds like a fine solution. The jobs are run in the early morning, so if they slow down the ETL, few will notice.
If we need post ETL logic, we can now add it to the main ETL, which is packaged in Docker
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.