1173465 - do not update socorroadmin node automatically

Reporter

Description

•

9 years ago

The socorroadmin node crontabber has long-running jobs and can't handle being run on multiple boxes, so I think for the time being we should treat it as a stateful service (like Consul or ES) and not push auto-updates to it. Work is ongoing to make it safe to run multiple crontabber instances, but we'll still need to figure out the "long-running jobs" issue.

Peter Bengtsson [:peterbe]

Comment 1

•

9 years ago

Crontabber rarely changes. What is changing often is the configuration (the .ini file and the `jobs` option). So, perhaps crontabber should permanently be run as a single master (per configuration that is!) and we just git pull on the socorro package and reload the crontabber.ini "periodically". Sorry, if that doesn't make sense. What I'm trying to say is: Let's not make the crontabber node one of those privy to the chaos monkey like the web heads are. As long as it's easy to re-deploy the socorro code (since the crontabber .ini file loads in from the socorro python package).

Robert Helmer [:rhelmer]

Reporter

Comment 2

•

9 years ago

(In reply to Peter Bengtsson [:peterbe] from comment #1) > Crontabber rarely changes. What is changing often is the configuration (the > .ini file and the `jobs` option). So, perhaps crontabber should permanently > be run as a single master (per configuration that is!) and we just git pull > on the socorro package and reload the crontabber.ini "periodically". The crontabber config in AWS comes from consul now, so we don't even need the crontabber.ini! We will need to keep the Socorro package up to date, true - I think it'd be helpful to just start thinking of "socorro crontabber" as a separate app/service from collector/processor/webapp and so on. > Sorry, if that doesn't make sense. What I'm trying to say is: Let's not make > the crontabber node one of those privy to the chaos monkey like the web > heads are. As long as it's easy to re-deploy the socorro code (since the > crontabber .ini file loads in from the socorro python package). Makes total sense, and this is what I was thinking of. We treat ES/Consul nodes this way, and also the ancillary services like crash-analysis and symbolapi, so it's not unprecedented. I think we'll get a better opportunity next quarter to fix this as we split Socorro up into separate apps - the crontabber node(s) would only be recycled when there was a change in the "socorro crontabber" repo, this reduction in push frequency plus us figuring out how to make multiple crontabbers run safely will make the problem effectively go away, I think.

Robert Helmer [:rhelmer]

Reporter

Comment 3

•

9 years ago

jp, can we exclude socorroadmin from getting updates, like we do for socorroanalysis and symbolapi? We'll be able to re-enable this next quarter, but socorro crontabber has long-running jobs and isn't safe to multiple instances of, so it doesn't fit with our deployment strategy quite yet. We're going to work to fix this next quarter.

Flags: needinfo?(jschneider)

Peter Bengtsson [:peterbe]

Comment 4

•

9 years ago

(In reply to Robert Helmer [:rhelmer] from comment #2) > (In reply to Peter Bengtsson [:peterbe] from comment #1) > > Crontabber rarely changes. What is changing often is the configuration (the > > .ini file and the `jobs` option). So, perhaps crontabber should permanently > > be run as a single master (per configuration that is!) and we just git pull > > on the socorro package and reload the crontabber.ini "periodically". > > > The crontabber config in AWS comes from consul now, so we don't even need > the crontabber.ini! We will need to keep the Socorro package up to date, > true - I think it'd be helpful to just start thinking of "socorro > crontabber" as a separate app/service from collector/processor/webapp and so > on. > So, do we still relay on this hardcoded list https://github.com/mozilla/socorro/blob/master/socorro/cron/crontabber_app.py#L9-L54 ? Isn't the correlation jobs going to go off a different configuration so you can run parallel (unrelated dependency trees) crontabbers? If so, how do you manage the timing of new config and new socorro python code?

Robert Helmer [:rhelmer]

Reporter

Comment 5

•

9 years ago

(In reply to Peter Bengtsson [:peterbe] from comment #4) > (In reply to Robert Helmer [:rhelmer] from comment #2) > > (In reply to Peter Bengtsson [:peterbe] from comment #1) > > > Crontabber rarely changes. What is changing often is the configuration (the > > > .ini file and the `jobs` option). So, perhaps crontabber should permanently > > > be run as a single master (per configuration that is!) and we just git pull > > > on the socorro package and reload the crontabber.ini "periodically". > > > > > > The crontabber config in AWS comes from consul now, so we don't even need > > the crontabber.ini! We will need to keep the Socorro package up to date, > > true - I think it'd be helpful to just start thinking of "socorro > > crontabber" as a separate app/service from collector/processor/webapp and so > > on. > > > > So, do we still relay on this hardcoded list > https://github.com/mozilla/socorro/blob/master/socorro/cron/crontabber_app. > py#L9-L54 ? > > Isn't the correlation jobs going to go off a different configuration so you > can run parallel (unrelated dependency trees) crontabbers? > If so, how do you manage the timing of new config and new socorro python > code? No we don't rely on the hardcoded list - we use consul to configure the list of jobs as a comma-separated value, it looks like this: crontabber.jobs=socorro.cron.jobs.laglog.LagLog|5m, socorro.cron.jobs.weekly_reports_partitions.WeeklyReportsPartitionsCronApp|7d, ... We don't run the correlation jobs anymore, we decided not to use PG JSON for that.

JP Schneider [:jp]

Comment 6

•

9 years ago

https://github.com/mozilla/socorro-infra/pull/174 is to resolve this!

Flags: needinfo?(jschneider)

Daniel Maher [:phrawzty]

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Daniel Maher [:phrawzty]

Comment 7

•

9 years ago

OOPS! Closed the wrong bug - sorry.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

JP Schneider [:jp]

Comment 8

•

9 years ago

This PR got merged, so we're set here.

Status: REOPENED → RESOLVED

Closed: 9 years ago → 9 years ago

Resolution: --- → FIXED

Bugzilla

do not update socorroadmin node automatically

Categories

(Socorro :: Infra, task)

Tracking

(Not tracked)

People

(Reporter: rhelmer, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8