Closed
Bug 1173465
Opened 10 years ago
Closed 9 years ago
do not update socorroadmin node automatically
Categories
(Socorro :: Infra, task)
Socorro
Infra
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rhelmer, Unassigned)
References
Details
The socorroadmin node crontabber has long-running jobs and can't handle being run on multiple boxes, so I think for the time being we should treat it as a stateful service (like Consul or ES) and not push auto-updates to it.
Work is ongoing to make it safe to run multiple crontabber instances, but we'll still need to figure out the "long-running jobs" issue.
Comment 1•9 years ago
|
||
Crontabber rarely changes. What is changing often is the configuration (the .ini file and the `jobs` option). So, perhaps crontabber should permanently be run as a single master (per configuration that is!) and we just git pull on the socorro package and reload the crontabber.ini "periodically".
Sorry, if that doesn't make sense. What I'm trying to say is: Let's not make the crontabber node one of those privy to the chaos monkey like the web heads are. As long as it's easy to re-deploy the socorro code (since the crontabber .ini file loads in from the socorro python package).
Reporter | ||
Comment 2•9 years ago
|
||
(In reply to Peter Bengtsson [:peterbe] from comment #1)
> Crontabber rarely changes. What is changing often is the configuration (the
> .ini file and the `jobs` option). So, perhaps crontabber should permanently
> be run as a single master (per configuration that is!) and we just git pull
> on the socorro package and reload the crontabber.ini "periodically".
The crontabber config in AWS comes from consul now, so we don't even need the crontabber.ini! We will need to keep the Socorro package up to date, true - I think it'd be helpful to just start thinking of "socorro crontabber" as a separate app/service from collector/processor/webapp and so on.
> Sorry, if that doesn't make sense. What I'm trying to say is: Let's not make
> the crontabber node one of those privy to the chaos monkey like the web
> heads are. As long as it's easy to re-deploy the socorro code (since the
> crontabber .ini file loads in from the socorro python package).
Makes total sense, and this is what I was thinking of. We treat ES/Consul nodes this way, and also the ancillary services like crash-analysis and symbolapi, so it's not unprecedented.
I think we'll get a better opportunity next quarter to fix this as we split Socorro up into separate apps - the crontabber node(s) would only be recycled when there was a change in the "socorro crontabber" repo, this reduction in push frequency plus us figuring out how to make multiple crontabbers run safely will make the problem effectively go away, I think.
Reporter | ||
Comment 3•9 years ago
|
||
jp, can we exclude socorroadmin from getting updates, like we do for socorroanalysis and symbolapi? We'll be able to re-enable this next quarter, but socorro crontabber has long-running jobs and isn't safe to multiple instances of, so it doesn't fit with our deployment strategy quite yet.
We're going to work to fix this next quarter.
Flags: needinfo?(jschneider)
Comment 4•9 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #2)
> (In reply to Peter Bengtsson [:peterbe] from comment #1)
> > Crontabber rarely changes. What is changing often is the configuration (the
> > .ini file and the `jobs` option). So, perhaps crontabber should permanently
> > be run as a single master (per configuration that is!) and we just git pull
> > on the socorro package and reload the crontabber.ini "periodically".
>
>
> The crontabber config in AWS comes from consul now, so we don't even need
> the crontabber.ini! We will need to keep the Socorro package up to date,
> true - I think it'd be helpful to just start thinking of "socorro
> crontabber" as a separate app/service from collector/processor/webapp and so
> on.
>
So, do we still relay on this hardcoded list https://github.com/mozilla/socorro/blob/master/socorro/cron/crontabber_app.py#L9-L54 ?
Isn't the correlation jobs going to go off a different configuration so you can run parallel (unrelated dependency trees) crontabbers?
If so, how do you manage the timing of new config and new socorro python code?
Reporter | ||
Comment 5•9 years ago
|
||
(In reply to Peter Bengtsson [:peterbe] from comment #4)
> (In reply to Robert Helmer [:rhelmer] from comment #2)
> > (In reply to Peter Bengtsson [:peterbe] from comment #1)
> > > Crontabber rarely changes. What is changing often is the configuration (the
> > > .ini file and the `jobs` option). So, perhaps crontabber should permanently
> > > be run as a single master (per configuration that is!) and we just git pull
> > > on the socorro package and reload the crontabber.ini "periodically".
> >
> >
> > The crontabber config in AWS comes from consul now, so we don't even need
> > the crontabber.ini! We will need to keep the Socorro package up to date,
> > true - I think it'd be helpful to just start thinking of "socorro
> > crontabber" as a separate app/service from collector/processor/webapp and so
> > on.
> >
>
> So, do we still relay on this hardcoded list
> https://github.com/mozilla/socorro/blob/master/socorro/cron/crontabber_app.
> py#L9-L54 ?
>
> Isn't the correlation jobs going to go off a different configuration so you
> can run parallel (unrelated dependency trees) crontabbers?
> If so, how do you manage the timing of new config and new socorro python
> code?
No we don't rely on the hardcoded list - we use consul to configure the list of jobs as a comma-separated value, it looks like this:
crontabber.jobs=socorro.cron.jobs.laglog.LagLog|5m, socorro.cron.jobs.weekly_reports_partitions.WeeklyReportsPartitionsCronApp|7d, ...
We don't run the correlation jobs anymore, we decided not to use PG JSON for that.
Comment 6•9 years ago
|
||
https://github.com/mozilla/socorro-infra/pull/174 is to resolve this!
Flags: needinfo?(jschneider)
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment 7•9 years ago
|
||
OOPS! Closed the wrong bug - sorry.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 8•9 years ago
|
||
This PR got merged, so we're set here.
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•