Closed Bug 1439546 Opened 7 years ago Closed 7 years ago

configuration cleanup (2/20/2018)

Categories

(Socorro :: Infra, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

We landed a PR that changed how behavior configuration is specified in Socorro and reduced the number of places we specify configuration (yay!). Soon, we're going to land another PR that changes configuration in the processor. I'm working on a PR that re-fixes the telemetry bucket_name configuration setting. As a result of these landings, we need to make configuration changes in -stage and -prod. We'll need to reflect these in -new-stage and -new-prod.
This is a stub at the moment. I think we'll have a small number of changes that *need* to be made now and a large number of changes we can make because they're redundant, but don't have to make them now. Next step is to go through all the configuration everywhere to figure out exactly what changes need to be made.
In the investigation into bug #1439544, we discovered the crontabber jobs config that doesn't start with "crontabber." is all junk. We can remove those keys from -stage and -prod. One thing we might want to do is run crontabber and processor and remove any keys from consul that don't get listed in the logs at startup. That's probably an easy pass to do and it'd identify 95%+ of the junk keys.
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #2) > In the investigation into bug #1439544, we discovered the crontabber jobs > config that doesn't start with "crontabber." is all junk. We can remove > those keys from -stage and -prod. Note that, on -stage, there are a few crontabber config keys that correctly have the "crontabber." prefix, namely: socorro/crontabber/crontabber.class-DependencySecurityCheckCronApp.nsp_path socorro/crontabber/crontabber.class-DependencySecurityCheckCronApp.package_json_path socorro/crontabber/crontabber.class-DependencySecurityCheckCronApp.safety_api_key socorro/crontabber/crontabber.class-DependencySecurityCheckCronApp.safety_path These should not be removed.
socorro/common in -stage has these keys which are not used: """ socorro/common/secrets.exacttarget.exacttarget_password socorro/common/secrets.exacttarget.exacttarget_user """ No clue what those are, but they aren't used anywhere. socorro/crontabber in -stage has these keys which are not used: """ socorro/crontabber/class-ElasticsearchCleanupCronApp.retention_policy socorro/crontabber/class-MissingSymbolsCronApp.bucket_name socorro/crontabber/class-ReprocessingJobsApp.filter_on_legacy_processing socorro/crontabber/class-ReprocessingJobsApp.queue_class socorro/crontabber/class-ReprocessingJobsApp.routing_key socorro/crontabber/class-ServerStatusCronApp.queue_class socorro/crontabber/class-UploadCrashReportJSONSchemaCronApp.bucket_name """ We need to set this one correctly: """ socorro/crontabber/crontabber.class-UploadCrashReportJSONSchemaCronApp.bucket_name """ socorro/processor in -stage has these keys which are not used: """ socorro/processor/destination.storage-1.benchmark_tag socorro/processor/destination.storage0.active_list socorro/processor/destination.storage0.backoff_delays socorro/processor/destination.storage0.benchmark_tag socorro/processor/destination.storage0.crashstorage_class socorro/processor/destination.storage0.statsd_prefix socorro/processor/destination.storage0.transaction_executor_class socorro/processor/destination.storage0.wrapped_object_class socorro/processor/destination.storage1.active_list socorro/processor/destination.storage1.benchmark_tag socorro/processor/destination.storage1.crashstorage_class socorro/processor/destination.storage1.statsd_prefix socorro/processor/destination.storage1.wrapped_object_class socorro/processor/destination.storage2.active_list socorro/processor/destination.storage2.crashstorage_class socorro/processor/destination.storage2.es_redactor.forbidden_keys socorro/processor/destination.storage2.statsd_prefix socorro/processor/destination.storage2.wrapped_object_class socorro/processor/destination.storage3.active_counters_list socorro/processor/destination.storage3.active_list socorro/processor/destination.storage3.crashstorage_class socorro/processor/destination.storage3.prefix socorro/processor/destination.storage3.statsd_prefix socorro/processor/destination.storage4.active_list socorro/processor/destination.storage4.bucket_name socorro/processor/destination.storage4.crashstorage_class socorro/processor/destination.storage4.statsd_prefix socorro/processor/destination.storage4.wrapped_object_class socorro/processor/destination.storage_classes """ The bulk of those were obsoleted by Mike's recent PolyCrashStorage change. One was a junk key to begin with. For socorro/webapp-django: """ socorro/webapp-django/CACHE_MIDDLEWARE socorro/webapp-django/CACHE_MIDDLEWARE_FILES socorro/webapp-django/MWARE_BASE_URL socorro/webapp-django/MWARE_HTTP_HOST """ We also have these that should get removed: """ socorro/collector/application socorro/collector/collector.accept_submitted_crash_id socorro/collector/metrics.metrics_class socorro/collector/metrics.statsd_prefix socorro/collector/storage.crashstorage_class socorro/collector/web_server.wsgi_server_class socorro/crashmover/destination.crashstorage_class socorro/crashmover/destination.storage0.crashstorage_class socorro/crashmover/destination.storage0.statsd_prefix socorro/crashmover/destination.storage0.wrapped_crashstore socorro/crashmover/destination.storage0.wrapped_object_class socorro/crashmover/destination.storage1.benchmark_tag socorro/crashmover/destination.storage1.crashstorage_class socorro/crashmover/destination.storage1.statsd_prefix socorro/crashmover/destination.storage1.wrapped_crashstore socorro/crashmover/destination.storage1.wrapped_object_class socorro/crashmover/destination.storage2.active_counters_list socorro/crashmover/destination.storage2.active_list socorro/crashmover/destination.storage2.crashstorage_class socorro/crashmover/destination.storage2.statsd_prefix socorro/crashmover/destination.storage_classes socorro/crashmover/estination.storage2.active_list socorro/crashmover/producer_consumer.maximum_queue_size 24 socorro/crashmover/producer_consumer.number_of_threads 12 socorro/middleware/elasticsearch.elasticsearch_class socorro/middleware/hbase.hbase_class socorro/middleware/implementations.implementation_list socorro/middleware/rabbitmq.rabbitmq_class socorro/middleware/web_server.wsgi_server_class socorro/web-django/resource.elasticsearch.elasticsearch_urls """ We no longer have collector, crashmover, or middleware services and "web-django" was probably a typo. For processor and crontabber, I compared the list of keys with what the app outputs at startup. For webapp-django and common, I just looked through it. After we do that, we can do a pass and remove configuration where the value is the same as the default in code. I think there are probably a few like that. Lonnen, Osmose: Does the method of figuring this out look sane? Does the list look ok?
Flags: needinfo?(chris.lonnen)
needinfo'ing Osmose, too.
Flags: needinfo?(mkelly)
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Looks great to me!
Flags: needinfo?(mkelly)
I removed all those keys and fixed the configuration for UploadCrashReportJSONSchemaCronApp and we're down to a total of 68 config variables in -stage. Yay! We need to do a system check on -stage and make any additional changes. After that, we can look at doing the same for -prod.
Flags: needinfo?(chris.lonnen)
Since the work was done, we've done a system check and run it for 23 days--stage is fine. I'm auditing -prod configuration now. We're about to switch infrastructures, so we don't really need to clean up configuration, but it'd be good to know what's missing in -new-prod if anything and an audit will help. Doing that this week.
Here's the list of things to delete from -prod. I used -stage as a template of what to keep and double-checked the logs for the processor and crontabber for those keys just like I did with -stage. No more collector, crashmover, reprocessor, or middleware components, so we don't need their configuration either: """ socorro/collector/application socorro/collector/metrics.metrics_class socorro/collector/metrics.statsd_prefix socorro/collector/storage.crashstorage_class socorro/collector/web_server.wsgi_server_class socorro/reprocessor/destination.storage0.active_list socorro/reprocessor/destination.storage1.active_list socorro/reprocessor/destination.storage2.active_list socorro/reprocessor/destination.storage3.active_list socorro/reprocessor/destination.storage4.active_list socorro/reprocessor/resource.rabbitmq.reprocessing_queue_name socorro/reprocessor/resource.rabbitmq.standard_queue_name socorro/reprocessor/resource.statsd.active_list socorro/reprocessor/resource.statsd.statsd_prefix socorro/crashmover/destination.crashstorage_class socorro/crashmover/destination.storage0.benchmark_tag socorro/crashmover/destination.storage0.crashstorage_class socorro/crashmover/destination.storage0.filter_on_legacy_processing socorro/crashmover/destination.storage0.statsd_prefix socorro/crashmover/destination.storage0.wrapped_crashstore socorro/crashmover/destination.storage0.wrapped_object_class socorro/crashmover/destination.storage1.benchmark_tag socorro/crashmover/destination.storage1.crashstorage_class socorro/crashmover/destination.storage1.filter_on_legacy_processing socorro/crashmover/destination.storage1.statsd_prefix socorro/crashmover/destination.storage1.wrapped_crashstore socorro/crashmover/destination.storage1.wrapped_object_class socorro/crashmover/destination.storage2.active_list socorro/crashmover/destination.storage2.crashstorage_class socorro/crashmover/destination.storage2.statsd_prefix socorro/crashmover/destination.storage3.active_list socorro/crashmover/destination.storage3.crashstorage_class socorro/crashmover/destination.storage3.filter_on_legacy_processing socorro/crashmover/destination.storage3.host socorro/crashmover/destination.storage3.priority_queue_name socorro/crashmover/destination.storage3.routing_key socorro/crashmover/destination.storage3.standard_queue_name socorro/crashmover/destination.storage3.statsd_prefix socorro/crashmover/destination.storage3.throttle socorro/crashmover/destination.storage3.virtual_host socorro/crashmover/destination.storage3.wrapped_object_class socorro/crashmover/destination.storage_classes socorro/crashmover/estination.storage2.active_list socorro/crashmover/producer_consumer.maximum_queue_size socorro/crashmover/producer_consumer.number_of_threads socorro/crashmover/source.crashstorage_class socorro/middleware/elasticsearch.elasticsearch_class socorro/middleware/hbase.hbase_class socorro/middleware/implementations.implementation_list socorro/middleware/implementations.service_overrides socorro/middleware/rabbitmq.rabbitmq_class socorro/middleware/web_server.wsgi_server_class """ These are no longer used because of recent changes: """ socorro/processor/destination.storage0.benchmark_tag socorro/processor/destination.storage0.crashstorage_class socorro/processor/destination.storage0.statsd_prefix socorro/processor/destination.storage0.transaction_executor_class socorro/processor/destination.storage0.wrapped_crashstore socorro/processor/destination.storage0.wrapped_object_class socorro/processor/destination.storage1.active_list socorro/processor/destination.storage1.benchmark_tag socorro/processor/destination.storage1.crashstorage_class socorro/processor/destination.storage1.statsd_prefix socorro/processor/destination.storage1.use_mapping_file socorro/processor/destination.storage1.wrapped_crashstore socorro/processor/destination.storage1.wrapped_object_class socorro/processor/destination.storage2.active_list socorro/processor/destination.storage2.benchmark_tag socorro/processor/destination.storage2.crashstorage_class socorro/processor/destination.storage2.es_redactor.forbidden_keys socorro/processor/destination.storage2.statsd_prefix socorro/processor/destination.storage2.use_mapping_file False socorro/processor/destination.storage2.wrapped_crashstore socorro/processor/destination.storage2.wrapped_object_class socorro/processor/destination.storage3.active_list socorro/processor/destination.storage3.crashstorage_class socorro/processor/destination.storage3.statsd_prefix socorro/processor/destination.storage4.active_list socorro/processor/destination.storage4.bucket_name socorro/processor/destination.storage4.crashstorage_class socorro/processor/destination.storage4.statsd_prefix socorro/processor/destination.storage4.wrapped_object_class socorro/processor/destination.storage_classes socorro/processor/processor.raw_to_processed_transform.BreakpadStackwalkerRule2015.symbol_cache_path socorro/processor/processor.raw_to_processed_transform.BreakpadStackwalkerRule2015.symbol_tmp_path socorro/processor/processor.raw_to_processed_transform.BreakpadStackwalkerRule2015.symbols_urls socorro/processor/destination.storage_classes_tmp socorro/processor/processor.jit_classifers.JitCrashCategorizeRule.chatty socorro/processor/resource.signature.collapse_arguments socorro/processor/sdCounterdestination.storage3.active_list socorro/crontabber/class-ElasticsearchCleanupCronApp.retention_policy socorro/crontabber/class-MissingSymbolsCronApp.bucket_name socorro/crontabber/class-ReprocessingJobsApp.filter_on_legacy_processing socorro/crontabber/class-ReprocessingJobsApp.queue_class socorro/crontabber/class-ReprocessingJobsApp.routing_key socorro/crontabber/class-ServerStatusCronApp.queue_class socorro/crontabber/class-UploadCrashReportJSONSchemaCronApp.bucket_name socorro/crontabber/crontabber.class-BugzillaCronApp.days_into_past socorro/crontabber/database_hostname """ We don't have the middleware, symbols, or a data service anymore: """ socorro/webapp-django/CACHE_MIDDLEWARE socorro/webapp-django/CACHE_MIDDLEWARE_FILES socorro/webapp-django/DATABASE_USER socorro/webapp-django/DATASERVICE_DATABASE_HOSTNAME socorro/webapp-django/DATASERVICE_DATABASE_NAME socorro/webapp-django/DATASERVICE_DATABASE_PASSWORD socorro/webapp-django/DATASERVICE_DATABASE_PORT socorro/webapp-django/DATASERVICE_DATABASE_USERNAME socorro/webapp-django/MWARE_BASE_URL socorro/webapp-django/MWARE_HTTP_HOST socorro/webapp-django/SYMBOLS_BUCKET_DEFAULT_LOCATION socorro/webapp-django/SYMBOLS_BUCKET_DEFAULT_NAME socorro/webapp-django/SYMBOLS_BUCKET_EXCEPTIONS """ Process: 1. Backup existing -prod configuration. 2. Delete all the keys listed above. 3. Watch graphs, sentry, and logs for errors. Mike: Does that look ok to you?
Flags: needinfo?(mkelly)
Looks good to me
Flags: needinfo?(mkelly)
I converted it to a bash script, went through it again, and then ran it in -prod. I'll keep an eye on things over the course of today. If everything's fine tomorrow, I'll close this out.
I watched Datadog and sentry and everything looks fine. I'm going to close this out as FIXED. Yay!
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.