Closed
Bug 1347183
Opened 7 years ago
Closed 7 years ago
no startup errors in Sentry [antenna]
Categories
(Socorro :: Antenna, task)
Socorro
Antenna
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: willkg, Assigned: willkg)
References
Details
Both Antenna -dev and -stage throw lots of startup errors for about 10 minutes after they recycle. There's lots of interleaved nonsense in the logs like this: app.add_route('breakpad', '/submit', BreakpadSubmitterResource(config)) client_config=config, api_version=api_version) File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 70, in create_client File "/app/antenna/ext/s3/connection.py", line 139, in _build_client File "/app/antenna/app.py", line 251, in get_app timeout=(new_config.connect_timeout, new_config.read_timeout)) ValueError: Invalid endpoint: https://s3..amazonaws.com File "/app/antenna/ext/s3/connection.py", line 99, in __init__ File "/app/antenna/app.py", line 251, in get_app ret = fun(*args, **kwargs) client_config=config, api_version=api_version) File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 224, in _get_client_args File "/usr/local/lib/python3.5/site-packages/boto3/session.py", line 263, in client self.crashstorage = self.config('crashstorage_class')(config.with_namespace('crashstorage')) Traceback (most recent call last): timeout=(new_config.connect_timeout, new_config.read_timeout)) ret = fun(*args, **kwargs) ValueError: Invalid endpoint: https://s3..amazonaws.com File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 224, in _get_client_args ret = fun(*args, **kwargs) File "/app/antenna/ext/s3/connection.py", line 99, in __init__ timeout=(new_config.connect_timeout, new_config.read_timeout)) self.crashstorage = self.config('crashstorage_class')(config.with_namespace('crashstorage')) File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 224, in _get_client_args File "/usr/local/lib/python3.5/site-packages/boto3/session.py", line 263, in client File "/app/antenna/ext/s3/connection.py", line 99, in __init__ File "/app/antenna/app.py", line 251, in get_app ValueError: Invalid endpoint: https://s3..amazonaws.com File "/usr/local/lib/python3.5/site-packages/boto3/session.py", line 263, in client client_config=config, api_version=api_version) verify, credentials, scoped_config, client_config, endpoint_bridge) File "/app/antenna/ext/s3/crashstorage.py", line 47, in __init__ File "/app/antenna/ext/s3/connection.py", line 139, in _build_client self.client = self._build_client() File "/app/antenna/ext/s3/connection.py", line 139, in _build_client self.client = self._build_client() verify, credentials, scoped_config, client_config, endpoint_bridge) app.add_route('breakpad', '/submit', BreakpadSubmitterResource(config)) aws_session_token=aws_session_token, config=config) [2017-03-13 23:45:41 +0000] [ANTENNA ip-172-31-57-191 25] [ERROR] antenna.app: Unhandled startup exception That's logged by this line: https://github.com/mozilla/antenna/blob/e212b5fbf76fb6de148ae315e138f66166633427/antenna/app.py#L265 This bug covers figuring out why these errors aren't making it to Sentry.
Assignee | ||
Comment 1•7 years ago
|
||
What's going on here is that the app creates a BreakpadSubmitterResource which creates an s3 connection which tries to HEAD the s3 bucket and then fails. All that should be in the "capture_unhandled_exceptions" context manager and thus sent to Sentry, but we don't see anything. Since this is a startup error, Gunicorn kills this process and starts a new one (I think). It might do it really fast--it's hard to tell from the logs. Maybe it happens so fast, Sentry doesn't have a chance to send the data? Maybe the code is wrong somewhere? It's been hard to test locally, so it's possible. Maybe the -dev and -stage environments are configured wrong? How do we verify the configuration is correct? Grabbing this to look into because if this means we're not getting *any* errors to Sentry, then that's bad.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•7 years ago
|
||
Miles pointed out we have an endpoint that tests Sentry. So Sentry is configured correctly. I'll go through the code again and see if I can't suss out issues.
Assignee | ||
Comment 3•7 years ago
|
||
I landed a patch to change the tracebacks so they're on a single line which will reduce the interleaving. I landed a patch to log whether or not the error got logged to Sentry. Then we'll know whether that exception catching thing is even kicking off at all. The only other thing I can think of doing is adding a gevent.sleep(1) to the exception handling figuring then Sentry has a chance to send the data (assuming that's the problem). I'll wait until the current set of things land and then re-evaluate.
Assignee | ||
Comment 4•7 years ago
|
||
The code is saying there's no sentry client set up, so that's what's going on. I'll spend some quality time trying to figure out why that might be the case.
Assignee | ||
Comment 5•7 years ago
|
||
We see startup errors now, so we're good here. Marking as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 6•7 years ago
|
||
Switching Antenna bugs to Antenna component.
Component: General → Antenna
You need to log in
before you can comment on or make changes to this bug.
Description
•