Closed Bug 1347183 Opened 7 years ago Closed 7 years ago

no startup errors in Sentry [antenna]

Categories

(Socorro :: Antenna, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Both Antenna -dev and -stage throw lots of startup errors for about 10 minutes after they recycle. There's lots of interleaved nonsense in the logs like this:

app.add_route('breakpad', '/submit', BreakpadSubmitterResource(config))
client_config=config, api_version=api_version)
File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 70, in create_client
File "/app/antenna/ext/s3/connection.py", line 139, in _build_client
File "/app/antenna/app.py", line 251, in get_app
timeout=(new_config.connect_timeout, new_config.read_timeout))
ValueError: Invalid endpoint: https://s3..amazonaws.com
File "/app/antenna/ext/s3/connection.py", line 99, in __init__
File "/app/antenna/app.py", line 251, in get_app
ret = fun(*args, **kwargs)
client_config=config, api_version=api_version)
File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 224, in _get_client_args
File "/usr/local/lib/python3.5/site-packages/boto3/session.py", line 263, in client
self.crashstorage = self.config('crashstorage_class')(config.with_namespace('crashstorage'))
Traceback (most recent call last):
timeout=(new_config.connect_timeout, new_config.read_timeout))
ret = fun(*args, **kwargs)
ValueError: Invalid endpoint: https://s3..amazonaws.com
File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 224, in _get_client_args
ret = fun(*args, **kwargs)
File "/app/antenna/ext/s3/connection.py", line 99, in __init__
timeout=(new_config.connect_timeout, new_config.read_timeout))
self.crashstorage = self.config('crashstorage_class')(config.with_namespace('crashstorage'))
File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 224, in _get_client_args
File "/usr/local/lib/python3.5/site-packages/boto3/session.py", line 263, in client
File "/app/antenna/ext/s3/connection.py", line 99, in __init__
File "/app/antenna/app.py", line 251, in get_app
ValueError: Invalid endpoint: https://s3..amazonaws.com
File "/usr/local/lib/python3.5/site-packages/boto3/session.py", line 263, in client
client_config=config, api_version=api_version)
verify, credentials, scoped_config, client_config, endpoint_bridge)
File "/app/antenna/ext/s3/crashstorage.py", line 47, in __init__
File "/app/antenna/ext/s3/connection.py", line 139, in _build_client
self.client = self._build_client()
File "/app/antenna/ext/s3/connection.py", line 139, in _build_client
self.client = self._build_client()
verify, credentials, scoped_config, client_config, endpoint_bridge)
app.add_route('breakpad', '/submit', BreakpadSubmitterResource(config))
aws_session_token=aws_session_token, config=config)
[2017-03-13 23:45:41 +0000] [ANTENNA ip-172-31-57-191 25] [ERROR] antenna.app: Unhandled startup exception


That's logged by this line:

https://github.com/mozilla/antenna/blob/e212b5fbf76fb6de148ae315e138f66166633427/antenna/app.py#L265


This bug covers figuring out why these errors aren't making it to Sentry.
What's going on here is that the app creates a BreakpadSubmitterResource which creates an s3 connection which tries to HEAD the s3 bucket and then fails. All that should be in the "capture_unhandled_exceptions" context manager and thus sent to Sentry, but we don't see anything.

Since this is a startup error, Gunicorn kills this process and starts a new one (I think). It might do it really fast--it's hard to tell from the logs.

Maybe it happens so fast, Sentry doesn't have a chance to send the data?

Maybe the code is wrong somewhere? It's been hard to test locally, so it's possible.

Maybe the -dev and -stage environments are configured wrong? How do we verify the configuration is correct?


Grabbing this to look into because if this means we're not getting *any* errors to Sentry, then that's bad.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Miles pointed out we have an endpoint that tests Sentry. So Sentry is configured correctly.

I'll go through the code again and see if I can't suss out issues.
I landed a patch to change the tracebacks so they're on a single line which will reduce the interleaving.

I landed a patch to log whether or not the error got logged to Sentry. Then we'll know whether that exception catching thing is even kicking off at all.

The only other thing I can think of doing is adding a gevent.sleep(1) to the exception handling figuring then Sentry has a chance to send the data (assuming that's the problem).

I'll wait until the current set of things land and then re-evaluate.
The code is saying there's no sentry client set up, so that's what's going on. I'll spend some quality time trying to figure out why that might be the case.
Depends on: 1342619
We see startup errors now, so we're good here. Marking as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Switching Antenna bugs to Antenna component.
Component: General → Antenna
You need to log in before you can comment on or make changes to this bug.