Closed Bug 1381879 Opened 7 years ago Closed 5 years ago

[traceback] SerializationError: (big long string that's 7.3mb long)

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: willkg, Unassigned)

References

(Blocks 1 open bug)

Details

From: https://sentry.prod.mozaws.net/operations/socorro-stage/issues/619628/

"""
SerializationError: (u'{"took":971,"timed_out":false,"_shards":{"total":20,"successful":20,"failed":0},"hits":{"total":15800,"max_score":0.0,"hits":[]},"aggregations":{"signature":{"doc_count_error_upper_bound":30,"sum_other_doc_count":9565,"buckets":[{"key":"OOM | small","doc_count":1045,"startup_crash":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":0,"key_as_string":"false","doc_count":1001},{"key":1,"key_as_string":"true","doc_count":37}]},"histogram_uptime":{"buckets":[{"key":0.0,"doc_count":139},{"key":60.0,"doc_count":24},{"key":7440.0,"doc...
  File "django/core/handlers/base.py", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "newrelic/hooks/framework_django.py", line 499, in wrapper
    return wrapped(*args, **kwargs)
  File "crashstats/crashstats/decorators.py", line 101, in inner
    return view(request, *args, **kwargs)
  File "session_csrf/__init__.py", line 158, in wrapper
    response = f(request, *args, **kw)
  File "crashstats/crashstats/decorators.py", line 69, in inner
    return view(request, *args, **kwargs)
  File "crashstats/topcrashers/views.py", line 297, in topcrashers
    _range_type=range_type,
  File "crashstats/topcrashers/views.py", line 59, in get_topcrashers_results
    search_results = api.get(**params)
  File "crashstats/supersearch/models.py", line 214, in get
    return super(SuperSearch, self).get(**kwargs)
  File "crashstats/crashstats/models.py", line 341, in get
    return self._get(expect_json=expect_json, **kwargs)
  File "crashstats/crashstats/models.py", line 399, in _get
    expect_json=expect_json,
  File "crashstats/crashstats/models.py", line 182, in inner
    result = method(*args, **kwargs)
  File "crashstats/crashstats/models.py", line 272, in fetch
    result = implementation_method(**params)
  File "socorro/external/es/supersearch.py", line 450, in get
    results = search.execute()
  File "elasticsearch_dsl/search.py", line 606, in execute
    **self._params
  File "newrelic/hooks/datastore_elasticsearch.py", line 70, in _nr_wrapper_Elasticsearch_method_
    return wrapped(*args, **kwargs)
  File "elasticsearch/client/utils.py", line 73, in _wrapped
    return func(*args, params=params, **kwargs)
  File "elasticsearch/client/__init__.py", line 625, in search
    doc_type, '_search'), params=params, body=body)
  File "elasticsearch/transport.py", line 348, in perform_request
    data = self.deserializer.loads(data, headers.get('content-type'))
  File "elasticsearch/serializer.py", line 76, in loads
    return deserializer.loads(s)
  File "elasticsearch/serializer.py", line 40, in loads
    raise SerializationError(s, e)
"""

Peter pointed out that this is likely to be Elasticsearch-migration related. We should look at this before we finish up the Elasticsearch migration project.
Summary: [traceback] → [traceback] SerializationError: (u'{"took":971,"timed_out":false
This is irksome. The problem is the way they're creating SerializationError:

https://github.com/elastic/elasticsearch-py/blob/57362aa8c7c739cc46d57f380f984ab972e6d08f/elasticsearch/serializer.py#L40

That makes the string it's having problems with the exception msg. So Sentry spits out the first x characters of that and that's all we get. The actual error is in the variable "e":

JSONDecodeError("Expecting ',' delimiter or '}': line 1 column 7534060 (char 7534059)",)

Off the top of my head, I can think of a few possibilities:

1. It's a bug/problem in Elasticsearch. Maybe this is a lot of data to return and there's a configurable cap?

2. It's a bug/problem in nginx which sits in front of Elasticsearch. Maybe there's a configurable amount of data to send through? Or a timeout was hit? Or something like that?


Since this is with topcrashers, I'm inclined to let it go for now and if it persists when we go to -prod, we should fix it there.
As a side note, there's an elasticsearch-py 5.4 out. We're using 5.3. I don't see any big differences between the two, so I doubt upgrading the Python library will fix this issue.
Summary: [traceback] SerializationError: (u'{"took":971,"timed_out":false → [traceback] SerializationError: (big long string that's 7.3mb long)
The Sentry issue data is gone. I'm not sure what I can do here, so I'm going to close it out as INCOMPLETE.

If it turns out to be a problem in the next migration attempt, it'll rear its ugly head and we can figure it out then.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.