Closed Bug 1306310 Opened 9 years ago Closed 9 years ago

[traceback] RequestError: TransportError(400, u'RemoteTransportException[[i-450d96b2]

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 946921

People

(Reporter: willkg, Assigned: adrian)

References

Details

I'm verifying the processor in prod is working and saw a few of these in the logs: Sep 29 14:04:29 prod-processor-i-3df27225 bash: 2016-09-29 14:04:29,181 CRITICAL - processor - - Thread-4 - Submission to Elasticsearch failed for 2331d75f-877c-4826-bd21-061fb2160929 (TransportError(400, u'RemoteTransportException[[i-450d96b2][inet[/172.31.44.16:9300]][indices:data/write/index]]; nested: IllegalArgumentException[Document contains at least one immense term in field="raw_crash.AsyncShutdownTimeout.full" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: \'[123, 34, 112, 104, 97, 115, 101, 34, 58, 34, 112, 114, 111, 102, 105, 108, 101, 45, 98, 101, 102, 111, 114, 101, 45, 99, 104, 97, 110, 103]...\', original message: bytes can be at most 32766 in length; got 40324]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 40324]; ')) Sep 29 14:04:29 prod-processor-i-3df27225 bash: Traceback (most recent call last): Sep 29 14:04:29 prod-processor-i-3df27225 bash: File "/data/socorro/socorro-virtualenv/lib/python2.7/site-packages/socorro-master-py2.7.egg/socorro/external/es/crashstorage.py", line 146, in _submit_crash_to_elasticsearch Sep 29 14:04:29 prod-processor-i-3df27225 bash: id=crash_id Sep 29 14:04:29 prod-processor-i-3df27225 bash: File "/data/socorro/socorro-virtualenv/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped Sep 29 14:04:29 prod-processor-i-3df27225 bash: return func(*args, params=params, **kwargs) Sep 29 14:04:29 prod-processor-i-3df27225 bash: File "/data/socorro/socorro-virtualenv/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 213, in index Sep 29 14:04:29 prod-processor-i-3df27225 bash: _make_path(index, doc_type, id), params=params, body=body) Sep 29 14:04:29 prod-processor-i-3df27225 bash: File "/data/socorro/socorro-virtualenv/lib/python2.7/site-packages/elasticsearch/transport.py", line 284, in perform_request Sep 29 14:04:29 prod-processor-i-3df27225 bash: status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout) Sep 29 14:04:29 prod-processor-i-3df27225 bash: File "/data/socorro/socorro-virtualenv/lib/python2.7/site-packages/elasticsearch/connection/http_requests.py", line 54, in perform_request Sep 29 14:04:29 prod-processor-i-3df27225 bash: self._raise_error(response.status_code, raw_data) Sep 29 14:04:29 prod-processor-i-3df27225 bash: File "/data/socorro/socorro-virtualenv/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 97, in _raise_error Sep 29 14:04:29 prod-processor-i-3df27225 bash: raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) Sep 29 14:04:29 prod-processor-i-3df27225 bash: RequestError: TransportError(400, u'RemoteTransportException[[i-450d96b2][inet[/172.31.44.16:9300]][indices:data/write/index]]; nested: IllegalArgumentException[Document contains at least one immense term in field="raw_crash.AsyncShutdownTimeout.full" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: \'[123, 34, 112, 104, 97, 115, 101, 34, 58, 34, 112, 114, 111, 102, 105, 108, 101, 45, 98, 101, 102, 111, 114, 101, 45, 99, 104, 97, 110, 103]...\', original message: bytes can be at most 32766 in length; got 40324]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 40324]; ')
Crash ids I saw were these: fbd057d1-2a20-445a-8969-4d0b42160929 2331d75f-877c-4826-bd21-061fb2160929 64d8cc36-fee4-49f5-a4c1-fb4802160929 There might be others--I'm just glancing and not doing any kind of thorough analysis.
Adrian: Can you look into this?
Assignee: nobody → adrian
JP points out that loggly shows the same error periodically: https://www.dropbox.com/s/srz584qqboo784u/Screenshot%202016-09-29%2009.30.55.jpg That screenshot can be described in this way: A man sits forlornly next to a set of burning logs in the fireplace in an otherwise silent room. He stares into the firey embers and sees a vision of 240 instances of a child of perhaps seven named: "UTF8 encoding is longer than the max length 32766" It's an odd name for a child, but visions of this nature are usually odd. Oh, but wait. It's not 240 different children. It's 240 of the same child spread out across the date period 2016-09-15 through 2016-09-27. It is like pictures in a book. Frames of a film.
So this is a known error. It happens over a few different numeric fields, and we have, back in the days, estimated that it was low enough that we could just ignore it. However, there is a bug about solving this problem: bug 946921 Shall we keep ignoring it, or do we want to actively solve it now?
Seems like bug #946921 starts out with specific issues, but then the scope opens up to cover the general problem. This is new to me, so I'd want to talk about the general problem and the specific issues we've seen and mull over the various options and what we can say about our data after we implement them. Off the top of my head, I'm concerned that we do things like this and then other people do super searches and while it's probably not a big deal if 10 crashes don't get indexed, what happens if those 10 crashes don't get indexed because they're about a specific problem which happens to create crashes that have values ES is sad about? Are we unintentionally blinded to entire classes of crashes? Given that, my urge is to be really methodical about how we fix invalid value problems and do it one-by-one so we can know in our hearts we're doing the right thing and not creating a new problem.
It's hard to separate this bug from https://bugzilla.mozilla.org/show_bug.cgi?id=946921 Can we just make this a dupe of https://bugzilla.mozilla.org/show_bug.cgi?id=946921 and put our energy into that one. The bug summary of that one is generic and not about a specific exception.
Flags: needinfo?(adrian)
Peter: Like I said in comment #5, I think duping this bug to bug #946921 is a bad idea. I'd rather solve data issues these one-by-one rather than do a general solution.
See Also: → 946921
Bug 946921 is effectively going to cover this. We can file more bugs if other ranges of errors happen.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(adrian)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.