Closed Bug 1173178 Opened 10 years ago Closed 10 years ago

[traceback] UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte

Categories

(developer.mozilla.org Graveyard :: Code Cleanup, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: stephend, Assigned: jezdez)

References

()

Details

(Keywords: in-triage, Whiteboard: [fuzzer])

https://developer.mozilla.org/en-US/search?q=%BF%27%22%28&topic=api&skill=intermediate&type=howto%2F throws: UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte Stacktrace (most recent call last): File "django/core/handlers/base.py", line 111, in get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "newrelic/hooks/framework_django.py", line 499, in wrapper return wrapped(*args, **kwargs) File "django/views/decorators/csrf.py", line 57, in wrapped_view return view_func(*args, **kwargs) File "django/views/generic/base.py", line 69, in view return self.dispatch(request, *args, **kwargs) File "newrelic/hooks/component_djangorestframework.py", line 27, in _nr_wrapper_APIView_dispatch_ return wrapped(*args, **kwargs) File "rest_framework/views.py", line 403, in dispatch response = self.handle_exception(exc) File "rest_framework/views.py", line 400, in dispatch response = handler(request, *args, **kwargs) File "rest_framework/generics.py", line 451, in get return self.list(request, *args, **kwargs) File "kuma/search/views.py", line 68, in list return super(SearchView, self).list(request, *args, **kwargs) File "rest_framework/mixins.py", line 98, in list return Response(serializer.data) File "rest_framework/serializers.py", line 576, in data self._data = self.to_native(obj) File "rest_framework/serializers.py", line 355, in to_native value = field.field_to_native(obj, field_name) File "rest_framework/fields.py", line 1041, in field_to_native value = getattr(self.parent, self.method_name)(obj) File "kuma/search/serializers.py", line 124, in get_filters many=True File "rest_framework/serializers.py", line 574, in data self._data = [self.to_native(item) for item in obj] File "rest_framework/serializers.py", line 355, in to_native value = field.field_to_native(obj, field_name) File "rest_framework/serializers.py", line 420, in field_to_native return [self.to_native(item) for item in value] File "rest_framework/serializers.py", line 355, in to_native value = field.field_to_native(obj, field_name) File "rest_framework/serializers.py", line 409, in field_to_native value = get_component(value, component) File "rest_framework/fields.py", line 61, in get_component return val() File "kuma/search/queries.py", line 15, in urls self.url.merge_query_param(self.group_slug, self.slug)), File "kuma/search/utils.py", line 40, in merge_query_param params = self.query.multi_dict File "urlobject/query_string.py", line 47, in multi_dict for name, value in self.list: File "urlobject/query_string.py", line 35, in list value = qs_decode(value) File "urlobject/query_string.py", line 138, in _qs_decode_py2 return urllib.unquote_plus(s).decode('utf-8') File "encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) Errormill: https://errormill.mozilla.org/mdn/mdn/group/398332/
Blocks: 1174209
Component: General → Code Cleanup
Keywords: in-triage
So I'm not sure where this request comes from, but this query is a iso8859-1 encoded value, instead of utf-8 what the search expects. Can you clarify where that data comes from? Is that a legitimate request or someone fuzzing our forms?
Flags: needinfo?(stephen.donner)
(In reply to Jannis Leidel [:jezdez] from comment #1) > So I'm not sure where this request comes from, but this query is a iso8859-1 > encoded value, instead of utf-8 what the search expects. Can you clarify > where that data comes from? Is that a legitimate request or someone fuzzing > our forms? Pretty sure this was/is from a fuzzer -- not sure if it's from PowerFuzzer[0] or from Netsparker[1] [0] http://sourceforge.net/projects/powerfuzzer/ [1] https://www.netsparker.com
Flags: needinfo?(stephen.donner)
Whiteboard: [fuzzer]
:stephend Would you argue that this type of error should be handled on app level in our code or is it an acceptable response when providing data that is encoded in a way that we don't expect?
Flags: needinfo?(stephen.donner)
(In reply to Jannis Leidel [:jezdez] from comment #3) > :stephend Would you argue that this type of error should be handled on app > level in our code or is it an acceptable response when providing data that > is encoded in a way that we don't expect? Our team was taught throughout the years by other Webdevers that "the server must gracefully deal with crappy requests" - how that plays out in individual situations isn't always clear, but returning an ISE for fuzzed input could be handled better - 404 or another one of the 4xx codes: https://developer.mozilla.org/en-US/docs/Web/HTTP/Response_codes
Flags: needinfo?(stephen.donner)
Thanks Stephen, that makes sense. I'll make sure to wrap the code that triggered the bad behavior in the search's URL parameter handling in a try except block and raise an appropriate HTTP response instead.
Assignee: nobody → jezdez
Status: NEW → ASSIGNED
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/798bb8c270f6aecad837bb90d830d9d8cf3b6a76 Fix bug 1173178 - Catch Unicode decode errors in the search API. This shows a 404 for the search if it's not working because of a decoding error of search query parameters. https://github.com/mozilla/kuma/commit/0ce11d9aa9ffc09c3f85fbffe63e658195104282 Merge pull request #3347 from mozilla/bug1173178 Fix bug 1173178 - Catch Unicode decode errors in the search API.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.