[traceback] UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte

RESOLVED FIXED

Status

--
major
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: stephend, Assigned: jezdez)

Tracking

({in-triage})

Details

(Whiteboard: [fuzzer], URL)

(Reporter)

Description

3 years ago
https://developer.mozilla.org/en-US/search?q=%BF%27%22%28&topic=api&skill=intermediate&type=howto%2F throws:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte

Stacktrace (most recent call last):

  File "django/core/handlers/base.py", line 111, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "newrelic/hooks/framework_django.py", line 499, in wrapper
    return wrapped(*args, **kwargs)
  File "django/views/decorators/csrf.py", line 57, in wrapped_view
    return view_func(*args, **kwargs)
  File "django/views/generic/base.py", line 69, in view
    return self.dispatch(request, *args, **kwargs)
  File "newrelic/hooks/component_djangorestframework.py", line 27, in _nr_wrapper_APIView_dispatch_
    return wrapped(*args, **kwargs)
  File "rest_framework/views.py", line 403, in dispatch
    response = self.handle_exception(exc)
  File "rest_framework/views.py", line 400, in dispatch
    response = handler(request, *args, **kwargs)
  File "rest_framework/generics.py", line 451, in get
    return self.list(request, *args, **kwargs)
  File "kuma/search/views.py", line 68, in list
    return super(SearchView, self).list(request, *args, **kwargs)
  File "rest_framework/mixins.py", line 98, in list
    return Response(serializer.data)
  File "rest_framework/serializers.py", line 576, in data
    self._data = self.to_native(obj)
  File "rest_framework/serializers.py", line 355, in to_native
    value = field.field_to_native(obj, field_name)
  File "rest_framework/fields.py", line 1041, in field_to_native
    value = getattr(self.parent, self.method_name)(obj)
  File "kuma/search/serializers.py", line 124, in get_filters
    many=True
  File "rest_framework/serializers.py", line 574, in data
    self._data = [self.to_native(item) for item in obj]
  File "rest_framework/serializers.py", line 355, in to_native
    value = field.field_to_native(obj, field_name)
  File "rest_framework/serializers.py", line 420, in field_to_native
    return [self.to_native(item) for item in value]
  File "rest_framework/serializers.py", line 355, in to_native
    value = field.field_to_native(obj, field_name)
  File "rest_framework/serializers.py", line 409, in field_to_native
    value = get_component(value, component)
  File "rest_framework/fields.py", line 61, in get_component
    return val()
  File "kuma/search/queries.py", line 15, in urls
    self.url.merge_query_param(self.group_slug, self.slug)),
  File "kuma/search/utils.py", line 40, in merge_query_param
    params = self.query.multi_dict
  File "urlobject/query_string.py", line 47, in multi_dict
    for name, value in self.list:
  File "urlobject/query_string.py", line 35, in list
    value = qs_decode(value)
  File "urlobject/query_string.py", line 138, in _qs_decode_py2
    return urllib.unquote_plus(s).decode('utf-8')
  File "encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)

Errormill: https://errormill.mozilla.org/mdn/mdn/group/398332/
Blocks: 1174209
Component: General → Code Cleanup
Keywords: in-triage
(Assignee)

Comment 1

3 years ago
So I'm not sure where this request comes from, but this query is a iso8859-1 encoded value, instead of utf-8 what the search expects. Can you clarify where that data comes from? Is that a legitimate request or someone fuzzing our forms?
(Assignee)

Updated

3 years ago
Flags: needinfo?(stephen.donner)
(Reporter)

Comment 2

3 years ago
(In reply to Jannis Leidel [:jezdez] from comment #1)
> So I'm not sure where this request comes from, but this query is a iso8859-1
> encoded value, instead of utf-8 what the search expects. Can you clarify
> where that data comes from? Is that a legitimate request or someone fuzzing
> our forms?

Pretty sure this was/is from a fuzzer -- not sure if it's from PowerFuzzer[0] or from Netsparker[1]

[0] http://sourceforge.net/projects/powerfuzzer/
[1] https://www.netsparker.com
Flags: needinfo?(stephen.donner)
Whiteboard: [fuzzer]
(Assignee)

Comment 3

3 years ago
:stephend Would you argue that this type of error should be handled on app level in our code or is it an acceptable response when providing data that is encoded in a way that we don't expect?
Flags: needinfo?(stephen.donner)
(Reporter)

Comment 4

3 years ago
(In reply to Jannis Leidel [:jezdez] from comment #3)
> :stephend Would you argue that this type of error should be handled on app
> level in our code or is it an acceptable response when providing data that
> is encoded in a way that we don't expect?

Our team was taught throughout the years by other Webdevers that "the server must gracefully deal with crappy requests" - how that plays out in individual situations isn't always clear, but returning an ISE for fuzzed input could be handled better - 404 or another one of the 4xx codes: https://developer.mozilla.org/en-US/docs/Web/HTTP/Response_codes
Flags: needinfo?(stephen.donner)
(Assignee)

Comment 5

3 years ago
Thanks Stephen, that makes sense. I'll make sure to wrap the code that triggered the bad behavior in the search's URL parameter handling in a try except block and raise an appropriate HTTP response instead.
Assignee: nobody → jezdez
Status: NEW → ASSIGNED

Comment 7

3 years ago
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/798bb8c270f6aecad837bb90d830d9d8cf3b6a76
Fix bug 1173178 - Catch Unicode decode errors in the search API.

This shows a 404 for the search if it's not working because of a decoding error of search query parameters.

https://github.com/mozilla/kuma/commit/0ce11d9aa9ffc09c3f85fbffe63e658195104282
Merge pull request #3347 from mozilla/bug1173178

Fix bug 1173178 - Catch Unicode decode errors in the search API.

Updated

3 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.