Detect dodgy character encoding

RESOLVED WONTFIX

Status

P5
enhancement
RESOLVED WONTFIX
6 years ago
5 years ago

People

(Reporter: andy+bugzilla, Assigned: davidbgk)

Tracking

x86
Mac OS X
Points:
---

Details

(Whiteboard: p=2)

(Reporter)

Description

6 years ago
This is an implementation of

https://www.owasp.org/index.php/AppSensor_DetectionPoints#EE2:_Unexpected_Encoding_Used

To be sure we detect different encodings. This should go into django paranoia so it can be reused.

https://django-paranoia.readthedocs.org/
Severity: normal → enhancement
(Reporter)

Updated

5 years ago
Assignee: nobody → david
(Assignee)

Comment 1

5 years ago
Different ways to detect encoding in Python:

* BeautifulSoup (http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html#Beautiful%20Soup%20Gives%20You%20Unicode,%20Dammit) on top of chardet (https://pypi.python.org/pypi/chardet)
* python-magic (https://pypi.python.org/pypi/python-magic) on top of libmagic (http://linux.die.net/man/3/libmagic)
* PyICU (https://pypi.python.org/pypi/PyICU) on top of ICU (http://site.icu-project.org/)
* python-libguess (https://bitbucket.org/barro/python-libguess/wiki/Home) on top of libguess (http://www.atheme.org/project/libguess)

Django is performing a "lazy" evaluation of submitted content: https://docs.djangoproject.com/en/dev/ref/unicode/#form-submission

Both Django forms and testing client use the DEFAULT_CHARSET setting to decode data: https://docs.djangoproject.com/en/dev/ref/settings/#default-charset

Logging encoding errors will probably require to monkeypatch both django.http.HttpRequest.body and django.http.QueryDict

To be discussed.
(Assignee)

Comment 2

5 years ago
After discussing with Andy, it's not worth the investment to log that kind of error.
Feel free to reopen it if you want to do something valuable with the logged data.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.