[prod] ISE: IOError: request data read error

VERIFIED WONTFIX

Status

VERIFIED WONTFIX
8 years ago
8 years ago

People

(Reporter: mbrandt, Assigned: brez)

Tracking

unspecified

Details

(Whiteboard: [prod], URL)

(Reporter)

Description

8 years ago
Requested url /requests/save_mark

Traceback (most recent call last):

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/views/decorators/http.py", line 37, in inner
    return func(request, *args, **kwargs)

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/markup/requests.py", line 106, in save_mark
    if 'points_obj' in request.POST and 'points_obj_simplified' in request.POST:

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 171, in _get_post
    self._load_post_and_files()

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 151, in _load_post_and_files
    self._post, self._files = http.QueryDict(self.raw_post_data, encoding=self._encoding), datastructures.MultiValueDict()

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 205, in _get_raw_post_data
    size=content_length)

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 69, in safe_copyfileobj
    buf = fsrc.read(min(length, size))

IOError: request data read error
Assignee: nobody → jbresnik
Blocks: 628811
(Assignee)

Comment 1

8 years ago
What are the steps to recreate? 

IOError is just a generic error that comes from Apache / wsgi i.e. not specifically app related. 

Which version of wsgi / apache are run in production?
(Assignee)

Comment 2

8 years ago
Also this is really an issue with the configuration (a bug inside of wsgi) not the app (see stack trace) just to clarify.
(Assignee)

Comment 3

8 years ago
And I don't have access to the production environment
(Assignee)

Comment 4

8 years ago
We believe this is related to load and have made a significant change that doesn't send RAW mark data unless the feature is enabled:

https://github.com/mozilla/markup/commit/e17cd7b9acb0b2619cf0c6f6cd61f5632c540147#L2R39
and
https://github.com/mozilla/markup/commit/b6290ae8a5bcef9938c4a418c04024ba45ee03e6

Closing for now pls reopen if this persists
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(Assignee)

Comment 5

8 years ago
Some additional changes pending
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 6

8 years ago
Need additional backend support for this.. added now:

https://github.com/mozilla/markup/commit/07a8c771832c3290a3847ac3863500abd5e1967d
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
(Reporter)

Comment 7

8 years ago
Hmm this is strange, it appears the ISE is still being hit. If I understand correctly this patch has already landed on prod. If this is the case let's continue to dig into this one.
Here's the latest traceback, a similar one is sent every couple of hours.

Traceback (most recent call last):

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/views/decorators/http.py", line 37, in inner
    return func(request, *args, **kwargs)

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/markup/requests.py", line 106, in save_mark
    if 'points_obj_simplified' in request.POST:

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 171, in _get_post
    self._load_post_and_files()

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 151, in _load_post_and_files
    self._post, self._files = http.QueryDict(self.raw_post_data, encoding=self._encoding), datastructures.MultiValueDict()

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 205, in _get_raw_post_data
    size=content_length)

  File "/data/www/python/markup.mozilla.org/markup/ffdemo/vendor/src/django/django/core/handlers/wsgi.py", line 69, in safe_copyfileobj
    buf = fsrc.read(min(length, size))

IOError: request data read error
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 8

8 years ago
Ok Ill set a load test for save mark - can you tell me what versions of the OS / apache / wsgi you are running so I can recreate the same environment?
(In reply to comment #8)
> Ok Ill set a load test for save mark - can you tell me what versions of the
> OS / apache / wsgi you are running so I can recreate the same environment?

RHEL6
httpd-2.2.15-9.el6.i686
mod_wsgi-3.2-1.el6.i686
(Assignee)

Comment 10

8 years ago
Just to confirm we are *not writing marks to drive correct, i.e. this is what is in settings_local.py :

ENABLE_RAW_MARKS = False

?
(Assignee)

Comment 11

8 years ago
Not having any luck recreating that error - using a 1000 simultaneous threads (500 loop count) - I wasn't able to get an IOError - did managed to get this:

[error] [client 127.0.0.1] (61)Connection refused: mod_wsgi (pid=68634): Connection attempt #1 to WSGI daemon process 'django' on '/private/var/run/wsgi.67411.0.1.sock' failed, sleeping before retrying again.

but that's reasonable based on the amount of load I was hitting it with.. 

Next step is to run on RHEL6 (been running on OSX)
(Assignee)

Comment 12

8 years ago
Actually I don't have access to RHEL6 (because I dont have a license for it) - using CentOS 5.6
(In reply to comment #12)
> Actually I don't have access to RHEL6 (because I dont have a license for it)
> - using CentOS 5.6

I believe this was outlined (or should have been) when we signed a contract for TBG to build this site.  This should be a business expense for you, and I'm pretty sure our contract was enough to cover the cost of a development server.

Sorry, I just don't buy that excuse.
(Assignee)

Comment 14

8 years ago
Well CentOS is the open source version of redhat but sure let me find out about a license..
(Assignee)

Comment 15

8 years ago
Also need the version of python that is running on production
(Assignee)

Comment 16

8 years ago
After some research by our sysadmin he found some interesting observations about this error:

* Browser: you can get this when a user just clicks the "stop" button 
or navigates away. Some browsers close the connection cleanly; others 
don't. 
* Browser: some versions of IE, especially on Vista, randomly 
terminate uploads. Most of the time it's seemed to be related to the 
windows firewall thing. 
* Browser: various development versions of Firefox, Chrome, and Safari 
have all had upload/disconnect issues. To my knowledge none of these 
issues have made it into shipping versions, but many users of these 
browsers use the cutting edge. 
* Dodgy 'net connections, especially public ones. For example, I can 
pretty consistently trigger an IOError by uploading a file from Kansas 
City International Airport. 
* Badly-configured intermediary caches, like those run by ISPs. AOL 
dialup users (yes, there are still some of these!) seem to throw lots 
of IOErrors. (I traced this town by noting that certain /24 and /32 IP 
spaces seemed more prevalent in the error tracebacks and tracked down 
those block owners.) [1]

i.e. it's entirely possible that this is caused by the client side but in order to rule this out as the cause, we're going to need complete access logs / app logs to compare the type of access when these errors occur - i.e. we need to be able to recreate the exact same circumstances

We're setting up the RHEL6 now but will need those logs to continue. 

[1] http://groups.google.com/group/django-developers/browse_thread/thread/71cd0bbc76113ac8
(Assignee)

Comment 17

8 years ago
On hold until [:cshields] can get back to me re. python version
(Assignee)

Comment 18

8 years ago
Python 2.6.6

Comment 19

8 years ago
Corey - are you guys able to get John the logs today?  If not can you let us know when we can expect them?
(Assignee)

Comment 20

8 years ago
We're all set to test..
(Assignee)

Comment 21

8 years ago
Let me know if we want to continue with this - would need those logs to recreate the exact environment in order to duplicate.  Thanks
(Reporter)

Comment 22

8 years ago
As I understand the conversation (fwenzel please jump in) this is a spurious error that is bubbled up through the app layer from django. Bumping to wontfix for several reasons, 1) the failure occurs infrequently, and 2) the fix isn't within the dimensions of this project.
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → WONTFIX
All right. If it becomes more frequent, we will need to reopen, but until then I am fine with wontfixing this.
(Reporter)

Comment 24

8 years ago
wenzel, thx for commenting on this. We'll reopen if the traceback emails increase in frequency... verifying as wontfix.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.