Getting broken pipe running talos in --develop mode

RESOLVED FIXED

Status

Testing
Mozbase
P2
normal
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: Jeff Hammel, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [mozharness+talos])

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

6 years ago
Running:

 python PerfConfigurator.py -a tsvg -e `which firefox` --develop
 python run_tests.py 20111209_1441_config.yml

I get:

RETURN:s: qm-pxp01
RETURN:id:20111209031218
RETURN:<a href =
"http://hg.mozilla.org/mozilla-central/rev/9e7239c0f557">rev:9e7239c0f557</a>
qm-pxp01: 
                Started Fri, 09 Dec 2011 14:47:08
Running test tsvg: 
             Started Fri, 09 Dec 2011 14:47:08
----------------------------------------
Exception happened during processing of request from ('127.0.0.1',
57986)
Traceback (most recent call last):
  File "/usr/lib/python2.7/SocketServer.py", line 582, in
  process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python2.7/SocketServer.py", line 323, in
  finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python2.7/SocketServer.py", line 641, in __init__
    self.finish()
  File "/usr/lib/python2.7/SocketServer.py", line 694, in finish
    self.wfile.flush()
  File "/usr/lib/python2.7/socket.py", line 303, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 32] Broken pipe
----------------------------------------
        Screen width/height:1600/900
        colorDepth:24
        Browser inner width/height: 1024/609

Completed test tsvg: 
               Stopped Fri, 09 Dec 2011 14:52:15
RETURN: cycle time: 00:05:06<br>
qm-pxp01: 
                Stopped Fri, 09 Dec 2011 14:52:15

I haven't had a chance to diagnose this.  After the broken pipe tests
do seem to run, so I'm not sure what the issue is.  Anyone see this
before?  I can dive in and debug, but don't plan on anytime soon
unless this is a high priority (mostly recording for posterity and to
see if anyone else has had the same error).
I see this all the time.  It is that we have an open connection to the server and we terminate the client in the middle of it (or so my theory is).
(Reporter)

Comment 2

6 years ago
Good to know its not just me.  I think your theory is likely correct, :jmaher.  I wonder if we should just catch on Errno 32 here or if that's too heavy handed.
Summary: Getting broken pipe running talso in --develop mode with tsvg → Getting broken pipe running talos in --develop mode with tsvg
(Reporter)

Updated

6 years ago
Summary: Getting broken pipe running talos in --develop mode with tsvg → Getting broken pipe running talos in --develop mode
(Reporter)

Updated

6 years ago
Blocks: 713055
(Reporter)

Updated

6 years ago
Priority: -- → P2
Whiteboard: [mozharness+talos]
This can be avoided using a try-except block in mozhttpd module though I doubt about its necessity since a try block will definitely introduce some overhead. If you say you're okay with this, I'll start working on the bug.
I say lets go ahead and do this.  A perf hit for a --develop mode feature is ok.  Mozhttpd is used for other tools, so maybe others might disagree.
(In reply to Joel Maher (:jmaher) from comment #4)
> I say lets go ahead and do this.  A perf hit for a --develop mode feature is
> ok.  Mozhttpd is used for other tools, so maybe others might disagree.

Yes, this seems to be the right thing to do. See:

http://stackoverflow.com/questions/180095/how-to-handle-a-broken-pipe-sigpipe-in-python
http://stackoverflow.com/questions/6063416/python-basehttpserver-how-do-i-catch-trap-broken-pipe-errors

I don't think there should be much, if any, perf hit. If possible (I'm pretty sure it is), I *would* like to see a unit test which deliberately severs the connection before a request is completed so we can be sure this bug is fixed (and doesn't regress in the future). I just added some preliminary unit tests to mozhttpd that may be used as a model:

https://github.com/mozilla/mozbase/blob/master/mozhttpd/tests/filelisting.py
Created attachment 586380 [details] [diff] [review]
Fix to the "EasyServer" base class for error handling.

I found out that handling the exception in the "RequestHandler" is not enough since the BaseServer class has a "handle_error" method which is triggered even if the exception is handled so I overridden that method which now skips this error(and its Windows variant) and logs others using the standard Python logging mechanism.

I couldn't write a unit test since I couldn't find a proper way to "cancel" the request when it was not finished using urllib2.
Attachment #586380 - Flags: review?(wlachance)
Comment on attachment 586380 [details] [diff] [review]
Fix to the "EasyServer" base class for error handling.

Overall this looks except for some changes that aren't strictly to do with this issue.

>--- mozhttpd/mozhttpd/mozhttpd.py	(date 1325706503000)
>+++ mozhttpd/mozhttpd/mozhttpd.py	(revision )
>@@ -39,23 +39,36 @@

>-import urllib
>-import re

This is a valid change, but not strictly part of the patch. Let's apply this in a seperate commit (I've been meaning to do a "pyflakes" run against mozbase and fix the resulting issues, which would include this one). 

>-            # ignore query string, otherwise SimpleHTTPRequestHandler 
>+            # ignore query string, otherwise SimpleHTTPRequestHandler

Looks like a whitespace change, could you pull this out of your patch?

>     # This produces a LOT of noise
>     def log_message(self, format, *args):
>         pass
>+

Ditto.

>             self.server = threading.Thread(target=self.httpd.serve_forever)
>             self.server.setDaemon(True) # don't hang on exit
>             self.server.start()
>-        
>+

Ditto.

> def main(args=sys.argv[1:]):
>-    
>+
>     # parse command line options
>     from optparse import OptionParser
>     parser = OptionParser()
>-    parser.add_option('-p', '--port', dest='port', 
>+    parser.add_option('-p', '--port', dest='port',
>                       type="int", default=8888,
>                       help="port to run the server on [DEFAULT: %default]")
>     parser.add_option('-H', '--host', dest='host',

Ditto.
Attachment #586380 - Flags: review?(wlachance) → review+
(In reply to Burak Yiğit Kaya [:BYK] from comment #6)

> I couldn't write a unit test since I couldn't find a proper way to "cancel"
> the request when it was not finished using urllib2.

We don't have to block committing this fix on a test, but it should be possible to cancel the request as needed by using a raw socket or something and sending the request manually over the wire (e.g. "GET /").
(In reply to William Lachance (:wlach) from comment #8)
> it should be possible to cancel the request as needed by using a raw socket
> or something and sending the request manually over the wire (e.g. "GET /").

Sounds like an option. Also I checked httplib and it seems to have a similar functionality.
Created attachment 586469 [details] [diff] [review]
Fix to the "EasyServer" base class for error handling.

Removed code formatting changes.
Attachment #586380 - Attachment is obsolete: true
Attachment #586469 - Flags: review?(wlachance)
where are we on this bug?  Do we just need the review and land it, or do we need to get a unit test?
(Reporter)

Comment 12

6 years ago
IMHO, reviewing + landing is sufficient (and possibly staging)
(Reporter)

Updated

6 years ago
Component: Talos → Mozbase
QA Contact: talos → mozbase
Comment on attachment 586469 [details] [diff] [review]
Fix to the "EasyServer" base class for error handling.

Looks good now, thanks!
Attachment #586469 - Flags: review?(wlachance) → review+
(Reporter)

Comment 14

6 years ago
FWIW, that patch is a bit wonky ;)

curl https://bug709349.bugzilla.mozilla.org/attachment.cgi?id=586469 | patch -p1 --dry-run

I'll just copy the code by hand but really the patch format should be correct
(In reply to Jeff Hammel [:jhammel] from comment #14)
> I'll just copy the code by hand but really the patch format should be correct

Which part creates problems? What error do you get when you try to merge it in? I'll try to fix/avoid that next time ;)
(Reporter)

Comment 16

6 years ago
pushed: https://github.com/mozilla/mozbase/commit/9d11662ae8001d591aded336fe199c4337c99232
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Reporter)

Comment 17

6 years ago
(In reply to Burak Yiğit Kaya [:BYK] from comment #15)
> (In reply to Jeff Hammel [:jhammel] from comment #14)
> > I'll just copy the code by hand but really the patch format should be correct
> 
> Which part creates problems? What error do you get when you try to merge it
> in? I'll try to fix/avoid that next time ;)

The very confusing output:

(mozmill)│curl https://bug709349.bugzilla.mozilla.org/attachment.cgi?id=586469 | patch -p1 --dry-run
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6403  100  6403    0     0  11278      0 --:--:-- --:--:-- --:--:-- 14787
can't find file to patch at input line 10
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|Index: mozhttpd/mozhttpd/mozhttpd.py
|IDEA additional info:
|Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
|<+>UTF-8
|Subsystem: com.intellij.openapi.diff.impl.patch.BaseRevisionTextPatchEP
|<+>#!/usr/bin/env python\n\n# ***** BEGIN LICENSE BLOCK *****\n# Version: MPL 1.1/GPL 2.0/LGPL 2.1\n#\n# The contents of this file are subject to the Mozilla Public License Version\n# 1.1 (the \"License\"); you may not use this file except in compliance with\n# the License. You may obtain a copy of the License at\n# http://www.mozilla.org/MPL/\n#\n# Software distributed under the License is distributed on an \"AS IS\" basis,\n# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License\n# for the specific language governing rights and limitations under the\n# License.\n#\n# The Original Code is mozilla.org code.\n#\n# The Initial Developer of the Original Code is\n# the Mozilla Foundation.\n# Portions created by the Initial Developer are Copyright (C) 2011\n# the Initial Developer. All Rights Reserved.\n#\n# Contributor(s):\n#   Joel Maher <joel.maher@gmail.com>\n#\n# Alternatively, the contents of this file may be used under the terms of\n# either the GNU General Public License Version 2 or later (the \"GPL\"), or\n# the GNU Lesser General Public License Version 2.1 or later (the \"LGPL\"),\n# in which case the provisions of the GPL or the LGPL are applicable instead\n# of those above. If you wish to allow use of your version of this file only\n# under the terms of either the GPL or the LGPL, and not to allow others to\n# use your version of this file under the terms of the MPL, indicate your\n# decision by deleting the provisions above and replace them with the notice\n# and other provisions required by the GPL or the LGPL. If you do not delete\n# the provisions above, a recipient may use your version of this file under\n# the terms of any one of the MPL, the GPL or the LGPL.\n#\n# ***** END LICENSE BLOCK *****\n\nimport BaseHTTPServer\nimport SimpleHTTPServer\nimport threading\nimport sys\nimport os\nimport urllib\nimport re\nfrom SocketServer import ThreadingMixIn\n\nclass EasyServer(ThreadingMixIn, BaseHTTPServer.HTTPServer):\n    allow_reuse_address = True\n\nclass MozRequestHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):\n    docroot = os.getcwd() # current working directory at time of import\n\n    def parse_request(self):\n        retval = SimpleHTTPServer.SimpleHTTPRequestHandler.parse_request(self)\n        if '?' in self.path:\n            # ignore query string, otherwise SimpleHTTPRequestHandler \n            # will treat it as PATH_INFO for `translate_path`\n            self.path = self.path.split('?', 1)[0]\n        return retval\n\n    def translate_path(self, path):\n        path = path.strip('/').split()\n        if path == ['']:\n            path = []\n        path.insert(0, self.docroot)\n        return os.path.join(*path)\n\n    # I found on my local network that calls to this were timing out\n    # I believe all of these calls are from log_message\n    def address_string(self):\n        return \"a.b.c.d\"\n\n    # This produces a LOT of noise\n    def log_message(self, format, *args):\n        pass\n\nclass MozHttpd(object):\n\n    def __init__(self, host=\"127.0.0.1\", port=8888, docroot=os.getcwd(), handler_class=MozRequestHandler):\n        self.host = host\n        self.port = int(port)\n        self.docroot = docroot\n        self.httpd = None\n\n        class MozRequestHandlerInstance(handler_class):\n            docroot = self.docroot\n\n        self.handler_class = MozRequestHandlerInstance\n\n    def start(self, block=False):\n        \"\"\"\n        start the server.  If block is True, the call will not return.\n        If block is False, the server will be started on a separate thread that\n        can be terminated by a call to .stop()\n        \"\"\"\n        self.httpd = EasyServer((self.host, self.port), self.handler_class)\n        if block:\n            self.httpd.serve_forever()\n        else:\n            self.server = threading.Thread(target=self.httpd.serve_forever)\n            self.server.setDaemon(True) # don't hang on exit\n            self.server.start()\n        \n    def stop(self):\n        if self.httpd:\n            self.httpd.shutdown()\n        self.httpd = None\n\n    __del__ = stop\n\n\ndef main(args=sys.argv[1:]):\n    \n    # parse command line options\n    from optparse import OptionParser\n    parser = OptionParser()\n    parser.add_option('-p', '--port', dest='port', \n                      type=\"int\", default=8888,\n                      help=\"port to run the server on [DEFAULT: %default]\")\n    parser.add_option('-H', '--host', dest='host',\n                      default='127.0.0.1',\n                      help=\"host [DEFAULT: %default]\")\n    parser.add_option('-d', '--docroot', dest='docroot',\n                      default=os.getcwd(),\n                      help=\"directory to serve files from [DEFAULT: %default]\")\n    options, args = parser.parse_args(args)\n    if args:\n        parser.print_help()\n        parser.exit()\n\n    # create the server\n    kwargs = options.__dict__.copy()\n    test = kwargs.pop('test')\n    server = MozHttpd(**kwargs)\n\n    if test:\n        server.start()\n        server.testServer()\n    else:\n        server.start(block=True)\n\nif __name__ == '__main__':\n    main()\n
|===================================================================
|--- mozhttpd/mozhttpd/mozhttpd.py	(date 1325860736000)
|+++ mozhttpd/mozhttpd/mozhttpd.py	(revision )
--------------------------
File to patch: ^C
Also we had this pull request for this bug: https://github.com/mozilla/mozbase/pull/3
(Reporter)

Comment 19

6 years ago
my version of errno doesn't have one of the codes:

    import mozhttpd
  File "/home/jhammel/mozmill/src/mozbase/mozhttpd/mozhttpd/__init__.py", line 38, in <module>
    from mozhttpd import MozHttpd, MozRequestHandler, main
  File "/home/jhammel/mozmill/src/mozbase/mozhttpd/mozhttpd/mozhttpd.py", line 52, in <module>
    class EasyServer(ThreadingMixIn, BaseHTTPServer.HTTPServer):
  File "/home/jhammel/mozmill/src/mozbase/mozhttpd/mozhttpd/mozhttpd.py", line 54, in EasyServer
    acceptable_errors = (errno.EPIPE, errno.WSAECONNABORTED)
AttributeError: 'module' object has no attribute 'WSAECONNABORTED'
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Reporter)

Comment 20

6 years ago
I do have 

>>> [i for i in dir(errno) if 'ABORT' in i]
['ECONNABORTED']

Is this what is desired?  I'm not sure what works on what python :/ (here I'm on 2.7/linux)

I think for now I will take out this error code and point to this bug until there is more time to investigate
(In reply to Jeff Hammel [:jhammel] from comment #19)
> my version of errno doesn't have one of the codes:
> AttributeError: 'module' object has no attribute 'WSAECONNABORTED'

That is a Windows specific code but I thought it would be included. I'll put that in a try-except block and submit another patch.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Reporter)

Comment 22

6 years ago
https://github.com/mozilla/mozbase/commit/87ffccdc5cf10eb998d86e17a85e0665ea04dd03
(Reporter)

Comment 23

6 years ago
FWIW, this doesn't seem to fix things :(

----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 56180)
Traceback (most recent call last):
  File "/usr/lib/python2.7/SocketServer.py", line 582, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python2.7/SocketServer.py", line 323, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python2.7/SocketServer.py", line 641, in __init__
    self.finish()
  File "/usr/lib/python2.7/SocketServer.py", line 694, in finish
    self.wfile.flush()
  File "/usr/lib/python2.7/socket.py", line 303, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 32] Broken pipe
----------------------------------------

Note: >>> [(i, getattr(errno, i)) for i in dir(errno) if getattr(errno, i) == 32]
[('EPIPE', 32)]

So we are doing something wrong :(
Will give it another shot. In the mean time can you put a command in that "handle_error" to confirm that it gets called?
Here are a few more changes which fix the WSAECONNABORTED error and a possible fix for the last problem :jhammel mentioned: http://github.com/mozilla/mozbase/pull/3/files#diff-0
(Reporter)

Comment 26

6 years ago
Created attachment 589231 [details] [diff] [review]
unbitrot pull request

I haven't been able to reproduce the issue on this computer, but ABICT this is a faithful unbitrot of the github pull request. If/when I do see the issue again, I will comment on this bug. Otherwise, I am content to land the follow-up and close the bug unless/until the issue shows its ugly head again.
Attachment #589231 - Flags: review?(madbyk)
Attachment #589231 - Flags: feedback?(jmaher)
Comment on attachment 589231 [details] [diff] [review]
unbitrot pull request

Review of attachment 589231 [details] [diff] [review]:
-----------------------------------------------------------------

Seems like it doesn't have the line continuators("\") on if statements' comparison parts which causes a syntax error.
Attachment #589231 - Flags: review?(madbyk) → review-
(Reporter)

Comment 28

6 years ago
(In reply to Burak Yiğit Kaya [:BYK] from comment #27)
> Comment on attachment 589231 [details] [diff] [review]
> unbitrot pull request
> 
> Review of attachment 589231 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Seems like it doesn't have the line continuators("\") on if statements'
> comparison parts which causes a syntax error.

I used (very intentionally) parentheses instead of line continuators. I do not get a syntax error.
Comment on attachment 589231 [details] [diff] [review]
unbitrot pull request

Review of attachment 589231 [details] [diff] [review]:
-----------------------------------------------------------------

OH okay, I didn't notice the additional paranthesis. I think they make the code somewhat uglier though since it works, I'll go with an r+ =)
Attachment #589231 - Flags: review- → review+
(Reporter)

Comment 30

6 years ago
pushed: https://github.com/mozilla/mozbase/commit/2719883473110519cdaf963c808985230fe3d91d
(Reporter)

Comment 31

6 years ago
Closing for now; will reopen if i see the symptom again
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Attachment #589231 - Flags: feedback?(jmaher)
You need to log in before you can comment on or make changes to this bug.