Closed
Bug 781451
Opened 12 years ago
Closed 12 years ago
Log greenlets that block the event loop for too long
Categories
(Cloud Services :: Server: Core, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: rfkelly, Assigned: rfkelly)
Details
(Whiteboard: [qa+])
Attachments
(3 files)
4.24 KB,
patch
|
rmiller
:
review+
|
Details | Diff | Splinter Review |
894 bytes,
patch
|
rmiller
:
review+
|
Details | Diff | Splinter Review |
2.66 KB,
patch
|
rmiller
:
review+
|
Details | Diff | Splinter Review |
This patch uses the new trace functionality in greenlet==0.4.0 to implement a very simple blocking-greenlets detector. It sets a function to be called on every greenlet switch, and just keeps track of the time beteen each switch. If too much time has elapsed, then it logs an error with a stack trace of the offending greenlet.
I've implemented it against server-core so that it would be enabled by default on all products.
Attachment #650462 -
Flags: review?(rmiller)
Assignee | ||
Comment 1•12 years ago
|
||
Also need to update deps to greenlet 0.4
Attachment #650463 -
Flags: review?(rmiller)
Assignee | ||
Comment 2•12 years ago
|
||
I should note that this seems to work correctly in my local tests, and the logic seems sound to me, but I won't be terribly surprised if some edge-case means it gives false positives on stage. If it does so, the checking can be disabled by setting the GEVENT_MAX_BLOCKING_TIME environment variable to zero.
Comment 3•12 years ago
|
||
Comment on attachment 650462 [details] [diff] [review]
helper to detect blocking calls in gevent
Review of attachment 650462 [details] [diff] [review]:
-----------------------------------------------------------------
This is a bit more import-time-side-effect'y than I usually like, but as it's server-core it's probably not worth a lot of effort to wire it up to the app initialization code.
Attachment #650462 -
Flags: review?(rmiller) → review+
Updated•12 years ago
|
Attachment #650463 -
Flags: review?(rmiller) → review+
Assignee | ||
Comment 4•12 years ago
|
||
(In reply to Rob Miller [:RaFromBRC :rmiller] from comment #3)
> Comment on attachment 650462 [details] [diff] [review]
> helper to detect blocking calls in gevent
>
> Review of attachment 650462 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> This is a bit more import-time-side-effect'y than I usually like, but as
> it's server-core it's probably not worth a lot of effort to wire it up to
> the app initialization code.
Yeah, it is a bit icky like that, but no more so than the import-time gevent monkey-patching we do elsewhere IMO. One other option is to put things like this into a custom gunicorn worker class, so they are only apply at execution time to the processes that need them.
This was committed in http://hg.mozilla.org/services/server-core/rev/8284b086fbd5, I am leaving the bug open while we see if it actually works...
Updated•12 years ago
|
Whiteboard: [qa+]
Assignee | ||
Comment 5•12 years ago
|
||
After trying this out on stage, I've made some adjustments to how it works.
First, I have removed the traceback from the original error logs. The error is only logged when the code eventually yields back to gevent, which probably happens in some unrelated piece of well-behaved code. The traceback was thus rather useless in tracking down the problem.
To go with this, I've added a check_for_gevent_blocking() context manager. This can be used to determine whether a particular piece of code actually yields to gevent, and log an error if it does not. The error message from this one includes a traceback because it will actually point to somewhere near the code that is at fault.
Assignee: nobody → rfkelly
Attachment #650756 -
Flags: review?(rmiller)
Updated•12 years ago
|
Attachment #650756 -
Flags: review?(rmiller) → review+
Assignee | ||
Comment 6•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•