Closed Bug 1140882 Opened 10 years ago Closed 10 years ago

Use prefork scheduling instead of gevent scheduling for pushlog workers

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

54 bytes, text/x-github-pull-request
mdoglio
: review+
Details | Review
In recent Treeherder meetings we have discussed the possibility of stopping using gevent entirely, in favour of the default celery prefork scheduling, given (a) the python 2.7.9 issues, and most importantly (b) the bad feedback we heard regarding celery and gevent, from catlee/rail. Bug 1123479 already moved the log parser tasks back to prefork, but we should consider doing so for the pushlog worker: https://github.com/mozilla/treeherder-service/blob/40b01c751d4e2500cb069933ac646ea32249eafc/bin/run_celery_worker_pushlog#L31 and socketio running (not that we're using it at present): https://github.com/mozilla/treeherder-service/blob/70330c5842a51d06a81a2f1f60cc1b1988c0ff02/treeherder/events/run_socketio.py#L10 I'm wondering if this will help thing issues like: Bug 1131059 - Determine why there were zombie celery processes on some nodes
I actually chatted with Rail about this on Friday and he didn't seem to feel that there was anything inherently wrong with gevent in particular (aside from it being a kind of evil hack). He did have some general complaints about how difficult celery was to manage in general, which seem valid from my limited experience. :) Catlee, could you go into more detail on what you mean when you said that gevent "tends to fall over"? In particular, do you think it might help with the issue Ed pointed at above? http://logs.glob.uno/?c=mozilla%23treeherder&s=3+Mar+2015&e=3+Mar+2015&h=gevent#c36663
Chatted with catlee about this over lunch today. He also didn't think there was anything about gevent that would cause particular problems in our case. Its concurrency model (switch threads on every i/o function!) can be difficult to reason about for some cases. Personally I think I'd advise switching to prefork scheduling for everything, as it seems like a more sane default unless there are really compelling reasons not to use it. But I wouldn't expect doing so to solve any of our problems.
(In reply to Ed Morley [:edmorley] from comment #0) > and socketio running (not that we're using it at present): > https://github.com/mozilla/treeherder-service/blob/ > 70330c5842a51d06a81a2f1f60cc1b1988c0ff02/treeherder/events/run_socketio. > py#L10 run_socketio.py has since been removed from the repo, so this just leaves switching the pushlog worker over: https://github.com/mozilla/treeherder-service/blob/master/bin/run_celery_worker_pushlog#L31 This bug is now even more relevant at the moment, since if we were to fix it, we can upgrade from Python 2.7.8 to Python 2.7.9 again - which makes things simpler for th-dev & also our Heroku prototyping. (Plus means we get the Python 2.7.9 security fixes).
Assignee: nobody → emorley
Summary: Decide if we should stop using gevent with Celery → Use prefork scheduling instead of gevent scheduling for pushlog retrieval
Summary: Use prefork scheduling instead of gevent scheduling for pushlog retrieval → Use prefork scheduling instead of gevent scheduling for pushlog workers
Status: NEW → ASSIGNED
Attached file Stop using gevent
Attachment #8591745 - Flags: review?(mdoglio)
Attachment #8591745 - Flags: review?(mdoglio) → review+
Commits pushed to master at https://github.com/mozilla/treeherder-service https://github.com/mozilla/treeherder-service/commit/19d0b51d2ae162b0f20c2f5be57abc5aea74851f Bug 1140882 - Use prefork scheduling for pushlog workers Use prefork scheduling instead of gevent scheduling, to avoid issues we've had with gevent - both with zombie tasks & also incompatibilities with Python 2.7.9. https://github.com/mozilla/treeherder-service/commit/0c9203555fb3323b4729ef4fde996b47c2d60bd2 Bug 1140882 - Remove gevent & greenlet from requirements Since they are now unused.
Thanks for the review :-)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: