Closed Bug 1221536 Opened 9 years ago Closed 9 years ago

Make it clearer that the celery worker must be started before running ingest_push

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

Details

Attachments

(1 file)

Docs: Emphasise starting the worker before ingest_push 9 years ago Ed Morley [:emorley] 47 bytes, text/x-github-pull-request	wlach : review+	Details \| Review

Ed Morley [:emorley]

Assignee

Description

•

9 years ago

The ingest single push command doesn't run all of the tasks for ingestion synchronously. For things like log parsing, these must currently be handled by a celery worker.

However it turns out this celery worker must be started *before* they are scheduled, since unlike normal ingestion, when using ingest_push CELERY_ALWAYS_EAGER is True, which makes celery throw away the jobs if the worker isn't running (rather than persisting them in the rabbitmq queue).

http://treeherder.readthedocs.org/installation.html#ingesting-a-single-push-at-a-time

Ed Morley [:emorley]

Assignee

Comment 1

•

9 years ago

I've filed an issue against Celery since IMO it shouldn't silently fail *and* throw away the job. It should either give an exception, or else still put the jobs in the rabbitmq:
https://github.com/celery/celery/issues/2910

Ed Morley [:emorley]

Assignee

Comment 2

•

9 years ago

Attached file Docs: Emphasise starting the worker before ingest_push — Details

Attachment #8683090 - Flags: review?(wlachance)

William Lachance (:wlach)

Comment 3

•

9 years ago

Comment on attachment 8683090 [details] [review]
Docs: Emphasise starting the worker before ingest_push

FWIW, I don't think doc updates like this need review. Looks good though!

Attachment #8683090 - Flags: review?(wlachance) → review+

Treeherder GitHub Bugbot

Comment 4

•

9 years ago

Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/ed8498710c3794c94552014e29ea09815f2178e6
Bug 1221536 - Docs: Emphasise starting the worker before ingest_push

If the worker is not running, any `apply_async()` calls are silently
thrown away, due to `ingest_push`'s use of `CELERY_ALWAYS_EAGER` and:
https://github.com/celery/celery/issues/2910

As such, running the worker after ingest_push doesn't help (since the
rabbitmq queues are empty) and so if people are interested in perf/log
data, then they must start the worker first instead.

Ed Morley [:emorley]

Assignee

Updated

•

9 years ago

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Ed Morley [:emorley]

Assignee

Comment 5

•

8 years ago

(In reply to Ed Morley [:emorley] from comment #1)
> I've filed an issue against Celery since IMO it shouldn't silently fail
> *and* throw away the job. It should either give an exception, or else still
> put the jobs in the rabbitmq:
> https://github.com/celery/celery/issues/2910

Ah so I've finally figured this out.

Whilst we were setting the "always eager" setting, that only occurred in the management command's process, whereas log ingestion was triggered via the API call to /jobs/, which was running in the gunicorn process and so had no idea "always eager" was set.

Now that we no longer submit to our own API this isn't an issue. I'll file a bug to simplify the docs.

Nobody; OK to take it and work on it

Updated

•

2 years ago

Component: Treeherder: Docs & Development → TreeHerder

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Make it clearer that the celery worker must be started before running ingest_push

Categories

(Tree Management :: Treeherder, defect, P2)

Tracking

(Not tracked)

People

(Reporter: emorley, Assigned: emorley)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Updated

Attachment

General

Description

File Name

Content Type