Closed
Bug 1087343
Opened 10 years ago
Closed 10 years ago
Treeherder stopped ingesting pushes after the production deploy
Categories
(Tree Management :: Treeherder: Infrastructure, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cbook, Assigned: fubar)
References
Details
somehow treeherder is stuck at https://treeherder.mozilla.org/ui/#/jobs?repo=mozilla-inbound&revision=6c094e2b6e57 while it should show https://hg.mozilla.org/integration/mozilla-inbound/rev/dedc769ea8a8 wonder if this is the pushlog problem we have seen before ?
Reporter | ||
Comment 1•10 years ago
|
||
also trees closed now for this problem
Comment 2•10 years ago
|
||
15:15 <•mdoglio> edmorley: the etltasks stopped working 15 minutes ago, I guess because of the deployment 15:16 <•mdoglio> fubar: hey there, can you please have a look at the celery worker on the etl nodes? 15:17 <fubar> mdoglio: processes are running 15:17 <fubar> [treeherder-etl1.private.scl3.mozilla.com] out: celery RUNNING pid 2013, uptime 0:17:37 15:17 <fubar> [treeherder-etl2.private.scl3.mozilla.com] out: celery RUNNING pid 22710, uptime 0:18:51 15:17 <fubar> [treeherder-etl1.private.scl3.mozilla.com] out: 495 2034 2013 0 13:59 ? 00:00:04 /usr/bin/python /usr/bin/celery -A treeherder worker -c 3 -Q default -E --maxtasksperchild=500 --logfile=/var/log/celery/celery_worker.log -l INFO -n default.%h 15:18 <•mdoglio> fubar: I'm on etl and I see the worker is started with the wrong script 15:19 <•mdoglio> s/etl/etl1 15:20 <•edmorley> mdoglio: fubar: this is the first deploy after bug 1086934 I guess? 15:20 <firebot> https://bugzil.la/1086934 — FIXED, klibby@mozilla.com — Production's commander_settings.py is missing treeherder-etl from CELERY_HOSTGROUP 15:21 <•mdoglio> fubar: this is the supervisord conf for the etl nodes https://github.com/mozilla/treeherder-service/blob/master/deployment/supervisord/etl_node.conf 15:22 <fubar> mdoglio: I think we missed a step between building the etl nodes and actually using those queues 15:23 <•mdoglio> oh okey 15:23 — fubar adds those to puppet 15:23 <•mdoglio> thanks fubar
Summary: Treeherder missing https://hg.mozilla.org/integration/mozilla-inbound/rev/dedc769ea8a8 and following → Treeherder has stopped ingesting pushes
Updated•10 years ago
|
Summary: Treeherder has stopped ingesting pushes → Treeherder stopped ingesting pushes after the production deploy
Comment 3•10 years ago
|
||
The pushlog and buildapi ingestion services are running now. Thanks fubar!
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Assignee: nobody → klibby
Comment 4•10 years ago
|
||
This was presumably a combination of: 1) Recent change to unplug the default celery worker from etl (buildapi,pushlog) queues (https://github.com/mozilla/treeherder-service/commit/5df6bd4212778425adff0e743dd9a09c63bb01c4). 2) Recent fix to the Chief deploy script, since the new ETL nodes were not included in the machines it was updating (bug 1086934). 3) The ETL nodes using the wrong supervisord conf, which wasn't noticed until now, since #1 meant the default worker was doing ETL, and #2 meant that even after #1 landed, the ETL workers didn't get the new code until the first deploy after #2 was fixed (which was the deploy 30 mins ago).
Updated•10 years ago
|
Assignee: nobody → klibby
You need to log in
before you can comment on or make changes to this bug.
Description
•