If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Bugzilla Elasticsearch Cluster is Stale (has ETL has stopped?)

RESOLVED FIXED

Status

()

bugzilla.mozilla.org
Infrastructure
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: ekyle, Assigned: fubar)

Tracking

(Blocks: 1 bug)

Production
x86_64
Windows 7
Dependency tree / graph

Details

(Reporter)

Description

3 years ago
The public and private clusters have stopped receiving updates from the ETL processes.  The last update was done around Mar 29 @ 14:46EDT.

Please check that the ETL processes, both public and private, are running. I may require logs if they are.

[1] Diagram with bug numbers:  https://github.com/klahnakoski/Bugzilla-ETL/blob/master/docs/Architecture%20%28bug%20879822%29.png
(Reporter)

Updated

3 years ago
Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/194]
(Reporter)

Updated

3 years ago
Flags: needinfo?(klibby)
(Assignee)

Comment 1

3 years ago
There was a hung private process from March 29; killed it and the next cron job ran and did stuff. The public one looks like it's been running fine, from the logs.
Flags: needinfo?(klibby)
(Assignee)

Comment 2

3 years ago
And now it looks like the private one may be wedged again?  It's still running, and the logs are showing:

2015-04-08 13:24:49.233966 - Waiting on thread "etl"

2015-04-08 13:24:56.555011 - Waiting on thread "etl"

2015-04-08 13:25:10.284315 - Waiting on thread "etl"

strace doesn't help much:
[pid 23430] futex(0x3113020, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 23482] futex(0x3113020, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 23430] <... futex resumed> )       = 0
[pid 23430] futex(0x3113020, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 23482] futex(0x3113020, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 23430] <... futex resumed> )       = 0
[pid 23482] futex(0x3113020, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 23430] futex(0x3113020, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 23482] <... futex resumed> )       = 0
[pid 23430] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid 23430] futex(0x3113020, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 23482] futex(0x3113020, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 23430] <... futex resumed> )       = 0
[pid 23430] futex(0x3113020, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 23482] futex(0x3113020, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 23430] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid 23482] <... futex resumed> )       = 0
(Reporter)

Comment 3

3 years ago
The ETL should be generating a log file (exact location in setting.json file)  which may give more infor to what's happening (problem connecting?)
(Reporter)

Comment 4

3 years ago
The "Waiting on thread \"etl\"" without logs between means the etl thread is slow (or failed).   Again the log file should tell us more.
(Reporter)

Comment 5

3 years ago
The ETL appears to have caught up fine.  This bug will be kept open until fubar and I are happy it will not show another problem.
(Reporter)

Comment 6

3 years ago
Looks good!
Assignee: nobody → klibby
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(Reporter)

Updated

3 years ago
Blocks: 1156329
You need to log in before you can comment on or make changes to this bug.