Open Bug 1493575 Opened 6 years ago Updated 6 years ago

hgssh3(and 1).dmz.mdc1.mozilla.com:hg push data aggregator pending lag is CRITICAL

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

defect
Not set
normal

Tracking

(Not tracked)

People

(Reporter: Derek, Unassigned)

Details

Mon 03:58:25 UTC [9631] [Unknown] hgssh3.dmz.mdc1.mozilla.com:hg push data aggregator pending lag is CRITICAL: CRITICAL - 95 messages from 8 partitions behind (http://m.mozilla.org/hg+push+data+aggregator+pending+lag) 8:59 PM Mon 03:59:47 UTC [9634] [Unknown] hgssh1.dmz.mdc1.mozilla.com:hg push data aggregator pending lag is CRITICAL: CRITICAL - 151 messages from 8 partitions behind (http://m.mozilla.org/hg+push+data+aggregator+pending+lag) 9:08 PM Mon 04:08:52 UTC [9640] [Unknown] hgssh1.dmz.mdc1.mozilla.com:procs - hg vcsreplicator aggregator is CRITICAL: PROCS CRITICAL: 1 process with regex args 'vcsreplicator-aggregator' (http://m.mozilla.org/procs+-+hg+vcsreplicator+aggregator)
Here's what that looked like, rchilds@hgssh1.dmz.mdc1.mozilla.com ~ ❯ sudo systemctl restart pushdataaggregator-pending.service rchilds@hgssh1.dmz.mdc1.mozilla.com ~ ❯ sudo systemctl status pushdataaggregator-pending.service ● pushdataaggregator-pending.service - Aggregate replicated pending messages to a new topic Loaded: loaded (/etc/systemd/system/pushdataaggregator-pending.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2018-09-24 04:17:59 UTC; 6s ago Main PID: 16783 (vcsreplicator-a) CGroup: /system.slice/pushdataaggregator-pending.service └─16783 /var/hg/venv_tools/bin/python2.7 /var/hg/venv_tools/bin/vcsreplicator-aggregator --max-polls 1800 /etc/mercurial/pushdataaggregator-pending.ini Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying hg-changegroup-2 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying hg-heads-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: vcsreplicator.aggregator copying heartbeat-1 from partition 7 Sep 24 04:18:02 hgssh1.dmz.mdc1.mozilla.com pushdataaggregator-pending[16783]: kafka.producer producer.stop() called, but producer is not async rchilds@hgssh3.dmz.mdc1.mozilla.com ~ ❯ sudo systemctl restart pushdataaggregator-pending.service Assertion failed on job for pushdataaggregator-pending.service. rchilds@hgssh3.dmz.mdc1.mozilla.com ~ ❯ sudo systemctl status pushdataaggregator-pending.service ● pushdataaggregator-pending.service - Aggregate replicated pending messages to a new topic Loaded: loaded (/etc/systemd/system/pushdataaggregator-pending.service; enabled; vendor preset: disabled) Active: inactive (dead) Assert: start assertion failed at Mon 2018-09-24 04:19:44 UTC; 6s ago AssertPathExists=/repo/hg/master.hgssh3.dmz.mdc1.mozilla.com was not met Sep 13 14:38:46 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 18 17:36:18 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 18 17:58:58 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 20 14:49:27 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 20 20:52:20 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 20 21:00:24 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 21 00:08:58 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 24 04:18:36 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. Sep 24 04:19:44 hgssh3.dmz.mdc1.mozilla.com systemd[1]: Assertion failed for Aggregate replicated pending messages to a new topic. 00:18:14 <@nagios-mdc1> Mon 04:18:14 UTC [9643] [Unknown] hgssh1.dmz.mdc1.mozilla.com:procs - hg vcsreplicator aggregator is OK: PROCS OK: 2 processes with regex args 'vcsreplicator-aggregator' (http://m.mozilla.org/procs+-+hg+vcsreplicator+aggregator) 00:18:14 <@nagios-mdc1> Mon 04:18:14 UTC [9646] [Unknown] hgssh1.dmz.mdc1.mozilla.com:hg push data aggregator pending lag is OK: OK - aggregator has copied all fully replicated messages (http://m.mozilla.org/hg+push+data+aggregator+pending+lag) 00:19:48 <@nagios-mdc1> Mon 04:19:47 UTC [9649] [Unknown] hgssh3.dmz.mdc1.mozilla.com:hg push data aggregator pending lag is OK: OK - 8 messages from 8 partitions behind (http://m.mozilla.org/hg+push+data+aggregator+pending+lag)
You need to log in before you can comment on or make changes to this bug.