chief bots & #mdndev: automatically rejoin?

RESOLVED FIXED

Status

Infrastructure & Operations
WebOps: Community Platform
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: groovecoder, Assigned: solarce)

Tracking

Details

'mdnstagepush' and 'mdnprodpush' bots are missing from #mdndev. Can we make them re-join, and make them automatically re-join after they disconnect?

Comment 1

5 years ago
The bot code is from:

https://github.com/jbalogh/bots

Which uses the npm irc stuff from:

https://node-irc.readthedocs.org/en/latest/

Since the autoRejoin setting defaults to true I suspect it is the error handling as noted in the docs here:

https://node-irc.readthedocs.org/en/latest/#help-it-keeps-crashing

So..., I guess what is needed is for someone who knows how to program node to fork the above repository, make the enhancements and issue a pull request. After that is done we can ping oremj who historically has been the one to turn the github repo into an rpm for consumption[1].

I am unable to code this so we (webops) will be rather unable to move forward with this request until someone steps up with a patch.


1. From puppet/modules/webapp/<cluster>/pushbots.pp
    package {
        "nodejs-pushbots":
            ensure => latest,
            notify  => Service["${name_prefix}-pushbots"];
   }

Comment 2

5 years ago
Nothing we can do here. Perhaps you can open a github issue, or even a pull request if you have time to figure it out and fix it yourself? Sorry. :(

CC'ing oremj and jthomas, who might have more insight. This bot originated from jbalogh, on the team they support... they might be able to help a bit.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → INCOMPLETE
Sorry to bug-spam, but we lost the bots from #mdndev again. If we can't make the bots automatically re-join, can mdn dev's have access to whatever tool webops uses to restart them or make them join #mdndev?
(Assignee)

Comment 4

5 years ago
I restarted the service

[root@developeradm.private.scl3 ~]# service mdn-pushbots status
mdn-pushbots                     RUNNING    pid 2069, uptime 33 days, 17:04:42
[root@developeradm.private.scl3 ~]# service mdn-pushbots restart
Restarting mdn-pushbots: mdn-pushbots: stopped
mdn-pushbots: started
                                                           [  OK  ]
As :jd outlined in c1, we could use some dev resources to improve the reliability of pushbots

I've given those of you with SSH access sudo privileges so you can restart the mdn-pushbots service now, in case it dies again.

You should be able to SSH to developeradm.private.scl3.mozilla.com from MPT-VPN, let me know if you can't
sudo asked for my sudo password and my LDAP password didn't work?
(Assignee)

Comment 6

5 years ago
(In reply to Luke Crouch [:groovecoder] from comment #5)
> sudo asked for my sudo password and my LDAP password didn't work?

It should be passwordless, I forgot the correct way to do this changes. I'd updated the configs and now it works

Info: Applying configuration version '67147'
Notice: /Stage[main]/Sudoers::Users/File[/etc/sudoers.d/users]/ensure: defined content as '{md5}4835ef7bb05febc9da42b14013a423ac'
Notice: Finished catalog run in 41.38 seconds
[root@developeradm.private.scl3 ~]# su - lorchard
s[lorchard@developeradm.private.scl3 ~]$ sudo ls /root/
anaconda-ks.cfg  another-kuma-migration-20120710  another-kuma-migration-20120711  by-hand-output  cshieldstmp	install.log  install.log.syslog  keys  post_install.log
[lorchard@developeradm.private.scl3 ~]$
got it, thanks.
I just tried to restart the pushbots but got a "no such file" error?

[lcrouch@developeradm.private.scl3 ~]$ sudo service mdn-pushbots restart
Restarting mdn-pushbots: mdn-pushbots: ERROR (not running)
mdn-pushbots: ERROR (no such file)
                                                           [  OK  ]
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
(Assignee)

Comment 9

4 years ago
(In reply to Luke Crouch [:groovecoder] from comment #8)
> I just tried to restart the pushbots but got a "no such file" error?
> 
> [lcrouch@developeradm.private.scl3 ~]$ sudo service mdn-pushbots restart
> Restarting mdn-pushbots: mdn-pushbots: ERROR (not running)
> mdn-pushbots: ERROR (no such file)
>                                                            [  OK  ]

The problem is actually a node.js package upgrade.

[root@developeradm.private.scl3 ~]# service mdn-pushbots start
Starting mdn-pushbots: mdn-pushbots: ERROR (no such file)
                                                           [  OK  ]
[root@developeradm.private.scl3 ~]# service mdn-pushbots status
mdn-pushbots                     FATAL      can't find command '/usr/bin/nodejs'

The name of the binary in the package changed from /usr/bin/nodejs to /usr/bin/node

We'd dealt with this by making a symlink, since a number of things we manage assume the older path, but unfortunately it looks like we missed this admin node in our fixing

I am fixing it now
Assignee: server-ops-webops → bburton
Status: REOPENED → ASSIGNED
Component: Server Operations: Web Operations → WebOps: Community Platform
Product: mozilla.org → Infrastructure & Operations
(Assignee)

Comment 10

4 years ago
-> % svn diff modules/webapp/manifests/admin/developer.pp
Index: modules/webapp/manifests/admin/developer.pp
===================================================================
--- modules/webapp/manifests/admin/developer.pp	(revision 73882)
+++ modules/webapp/manifests/admin/developer.pp	(working copy)
@@ -31,6 +31,13 @@
             ensure => present;
     }

+    file {
+      '/usr/bin/node':
+        ensure => link,
+        target => "/usr/bin/nodejs",
+    }
+
+
     include supervisord::base
     include webapp::developer::pushbots
     include gunicorn

bburton@althalus [01:08:50] [~/code/mozilla/sysadmins/puppet/trunk]
-> % svn ci -m "symlink to older path for nodejs works for mdn pushbots, bug 865669" modules/webapp/manifests/admin/developer.pp
Sending        modules/webapp/manifests/admin/developer.pp
Transmitting file data .
Committed revision 73884.
(Assignee)

Comment 11

4 years ago
Deployed and working

Info: Caching catalog for developeradm.private.scl3.mozilla.com
Info: Applying configuration version '73891'
Notice: /Stage[main]/Webapp::Admin::Developer/File[/usr/bin/nodejs]/ensure: created
Notice: /Stage[main]/Webapp::Developer::Pushbots/Package[nodejs-pushbots]/ensure: created 

[root@developeradm.private.scl3 ~]# /usr/bin/nodejs -v
v0.10.14
[root@developeradm.private.scl3 ~]# /usr/bin/node -v
v0.10.14
[root@developeradm.private.scl3 ~]# service mdn-pushbots status
mdn-pushbots                     RUNNING    pid 16805, uptime 0:00:55
[root@developeradm.private.scl3 ~]# service mdn-pushbots stop
Stopping mdn-pushbots: mdn-pushbots: stopped
                                                           [  OK  ]
[root@developeradm.private.scl3 ~]# service mdn-pushbots start
Starting mdn-pushbots: mdn-pushbots: started
                                                           [  OK  ]
[root@developeradm.private.scl3 ~]# service mdn-pushbots status
mdn-pushbots                     RUNNING    pid 17238, uptime 0:01:00

Sorry for the breakage, https://wiki.mozilla.org/Websites/Captain_Shove should help improve the situation over the next couple months
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.