Closed Bug 1311847 Opened 9 years ago Closed 9 years ago

It looks like the ETL ingestion stopped, on both the private and public clusters around Oct7th

Categories

(bugzilla.mozilla.org :: Infrastructure, defect)

Production
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: ekyle, Assigned: dhouse)

References

(Blocks 1 open bug)

Details

No description provided.
1. I confirmed from /var/log/cron that cron continues to try running the script 2. I tried running the script directly and I get and error about pymysql :ekyle requests that I upgrade pymysql
By setting the proxy in the env, I was able to upgrade PyMySQL to 0.7.9 I'm watching the next run of the cron to see if the problem is fixed.
The same failure "'SocketIO' object has no attribute '_checkClosed'" appears with the new PyMySQL: 2016-10-21 20:52:40.359106 - ERROR: Can not start at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 490, in start at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 496, in <module> at File app_main.py, line 72, in run_toplevel at File app_main.py, line 642, in run_command_line at File app_main.py, line 712, in entry_point caused by ERROR: Problem with main ETL loop at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 399, in main at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 488, in start at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 496, in <module> at File app_main.py, line 72, in run_toplevel at File app_main.py, line 642, in run_command_line at File app_main.py, line 712, in entry_point caused by ERROR: Failure to connect at File /data/www/Bugzilla-ETL/bzETL/util/sql/db.py, line 92, in _open at File /data/www/Bugzilla-ETL/bzETL/util/sql/db.py, line 76, in __init__ at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 365, in main at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 488, in start at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 496, in <module> at File app_main.py, line 72, in run_toplevel at File app_main.py, line 642, in run_command_line at File app_main.py, line 712, in entry_point caused by ERROR: (2003, "Can't connect to MySQL server on u'db-bugs-ro' ('SocketIO' object has no attribute '_checkClosed')") at File /usr/lib/python2.7/site-packages/pymysql/connections.py, line 818, in _connect at File /usr/lib/python2.7/site-packages/pymysql/connections.py, line 634, in __init__ at File /usr/lib/python2.7/site-packages/pymysql/__init__.py, line 88, in Connect at File /data/www/Bugzilla-ETL/bzETL/util/sql/db.py, line 88, in _open
:ekyle is "db-bugs-ro" correct? I'm guessing the "ro" could mean read-only and might not be wanted.
Flags: needinfo?(klahnakoski)
The settings are entirely under your control. I do not have access to the machine, nor do I know about its environment. I am guessing this is an ACL issue? Is db-bugs-ro defined in hosts?
Flags: needinfo?(klahnakoski)
db-bugs-ro is correct; esfrontline/etl is only pulling data from the db, and should never be able to write. it's not a network acl: etl1.bugs.scl3# nc -vz db-bugs-ro 3306 Connection to db-bugs-ro 3306 port [tcp/mysql] succeeded! (In reply to Dave House [:dhouse] from comment #2) > By setting the proxy in the env, I was able to upgrade PyMySQL to 0.7.9 > I'm watching the next run of the cron to see if the problem is fixed. looks like this is a python version issue. use pip-2.7 to upgrade and it should fix. the etl cron jobs all run via pypy, which is based on python-2.7.3, so the system has both python26 and python27 installed. there's really no way to avoid that, which leads to confusion. :-\
We found that pypy had been upgraded away from a moz package to a redhat one when I performed the patching. We were able to downgrade pypy to the moz package and the etl cron script is working correctly again. 1. running the script directly with pypy, we got the same failure as the cron did 2. running the script directly with python2.7, the etl script ran without problems 3. /usr/bin/pypy was linked to /usr/lib64/pypy-2.0.2/pypy which had no site-packages installed 4. /usr/lib64/pypy-2.2.1/ existed and had site-packages including PyMySQL, but no pypy binary (it had been uninstalled in the update) 5. pypy-2.2.1 existed as an available moz package 6. "downgraded" pypy to the pypy-2.2.1 moz package from the pypy-2.0.2 redhat package. pypy link was switched 7. confirmed that the cron ran successfully
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
fubar: re: elasticseach ekyle and I found that my update had replaced a custom moz package of pypy with the redhat one. I downgraded, but how/where would I add that to puppet to lock the version for the etl box? I'm assuming that nothing will automatically update it for us. So that we could simply be careful/aware next time--I could check to make sure no packages are moving from a moz package to non-moz. Also it may be that the etl script will now work with the new redhat pypy package, but that we just need to install the packages for that pypy. That may be a simpler long-term solution. What do you think?
Flags: needinfo?(klibby)
In manifests/nodes/bugzilla.pp, under the node def for etl1, add this: util::lock_package { 'pypy': version => '2.2.1', epoch => '0'; } I'm inclined to pin the version, as the RHEL version is older.
Flags: needinfo?(klibby)
You need to log in before you can comment on or make changes to this bug.