It looks like the ETL ingestion stopped, on both the private and public clusters around Oct7th

RESOLVED FIXED

Status

()

bugzilla.mozilla.org
Infrastructure
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: ekyle, Assigned: dhouse)

Tracking

(Blocks: 1 bug)

Production

Firefox Tracking Flags

(Not tracked)

Details

Comment hidden (empty)
(Assignee)

Comment 1

2 years ago
1. I confirmed from /var/log/cron that cron continues to try running the script
2. I tried running the script directly and I get and error about pymysql
:ekyle requests that I upgrade pymysql
(Assignee)

Comment 2

2 years ago
By setting the proxy in the env, I was able to upgrade PyMySQL to 0.7.9
I'm watching the next run of the cron to see if the problem is fixed.
(Assignee)

Comment 3

2 years ago
The same failure "'SocketIO' object has no attribute '_checkClosed'" appears with the new PyMySQL:

2016-10-21 20:52:40.359106 - ERROR: Can not start
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 490, in start
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 496, in <module>
	at File app_main.py, line 72, in run_toplevel
	at File app_main.py, line 642, in run_command_line
	at File app_main.py, line 712, in entry_point

caused by
	ERROR: Problem with main ETL loop
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 399, in main
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 488, in start
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 496, in <module>
	at File app_main.py, line 72, in run_toplevel
	at File app_main.py, line 642, in run_command_line
	at File app_main.py, line 712, in entry_point

caused by
	ERROR: Failure to connect
	at File /data/www/Bugzilla-ETL/bzETL/util/sql/db.py, line 92, in _open
	at File /data/www/Bugzilla-ETL/bzETL/util/sql/db.py, line 76, in __init__
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 365, in main
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 488, in start
	at File /data/www/Bugzilla-ETL/bzETL/bz_etl.py, line 496, in <module>
	at File app_main.py, line 72, in run_toplevel
	at File app_main.py, line 642, in run_command_line
	at File app_main.py, line 712, in entry_point

caused by
	ERROR: (2003, "Can't connect to MySQL server on u'db-bugs-ro' ('SocketIO' object has no attribute '_checkClosed')")
	at File /usr/lib/python2.7/site-packages/pymysql/connections.py, line 818, in _connect
	at File /usr/lib/python2.7/site-packages/pymysql/connections.py, line 634, in __init__
	at File /usr/lib/python2.7/site-packages/pymysql/__init__.py, line 88, in Connect
	at File /data/www/Bugzilla-ETL/bzETL/util/sql/db.py, line 88, in _open
(Assignee)

Comment 4

2 years ago
:ekyle is "db-bugs-ro" correct? I'm guessing the "ro" could mean read-only and might not be wanted.
Flags: needinfo?(klahnakoski)
(Reporter)

Comment 5

2 years ago
The settings are entirely under your control.  I do not have access to the machine, nor do I know about its environment.  I am guessing this is an ACL issue?  Is db-bugs-ro defined in hosts?
Flags: needinfo?(klahnakoski)
db-bugs-ro is correct; esfrontline/etl is only pulling data from the db, and should never be able to write. 

it's not a network acl:

etl1.bugs.scl3# nc -vz db-bugs-ro 3306
Connection to db-bugs-ro 3306 port [tcp/mysql] succeeded!

(In reply to Dave House [:dhouse] from comment #2)
> By setting the proxy in the env, I was able to upgrade PyMySQL to 0.7.9
> I'm watching the next run of the cron to see if the problem is fixed.

looks like this is a python version issue. use pip-2.7 to upgrade and it should fix.

the etl cron jobs all run via pypy, which is based on python-2.7.3, so the system has both python26 and python27 installed. there's really no way to avoid that, which leads to confusion. :-\
(Assignee)

Comment 7

2 years ago
We found that pypy had been upgraded away from a moz package to a redhat one when I performed the patching. We were able to downgrade pypy to the moz package and the etl cron script is working correctly again.

1. running the script directly with pypy, we got the same failure as the cron did
2. running the script directly with python2.7, the etl script ran without problems
3. /usr/bin/pypy was linked to /usr/lib64/pypy-2.0.2/pypy which had no site-packages installed
4. /usr/lib64/pypy-2.2.1/ existed and had site-packages including PyMySQL, but no pypy binary (it had been uninstalled in the update)
5. pypy-2.2.1 existed as an available moz package
6. "downgraded" pypy to the pypy-2.2.1 moz package from the pypy-2.0.2 redhat package. pypy link was switched
7. confirmed that the cron ran successfully
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
(Assignee)

Comment 8

2 years ago
fubar: re: elasticseach ekyle and I found that my update had replaced a custom moz package of pypy with the redhat one. I downgraded, but how/where would I add that to puppet to lock the version for the etl box?

I'm assuming that nothing will automatically update it for us. So that we could simply be careful/aware next time--I could check to make sure no packages are moving from a moz package to non-moz.

Also it may be that the etl script will now work with the new redhat pypy package, but that we just need to install the packages for that pypy. That may be a simpler long-term solution. What do you think?
Flags: needinfo?(klibby)
In manifests/nodes/bugzilla.pp, under the node def for etl1, add this:

    util::lock_package {
      'pypy':
        version => '2.2.1',
        epoch   => '0';
    }

I'm inclined to pin the version, as the RHEL version is older.
Flags: needinfo?(klibby)
You need to log in before you can comment on or make changes to this bug.