Database and web server update for datazilla.mozilla.org

RESOLVED FIXED

Status

task
P3
normal
RESOLVED FIXED
7 years ago
6 months ago

People

(Reporter: jeads, Assigned: cturra)

Tracking

Details

Attachments

(1 attachment, 1 obsolete attachment)

Posted file SQL statements for rollout (obsolete) —
We have an updated repository and a variety of database changes that need to be rolled out on datazilla.mozilla.org.
I listed out a set of operations that we need performed on the production database and web server.

1.) Disable the cron jobs in crontab.txt on the server where they run

2.) Update the repository https://github.com/mozilla/datazilla/ 

3.) Install new compiled dependencies in requirements/compiled.txt on the server that runs crontab.txt

    'pip install -r requirements/compiled.txt'

    This will attempt to install the python modules numpy and scipy.  Confirm that they install clean 
    before proceeding, there might be some issues on RHEL6.

4.) Execute database rollout.sql script on the master database.
    
5.) Install new crontab.txt

6.) Enable the cron jobs

7.) Restart httpd.  This should be the last step, we can continue ingesting data objects throughout this process since they are stored in a separate objectstore that is not effected by any of these database changes.
Hrm....can we have a brief (5-10 minutes) meeting about this? specifically I'd like to spell out which machines (web/db) get each step. Also, aren't steps 5 and 6 the same?

Also, is there a timeframe on this?
Sure a meeting would be great. Is there a time today that would work for you? vidyo or skype (jeads3) is fine with me.
Steps 1-3, 5-6 need to be run on the machine that runs the cronjobs.  I'm not sure how this was configured in production,  Chris Turra [:cturra] or maybe Shyam Mani [:fox2mike] would know.  I think the machine that runs the cron jobs might also be the web server since it needed access to the memcache but I don't have access to the production environment so I'm not sure how it's configured.

Steps 2 and 7 would also need to be run on the web server. If this is the same machine that runs the crons then 1-3 and 5-7 would all be done on the same machine.

Step 4 would need to be run on the master database.

Steps 5 and 6 may or may not be the same depending on how you enable/disable running cron jobs.  Basically the crontab.txt file in https://github.com/mozilla/datazilla/ needs to be set up as the cron file that gets run after steps 1-4 are complete.

In terms of a time frame, Q3 would be great...
Given that this bug was entered today and we have 1 week left in the quarter we will do the best we can.  However, I will not guarantee this will be done within Q3.

Melissa
Added SQL to remove all corrupted data from stoneridge.
Attachment #663073 - Attachment is obsolete: true
There is one additional step that needs to be added.  The databases xperf_perftest_1 and xperf_objectstore_1 are being dropped in the rollout.sql script.  Once this is done, the memcache will need to be updated.  There is a custom manage command in https://github.com/mozilla/datazilla/ for doing this. 

'python manage.py reset_cached_datasources'

This will remove the stale memcache xperf entries.  This should be done after the rollout.sql script is executed in step 4 and before httpd is restarted in 7.  Chris Turra [:cturra] might have already incorporated the command in an httpd startup script but if not, the command would need to be executed explicitly.
:jeads - since a couple groups (dba's/webops) needs to work through this, i think we should update the instruction to note which group should do what so one group is not waiting on the other without knowing the next steps.

from what i can tell, the owners of your initial task list looks to be...

1.) webops

2.) webops

3.) webops *note, we do not use pip packages at this point per our security policy. we would need to makes sure there is an rpm available for any new requirements. if not, you will need to include ith/them as [a] git submodule[s]. 

4.) dba's
    
5.) webops

6.) webops

7.) webops & dba's
(In reply to Chris Turra [:cturra] from comment #7)
> :jeads - since a couple groups (dba's/webops) needs to work through this, i
> think we should update the instruction to note which group should do what so
> one group is not waiting on the other without knowing the next steps.

Thanks Chris, this helps clarify things.
 
> 3.) webops *note, we do not use pip packages at this point per our security
> policy. we would need to makes sure there is an rpm available for any new
> requirements. if not, you will need to include ith/them as [a] git
> submodule[s]. 

There are rpms avaialble for numpy and scipy.

 
> 7.) webops & dba's

Not sure if the dba would need to be involved in restarting httpd... I don't
think there are any changes in rollout.sql that require restarting mysql.
Looks like this is mostly webops, and I'll defer to them for schedule. I don't need a call, I understand what's going on now :)
deferring to webops. I'm still point for the db team on this.
Assignee: server-ops-database → server-ops-webops
Component: Server Operations: Database → Server Operations: Web Operations
Priority: -- → P3
Whiteboard: [pending triage]
Any update here?
Josh - sorry for not providing an update on this sooner! i just spoke with :sherri and we're both going to be in sfo next week so will plan to grab a room and get this done for you guys early next week! stay tuned for further updates.
Assignee: server-ops-webops → cturra
Status: NEW → ASSIGNED
Whiteboard: [pending triage]
I changed SET FOREIGN KEY_CHECKS to SET FOREIGN_KEY_CHECKS btw, but other than that, there were these errors:

[root@datazilla1.db.scl3 scabral]# mysql -f < 792912_release.sql 
ERROR 1146 (42S02) at line 3: Table 'stoneridge_perftest_1.test_page_metric' doesn't exist
ERROR 1146 (42S02) at line 514: Table 'datazilla.datasource' doesn't exist
ERROR 1146 (42S02) at line 518: Table 'datazilla.datasource' doesn't exist
[root@datazilla1.db.scl3 scabral]#
datasource is in the datazilla_mozilla_org db, those were the last 2 statements of the file so I ran them manually:

MariaDB [datazilla_mozilla_org]> DELETE FROM `datasource` WHERE `project` = 'xperf';
Query OK, 2 rows affected (0.11 sec)

MariaDB [datazilla_mozilla_org]> UPDATE datasource SET cron_batch='small' where project='talos';
Query OK, 2 rows affected (0.09 sec)
Rows matched: 2  Changed: 2  Warnings: 0
updates complete per our discussion on #datazilla.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
MariaDB [(none)]> SELECT `processed_flag`, COUNT(`processed_flag`) AS 'processed_flag_count', `error_flag` FROM `talos_objectstore_1`.`objectstore` GROUP BY `processed_flag`, `error_flag`;
+----------------+----------------------+------------+
| processed_flag | processed_flag_count | error_flag |
+----------------+----------------------+------------+
| ready          |                11840 | N          |
| ready          |                    8 | Y          |
| loading        |                   52 | N          |
| complete       |               411611 | N          |
+----------------+----------------------+------------+
4 rows in set (4 min 0.67 sec)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
MariaDB [(none)]> SELECT * FROM `talos_perftest_1`.`application_log`;

| id | revision     | test_run_id | msg_type                 | msg| msg_date   |

|  1 | 1c6b5cae9dc1 |      411421 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349822385 |
|  2 | e2439f189feb |      411429 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349823298 |
|  3 | 1c6b5cae9dc1 |      411437 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349824008 |
|  4 | 1c6b5cae9dc1 |      411478 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349825096 |
|  5 | 1c6b5cae9dc1 |      411484 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349827266 |
|  6 | bb0e7efed1bd |      411498 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349828782 |
|  7 | ec34a79837f6 |      411516 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349829919 |
|  8 | 1c6b5cae9dc1 |      411519 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349831537 |
|  9 | 544e994dc2b7 |      411528 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349836106 |
| 10 | d4425bce8b09 |      411531 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349836524 |
| 11 | bb0e7efed1bd |      411545 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349841089 |
| 12 | bb0e7efed1bd |      411559 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349842737 |
| 13 | ec34a79837f6 |      411587 | compute_test_run_metrics | Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 82, in compute_test_run_metrics
    test_name, debug
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/metrics/perftest_metrics.py", line 327, in _run_metrics
    child_test_data[mkey]['values']
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 850, in get_parent_test_data
    data = self.get_test_values_by_revision(revision)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/metrics.py", line 162, in get_test_values_by_revision
    return_type='tuple',
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 307, in __execute
    self.connection[host_type]['con_obj'].commit()
OperationalError: (2006, 'MySQL server has gone away')

Test type: Talos tp5n
 Exception Name: OperationalError: (2006, 'MySQL server has gone away') | 1349846962 |

13 rows in set (0.00 sec)
MariaDB [(none)]> SELECT COUNT(`product_id`) FROM `talos_perftest_1`.`metric_threshold`;
+---------------------+
| COUNT(`product_id`) |
+---------------------+
|                 187 |
+---------------------+
1 row in set (0.03 sec)
I was able to identify the source of the resource consumption problem and have fixed the offending query.  The fix is now in the datazilla repository.  Please update the repository in production and test the command:

$DATAZILLA_HOME/manage.py process_objects --cron_batch small --loadlimit 15


This should run much faster, consume less memory, and "query killer" should not get invoked.  Thanks for all of the help.
(In reply to Jonathan Eads ( :jeads ) from comment #19)
> 
> $DATAZILLA_HOME/manage.py process_objects --cron_batch small --loadlimit 15

looks like we're getting a mysql error with this run (after updating)...

Starting for projects: talos, b2g, stoneridge, jetperf, test
Processing project talos
Traceback (most recent call last):
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/manage.py", line 13, in <module>
    execute_from_command_line(sys.argv)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/django/core/management/__init__.py", line 429, in execute_from_command_line
    utility.execute()
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/django/core/management/__init__.py", line 379, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/django/core/management/base.py", line 191, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/django/core/management/base.py", line 220, in execute
    output = self.handle(*args, **options)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/django/core/management/base.py", line 351, in handle
    return self.handle_noargs(**options)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/management/commands/base.py", line 137, in handle_noargs
    self.handle_project(p, **options)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/controller/admin/management/commands/process_objects.py", line 53, in handle_project
    test_run_ids = ptm.process_objects(loadlimit)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/base.py", line 1043, in process_objects
    rows = self.claim_objects(loadlimit)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/datazilla/model/base.py", line 1085, in claim_objects
    debug_show=self.DEBUG,
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/RDBSHub.py", line 71, in wrapper
    return func(self, **kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 136, in execute
    return self.__execute(sql, kwargs)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 291, in __execute
    tmsg = t.timeit(1)
  File "/usr/lib64/python2.6/timeit.py", line 193, in timeit
    timing = self.inner(it, self.timer)
  File "/usr/lib64/python2.6/timeit.py", line 99, in inner
    _func()
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 288, in timewrapper
    self.__cursor_execute(sql, kwargs, cursor)
  File "/data/datazilla/src/datazilla.mozilla.org/datazilla/vendor/datasource/bases/SQLHub.py", line 317, in __cursor_execute
    cursor.execute(sql, kwargs['placeholders'])
  File "/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py", line 175, in execute
    if not self._defer_warnings: self._warning_check()
  File "/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py", line 89, in _warning_check
    warn(w[-1], self.Warning, 3)
_mysql_exceptions.Warning: Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. The statement is unsafe because it uses a LIMIT clause. This is unsafe because the set of rows included cannot be predicted.
I think the query producing this message is https://github.com/mozilla/datazilla/blob/master/datazilla/model/sql/objectstore.json#L93

I was not able to reproduce the warning message so I cannot be sure.

In SQL an UPDATE with a LIMIT but no ORDER BY clause is unsafe because the query may produce different results each time it is applied even on identical data sets.  In the case of this particular query it doesn't cause a real problem but it does produce an annoying warning message in the log.  I added an ORDER BY clause to fix this.  

I found this reference to the error http://www.dbasquare.com/kb/warning-the-statement-is-unsafe-because-it-uses-a-limit-clause/.

According to that reference the warning may still persist even though the possibility of returning a non-deterministic result set has been removed.  So it's possible we may need to disregard this message but first lets see if this works. 

Please update the repository and try running 

$DATAZILLA_HOME/manage.py process_objects --cron_batch small --loadlimit 15

Thanks!
It doesn't look like json objects are being processed.  Could someone confirm that the command:

"$DATAZILLA_HOME/manage.py process_objects --cron_batch small --loadlimit 15" 

is showing up in the process list and that the crons are running?

I would also like to confirm that json objects are still being submitted. We can compare the output of this SQL query below with the results in https://bugzilla.mozilla.org/show_bug.cgi?id=792912#c16 to answer that question.

SELECT `processed_flag`, COUNT(`processed_flag`) AS 'processed_flag_count', `error_flag` FROM `talos_objectstore_1`.`objectstore` GROUP BY `processed_flag`, `error_flag`;

If objects are being processed we should see the 'complete' count increasing.  We should also see the counts returned from this web service method increasing.

https://datazilla.mozilla.org/talos/refdata/perftest/runs_by_branch?days_ago=3

If it looks like the crons are running and objects are being submitted but are not being processed then the next step would be running: 

"$DATAZILLA_HOME/manage.py process_objects --cron_batch small --loadlimit 15 --debug"

from the command line and analyzing the debug output written to stdout.
that sorted it out!
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.