Closed Bug 823817 Opened 13 years ago Closed 12 years ago

Fix checksum on webdev cluster

Categories

(Data & BI Services Team :: DB: MySQL, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: scabral, Assigned: scabral)

Details

[root@webdev1 ~]# /usr/local/bin/mysql-checksum.sh 12-20T19:29:02 Cannot checksum table b_personas.log: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 5950. 12-20T19:39:12 Cannot checksum table c_personas.log: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 5950. 12-20T19:47:55 Skipping table c_remora.eventlog because on the master it would be checksummed in one chunk but on these replicas it has too many rows: 42587 rows on webdev2.db.scl3.mozilla.com The current chunk size limit is 37034 rows (chunk size=37034 * chunk size limit=1).
And the rest of the errors/warnings: 12-20T20:15:07 Cannot checksum table cb_personas.log: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 5950. 12-20T20:29:11 Cannot checksum table personas.log: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 5950. 12-20T21:51:32 Skipping table remora-test.addon_recommendations because it has problems on these replicas: Table remora-test.addon_recommendations does not exist on replica webdev2.db.scl3.mozilla.com This can break replication. If you understand the risks, specify --no-check-slave-tables to disable this check. (and tons more for every other remora-test table)
Assignee: server-ops-database → scabral
We can index the log tables on date: mysql> alter table b_personas.log add index(date); Query OK, 972634 rows affected (1.91 sec) Records: 972634 Duplicates: 0 Warnings: 0 mysql> alter table cb_personas.log add index(date); Query OK, 863195 rows affected (1.65 sec) Records: 863195 Duplicates: 0 Warnings: 0 mysql> alter table c_personas.log add index(date); Query OK, 1237805 rows affected (2.92 sec) Records: 1237805 Duplicates: 0 Warnings: 0 mysql> alter table personas.log add index(date); Query OK, 367374 rows affected (0.72 sec) Records: 367374 Duplicates: 0 Warnings: 0 mysql> alter table c_junk.log add index(date); Query OK, 1237805 rows affected (2.91 sec) Records: 1237805 Duplicates: 0 Warnings: 0 Checksumming all tables called "log" works out great now.
Trying all the tables called "eventlog" I get: 04-08T16:50:56 Skipping table remora-test.eventlog because it has problems on these replicas: Table remora-test.eventlog does not exist on replica webdev2.db.scl3.mozilla.com This can break replication. If you understand the risks, specify --no-check-slave-tables to disable this check. And in fact there's no remora-test database at all on webdev2. It was last written to Nov 30th, so I'm archiving it: [root@webdev1 archive]# pwd /root/archive [root@webdev1 archive]# mysqldump remora-test > remora-test.sql [root@webdev1 archive]# ls -rlth total 312K -rw-r--r-- 1 root root 306K Apr 8 16:54 remora-test.sql [root@webdev1 archive]# gzip remora-test.sql [root@webdev1 archive]#
mysql> drop database if exists `remora-test`; Query OK, 86 rows affected (3.32 sec)
eventlog doesn't seem to have problems any more, so I'm running the full checksum now to see if there's any output.
[root@webdev1 bin]# ./mysql-checksum.sh echo 04-08T19:50:07 Skipping table remora.stats_collections because on the master it would be checksummed in one chunk but on these replicas it has too many rows: 118034 rows on webdev2.db.scl3.mozilla.com The current chunk size limit is 108814 rows (chunk size=108814 * chunk size limit=1). [root@webdev1 bin]# That's the only one left.....
Trying with --chunk-size-limit=100 as there are plenty of indexes, including a primary key for id.
04-10T13:00:00 Skipping table c_remora.translations because on the master it would be checksummed in one chunk but on these replicas it has too many rows: 5524558 rows on webdev2.db.scl3.mozilla.com The current chunk size limit is 4852500 rows (chunk size=48525 * chunk size limit=100). That didn't work out. Trying again, with 250 as the chunk-size-limit.
04-17T13:24:03 Skipping table c_remora.update_counts because on the master it would be checksummed in one chunk but on these replicas it has too many rows: 22169140 rows on webdev2.db.scl3.mozilla.com The current chunk size limit is 19206500 rows (chunk size=76826 * chunk size limit=250). Trying just the update_counts table with a chunk_size_limit of 500...
[root@webdev1 bin]# time ./mysql-checksum-test.sh real 60m9.234s user 0m1.223s sys 0m0.215s [root@webdev1 bin]# grep chunk mysql-checksum-test.sh --chunk-size-limit=500 \ the chunk-size-limit 500 worked out, so I put that in puppet.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Data & BI Services Team
You need to log in before you can comment on or make changes to this bug.