Closed Bug 713238 Opened 14 years ago Closed 14 years ago

please repair build_properties table on tm-b01-master01.mozilla.org/buildbot database

Categories

(Data & BI Services Team :: DB: MySQL, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: mpressman)

References

Details

Attachments

(1 file)

We're seeing a bunch of masters failing to update this database, well after the load issues tracked in bug 712830 have been resolved. That bug talks about repairing one of the tables, and I think there's another table in a bad state. We're seeing errors like: Traceback (most recent call last): File "/builds/buildbot/tests1-macosx/buildbotcustom/bin/update_from_files.py", line 300, in <module> updated = updateFromFiles(session, options.master, options.name, builders, last_time, options.times) File "/builds/buildbot/tests1-macosx/buildbotcustom/bin/update_from_files.py", line 219, in updateFromFiles db_build.updateFromBBBuild(session, build) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/buildbotcustom/status/db/model.py", line 483, in updateFromBBBuild mysteps = dict((s.name, s) for s in self.steps) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/attributes.py", line 163, in __get__ instance_dict(instance)) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/attributes.py", line 383, in get value = callable_(passive=passive) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/strategies.py", line 646, in __call__ result = q.all() File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/query.py", line 1492, in all return list(self) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/query.py", line 1603, in __iter__ self.session._autoflush() File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 843, in _autoflush self.flush() File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 1359, in flush self._flush(objects) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 1440, in _flush flush_context.execute() File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/unitofwork.py", line 299, in execute rec.execute(self) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/unitofwork.py", line 401, in execute self.dependency_processor.process_saves(uow, states) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/dependency.py", line 1018, in process_saves secondary_update, secondary_delete) File "/builds/buildbot/tests1-macosx/lib/python2.6/site-packages/sqlalchemy/orm/dependency.py", line 1039, in _run_crud result.rowcount) sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'build_properties' expected to delete 22 row(s); Only 0 were matched. Can we repair this table? Filing as blocker because it's currently holding the tree closed.
Assignee: server-ops → server-ops-database
Component: Server Operations → Server Operations: Database
Assignee: server-ops-database → rbryce
from irc: 08:15:42 < bhearsum> sheeri-afk, rbryce: i just filed https://bugzilla.mozilla.org/show_bug.cgi?id=713238 as a blocker ... 08:54:43 < joduinn-coffee> rbryce: as oncall, have you found someone to work on bug#713238 ? 08:54:55 < joduinn-coffee> ...its still unassigned, and its still blocking tree-reopening 08:55:43 < rbryce> joduinn-coffee: i have paged the dba, no response as of yet. 08:56:28 < joduinn-coffee> rbryce: when was that? 08:57:15 < rbryce> 20 minutes ago via text. And phone calls within the last 5 ... 09:07:46 < dumitru> I talked to Matt Pressman. he's looking in a minute 09:08:07 < bhearsum> thanks 09:09:36 < joduinn-coffee> dumitru: thanks ... 09:15:10 < zandr> joduinn-coffee: dba at work, please stand by.
Assignee: rbryce → mpressman
[09:41] < mpressman> | half way through full myisam table repair on buildbot
mpressman says it's all done
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Things are looking good so far - the jobs that were failing to complete before have been successfully inserting data for the past 15min or so. Also, for posterity: a full db repair was done, not just one on the build_properties table. Thanks for your quick response and hard work here, all - it's much appreciated.
mpressman also adjusted the innodb_buffer_pool size: 15:56 < mpressman> one thing I noticed was that we have the innodb_buffer_pool set extremely high, yet there is only one table that is benefitting graphs_mozilla_org_new.test_run_values 15:56 < dumitru> mpressman: maybe we can turn that lower 15:56 < dumitru> if that impacts the performance 15:56 < mpressman> the vast majority of the tables are using the myisam stroage engine and increasing the key_buffer_size would really help 15:56 < mpressman> dumitru: I think it really would 15:56 < dumitru> mpressman: dude do whatever you have to to improve the performance :D 15:56 < dumitru> bhearsum: ^ 15:57 < mpressman> it would require a restart though since the innodb_buffer_pool can only be set at run time 15:57 < mpressman> err start up 15:58 < bhearsum> if you think that will help - sounds good to me 15:58 < dumitru> bhearsum: when can we restart mysqld? 15:58 < mpressman> fwiw, the optimization freed up a ton of space 15:58 < dumitru> awesome heh 15:59 < bhearsum> dumitru: anytime now is fine 16:00 < dumitru> mpressman: ^ 16:00 < dumitru> yay 16:00 < dumitru> SHIP IT 16:00 < mpressman> restart complete 16:00 < dumitru> awesome 16:00 < bhearsum> nice 16:00 < dumitru> thanks mpressman
Here is a list of what was run and tuning options modified. -I ran a repair and index sort on all MyISAM taables on tm-b01-master01 -As a majority of the tables and table size are myisam, I increased the key_buffer_size from a previously set 1GB to it's max of 4GB, the my.cnf had it set at 128MB -I additionally decreased the innodb_buffer_pool_size from 6GB to 2GB. -I Increased the max_tmp_tables to 128 and set the tmp_table_size to 64MB as I noticed queries having to go to disk -Since quite a few more reads are hitting the master, I enabled low_priority_updates, this allows select queries priority over writes instead of blocking and queries stacking up. -I also increased the join_buffer_size as there are a couple of very heavy queries from graphs_mozilla_org_new that benefited quite a bit from this. -I set the query_cache_size to 16MB, we are benefiting from the query cache as hit rates are decent, but this size was determined from only a two hour window and should be examined after several days of use; we do have sufficient memory to play with this small number and de-allocation will be minimal at only 16MB. -I updated the puppet my.cnf file to ensure options are set in case of restart although it truly only affects the innodb_buffer_pool_size
This is a rough overview of the size difference in databases on tm-b01-master01
Product: mozilla.org → Data & BI Services Team
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: