Closed
Bug 470550
Opened 17 years ago
Closed 16 years ago
Database replication died in the C01 cluster due to overfilling sphinx tables
Categories
(support.mozilla.org :: General, defect)
support.mozilla.org
General
Tracking
(Not tracked)
RESOLVED
FIXED
0.8.1
People
(Reporter: justdave, Assigned: nkoth)
References
Details
(Whiteboard: sumo_only)
Attachments
(1 file, 1 obsolete file)
Replication on all of the database slaves died this morning with this error:
"Error 'The table 'se_words' is full' on query. Default database: 'support_mozilla_org'."
Since this is apparently only cache data for sphinx, the problem was resolved by truncating the table on all three slaves (and then on the master).
We need to look into how this is set up to prevent this from happening again.
As far as I can tell, the max_hash_table_size was set the same on the slaves as it was on the masters, except that the table status showed the master having a larger max_data_size on that table than the slaves did, and changing the max_heap_table_size on the slaves didn't seem to affect that max_data_size at all.
| Assignee | ||
Updated•17 years ago
|
Assignee: nobody → nelson
Target Milestone: --- → 0.8.1
| Assignee | ||
Comment 1•17 years ago
|
||
AFAIK, the max_data_size of a table is set at creation, based on the max_heap_table_size setting at the time. Once the table is created, setting max_heap_table_size won't have an effect on tables that have already been created.
I could try and remove dependency on the use of a hash table... by using a real table. Any comments on using a real table?
| Reporter | ||
Comment 2•17 years ago
|
||
Memory tables get lost any time the server restarts, both on the master and the slaves. Any dependency on that table already existing, regardless of replication being in use or not, should be removed if you expect the app to survive a database server reboot. The table should be created when you need it, and immediately dropped as soon as you're finished with it. If it needs to last longer than a minute or two, you're increasing the risk of a restart hosing it. Leaving it sit around for quick access to something is, of course, doable, as long as you check that it actually exists first and recreate it if it doesn't, and you're only accessing it on the master database. Slaves (especially when we have more than one behind a load balancer and can rotate them in and out of service to deal with load) are quite likely to crash the replication thread if any update is made to an in-memory table that was created before the slave last restarted. Likewise, updates to other tables based on things you put in an in-memory table (like selecting out of the in-memory table to fill another one) is not recommended in a replicated environment at all, unless you're doing the create it, populate it, do what you needed, then immediately drop it thing. And even that's risky, but it's less of a risk (how likely is it that a server would get rebooted right during the few seconds it takes you to do that?)
| Assignee | ||
Comment 3•16 years ago
|
||
We should convert se_words to a real table. Let's do this with the 0.8.1 push.
My tests on my development server show that this table size is going to be somewhat over 6 mb in size (not too big), but the large number of updates on the table mean that my first time indexing increases from 230 sec to about 320 seconds. 150 sec is attributable to selects that have to be run anyway, which (150 sec) is the time taken for subsequent indexing if no pages have changed.
These times are not a big deal for a batch mode operation run at most once a day.
Attachment #355606 -
Flags: review?(laura)
Updated•16 years ago
|
Attachment #355606 -
Attachment is patch: true
Attachment #355606 -
Attachment mime type: application/octet-stream → text/plain
Updated•16 years ago
|
Attachment #355606 -
Attachment is patch: false
Attachment #355606 -
Flags: review?(laura) → review+
| Assignee | ||
Updated•16 years ago
|
Attachment #355606 -
Flags: review?(justdave)
| Assignee | ||
Comment 4•16 years ago
|
||
Comment on attachment 355606 [details]
use a real table for se_words
Will running this kind of drop table followed by create table cause any problem with replication? I have absolutely no idea, so asking you just in case.
| Reporter | ||
Updated•16 years ago
|
Attachment #355606 -
Flags: review?(justdave) → review-
| Reporter | ||
Comment 5•16 years ago
|
||
Comment on attachment 355606 [details]
use a real table for se_words
I would use "DROP TABLE IF EXISTS `se_words`;" because if you try to drop it without the conditional and it doesn't exist, it'll break the same way.
| Assignee | ||
Comment 6•16 years ago
|
||
Is this OK, Dave?
Attachment #355606 -
Attachment is obsolete: true
Attachment #356203 -
Flags: review?(justdave)
| Assignee | ||
Updated•16 years ago
|
Attachment #356203 -
Attachment mime type: application/octet-stream → text/plain
| Reporter | ||
Comment 7•16 years ago
|
||
Comment on attachment 356203 [details]
replace any existing se_words table with real table
yep, looks good to me.
Attachment #356203 -
Flags: review?(justdave) → review+
Updated•16 years ago
|
Updated•16 years ago
|
Keywords: push-needed
Whiteboard: sumo_only
You need to log in
before you can comment on or make changes to this bug.
Description
•