Closed Bug 577754 Opened 14 years ago Closed 14 years ago

Updating bugs_fulltext for a 3.6 upgrade is really, really slow

Categories

(Bugzilla :: Installation & Upgrading, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 3.6

People

(Reporter: mkanat, Assigned: mkanat)

Details

Attachments

(1 file)

For some reason, it's WAY faster to truncate the entire bugs_fulltext table and repopulate it than to do a DELETE/INSERT of just the affected rows when we update the attachment comments. I'm going to figure out why and put a fix into 3.6 and above.
BTW, if anybody wants *really* fast bugs_fulltext updating, then around the bugs_fulltext stuff in Install::DB, but DISABLE KEYS and ENABLE KEYS. Then do a myisamchk -q -p afterward, which is super fast. (Of course, you have to shut down MySQL to do it.)
  Oh, although that last recommendation may not actually work--I'm not entirely sure. I'm having some trouble on my local bmo clone that are unrelated to this bug.
Under mssql adding two indexes help as well. Likely they may be useful for other databases:

For the first index, longdescs could easily have millions of lines when you have a few hundred thousand bugs. The schema defines no index on isprivate at all, resulting in a potentially massive scan.

CREATE UNIQUE INDEX custom_longdescs1_idx ON longdescs
(
	[isprivate] ASC,
	[comment_id] ASC,
	[bug_id] ASC
)

CREATE INDEX custom_longdescs2_idx ON longdescs
(
	[bug_id] ASC
)

Note that there is an existing index for bug_id in the schema but it also includes bug_when, at least on mssql the optimizer prefers using the index with only bug_id during population.

Granted this doesn't help during insert comment 1 points out the best solution for performance while inserting, though under mssql you could just disable the ftindex and re-enable after, it would result in a full repopulate of the ftindex.
Example an instance of 200k bugs took 5.5 minutes to populate vs about 20-30 minutes before.
(In reply to comment #3)
> Granted this doesn't help during insert

To correct myself, it does help on the select side of the query, however combining it with the method in comment 1 will speed it up that much more.
Summary: Updating bugs_fulltext is really, really slow → Updating bugs_fulltext for a 3.6 upgrade is really, really slow
  That's true that there's no isprivate index. Good point. It's not the primary time-consumer under MySQL (the primary time consumer is rebuilding the fulltext index) but it's still something that could possibly help--though probably not under MySQL, given the restrictions on index usage.
Attached patch v1Splinter Review
Okay, this makes the updates *considerably* faster, in my testing. It should be even better than the TRUNCATE TABLE hack that justdave and I worked out, because it doesn't have to copy as much data.

I tried adding an index on isprivate, and it didn't make any difference in my tests. MySQL never used it, and there was no benchmark difference. (However, it did take quite a long time to add the index to the longdescs table.)
Assignee: installation → mkanat
Status: NEW → ASSIGNED
Attachment #456654 - Flags: review+
BTW, for those curious, the other methods I tested were:

* TRUNCATE TABLE before calling _populate_bugs_fulltext() and always populating the whole table.
* Wrapping _populate_bugs_fulltext in DISABLE KEYS and ENABLE KEYS.

Both of those solutions were equally fast, and faster than what's currently in Install::DB. (Not too surprising, when you think about the fact that the time is being taken up by building the indexes, which actually happens the same way with both those methods.)

However, it's even faster to just do the REPLACE INTO, because that makes everything into a single statement, and copies over less data. So it's much like doing a new INSERT in terms of index-building performance, but with half the data (in bmo's case).

I still have to test this against a full upgrade, locally, but I've tested it against subsets of the longdescs data from bmo and it seems to have worked just fine.
  Okay, I checked and verified that this works and helps quite a bit. I replaced 50,000 rows in the table in 1 minute and 30 seconds on my local machine.

  One caution though: you need to make sure that your innodb_buffer_pool is large enough, or you're in for a VERY slow upgrade. For example, when I try to add 200,000 rows, it's WAY slower. I watched and the InnoDB stats plummet when it runs out of innodb_buffer_pool space for the data it's reading. (Goes from 10,000+ reads per second to like, 40.)
Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/trunk/
modified Bugzilla/Install/DB.pm
Committed revision 7323.

Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/4.0/
modified Bugzilla/Install/DB.pm
Committed revision 7304.

Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/3.6/
modified Bugzilla/Install/DB.pm
Committed revision 7129.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Flags: approval4.0+
Flags: approval3.6+
Flags: approval+
Remind me when I get back from traveling and I can make a test run of this on the bugzilla staging server, too.
You need to log in before you can comment on or make changes to this bug.