Closed Bug 425135 Opened 14 years ago Closed 13 years ago

Optimize slow SUMO database queries

Categories

(support.mozilla.org :: Knowledge Base Software, task)

task
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: morgamic, Assigned: laura)

References

Details

(Whiteboard: tiki_test)

Attachments

(10 files, 2 obsolete files)

2.98 KB, text/plain
Details
26.32 KB, text/plain
Details
4.22 KB, patch
nkoth
: review+
Details | Diff | Splinter Review
755 bytes, patch
nkoth
: review+
Details | Diff | Splinter Review
3.43 KB, text/plain
Details
1.64 KB, patch
Details | Diff | Splinter Review
1.15 KB, patch
nkoth
: review+
Details | Diff | Splinter Review
5.91 KB, text/plain
Details
699 bytes, text/plain
nkoth
: review+
Details
3.59 KB, patch
nkoth
: review+
Details | Diff | Splinter Review
Using the SUMO production database dump we need to analyze logged slow queries and fix any missing or incorrectly set indexes/keys.

This is likely the cause of bugs like bug 395271.
Priority: -- → P1
Target Milestone: --- → 0.6
Set of slow queries to analyze.
Laura, could you look at the list of slow queries and see if we can get some easy wins here?
Assignee: nobody → laura
Laura, any update on this?
No longer blocks: 395271
Depends on: 424676
One thing I noticed pretty quickly is that all the frequently run queries use prepared statements.  As such, MySQL won't be using the query cache for any of these queries.  (It's a stupid bug with the query cache in older versions of MySQL.)

In addition, all the database work is done using ADODB which is famous for being about the slowest possible alternative.

A big picture approach would be to write a drop in replacement db driver that uses PDO with prepared stmt emulation (and/or also build in external caching via memcache, depending on how much we need), and/or to upgrade to a newer version of MySQL (what does tikiwiki support?) and/or also build in external caching via memcache, depending on how much we need.

There is also specific query optimization that can be done which will require a little more targeted thought.  I'm really suspicious about that top query
( select count(*) from `users_objectpermissions` where `groupName` IN (?) and `objectId` = ? and `objectType` = ? and `permName` = ? LIMIT ?,? )
whic gets run by lib/userslib.php::object_has_permission.  I need to understand the app logic slightly better but it seems crazy that it's getting run 10,000 times a minute with the number of users I assume are on SUMO.  More investigation is next.



(In reply to comment #4)
> upgrade to a newer version of MySQL (what does tikiwiki support?)

TikiWiki claims support with 5.x for Tiki 1.9.3+. We're on Tiki 1.10.

http://doc.tikiwiki.org/tiki-index.php?page=Requirements+and+Setup#_Database
I printed stack traces each time object_has_permission is called and loaded my local home page. It ran 27 times.

25 times it came from categlib.php:517. That code loops through each possible permission for a category (there's five of them) and calls it once to check if the user has that permission. It does that 5 times (twice for the wiki page, once for the wiki page's category, once for the staging version of the home page, and once for the forums for some reason).

It came from categlib.php:1318 and categlib.php:1319 once each.

If we make the code query once for those five permissions, we'd reduce the number of calls to 7.
get_object_categories_perms will now ask about all the permissions for the user/category combo at once rather than each one individually. Instead of the 27 calls I mentioned earlier, there are now 2 calls to object_has_permission and 5 to get_object_permissions_for_user.

After further testing, this problem only seems to affect wiki pages. Other pages must be using other means to get the permissions data.

On my install, the serialized results before and after this patch are equivalent. Nelson, can you check to make sure what I've done here is valid?
Attachment #311976 - Flags: review?(nelson)
Attachment #311976 - Flags: review?(nelson) → review+
We are not using the individual wiki page permissions for sumo. Perhaps we can simply bypass the object_has_permission call since we know that the result will be "no perm".
(In reply to comment #5)

Yes. Most people have been using MySQL 5. I actually intentionally downgraded to by in sync with SUMO.

> (In reply to comment #4)
> > upgrade to a newer version of MySQL (what does tikiwiki support?)
> 
> TikiWiki claims support with 5.x for Tiki 1.9.3+. We're on Tiki 1.10.
> 
> http://doc.tikiwiki.org/tiki-index.php?page=Requirements+and+Setup#_Database
> 

(In reply to comment #8)
> We are not using the individual wiki page permissions for sumo. Perhaps we can
> simply bypass the object_has_permission call since we know that the result will
> be "no perm".
> 

The quickest way is to short circuit the query then, and always return false.  This will reduce the number of database queries by 10k per minute, so I'm happy ;)
BTW Nelson, I'm assuming there's not a way to disable checking from the admin interface?
The calls to object_has_permission are/were all for categories, not the actual wiki page. In comment 6, I'm talking about calls higher up in the chain, for example in get_object_categories_perms.
(In reply to comment #12)
> The calls to object_has_permission are/were all for categories, not the actual
> wiki page. In comment 6, I'm talking about calls higher up in the chain, for
> example in get_object_categories_perms.
> 

Yes. I know. I was referring in comment #11 to the calls to that top query that comes from object_has_one_perm, rather than object_has_perm. Those can be reduced as well, since they are not needed for wiki pages.
Status: NEW → ASSIGNED
Other than the need to take into account  $prefs['feature_category_reinforce'], this looks correct. I am fixing this up now.

(In reply to comment #7)
> Created an attachment (id=311976) [details]
> Make get_object_categories_perms use get_object_permissions_for_user
> 
> get_object_categories_perms will now ask about all the permissions for the
> user/category combo at once rather than each one individually. Instead of the
> 27 calls I mentioned earlier, there are now 2 calls to object_has_permission
> and 5 to get_object_permissions_for_user.
> 
> After further testing, this problem only seems to affect wiki pages. Other
> pages must be using other means to get the permissions data.
> 
> On my install, the serialized results before and after this patch are
> equivalent. Nelson, can you check to make sure what I've done here is valid?
> 

(In reply to comment #14)
> Other than the need to take into account  $prefs['feature_category_reinforce'],
> this looks correct. I am fixing this up now.

It does take that into account. Whether it does so correctly, I'm not sure...
an amended version of attachment 311976 [details] [diff] [review] has been committed to SVN
Depends on: 425487
posix_getpwuid is called twice in tikisetup.class.php, once for the current group and once for the current user. It gets the user and group names and uses these in various error messages that can come up, giving you tips on how to get Tiki running. These calls take 25% of the request time and we're set up correctly, so comment out this feature.

(Is posix_getpwuid really that expensive?)
Attachment #312203 - Flags: review?
I can't find the data that shows this 25%. Its completely negligible afaict. We probably don't need this patch. (In reply to comment #17)

> Created an attachment (id=312203) [details]
> remove calls to posix_getpwuid
> 
> posix_getpwuid is called twice in tikisetup.class.php, once for the current
> group and once for the current user. It gets the user and group names and uses
> these in various error messages that can come up, giving you tips on how to get
> Tiki running. These calls take 25% of the request time and we're set up
> correctly, so comment out this feature.
> 
> (Is posix_getpwuid really that expensive?)
> 

cachegrind.32183._en-US_kb_Managing_profiles

Path is 

tiki-index.php
(main)
require_once: tiki-setup.php
require_once: error_reporting.php
TikiSetup::check
php::posix_getpwuid
I have added a patch that caches the url transformation regexes (more
improvements are coming re bug 425485). This totally eliminates the number 7
top query in that list above.

Also, bug 425521 gets rid of the majority of top query number 2.
looks like an isolated case. But since it is useless anyway since we are all setup, we can comment it out. :)

(In reply to comment #19)
> cachegrind.32183._en-US_kb_Managing_profiles
> 
> Path is 
> 
> tiki-index.php
> (main)
> require_once: tiki-setup.php
> require_once: error_reporting.php
> TikiSetup::check
> php::posix_getpwuid
> 

Attachment #312203 - Flags: review? → review+
Attachment #312322 - Attachment is obsolete: true
Attachment #312335 - Flags: review?(nelson)
Cache the results of get_object_permissions_for_user in memory for a particular set of passed parameters. This means that the 5 calls in comment 7 will only hit the database once.
Attachment #312635 - Flags: review?(nelson)
Attachment #312635 - Flags: review?(nelson) → review+
Comment on attachment 312635 [details] [diff] [review]
cache get_object_permissions_for_user results

committed to r11769
Attached patch Fix for one slow query (obsolete) — Splinter Review
Before patching this you'll need to run:
alter table users_permissions add unique(permName);

I can't figure out why it doesn't see permName as a valid key (it's primary), but this resolves the issue anyway.
Attachment #316026 - Flags: review?(nelson)
Nelson's comments from email:
From what I remember, permName can be blank (unless I am confusing this db
table with another) due to some quirk. Perhaps we can make the query as
you have changed it (without where `permName` > '') but do a check for the
condition in PHP after it is returned.
Fixed one query in r12313.  SQL to run is at
http://wiki.mozilla.org/Support/Database_Changes
Have a patch to reduce the number of queries per request (by 20 or 30 I guess) over at https://bugzilla.mozilla.org/show_bug.cgi?id=425134
This caused a regression in bug 429793 -- was r12313 tested on support-stage?  Why was this not spotted?
Somehow I managed not to commit the complete patch as linked above (which didn't have the problem as I had removed the query clause / code path that was erroring out).  Still not sure how I managed this.  The patch as reviewed (and as I still had installed on my machine here) worked fine.  
Also, r12313 was not tested on support stage because we pushed r12363 to production to solve that pageRank issue in a hurry as per https://bugzilla.mozilla.org/show_bug.cgi?id=429529
yeah, we pushed r12363 in a great hurry to fix that critical issue, and I still had testing r12313  on my todo list when that r12363 pageRank issue happened.
New slow query:
select a.*, tf.*, max(b.`commentDate`) as `lastPost` from `tiki_comments` a left join `tiki_comments` b on b.`parentId`=a.`threadId` right join `tiki_forums` tf on tf.`forumId` where a.`objectType` = 'forum' and a.`parentId`=0  and a.`object`=1 group by a.`threadId` order by `lastPost` desc

Working on it.
Mark - can you post the common queries again?
http://xb95.pastebin.mozilla.org/424340

Those might not all be SUMO, but I *think* they are.  (Also, the figure of 550 queries per second is about 75% SUMO.  I removed the other couple queries from the view.)
A new one:
select a.`threadId`,a.`object`,a.`objectType`,a.`parentId`,     a.`userName`,a.`commentDate`,a.`hits`,a.`type`,a.`points`,     a.`votes`,a.`average`,a.`title`,a.`data`,a.`hash`,a.`user_ip`,     a.`summary`,a.`smiley`,a.`message_id`,a.`in_reply_to`,a.`comment_rating`, IFNULL(max(b.`commentDate`), a.`commentDate`)  as `lastPost`,     count(b.`threadId`) as `replies`, (a.`type` = 's') AS sortorder from `tiki_comments` a left join `tiki_comments` b  on b.`parentId`=a.`threadId` where a.`object`='1' and a.`objectType` = 'forum' and a.`parentId` = 0   group by a.`threadId`,a.`object`,a.`objectType`,a.`parentId`,a.`userName`,a.`commentDate`,a.`hits`,a.`type`,a.`points`,a.`votes`,a.`average`,a.`title`,a.`data`,a.`hash`,a.`user_ip`,a.`summary`,a.`smiley`,a.`message_id`,a.`in_reply_to`,a.`comment_rating` order by sortorder DESC, `hits` asc, `threadId` LIMIT 180,20;
The query should be able to rewritten with subselects like http://oremj.pastebin.mozilla.org/458495 that way it can use indexes and as far as I can tell does exactly the same thing.

In my tests the new query is at least 10x faster than the one in Comment 41.
I'm not seeing any improvement on my system for the rewritten queries, though I may be doing it wrong or may not have a DB with the right indexes.

Laura, can you look into this? I'm quite tempted by the 10x faster comment.
We'll continue with this in 0.7.
Severity: major → normal
Priority: P1 → --
Target Milestone: 0.6 → 0.7
If someone is going to re-write the commentslib query, can you include in the query a subquery to get info about the latest post? This is currently being loaded separately (one query per thread!). I'm also going to need this information to add some additional queries like "Threads where the last poster is a contributor"
No longer depends on: 425487
No longer depends on: 424676
Depends on: 445344
Still need to fix: select DISTINCT(a.`threadId`),a.`object`,a.`objectType`,a.`parentId`,    
                  a.`userName`,a.`commentDate`,a.`hits`,a.`type`,a.`points`,    
                  a.`votes`,a.`average`,a.`title`,a.`data`,a.`hash`,a.`user_ip`,    
                  a.`summary`,a.`smiley`,a.`message_id`,a.`in_reply_to`,a.`comment_rating`,
                 (SELECT IFNULL(max(b.`commentDate`), a.`commentDate`)  as `lastPost` 
                 FROM `tiki_comments` b where `parentId`=a.`threadId`) AS `lastPost`,
                 (SELECT count(b.`threadId`) as replies from `tiki_comments` b where `parentId`=a.`threadId`) AS `replies`,
                 (a.`type` = 's') AS sortorder
                 from
                 `tiki_comments` a left join `tiki_comments` b  on b.`parentId`=a.`threadId` where a.`object`='1' and a.`objectType` = 'forum' and a.`parentId` = 0    order by sortorder DESC, `lastPost` desc, `threadId` LIMIT 0,20

Queries like that are starting to take a really long time.
Okay...

lib/rankings/ranklib.php in forums_ranking_last_topics has this query:

select a.*, tf.*, max(b.`commentDate`) as `lastPost`
from `tiki_comments` a
    left join `tiki_comments` b
        on b.`parentId`=a.`threadId`
    right join `tiki_forums` tf
        on tf.`forumId`
where  a.`objectType` = 'forum'
             and a.`parentId`=0
             and a.`object`=1
group by a.`threadId`
order by `lastPost` desc;

This query is returning *SEVENTY MEGABYTES* of data.  No wonder it's taking lots of seconds to execute!  That's kinda bad, actually.

There are two things wrong with this query:

1) It joins the tiki_forums table, but the ON clause is invalid?  Try "ON tf.forumID = a.object".

2) Constrain the time range!  The point of the query is to get the most recent 10 posted in threads, right?  Compare:

ORIGINAL: 69,003,521 bytes in 47.48 seconds
MODIFIED: 64,730 bytes in 0.21 seconds

The change?  The query now looks like this:

select a.*, tf.*, max(b.`commentDate`) as `lastPost`
from `tiki_comments` a
    left join `tiki_comments` b
        on b.`parentId`=a.`threadId`
    right join `tiki_forums` tf
        on tf.`forumId` = a.`object`
where  a.`objectType` = 'forum'
             and a.`parentId`=0
             and a.`object`=1
/* -----> */ and b.commentDate > UNIX_TIMESTAMP() - 7200 /* <------- */
group by a.`threadId`
order by `lastPost` desc;

Third from the last line - the one with the clearly marked comments.  I chose two hours, but you can make it more or less.  The longer the period, the more likely we're going to always find enough rows to fulfill the request, but the longer the query will take.  In fact... let's do this empirically... I just selected out the last 10 days of replies (parentID > 0) and verified that there is NO hour in which there is less than 5 replies - and in the past few days, we've never gone below 20ish.

Two hours seems like it will work fine, given this information.  But I'd be fine with four hours too.  The only reason you'd want longer is to cover for quiet periods (we don't have any) or if you have lots of forums that users can't see (we don't use forum security on the Firefox forum which is 99.9% of our traffic).

Anyway, to test this change, I went and ran the query both ways, with a limit on the result, and only selected out the threadID, forumID, and lastPost.  The results match exactly from the old query and the new query.  They are functionally equivalent in their output.

Any more information you need, let me know.  Maybe I'll take a look at that DISTINCT query now...
Target Milestone: 0.7 → 0.8
It has been a week shy of two months since this ticket was updated.

I was unable to wait (constant load problems on the two slaves running SUMO), so I hid top-level forum posts that were more than two weeks old.  They are still available via direct linking, they just no longer appear on the list that you can scroll back to.

This fixes the DISTINCT query below by giving it less data to scan.  "Fixes" is a term used loosely here.

This action is easily reversed at any point.  I picked a new/unused parentId as a flag that we can use to revert this within a few minutes.  (Presumably after a fix is deployed for this issue.)  Anyway, this is what I did:

mysql> update tiki_comments set parentId = 999999999 where object = 1 and objectType = 'forum' and parentId = 0 and commentDate < unix_timestamp() - 86400*14;
Query OK, 64021 rows affected (3 min 54.05 sec)
Rows matched: 64021  Changed: 64021  Warnings: 0

Reverting is simply setting the parentId back to 0 where it's currently 999999999.

Again, this is a temporary "fix" that is simply pushing off the problem.  It would be great if we could get some webdev love and attention on this problem.  Thank you!
Blocks: 460954
Scheduling this for next release, somehow it ended up in a much later one...will give it some love, thanks Mark.
Severity: normal → critical
Target Milestone: 0.8 → 0.7.2
Laura, do you have an ETA on that release?
0.7.2 freezes Tuesday Nov 4 and should be pushed Thursday Nov 6.
I thought I had committed a fix for this back in Sep 5. This is in r10872/r10873. On that day, I must have had a comment to this bug that was timed out of bugzilla or something. Anyway, there is an admin configurable parameter that can be set in http://support.mozilla.com/tiki-admin.php?locale=en-US&page=forums. I've reduced it from 720 hours (what is was before to 336 hours now).

Mark, can you try reverting?



(In reply to comment #48)
> It has been a week shy of two months since this ticket was updated.
> 
> I was unable to wait (constant load problems on the two slaves running SUMO),
> so I hid top-level forum posts that were more than two weeks old.  They are
> still available via direct linking, they just no longer appear on the list that
> you can scroll back to.
>
(In reply to comment #52)
> I thought I had committed a fix for this back in Sep 5. This is in


I mean r18072/r18073...

> r10872/r10873. On that day, I must have had a comment to this bug that was
> timed out of bugzilla or something. Anyway, there is an admin configurable
> parameter that can be set in
> http://support.mozilla.com/tiki-admin.php?locale=en-US&page=forums. I've
> reduced it from 720 hours (what is was before to 336 hours now).
> 
> Mark, can you try reverting?
> 
> 
> 
> (In reply to comment #48)
> > It has been a week shy of two months since this ticket was updated.
> > 
> > I was unable to wait (constant load problems on the two slaves running SUMO),
> > so I hid top-level forum posts that were more than two weeks old.  They are
> > still available via direct linking, they just no longer appear on the list that
> > you can scroll back to.
> >
Mark, Let me know if you find the time not restricted or something. Could I have missed a query?


(In reply to comment #46)
> Still need to fix: select
> DISTINCT(a.`threadId`),a.`object`,a.`objectType`,a.`parentId`,    
>                   a.`userName`,a.`commentDate`,a.`hits`,a.`type`,a.`points`,    
>                  
> a.`votes`,a.`average`,a.`title`,a.`data`,a.`hash`,a.`user_ip`,    
>                  
> a.`summary`,a.`smiley`,a.`message_id`,a.`in_reply_to`,a.`comment_rating`,
>                  (SELECT IFNULL(max(b.`commentDate`), a.`commentDate`)  as
> `lastPost` 
>                  FROM `tiki_comments` b where `parentId`=a.`threadId`) AS
> `lastPost`,
>                  (SELECT count(b.`threadId`) as replies from `tiki_comments` b
> where `parentId`=a.`threadId`) AS `replies`,
>                  (a.`type` = 's') AS sortorder
>                  from
>                  `tiki_comments` a left join `tiki_comments` b  on
> b.`parentId`=a.`threadId` where a.`object`='1' and a.`objectType` = 'forum' and
> a.`parentId` = 0    order by sortorder DESC, `lastPost` desc, `threadId` LIMIT
> 0,20
> 
> Queries like that are starting to take a really long time.

This one is caused by the order by.  In local testing, running the query takes 1m 47s.  Without the order by it takes 0.01s.   

(The 'Using filesort' in
+----+--------------------+-------+------+-----------------------+------------+---------+-----------------+-------+----------------------------------------------+
| id | select_type        | table | type | possible_keys         | key        | key_len | ref             | rows  | Extra                                        |
+----+--------------------+-------+------+-----------------------+------------+---------+-----------------+-------+----------------------------------------------+
|  1 | PRIMARY            | a     | ref  | no_repeats,objectType | no_repeats |       5 | const           | 93875 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | b     | ref  | no_repeats            | no_repeats |       5 | sumo.a.threadId |  1709 | Using index                                  |
|  3 | DEPENDENT SUBQUERY | b     | ref  | no_repeats            | no_repeats |       5 | sumo.a.threadId | 17092 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | b     | ref  | no_repeats            | no_repeats |       5 | sumo.a.threadId | 17092 | Using where; Using index                     |
+----+--------------------+-------+------+-----------------------+------------+---------+-----------------+-------+----------------------------------------------+
is the giveaway.)


According to http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html, since this query uses a mix of ASC and DESC, it can't use keys.  If I cut the threadId sort (ASC) it cuts execution time to 33s.  Worthwhile?
Re Mark's query:
Nelson's change has been in production since October 20.  Can someone in server-ops (Jeremy?) revert Mark's change and check we've solved that problem?
Attachment #347567 - Flags: review?(nelson) → review+
Patch in comment 57 committed in r19657.  That's it for 0.7.2.

Reassigning to oremj for comments on:
1.  comment 56 above

2. whether we can tune our mysql install as referenced in the bottom section of 
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html, specifically
" If you want to increase ORDER BY speed, check whether you can get MySQL to use indexes rather than an extra sorting phase. If this is not possible, you can try the following strategies:
"    * Increase the size of the sort_buffer_size variable.
    * Increase the size of the read_rnd_buffer_size variable.
...
    * Change tmpdir to point to a dedicated filesystem with large amounts of free space. Also, this option accepts several paths that are used in round-robin fashion, so you can use this feature to spread the load across several directories. Paths should be separated by colon characters (“:”) on Unix and semicolon characters (“;”) on Windows, NetWare, and OS/2. The paths should be for directories in filesystems that are located on different physical disks, not different partitions on the same disk."

What do you think?
Assignee: laura → oremj
Severity: critical → major
Target Milestone: 0.7.2 → 0.7.3
Which file is Mark's change local change in and which query are we trying to optimize by tweaking the MySQL config?
1.  Mark ran 
mysql> update tiki_comments set parentId = 999999999 where object = 1 and
objectType = 'forum' and parentId = 0 and commentDate < unix_timestamp() -
86400*14;
Query OK, 64021 rows affected (3 min 54.05 sec)
Rows matched: 64021  Changed: 64021  Warnings: 0
in production.
Revert it by running
update tiki_comments set parentId = 0 where parentId = 999999999;

2.  The slow query you pasted in comment 46.  If you read comment 55, you'll see that it's the order by clause that's making that slow.  I've sped it up some, but to speed up the filesort I'm suggesting we try the tuning strategies listed in comment 58.  (Wow, that's confusing, grab me in IRC if it doesn't make sense.)
I'm probably overlooking something, but what is the join doing in that query?
         LEFT JOIN `tiki_comments` b
           ON b.`parentId` = a.`threadId`
Note: patch from comment 57 in prod branch r19661
(In reply to comment #61)
> I'm probably overlooking something, but what is the join doing in that query?
>          LEFT JOIN `tiki_comments` b
>            ON b.`parentId` = a.`threadId`

Pulling in the replies so it can display the number of replies to a thread, I think.
The reply count should be covered by 

                  (SELECT COUNT(b.`threadId`)  AS replies
                   FROM   `tiki_comments` b
                   WHERE  `parentId` = a.`threadId`) AS `replies`,
Two comments because I just started looking at this bug.  Comment 46:

The way it works is it seems to be joining a 200K row table to itself making a massive table, then calculating lastPost for each row based on a subquery of this table, then sorting based on lastPost.  Not to mention having to make sure there are no repeats on threadId.  It's painfully expensive.

A couple ways to make it faster:
Why are we pulling a.`data`? This query pulls the first 20 threads for the frontpage of the forums, we don't need to show data.  It just adds tons of information to return.

Also, if we break the query into two parts it should run a lot faster.  The first pulls the threadIds of the threads and since it sorts on commentDate (which we also index on) it should run pretty fast: 

SELECT DISTINCT (IF(a.parentId = 0, a.threadId, a.parentId)) as threadId, a.`type` = 's' as sortorder FROM tiki_comments a WHERE a.object=1 and a.objectType = 'forum' ORDER BY sortorder DESC, a.commentDate DESC LIMIT 0,20 

The second query would pull full data for each of the above results with the threadId as XXXX

select a.`threadId`,a.`object`,a.`objectType`,a.`parentId`,
                  a.`userName`,a.`commentDate`,a.`hits`,a.`type`,a.`points`,
a.`votes`,a.`average`,a.`title`,a.`hash`,a.`user_ip`, a.`summary`,a.`smiley`,a.`message_id`,a.`in_reply_to`,a.`comment_rating`,
                 (SELECT IFNULL(max(b.`commentDate`), a.`commentDate`)  as
`lastPost` 
                 FROM `tiki_comments` b where `parentId`=a.`threadId`) AS
`lastPost`,
                 (SELECT count(b.`threadId`) as replies from `tiki_comments` b
where `parentId`=a.`threadId`) AS `replies`,
                 (a.`type` = 's') AS sortorder
                 from
                 `tiki_comments` a where a.threadId = XXXXX and
a.`parentId` = 0 ORDER BY sortorder DESC, `lastPost` DESC


I'm sure that there's a way to do this with a JOIN but my SQL-fu is weak.
mysql> update tiki_comments set parentId = 0 where parentId = 999999999;
Query OK, 68189 rows affected (3 min 45.16 sec)
Rows matched: 68189  Changed: 68189  Warnings: 0
I think Mark went through and optimized the databases pretty well.  Here are the current values:

sort_buffer=6M
read_rnd_buffer_size=4M
tmpdir=/tmp

I'm not sure if increasing the sort_buffer and read_rnd_buffer_size would be worth it or not.
Assignee: oremj → laura
Target Milestone: 0.7.3 → 0.8
Laura, when will the query from comment #65 be fixed?  Should we create a separate bug for that?
Right now the queries like the following are killing the database: 
select DISTINCT(a.`threadId`),a.`object`,a.`objectType`,a.`parentId`,    
                  a.`userName`,a.`commentDate`,a.`hits`,a.`type`,a.`points`,    
                  a.`votes`,a.`average`,a.`title`,a.`data`,a.`hash`,a.`user_ip`,    
                  a.`summary`,a.`smiley`,a.`message_id`,a.`in_reply_to`,a.`comment_rating`,
                 (SELECT IFNULL(max(b.`commentDate`), a.`commentDate`)  as `lastPost` 
                 FROM `tiki_comments` b where `parentId`=a.`threadId`) AS `lastPost`,
                 (SELECT count(b.`threadId`) as replies from `tiki_comments` b where `parentId`=a.`threadId`) AS `replies`,
                 (a.`type` = 's') AS sortorder
                 from
                 `tiki_comments` a left join `tiki_comments` b  on b.`parentId`=a.`threadId` where a.`object`='1' and a.`objectType` = 'forum' and a.`parentId` = 0    order by sortorder DESC, `userName` desc LIMIT 680,20
For the time being had to redo what mark did in comment #48.
mysql> update tiki_comments set parentId = 999999999 where object = 1 and
    -> objectType = 'forum' and parentId = 0 and commentDate < unix_timestamp() -
    -> 86400*14;
Query OK, 75574 rows affected (3 min 33.51 sec)
Rows matched: 75574  Changed: 75574  Warnings: 0
(In reply to comment #70)
> For the time being had to redo what mark did in comment #48.

(In reply to comment #48)
> so I hid top-level forum posts that were more than two weeks old.  They are
> still available via direct linking, they just no longer appear on the list
> that you can scroll back to.

Older forum threads are no longer available via direct-linking.  You get the error, "This forum is flat and doesn't allow replies to other replies".

See the Firefox Support Contributor forum on this:
http://support.mozilla.com/tiki-view_forum_thread.php?forumId=3&comments_parentId=221840
(In reply to comment #73)
> That's bug 460954.

Sorry.  Bug 460954 is marked "Depends on: 425135" and links back here to comment 48 for "A description of the change that caused this" so I thought I should comment here.  I've copied my comment to bug 460954.
Local benchmarks.

Before conversion, old query takes 5 minutes to run.  
After conversion, each new query takes ~0.01 seconds to run.
Attachment #352122 - Flags: review?(nelson)
I think that's the last outstanding query in here, and this bug is getting long and difficult to follow.  Once that's committed I'm going to close this out: for any new slow queries, please file individual bugs!
Duplicate of this bug: 424676
Duplicate of this bug: 453936
Attachment #316026 - Attachment is obsolete: true
Attachment #316026 - Flags: review?(nelson)
Attachment #352122 - Flags: review?(nelson) → review+
Patch in trunk r20677, prod branch r20860.

Closing this; if you find new slow queries please open a new bug for clarity.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
that should be prod branch r20680, sorry for typo
The forums posts mentioned in http://support.mozilla.com/tiki-view_forum_thread.php?forumId=3&comments_parentId=221840 are all now working for me; Verified FIXED.
Status: RESOLVED → VERIFIED
(In reply to comment #81)
> The forums posts mentioned in
> http://support.mozilla.com/tiki-view_forum_thread.php?forumId=3&comments_parentId=221840
> are all now working for me; Verified FIXED.

...which was a comment intended for bug 460954, sorry.
Whiteboard: tiki_triage
Whiteboard: tiki_triage → tiki_test
You need to log in before you can comment on or make changes to this bug.