Closed Bug 491324 Opened 15 years ago Closed 15 years ago

Slow queries

Categories

(Websites Graveyard :: spreadfirefox.com, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: oremj, Assigned: oremj)

References

Details

I'm seeing queries like the following taking longer than 2 minutes to run:

SELECT   DISTINCT (t.tid),
                  t.*,
                  parent
FROM     term_data t
         INNER JOIN term_hierarchy h
           ON t.tid = h.tid
WHERE    (t.tid NOT IN (SELECT x.tid
               FROM   (SELECT DISTINCT (ogt.tid)
                       FROM   og_term ogt
                              INNER JOIN og_uid ogu
                                ON ogt.nid = ogu.nid
                                   AND ogu.uid != 260166
                                   AND ogt.PUBLIC < 2) x
               WHERE  x.tid NOT IN (SELECT DISTINCT (ogt.tid)
                                    FROM   og_term ogt
                                           INNER JOIN og_uid ogu
                                             ON ogt.nid = ogu.nid
                                                AND (ogt.PUBLIC > 1
                                                      OR ogu.uid = 260166)))
 AND t.tid NOT IN (SELECT DISTINCT (ogt.tid)
                   FROM   og_term ogt
                          INNER JOIN og_uid ogu
                            ON ogt.nid = ogu.nid
                               AND ogu.uid = 260166
                   WHERE  ogu.is_active = 0))
AND (t.vid = 3)
ORDER BY weight,
         name
Explain ran on that query:
+----+--------------------+------------+--------+---------------+---------+---------+----------------------+------+----------------------------------------------+
| id | select_type        | table      | type   | possible_keys | key     | key_len | ref                  | rows | Extra                                        |
+----+--------------------+------------+--------+---------------+---------+---------+----------------------+------+----------------------------------------------+
|  1 | PRIMARY            | t          | ALL    | PRIMARY,vid   | NULL    | NULL    | NULL                 |  453 | Using where; Using temporary; Using filesort | 
|  1 | PRIMARY            | h          | ref    | PRIMARY,tid   | tid     | 4       | sfx_v2.t.tid         |    1 |                                              | 
|  5 | DEPENDENT SUBQUERY | ogt        | ref    | PRIMARY       | PRIMARY | 4       | func                 |    5 | Using where; Using index; Using temporary    | 
|  5 | DEPENDENT SUBQUERY | ogu        | eq_ref | PRIMARY       | PRIMARY | 8       | sfx_v2.ogt.nid,const |    1 | Using where                                  | 
|  2 | DEPENDENT SUBQUERY | <derived3> | ALL    | NULL          | NULL    | NULL    | NULL                 |  584 | Using where                                  | 
|  4 | DEPENDENT SUBQUERY | ogt        | ref    | PRIMARY       | PRIMARY | 4       | func                 |    5 | Using where; Using temporary                 | 
|  4 | DEPENDENT SUBQUERY | ogu        | ref    | PRIMARY       | PRIMARY | 4       | sfx_v2.ogt.nid       |  857 | Using where; Using index                     | 
|  3 | DERIVED            | ogt        | ALL    | NULL          | NULL    | NULL    | NULL                 |  590 | Using where; Using temporary                 | 
|  3 | DERIVED            | ogu        | ref    | PRIMARY       | PRIMARY | 4       | sfx_v2.ogt.nid       |  857 | Using where; Using index                     | 
+----+--------------------+------------+--------+---------------+---------+---------+----------------------+------+----------------------------------------------+
sigh, this comes from the og_forum module.

CC'ing Paul Booker, he maintains this module on drupal.org.  

Paul, what is this query doing?  what can we do to fix it?  I loaded up some forums on the production site and they loaded fine.  Is this a cron query maybe?
It appears that these slow queries may be why MDC is failing; MDC is making database requests that are timing out because apparently the database is overworked.
(In reply to comment #3)
> It appears that these slow queries may be why MDC is failing; MDC is making
> database requests that are timing out because apparently the database is
> overworked.

oremj, can you verify this is the case?  If so, is there an infra change we can make to sandbox SFx until we get the proper code changes figured out?

Thanks
That does appear to be the case. I am seeing those og queries stack up.
What can we do to get this to stop impacting MDC asap?
(In reply to comment #2)
> sigh, this comes from the og_forum module.
> 
> CC'ing Paul Booker, he maintains this module on drupal.org.  

Just for the record i have been the co-maintainer of this module since August last year fixing bugs and testing patches. I have also upgraded this module to D6 for the migration of spreadfirefox to D6 

> 
> Paul, what is this query doing? 

To the best of my knowledge this query has something to do with allowing regular forums to work side by side with og forums.

 what can we do to fix it?  I loaded up some
> forums on the production site and they loaded fine.  

I have had a quick look into the problem and it looks to be related to the hook 
og_forum_db_rewrite_sql(). Posts that satisfy the condition ($restrict && $user->uid != 0 && !$all_public) will trigger an SQL rewrite.

I'll need some time to investigate as i have not read / touched this code at all since i have been working on the project.

This module has been running on spreadfirefox since the launch of SFX2 so has this problem existed since launch or is it a recent development ?

Best, Paul

Is this a cron query
> maybe?
I'll work on this tomorrow as a priority
Assignee: nobody → paul
Noticed that if i join a new group and then click on the link to create a new discussion topic ..

http://www.spreadfirefox.com/node/add/forum?gids[]=382

.. it times out but if we leave the page for another minute or so and refresh the pages the page then loads very fast which seems to suggest the query finishes loading and is cached (thats nice).

Also this URI is associated with $restrict and not $strict2 in the og.forum module code ..

elseif ((arg(0) == 'node' && arg(1) == 'add' && arg(2) == 'forum') && ($user->uid != 1) && !(user_access('administer forums'))) {
      $restrict = TRUE;
    }

So it looks as though the problem is largely if not entirely with ...

$return['where'] = 't.tid NOT IN (SELECT x.tid FROM (SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND ogu.uid != ' .$user->uid. ' AND ogt.public < 2) x
      WHERE x.tid NOT IN (SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND (ogt.public > 1 OR ogu.uid = ' .$user->uid. '))) 
      AND t.tid NOT IN (SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND ogu.uid = ' .$user->uid. ' WHERE ogu.is_active = 0)';

.. which i must admit i find difficult to even look at let alone understand :-(

I have asked Ryan who is the lead project maintainer of OG Forum and who wrote the module to help me (us) out here.

@Alex

It would be great if i could get a hold of the spreadfirefox database before the weekend so that i can see what happens when you comment this SQL out. It's probably worth having a go yourself if you have a local workstation running spreadfirefox


Best, Paul
Update:

No reply from Ryan , hoping for a reply later today.

The code that is causing the problems exists to restrict what forums should be visible to a user when creating a discussion post in a site wide context (ie sitewide forums & subscribed group forums). So a possible temporary solution if we have no sitewide forums? is to comment out the offending code and take out "Discussion topic" from the  "Create content" menu.


Paul
Update:

Currently exploring rewriting the subqueries as joins for optimization benefits
http://dev.mysql.com/doc/refman/5.1/en/rewriting-subqueries.html
Struggling with a couple of things at the moment  ...

1. Not clear to me at all how we convert the subquery parts into $return['join'] for the hook_db_rewrite_sql()  

2. Also not clear to me at all what 'x' is.


Here is the SQL that needs to refactored which i have formatted a little to investigate its structure

$return['where'] = 't.tid NOT IN (
                                 SELECT x.tid FROM (
                                                   SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND ogu.uid != ' .$user->uid. ' AND ogt.public < 2             ) 
                                                   x
                                                     WHERE x.tid NOT IN (
                                                                        SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND (ogt.public > 1 OR ogu.uid = ' .$user->uid. ')            )        
                                 ) 
                                 AND t.tid NOT IN (
                                                  SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND ogu.uid = ' .$user->uid. ' WHERE ogu.is_active = 0        )'

I'll continue with this after lunch ...

Best, Paul
@Alex

Can i take you up on you offer to have a look at this bug if i get stuck. 

Here is a further update ... 

An observation for later reference. The 3 SQL's below are used to filter out tid records ..

SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND ogu.uid != ' .$user->uid. ' AND ogt.public < 2                        SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND ogu.uid = ' .$user->uid. ' WHERE ogu.is_active = 0 
SELECT DISTINCT(ogt.tid) FROM {og_term} ogt INNER JOIN {og_uid} ogu ON ogt.nid = ogu.nid AND (ogt.public > 1 OR ogu.uid = ' .$user->uid. ')  


Thu query that we are modifying / rewriting is ..

SELECT t.tid, t.*, parent FROM {term_data} t INNER JOIN {term_hierarchy} h ON t.tid = h.tid WHERE t.vid = %d ORDER BY weight, name


We can get close to building the SQL we need more simply along the lines of ..

SELECT t.tid, t.*, parent FROM {term_data} t LEFT JOIN {term_hierarchy} h ON t.tid = h.tid LEFT JOIN {og_term} ogt ON t.tid = ogt.tid LEFT JOIN {og_uid} ogu ON ogt.nid = ogu.nid   WHERE t.vid = %d  AND ogu.uid = %d ORDER BY weight, name 


However at the moment i can't seem to translate this into a working hook_db_rewrite_sql() . My best effort looks to be ..

$return['join'] = "LEFT JOIN {og_term} ogt ON t.tid = ogt.tid LEFT JOIN {og_uid} ogu ON ogt.nid = ogu.nid";
$return['where'] = "ogu.uid = %d";

If this can be made to work it should then be possible to add the other factors that we see in the 3 SQL's like AND (ogt.public > 1 OR ogu.uid = ' .$user->uid. ') , ..
Assignee: paul → nobody
I commented on this yesterday, but I guess it didn't submit, sorry Paul.

I looked at this yesterday a bit.  I'm not sure of a solution yet, but those nasty queries are definitely the problem.

Last night, while I was trying to create an event on stage, the form kept timing out.  I think it had something to do with these slow queries.  So, I'm not sure this is limited to forum posts only.

I'm not sure I understand why the hook_db_rewrite_sql() is needed.  I think it has something to do with the projects a user is allowed to submit new content to.  I like the idea of bypassing this check altogether, by allowing users to post to ANY project and/or changing the workflow for creating new content.

There are a TON of settings on admin/og/og_forum, do any of these help us?  Like, 'update old groups' or 'auto publicize all groups'?  I don't understand a lot of those settings yet.

I'll look at this more today.  Paul, are there other modules that provide similar functionality to og_forum?
Assignee: nobody → buchanae
Something which i will look into tomorrow .. 
http://drupal.org/node/462808


(In reply to comment #15)
> I commented on this yesterday, but I guess it didn't submit, sorry Paul.

No problem Alex
> 
> I looked at this yesterday a bit.  I'm not sure of a solution yet, but those
> nasty queries are definitely the problem.
> 
> Last night, while I was trying to create an event on stage, the form kept
> timing out.  I think it had something to do with these slow queries.  So, I'm
> not sure this is limited to forum posts only.

Interesting. I would guess his will turn out to be a problem with the events module.
> 
> I'm not sure I understand why the hook_db_rewrite_sql() is needed.  

The code that is causing the problems exists to restrict what forums should be
visible to a user when creating a discussion post in a site wide context (ie
sitewide forums & subscribed group forums).

I think it
> has something to do with the projects a user is allowed to submit new content
> to.  I like the idea of bypassing this check altogether, by allowing users to
> post to ANY project and/or changing the workflow for creating new content.
> 
> There are a TON of settings on admin/og/og_forum, do any of these help us? 
> Like, 'update old groups' or 'auto publicize all groups'?  I don't understand a
> lot of those settings yet.

These don't help resolve the problem

> 
> I'll look at this more today.  Paul, are there other modules that provide
> similar functionality to og_forum?

OG Forum is a great module it just has this scaling problem which needs to be resolved. I think well get past this.
If we only show site wide forums outside of groups and continue to show only group forums inside groups then it looks as though we can simplify og_forum_db_rewrite_sql()  as follows ... 

function og_forum_db_rewrite_sql($query, $primary_table = 'n', $primary_field = 'nid') {
 
  if ($primary_field == 'tid') {
    //Only do this on forum nodes
    if(!$_GET['gids']) {
    $return['join'] = "LEFT JOIN {og_term} ogt ON t.tid = ogt.tid";
	$return['where'] = "ogt.nid IS NULL";
	$return['distinct'] = TRUE;
    return $return;
    }
  }
} 

.. feel free to reassign this back to me.

Best, Paul
Slight improvement ..

/**
 * Implementation of hook_db_rewrite_sql().
 *
*/
function og_forum_db_rewrite_sql($query, $primary_table = 'n', $primary_field = 'nid') {
  if ($primary_field == 'tid') {
    if (!$_GET['gids']) {
      $return['join'] = "LEFT JOIN {og_term} ogt ON t.tid = ogt.tid";
      $return['where'] = "ogt.nid IS NULL";
      if ($og_forum_container = variable_get('forum_default_container', FALSE)) {
	    $return['where'] .= " AND t.tid != $og_forum_container";
	  }
      $return['distinct'] = TRUE;
      return $return;
    }
  }
}// function og_forum_db_rewrite_sql()
Target Milestone: --- → 3.0.1
I was able to stop these queries by tweaking some settings on /admin/og/og_forum

1) go to /admin/og/og_forum
2) enable 'Automatic forum publicity'
3) enable 'Make all forums public'
4) save the form
5) Click the 'Publicize old groups' button
Nice job Alex
An update for the record,

My solution in comment 19 has the effect of making all forums on the site public.  Nearly all forums on the site are private, so this seems touchy.

We're not sure how the forums became private in the first place.  Most likely when the og_forum module was added, the privacy feature was unknown/not discussed.  It looks like new forums default to private.

I propose we do away with private forums altogether, at least until we can do it well, but first we need ask questions such as...

* are any private on purpose? how many?
* do users need private forums? why?
* do we want to support private forums?
* etc

In the meantime, I'll see if I can patch this query with something more lightweight.
Target Milestone: 3.0.1 → 3.0.2
OS: Mac OS X → All
This query brought down the database this morning:
SELECT   DISTINCT (t.tid),
                  t.*,
                  parent
FROM     term_data t
         INNER JOIN term_hierarchy h
           ON t.tid = h.tid
WHERE    (t.tid NOT IN (SELECT x.tid
                        FROM   (SELECT DISTINCT (ogt.tid)
                                FROM   og_term ogt
                                       INNER JOIN og_uid ogu
                                         ON ogt.nid = ogu.nid
                                            AND ogu.uid != 127519
                                            AND ogt.PUBLIC < 2) x
                        WHERE  x.tid NOT IN (SELECT DISTINCT (ogt.tid)
                                             FROM   og_term ogt
                                                    INNER JOIN og_uid ogu
                                                      ON ogt.nid = ogu.nid
                                                         AND (ogt.PUBLIC > 1
                                                               OR ogu.uid = 127519)))
          AND t.tid NOT IN (SELECT DISTINCT (ogt.tid)
                            FROM   og_term ogt
                                   INNER JOIN og_uid ogu
                                     ON ogt.nid = ogu.nid
                                        AND ogu.uid = 127519
                            WHERE  ogu.is_active = 0))
         AND (t.vid = 3)
ORDER BY weight,
         name;


Were the changes in comment 19 implemented in production?
Hey oremj,

The idea in comment #19 has not been implemented, per comment #21.  We'll be laying out a plan for how to deal with comment #21 in tomorrow's meeting.

I'll keep trying to find a more immediate temporary solution.

Sorry this brought the DB down :(
Was there any problem with the solution proposed at #c18 ? This did look to be a good solution at the time.

Just some notes about OG modules that may be useful for future discussion ..

Organic groups:
Enables users to create and manage their own 'groups' .
Group nodes are nodes that attract other nodes to them 

Organic groups access control:
Enables access control for private posts and private groups.
So access control on all group nodes is something that is introduced on top of the basic functionality supplied by "Organic groups" 

OG Forum
Creates forums within organic groups.
So this module allows you to have forums that are under the control of a group manager and which are affiliated with a group. This module also provides a "Forum publicity administration" section which allows you to specify how the publicity of forums can be managed. 

Also the "Forum publicity administration" works well or is not being used out in the field as i have not had any complaints about this functionality since effectively taking over as project maintainer last August.

Best, Paul
Hardware: x86 → All
Target Milestone: 3.0.2 → 3.0.3
I made the changes from comment #19.

oremj, could you help monitor whether these queries are still occuring?

Thanks.
assigning to oremj so he sees this bug (hopefully) for comment #25
Assignee: buchanae → jeremy.orem+bugs
Is that change in production yet?  Would need to know which server to look at to tell if it's still happening.
(In reply to comment #28)
> Is that change in production yet?  Would need to know which server to look at
> to tell if it's still happening.

yes
per IRC, justdave confirmed the last time this occurred was July 30th, which was when I applied the fix (comment #25)
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: Websites → Websites Graveyard
You need to log in before you can comment on or make changes to this bug.