Permission checking in queries not optimal

RESOLVED FIXED in Bugzilla 2.18

Status

()

P1
normal
RESOLVED FIXED
17 years ago
6 years ago

People

(Reporter: dkl, Assigned: dkl)

Tracking

({topperf})

2.17
Bugzilla 2.18
topperf
Dependency tree / graph

Details

Attachments

(2 attachments, 12 obsolete attachments)

v12
7.72 KB, patch
Details | Diff | Splinter Review
702 bytes, text/plain
myk
: review+
Details
(Assignee)

Description

17 years ago
I have found in practice that buglist.cgi will return results from a large query much 
faster when they SelectVisible() wrapper is not used for permission checking. I 
feel the performance hit is due to the large amount of outer joins that are being 
used. To show the difference you can simple comment out the SelectVisible line in 
buglist.cgi and re-run your query. The PostgreSQL branch was basically unusable 
without this removal with a large amount of bugs (50,000+). So in the PostgreSQL 
branch and now the Groups branch, I have altered to CanSeeBug to both take a 
single bug and a list of bugs returning the list of bugs that can be seen by the 
current user. This only adds one or two extra SQL queries (one for PostgreSQL, 
two for Groups) to each buglist.cgi hit. For details you can check out the code from 
either the Bugzilla_PgSql_Branch or the Bugzilla_Groups_Branch. I would be 
interested in seeing how this performs with 100,000+ bugs such as 
bugzilla.mozilla.org. Also if anyone is better at reading EXPLAIN output it would be 
helpful to compare the two.
dkl: can you give the sql gnereated by a query for all bugs asa has commented
on, both before and after this change?

If theres a large speedup on bmo, and the patch is simple, then maybe we could
get this in for 2.16? (unless the group stuff is landing before then, of course)

Post the explain output, too, I guess.
(Assignee)

Comment 2

17 years ago
I imagine this specific request would be difficult since I do not have a copy of the 
current mozilla database to run against my new code. I can simulate something by 
picking someone here at redhat and running a query against our database on bugs 
they have commented on which I assume will serve the same purpose. 
Priority: -- → P2
Target Milestone: --- → Bugzilla 2.16
Not a release blocker, no patch.  -> 2.18

I'm assuming we'll be able to fix this a better way when we redo the groups
stuff anyway (which is targetted for 2.18 already).  Setting dependency.
Depends on: 68022
Target Milestone: Bugzilla 2.16 → Bugzilla 2.18
Joel: comments on this bug now that you've rewritten groups?

Gerv

Comment 5

17 years ago
In light of the SQL discussions that have been going up and back now, I am going
to suggest that this bug be closed unless dkl jumps in and says otherwise.  The
postgresql bug 98304 is home to the changes to the query structure that make
these queries efficient across multiple dbs without requiring many outer joins.


Updated

17 years ago
Blocks: 172772

Comment 6

17 years ago
Changed name from:
Using SelectVisible() for permission checking not optimal.
To:
Permission checking in queries not optimal across DBs

From IRC discussion....

This bug is mutating to cover the changes to the Search.pm query to make it
closer to cross-db compatible or, at least, to unblock bug 172272

Assignee: endico → dkl
Blocks: 98304
OS: Linux → All
Priority: P2 → P1
Hardware: Other → All
Summary: Using SelectVisible() for permission checking not optimal. → Permission checking in queries not optimal across DBs
Version: 2.15 → 2.17
(Assignee)

Comment 7

17 years ago
Created attachment 102755 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v1)
(Assignee)

Comment 8

17 years ago
Created attachment 102758 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v2)

Oops, fixed perl warning and also left out other non-group related HAVING
stuff. Better patch.
(Assignee)

Updated

17 years ago
Attachment #102755 - Attachment is obsolete: true
Comment on attachment 102758 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v2)

You need to check for isbless != 0. You also don't need the DISTINCT. For the
empty grouplist case, use NOT NULL instead of NOT IN (....). I think.

You could also only select buggroups, and hope that the db can make that more
optimal, esp for teh common case where the user isn't in any bug group.

Bring on Bugzilla::User! :)

I suspect that this is faster, anyway - is it?
Attachment #102758 - Flags: review-
(Assignee)

Comment 10

17 years ago
Created attachment 102765 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v3)

New patch with bbaetz's changes
Attachment #102758 - Attachment is obsolete: true

Comment 11

17 years ago
Actually, when selecting the group list, you need to use isbless = 0, not
isbless != 0 or else only admins will be able to see bugs in queries.


After you do the ...
 @grouplist = (0) if !@grouplist; 

does the ...

+    if (@grouplist) {
+        $query .= " AND bug_group_map.group_id NOT IN (" . join(',',
@grouplist) . ") ";
+    } else {
+        $query .= " AND bug_group_map.group_id NOT NULL ";
+    }
+        

still work?




Status: NEW → ASSIGNED
No, that won;t work, what you do is leave @grouplist as empty, ie lose the (0) line.

Actually, won't you want IS NULL, rather than IS NOT NULL, so that you select
unprotected bugs?

Comment 13

17 years ago
Both of bbaetz's comment 12 notes are correct. 

Slightly off-topic,  what is an unprotected bug with OR-groups?
(Assignee)

Comment 14

17 years ago
Created attachment 102807 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v4)

@grouplist = (0) was due to lack of sleep. Fixed.
Attachment #102765 - Attachment is obsolete: true
(Assignee)

Comment 15

17 years ago
Reminder to myself per discussion with Joel on IRC

<dkl> joel_desk: the highlighting is only needed really when doing multibug editing
<dkl> so that you know that you are adding a private comment to a public bug or not
<dkl> we could make it where only multi bug suffers the slowdown somehow since
it is not used as often
<joel_desk> If that assumption is correct, the best option would be leave the
query alone, but have buglist do a query-per-bug only in the multi-bug-edit mode.
--- Kowalski|sleep is now known as Kowalski
<dkl> joel_desk: that is how I currently do it in the Oracle version except only
when in multi edit
<joel_desk> In which case, we may as well call an IsBugPublic function directly
from the template.
<joel_desk> Let's move that to the template in case someone has strong feelings
on the subject.
<dkl> joel_desk: or in buglist and just step one at a time through the resulting
bug array
<dkl> okay
<joel_desk> It becomes easy for a site to manage their own tradeoff.
<dkl> yeah that would be better
<dkl> I will generate a new patch
<dkl> i guess is doesnt really matter if we call it IsBugPublic() or IsBugPrivate()
<dkl> probably private better since the template can then check for a true value
<dkl> instead of !IsBugPublic()

Comment 16

17 years ago
Further notes on v4:

 You REALLY need a DISTINCT when populating grouplist()

 The queries generated when searching for a user  on the CC list, having
commented, reporting, or assigned are way slower than they need to be.  This
results from....

1) The reported and assigned EACH do their own get-the-user, then join with
profiles, then do the string compare.  Really, the entire property should get
the user maching the property and then avoid repeatedly joining the profiles in
for each table.
In other words, rather than....
SELECT stuff FROM bugs LEFT JOIN cc cc_ ON cc_.bugid = bugs.id LEFT JOIN
profiles map_cc ON cc_.who = profiles.id ...... WHERE map_cc.login =
'exactname@exactdomain.com'
We should first look up the userid specified in the form and then use matching
the numeric userid as part of the ON clause when the cc_ table is joined.
This is absolutely worthwhile for exact matches and cases where there are a
small number of userids matching the form.  It may even pay when there is a huge
number of matches to the form's userid, but there could be a limit to the number
of elements in an IN() clause.  (is there?)
So, for matches on "is" dkl@redhat.com or "contains dkl", it should certainly
work this way.  For cases like "doesn't match" dkl, is probably still has to do
the old join.

Even there, it may be possible to use...
SELECT stuff FROM bugs LEFT JOIN profiles AS profiles_leftbox ON
profiles_leftbox.login_name NOT LIKE "dkl" LEFT JOIN longdescs AS commented_ ON
commented_.who = profiles_leftbox.id AND commented_.bugid = bugs.id LEFT JOIN cc
AS cc_ ON cc_.who = profiles_leftbox.id AND cc_.bug_id = bugs.id ...morestuff
WHERE cc_.who IS NOT NULL OR commented_.who IS NOT NULL .. morestuff



Note that OR group checking is probably better, since we then wouldn't need the
COUNT stuff.

Comment 18

17 years ago
Actually, the issue in comment 16 is completely independent of AND vs OR groups
and is now being tracked under bug 175425.

The AND-vs-OR logic neutralizes out with the patch being worked here which does
eliminate the counting.
(Assignee)

Updated

17 years ago
Attachment #102807 - Attachment is obsolete: true
Attachment #102807 - Flags: review-
(Assignee)

Comment 19

17 years ago
Created attachment 103641 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v5)

Fixes:

1. Added DISTINCT back to SQL query that populates @grouplist per Joel's
request.
2. Added IsBugPrivate() to globals to be called from list template. Only
invoked 
when multi-edit is chosen do to being slight slower.
3. Fixed report.cgi to work with new query in Bugzilla::Search.
4. Removed 3 dummy fields in buglist.cgi.

Comment 20

17 years ago
One nit before this goes in....

1) We should supress the call to IsPrivate() if the use is not logged in (or
else make IsPrivate() return false immediately when called by a non-logged in user)
(Assignee)

Comment 21

17 years ago
WRT comment 20, this happens with the latest patch I attached.

<tr class="bz_[% bug.severity %] bz_[% bug.priority %] [%+ "bz_secure" IF
(dotweak && IsBugPrivate(bug.id)) %]">

dotweak in buglist.cgi requires login by default.

Updated

17 years ago
Attachment #103641 - Flags: review-

Comment 22

17 years ago
Comment on attachment 103641 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v5)

I'm seeing some non-public bugs that are showing up when the user is logged
out.

This comes from the ...
+    if (@grouplist) {
+	 $query .= " AND bug_group_map.group_id NOT IN (" . join(',',
@grouplist) . ") ";
+    } else {
+	 $query .= " AND bug_group_map.group_id IS NULL ";
+    }

Which should be 

+    if (@grouplist) {
+	 $query .= " AND bug_group_map.group_id NOT IN (" . join(',',
@grouplist) . ") ";
+    } 

To cause the join to grab ANY record that exists.  The only time we appear to
have a NULL in that field is if the JOIN can find nothing.




Also, we forgot to permit the QA contact to see bugs.  Better make sure that
the qa_contact is not zero when we do that or all bugs without QA contacts will
become visible to logged-out users.
(Assignee)

Comment 23

17 years ago
I have made the suggested change and still get bugs listed in the search results
that shouldnt be there for a logged out user.

$query = "SELECT DISTINCT " . join(', ', @fields) .
             " FROM $suppstring" .
             " LEFT JOIN bug_group_map " .
             " ON bug_group_map.bug_id = bugs.bug_id ";

    if (@grouplist) {
        $query .= " AND bug_group_map.group_id NOT IN (" . join(',', @grouplist)
. ") ";
    }

    $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
              " WHERE " . join(' AND ', (@wherepart, @andlist)) .
              " AND ((bug_group_map.group_id IS NULL) " .
              "    OR (bugs.reporter_accessible = 1 AND bugs.reporter =
$::userid) " .
              "    OR (bugs.cclist_accessible = 1 AND cc.who = $::userid) " .
              "    OR (bugs.assigned_to = $::userid) OR (bugs.qa_contact =
$::userid)) " .
              " GROUP BY bugs.bug_id";

Comment 24

17 years ago
I think you need to make the following conditional on the user being logged in....


              "    OR (bugs.reporter_accessible = 1 AND bugs.reporter =
$::userid) " .
              "    OR (bugs.cclist_accessible = 1 AND cc.who = $::userid) " .
              "    OR (bugs.assigned_to = $::userid) OR (bugs.qa_contact =
$::userid)) " .

Comment 25

17 years ago
To follow up....  I confirmed this....

quetly_check_login leaves $::userid = 0 if the user is not logged in.
qa_contact would be 0 if it is never set.

That should be the culprit.


(Assignee)

Comment 26

17 years ago
Created attachment 104061 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v6)

okay, made suggested changes. New cut.
Attachment #103641 - Attachment is obsolete: true
(Assignee)

Comment 27

17 years ago
Created attachment 104149 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v7)

Fix the SQL that gets the @grouplist to be more DB independent. Had problem
with foo JOIN bar ON (some condition) with a certain other db. Changed it to be


SELECT something FROM foo, bar WHERE foo.id = bar.id AND some condition.

Which does the same thing and is more compatible.
Attachment #104061 - Attachment is obsolete: true

Updated

17 years ago
Attachment #104149 - Flags: review+

Comment 28

17 years ago
Comment on attachment 104149 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v7)

r=joel

Updated

17 years ago
Blocks: 147275
(Assignee)

Comment 29

17 years ago
Created attachment 104242 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v8)

More efficient checking for private bugs in buglist.cgi. Calls to
IsBugPrivate() no longer needed.
Attachment #104149 - Attachment is obsolete: true
(Assignee)

Comment 30

17 years ago
Created attachment 104244 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v9)

Oops. Check for @bugidlist before performing check so as not generate a SQL
error if @bugidlist empty.
Attachment #104242 - Attachment is obsolete: true

Comment 31

17 years ago
Comment on attachment 104244 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v9)

Excellent - you did manage to get it back into the hash.
r=joel
Attachment #104244 - Flags: review+
Comment on attachment 104244 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v9)

>+# Check for bug privacy and set $bug->{id}->{isprivate} = 1 if private 
>+# to 1 or more groups
>+my %privatebugs;

The comment's wrong: that's not what you are setting.

>+if (@bugidlist) {
>+    SendSQL("SELECT DISTINCT bugs.bug_id FROM bugs, bug_group_map " .
>+            "WHERE bugs.bug_id = bug_group_map.bug_id " .
>+            "AND bugs.bug_id IN (" . join(',',@bugidlist) . ")");
>+    while (MoreSQLData()) {
>+        my ($id) = FetchSQLData();
>+        $privatebugs{$id} = 1;
>+    }
>+    foreach my $bug (@bugs) {
>+        if ($privatebugs{$bug->{id}}) {
>+            $bug->{isingroups} = 1;
>+        }
>+    }
>+}

This would be all so much easier if @bugs was actually a hash, %bugs. No need
for a separate list of IDs; no need for two loops. Is that too big a change?
The above section would look like this:
+if (%bugs) {
+    SendSQL("SELECT DISTINCT bugs.bug_id FROM bugs, bug_group_map " .
+	     "WHERE bugs.bug_id = bug_group_map.bug_id " .
+	     "AND bugs.bug_id IN (" . join(',', keys %bugs) . ")");
+    while (MoreSQLData()) {
+	 my ($id) = FetchSQLData();
+	 $bugs->{$id}{isingroups} = 1;
+    }

In fact, couldn't this be combined with the actual getting of the bug data
further up? I am reviewing this without access to the whole file, so I can't
see.

> ################################################################################
> # Template Variable Definition
>Index: report.cgi
>===================================================================
>RCS file: /cvsroot/mozilla/webtools/bugzilla/report.cgi,v
>retrieving revision 1.4
>diff -u -u -r1.4 report.cgi

Please don't mess with report.cgi, as I'm in the process of rewriting it (bug
173005). Are these changes necessary at the same time as everything else, or
can they be done in a later patch? (Also, I can't quite see what you are doing
- whay are you reading the id back from the database? If $id has always to be
selected, then you shouldn't force the caller to add it to the field list - you
should add it inside Search.pm.) 

>Index: Bugzilla/Search.pm
>===================================================================
>+   
>+    # Grab a list of group ids that user belongs to
>+    my @grouplist;
>+    my $query = "SELECT DISTINCT group_id FROM user_group_map, groups " .
>+                " WHERE user_group_map.group_id = groups.id " . 
>+                " AND groups.isbuggroup = 1 AND user_group_map.isbless = 0 " . 
>+                " AND user_group_map.user_id = $::userid";
>+    &::SendSQL($query);
>+    while (&::MoreSQLData()) {
>+        my ($id) = &::FetchSQLData();
>+        push (@grouplist, $id);
>+    }

The user's groups are, or should be, cached in $vars->{'user'}. No need to get
them again.

>+
>+    $query = "SELECT DISTINCT " . join(', ', @fields) .
>+             " FROM $suppstring" .
>+             " LEFT JOIN bug_group_map " .
>+             " ON bug_group_map.bug_id = bugs.bug_id ";
>+
>+    if (@grouplist) {
>+        $query .= " AND bug_group_map.group_id NOT IN (" . join(',', @grouplist) . ") ";
>+    } 
>+        
>+    $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
>+              " WHERE " . join(' AND ', (@wherepart, @andlist)) .
>+              " AND ((bug_group_map.group_id IS NULL) ";

This will fail if @wherepart and @andlist are both empty, which can happen, I
believe.

This is great - I'm sure this will speed things up immensely.

Gerv
Attachment #104244 - Flags: review-
(Assignee)

Comment 33

17 years ago
Created attachment 104515 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v10)

Comments about patch to follow.
Attachment #104244 - Attachment is obsolete: true
(Assignee)

Comment 34

17 years ago
WRT to Gerv's comments:

>The comment's wrong: that's not what you are setting.

Fixed.

>+    foreach my $bug (@bugs) {
>+        if ($privatebugs{$bug->{id}}) {
>+            $bug->{isingroups} = 1;
>+        }
>+    }
>+}

>This would be all so much easier if @bugs was actually a hash, %bugs. No need
>for a separate list of IDs; no need for two loops. Is that too big a change?

This would require changes to the table.html.tmpl template to properly iterate
through each record so it might be fine just to leave it as is.

> ################################################################################
> # Template Variable Definition
>Index: report.cgi
>===================================================================
>RCS file: /cvsroot/mozilla/webtools/bugzilla/report.cgi,v
>retrieving revision 1.4
>diff -u -u -r1.4 report.cgi

Please don't mess with report.cgi, as I'm in the process of rewriting it (bug
173005). Are these changes necessary at the same time as everything else, or
can they be done in a later patch? (Also, I can't quite see what you are doing
- whay are you reading the id back from the database? If $id has always to be
selected, then you shouldn't force the caller to add it to the field list - you
should add it inside Search.pm.) 

>+    # Grab a list of group ids that user belongs to
>+    my @grouplist;
>+    my $query = "SELECT DISTINCT group_id FROM user_group_map, groups " .
>+                " WHERE user_group_map.group_id = groups.id " . 
>+                " AND groups.isbuggroup = 1 AND user_group_map.isbless = 0 " . 
>+                " AND user_group_map.user_id = $::userid";
>+    &::SendSQL($query);
>+    while (&::MoreSQLData()) {
>+        my ($id) = &::FetchSQLData();
>+        push (@grouplist, $id);
>+    }

>The user's groups are, or should be, cached in $vars->{'user'}. No need to get
>them again.

This was actually a good idea. I had to alter GetUserInfo() in CGI.pl to pull in
group ids as well as names. Then just checked for the existence of
$::vars->{user}{groupids} in Search.pm and used those instead. Cut down one one
more unnecessary SQL query.

>+    $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
>+              " WHERE " . join(' AND ', (@wherepart, @andlist)) .
>+              " AND ((bug_group_map.group_id IS NULL) ";

>This will fail if @wherepart and @andlist are both empty, which can happen, I
>believe.

This was already like this before so was not introduced with this patch. Should
this be filed as a separate bug?

I have removed my changes from report.cgi since Gerv stated it was being
rewritten anyway. But note that it is broken when using this patch. Either
changes need to be made to report.cgi to add bugs.bug_id as the leading field
when calling Bugzilla::Search or Bugzilla::Search needs to add it if not there
since it is using a DISTINCT in the beginning of the main query. Comments?

Comment 35

17 years ago

If we want to avoid changing report.cgi, we should append the id to the list of
fields requested by the caller instead of prepending it.  However, that is
pretty hokey.  It would be much better to just fix report.cgi
(Assignee)

Comment 36

17 years ago
Created attachment 104666 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v10)

Fixed problem with $::vars->{user}{groupids} check. Also changed to look for
$::vars->{user}{userid} instead of $::userid since that may go away in the
future.
Attachment #104515 - Attachment is obsolete: true
> I have removed my changes from report.cgi since Gerv stated it was being
> rewritten anyway. But note that it is broken when using this patch. Either
> changes need to be made to report.cgi to add bugs.bug_id as the leading field
> when calling Bugzilla::Search or Bugzilla::Search needs to add it if not there
> since it is using a DISTINCT in the beginning of the main query. Comments?

If the workings of Bugzilla::Search require bugs.bug_id to be the first field
(why? DISTINCT doesn't have a special relationship with the first field named)
then it should make sure that this is true itself.

There should be no onus on a user of the search API to request particular
columns; if Bugzilla::Search needs particular columns, it should request them,
and make the fact that it has done so transparent to the caller (so that if the
caller asks for fields x, y and z, the results which return have fields x, y and
z as the first three fields.)

Gerv
(Assignee)

Comment 38

17 years ago
Created attachment 104687 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v11)

Fixed error problem with @{$::vars->{user}{groupids}} if not defined.
(Assignee)

Updated

17 years ago
Attachment #104666 - Attachment is obsolete: true

Comment 39

17 years ago
Assuming that any columns not retrieved are discarded anyway, we should APPEND
the bug_id to the list of fields requested if the caller doesn't ask for it. 
This will unbreak reports without hacking it up.
Comment on attachment 104687 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v11)

>+    $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
>+              " WHERE " . join(' AND ', (@wherepart, @andlist)) .
>+              " AND ((bug_group_map.group_id IS NULL) ";

You don't want to be unconditionally creating a row for every person on the cc
list...

You lost the ' AND cc.who = $::userid' from the join condition, so that you get
exactly one row per bug. Plus, that LEFT JOIN is only needed if you're logged
in.

Also, make the cc AS ccMap, just to avoid possible name conflicts with joins
from the search code, and also as a form of documentation.

You also may need to change the corresponding WHERE thing for the ccs to add a
AND ccMap.who IS NOT NULL, in order to work arround
http://lists.mysql.com/cgi-ez/ezmlm-cgi?9:mss:11417 You could also use IS NOT
NULL in place of checking for equality with $::userid, rather tha in addition
to, I think; if you do both, add a comment pointing to that url, and mention
that the bug was fixed in 3.23.44.

Oh, for a SUBSELECT ;)
Attachment #104687 - Flags: review-
You can also drop teh DISTINCT; its implied by the GROUP by
although myk noticed times increasing when the distinct was dropped. No clue why...
Anyway, even with teh JOIN change, the patch with this bug was still about 0.5
seconds slower when searching for all Bugzilla+Webtools bugs 10 times. That was
out of 1 min 20 second, though...

It may be better if the user isn't logged in, mind you.
No longer blocks: 176570
OK, new fun stuff

GROUP BY is 10-15% slower than DISTINCT on mysql (Its also slower on pg, but a
query which takes 0.8 seconds before a vaccum takes 12 after, probably because I
have a bogus cost setting somewhere which I don't feel like chasing down)

I'm guessing, (based partly on pg's more detailed EXPLAIN results) that the
reason is that GROUP has to sort every single possible column into a temp data
structure (since they could be used for HAVING/aggregate), whilst DISTINCT can
just sort the returned columns. Anyway, let me hack this patch to handle that...
OK, I'm going to attach a patch. Since so much of buglist.cgi's time is spent
loading TT+friends, I hacked up a quick script to generate the sql, then call
SendSQL on that 1000 times. The query I used the query term 'product=foo'

w/o being logged in:
prepatch, this took 6.1 seconds
postpatch, this to 3.8

when logged in:
prepatch - 6.8 seconds
postpatch - 4.7

I only have 40 bugs in total, so those numbers aren't really meaningful.

However, on a randomly generated db with 50000 bugs (spread roughly evenly over
10 products), 20000 users, 10 groups, 100000 entries in the cc table, 100
entires in user_group_map and 50 entries in bug_group_map, running 100 times for
product = '1 existing product':

w/o being logged in:

prepatch - 48 seconds
post-patch - 18 seconds

Being logged in:

prepatch - 48 seconds
post-patch - 28 seconds

Searching on two products:

not logged in:

pre - 67 seconds
post - 33 seconds

logged in:

prepatch - 67 seconds
post patch - 55 seconds

The vast speed difference when not being logged in is mainly due to avoiding the
unneeded table joins. Re-adding the GROUP BY adds 1-2 seconds to the post patch
time; IOW its about 5%. This is close to the variance, though, so take that
number with a grain of salt.

Numbers are average of 2-3 runs, variance about +/- 0.3 seconds for the first
set, and +/- 2-3 sec for the large db, and I discarded the first run for each query.

Unless the optimiser barfs, this improvement is relative to the number of rows
found in the query; its not going to be noticable on one of the monster
comment/historical stuff. Still, it won't hurt :)

In short; I think we really want this for bmo's update; patch coming after I fix
the reporting stuff.

While we're at it, I'll point out that the current Bugzilla::Search stuff leaves
out allowing the qa contact to see the bug..... Those numbers were run with the
new stuff disabling it too, but its fixed in the patch.

myk, given the above numbers I'm not sure why you weren't seeing any difference
when we ran the queries the other day. If you don't want it immediately because
of perceived risk, we need to fix teh qa stuff regardless.
Keywords: topperf
Those numbers were all for selecting the bug_id only. If I additionally select
assigned_to (which involves an extra join), then I get:

not logged in:

pre - 79
post - 63 

logged in:

pre - 70
post - 76

Removing distinct 'fixes' this. The numbers are really bad if you want reporter
and assigned_to, - they're worse by a  *factor* of 3.

Consider the query to find all bugs not in a group:

SELECT bugs.bug_id, map_assigned_to.login_name FROM profiles AS map_assigned_to,
bugs LEFT JOIN bug_group_map  ON bug_group_map.bug_id = bugs.bug_id  WHERE
bugs.assigned_to = map_assigned_to.userid AND ((bug_group_map.group_id IS NULL));

without distinct, 2.8 seonds. With, 18.5 seconds. No, thats not a typo. Using
GROUP BY in this case is 5.8 seconds, so I'm going to switch to that. The
numbers in my previous coment roughly stay the same, and the last case where we
are worse off becomes 60 s.

The real fix for this is probably to split up the 'give me columns' stuff from
the search fu, but that would not only be a pain with the current architechture,
its probably something which would be better done using subselects. (In any
event, doing so is probably a blocker for custfields)

As (another) aside, if I change the product code to do the id lookup in
Bugzilla::Search rather than a join, stuff doesn't seem to change much; the
optimiser is obvioulsy managing to pull that out, at least
Created attachment 105576 [details] [diff] [review]
v12

As previously discussed. The sumary for this bug probably needs to be changed;
its not about cross db support, and none of the other patches have done that
either.

myk, can I get you to run some comparison tests on bmo-based data, please?
Attachment #104687 - Attachment is obsolete: true

Comment 48

17 years ago
A few initial comments....

In some of the earlier work dkl and I did, we found that the IN() clause will
fail if there are more than a few thousand items in it.  I dont remember if this
effected Pg or MySql or both. Buglist.cgi needs a way of more incrementally
running the query to build the privacy hash.  A good way could be to just let it
go through the buglist 1000 entries at a time.

For logged out users, we know in advance that none of the bugs will be
restricted, so we should skip that entire query.

isingroups will become a misnomer when we get to OR groups.  How about reversing
the sense of it (only one template seems to use it) and making it "ispublic"?

This does look like it is on the right track, tough I am not certain what the
impact on Pg compatibility will be.  

No longer blocks: 147275
bbaetz- got that test script handy?
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-anonymous-before.sql > /dev/null; done

real    0m51.317s
user    0m5.721s
sys     0m2.666s
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-anonymous-after.sql > /dev/null; done

real    0m50.216s
user    0m5.691s
sys     0m2.652s
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-myk-before.sql > /dev/null; done

real    0m51.709s
user    0m5.699s
sys     0m2.627s
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-myk-after.sql > /dev/null; done

real    0m50.987s
user    0m5.742s
sys     0m2.678s

Doh. myk - I'll attach it now. What sort of query were you running? As I said,
the improvement is per row retreived, so make sure that you return a few hundred
rows to notice th emost difference.

dkl - we only use IN for groups. Noone is going to have several thousand groups,
and if anyone ever does we could move to a subselect with NOT EXISTS, which
would be faster in that case anyway.
Created attachment 105644 [details]
test script

Change the user id and groups stuff as appropriate. Put the script whereever
(change the $::db_name setting, too), and run from the root bugzilla dir.

Try it after also removing the assigned_to search, too.
Ok, here's what I get with the test script:

version on b.m.o: ~28.16
tip before 114696: ~17.55
tip after 114696: ~13.73
tip after 114696 and removing duplicate condition: ~13.69

I don't fully understand these numbers, since the version on b.m.o should be
faster than the others, but there it is.
Comment on attachment 105644 [details]
test script

Ok, I'm satisfied.  Testing shows significant performance improvements, and as
far as I can tell it doesn't break anything.  r=myk
Attachment #105644 - Flags: review+
Checked in. The cross-db support has a fair way to go, please file a separate
bug for those
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
Summary: Permission checking in queries not optimal across DBs → Permission checking in queries not optimal

Comment 57

14 years ago
User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 
1.0.3705)
Build Identifier: bugzilla/2.18.1

I use the bugzilla 2.18.1 and have a productA
       The Group Access Controls groupA: Default/Default
                                 groupB: Default/Default
                                 groupC: Default/Default
                                 groupD: Default/Default 
       The groupA->user(user1,user2)
           groupB->user(user3,user4)
           groupC->user(user5,user6)     
           groupD->user(user7,user8)
     

Reproducible: Always

Steps to Reproduce:
1. when user1 new a bug 1,select the groupA,groupB,groupC .

Actual Results:  
1.the bug 1 only user1 can see.


Expected Results:  
1.The users of groupA,groupB and groupC can see the bug 1,but the users of 
groupD can not see the bug.

when user1 new a bug 2 ,select the groupA. the result only the groupA can see 
the bug.It is right.

who can tall me ,How can i do?
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.