Closed Bug 97469 Opened 23 years ago Closed 23 years ago

Assignee/QA/Reporter/CC don't get email on restricted bugs

Categories

(Bugzilla :: Email Notifications, defect, P2)

2.13
defect

Tracking

()

RESOLVED FIXED
Bugzilla 2.16

People

(Reporter: justdave, Assigned: bbaetz)

References

Details

Attachments

(3 files, 7 obsolete files)

This is an offshoot from bug 39816, which made it possible for the Assignee, QA, Reporter, and CC list to see a bug even if they were not a member of the group the bug was restricted to. However, processmail currently will not send email to users who are not members of the groups the bug is in. Processmail needs to check for the role bits and act appropriately.
They should receive email if the settings say they should before OR after the change, so they know they're no longer going to receive it.
If they were able to receive mail before, but not after, should they just get a mail with the groupset change, and some text explaining the change? Or the full email that anyone else would get?
That's a good question. The easiest way to do it would be for them to recieve a "normal" message. This would also allow the person restricting the bug to explain why and have it be seen by the people who will no longer see it. Of course, that means the person doing the restriction will have to be careful not to put an sensitive/confidential information in the comment until after it's been restricted.
Bug #99599 filed on extra text letting them know they won't receive e-mail. Bug #99600 filed on making sure confidential information is not divulged in a comment.
Blocks: 99599, 99600
Taking - I have a patch for this bug and bug 95034 (and showdependancy*.cgi while I was at it). I came up with this last week, but I now have net access. This doesn't fix the dependant bugs, or the stuff mentioned in my comment. I get the feeling that they should wait for the group rewrite. This moves the security checks which were in ValidateBugID into a function SelectVisible in globals.pl, which takes an SQL query and modifies it to only list bugs which a person with the given userid (and optionally groupset; all of the places which use this have the groupset available, so theres no point in a separate query). I also got rid of the duplicate code for checking permissions when marking a bug as a dupe. I've done this using a LEFT JOIN for the cc stuff, and so theres actually only one query being run. I'd appreciate a couple of sets of eyes on that code. Some questions: 1. Is there some subtle reason that the previous code got all the people on the cc list, and then iterated through them, rather than letting the database do it using its index? If so, my approach won't work. 2. Will the SQL (ie the additional JOIN) slow down queries? I would hope that mysql would optimise the query so that it would only do the join if there was a bug which the user couldn't see unless they were on the cc list. Can someone who knows more about databases confirm that? buglist.cgi does an unconditional LEFT JOIN for the qa contacts, so I hope not. 3. Should SelectVisible send the query, rather than returning the modified string? 4. I had to turn off the DB taint mode in processmail. The problem is that DBI's taint mode not only checks for tainted query strings (which is good), but it also considers the output of the query to be tainted. Thats not so good. This didn't affect us before, because we just compared the result of the bitmask. My new code takes the userid and groupset from one query, and then tries to use them in another query, and so this hits the taint checks. I think that this mode is too restrictive - if we don't trust the results of the database, then bugzill has other problems :). I'll change that if people insist, though, by running the userid and groupset through a null regexp. 5. This is my first non-test perl program; perl coding stuff appreciated. I've tested this on my own installation, but I've probably missed something. Comments?
Assignee: jake → bbaetz
Attached patch patch, v0.9 (obsolete) — Splinter Review
Answers: 1. as far as i can tell (and i wrote the code), iterating over the cc query results instead of putting "AND cc.who = $whoid" in the query was a mistake on my part. The proposal you suggest is clearly the better (simpler, more performant) solution. 2. I looked at the query only briefly. JOINs can slow down query performance, but in this case all fields (bugs.bug_id, cc.bug_id, cc.who) are indexed, so query performance shouldn't be a problem. Assignee, you could compare the queries in mysql using the SQL query "EXPLAIN [rest of query]" when testing out your changes to see if mysql expects a significant difference (and post your results). Reviewers, test for performance if you think it may matter. Note that as currently constructed, this query results in multiple records for a single bug if the bug is not accessible to cc list users (because in that case there is no criterion limiting the LEFT JOIN to matching records and all records on the cc list match). 3. Not sure, this needs more thorough review. Probably it works either way. 4. This sounds bizarre. We should find out the rationale behind this DBI taint behavior. Nevertheless, we shouldn't run anything through a fake detainting process. Either don't use taint or really detaint it (which for IDs is as simple as making sure it is a positive integer). 5. Can't comment more until I review.
2) I tried that, but I didn't really understand the output well enough to work out which was faster. 3) It works each way, and the way I have it makes the buglist.cgi code easier to follow. Everyone else is now SendSQL(SelectVisible(...)); 4) "Its documented that way"
Status: NEW → ASSIGNED
*** Bug 100905 has been marked as a duplicate of this bug. ***
I searched the source for people checking the groupset, and I've now modified duplicates.cgi and xml.cgi to use the new method. New patch coming.
Attached patch patch, v1.0 (obsolete) — Splinter Review
Attachment #49952 - Attachment is obsolete: true
Comment on attachment 50437 [details] [diff] [review] patch, v1.0 >- SendSQL("SELECT component, bug_severity, op_sys, target_milestone, short_desc, groupset, bug_status, resolution" . >+ SendSQL("SELECT component, bug_severity, op_sys, target_milestone, short_desc, bug_status, resolution" . > " FROM bugs WHERE bug_id = $id"); >- my ($component, $severity, $op_sys, $milestone, $summary, $groupset, $bug_status, $resolution) = FetchSQLData(); >- next unless $groupset == 0; >+ my ($component, $severity, $op_sys, $milestone, $summary, $bug_status, $resolution) = FetchSQLData(); >+ next unless CanSeeBug($id, $::userid, $::usergroup); Hmmm... the number of queries being performed here has doubled. I wonder if this is going to be a problem. >+ # Assumes that 'bugs' is mentioned as a table name. You should >+ # also make sure that all the column names are qualified, especially >+ # bugs.bug_id! Is this really necessary? Some of the queries in this patch do not have qualified column names. >+ $query =~ s/ WHERE /$replace/i; Although unlikely, it is possible for a newline or tab to precede or succeed the WHERE keyword in an SQL statement. The expression should match \sWHERE\s. Also, you should replace "SELECT " with "SELECT DISTINCT " to prevent multiple records being returned when a bug has multiple users on the cc list. I checked all the queries being run through SelectVisible, and none of them look like they would have problems with DISTINCT. Tested buglist.cgi, process_bug.cgi, processmail, and showdependencytree.cgi. Bug.pm, duplicates.cgi, and showdependencygraph.cgi should also be tested, perhaps by the second-reviewer, but make the changes above and you have my r=myk for the first review.
> Hmmm... the number of queries being performed here has doubled. I wonder > if this is going to be a problem. I fixed this by changing the ordering a bit: SendSQL(SelectVisible("SELECT component, bug_severity, op_sys, target_milestone, short_desc, bug_status, resolution" . " FROM bugs WHERE bugs.bug_id = $id", $::userid, $::usergroup)); next unless MoreSQLData(); my ($component, $severity, $op_sys, $milestone, $summary, $bug_status, $resolution) = FetchSQLData(); This isn't really performance critical code - the query is run once per bug#, for a start. [comment on qualifying comment names] > Is this really necessary? Some of the queries in this patch do not have > qualified column names. I changed this to refer only to bug_id, now. There were a couple of places (including duplicates.cgi for the above change) which did "SELECT bug_id from bugs where ...", and had to be changed. Re distinct, didn't we decide on IRC that we should just make the left join conditional on bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = $userid? Or was that justdave? I couldn't find a place where it mattered though in the current code, since places like buglist.cgi uses GROUP BY. My new patch (coming up) does that, rather than distinct. I fixed the WHERE thing, too, and also changed the bug title attribute stuff to not do another query if the bug group is 0 (we want to get some info, so SelectVisible isn't appropriate here, but we may as well get the groupset while we're at it). I also fixed long_list.cgi to use SelectVisible rather than just checking the groupset - my original grep missed that, because it calls sendsql on a variable, not on the string directly.
Attached patch updated patch (obsolete) — Splinter Review
Attachment #50437 - Attachment is obsolete: true
> next unless MoreSQLData(); > my ($component, $severity, $op_sys, $milestone, $summary, $bug_status, >$resolution) = FetchSQLData(); > >This isn't really performance critical code - the query is run once per bug#, >for a start. There are problems with running MoreSQLData() without running FetchSQLData(). I'm not sure if the problems occur when there is no data. Still, this needs to be tested carefully, examined by someone who knows/remembers what the problems are, or reverted to double queries if there are no performance implications. >Re distinct, didn't we decide on IRC that we should just make the left join >conditional on bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = >$userid? Or was that justdave? I couldn't find a place where it mattered though >in the current code, since places like buglist.cgi uses GROUP BY. My new patch >(coming up) does that, rather than distinct. I think it is a bug that buglist.cgi uses GROUP BY rather than DISTINCT. The purpose of GROUP BY is to group bugs for aggregate calculations, while the purpose of DISTINCT is to eliminate duplicate results from the result set. buglist.cgi does not do aggregate calculations, so it should be using DISTINCT, and so should this code.
Comments: 1. Please move the "# More warning suppression silliness." comments to the top of the blocks in showdependency*.cgi so it's more obvious you're adding to it. 2. DBTaint needs to get turned back on. The fact that we need taint-out to get taint-in is perhaps unfortunate, but not tragic. Try adding the following code to globals.pl: sub detaint($$) { my ($taintedstr, $regexp) = @_; $taintedstr =~ /^($regexp)$/; return $1; } sub detaint_natural($) { return detaint(@_, "\\d+"); } then you can use detaint_natural for your detainting needs. 3. I would like to see duplicates.cgi updated so a query doesn't occur for each bug. As far as I can tell, it was done this way because of a security constraint that you've solved. If it isn't fixed with this patch, a supplemental bug should be filed, but I'm wary to do this, as these things tend to never get done. Myk, I'm not exactly sure where and why you want DISTINCT. The current patch at least seems to get no benefit from it. GROUP BY I suspect is an artifact of a time before the votes cache. Those are all the issues I see from my initial review. I have not yet tested this.
Priority: -- → P2
Target Milestone: --- → Bugzilla 2.16
I'll make these changes locally, but I need to talk to gerv about what the duplicates.cgi code is trying to do before I can fix that.
> There are problems with running MoreSQLData() without running FetchSQLData(). This is because MoreSQLData() retrives the query information and instead of returning it, stores it in @::fetchahead [this is so it can be returned on the next call to FetchSQLData()]. The solution is to add |undef @::fetchahead;| to the SendSQL() routine [when we call SendSQL(), we are throwing away our @::fetchahead value].
The case in question was a loop with: next unless MoreSQLData(); Thats safe, isn't it?
Comment on attachment 51298 [details] [diff] [review] updated patch >+ next unless MoreSQLData(); >+ my ($component, $severity, $op_sys, $milestone, $summary, $bug_status, $resolution) = FetchSQLData(); As used above it is safe... if there's no SQLData then it won't set @::fetchahead, if there is, then FetchSQLData() is called on the next line which will clear the @::fetchahead array. [It's pretty much the same concept as |while(MoreSQLData()) {|]
New patch coming up. 3. duplicates.cgi has to be the way it is so that sorting works. SELECT ... WHERE bug_id in ($list) doesn't return the results in the same order as the list
Oh, two more comments. a) ./processmail rescanall doesn't work because of the taint stuff. I've fixed this in the patch. Does b.m.o run processmail rescanall via cron every so often? If so, this may be the cause of the bugmails with multiple changes that was briefly mentioned on irc last night. b) Running a query through SelectVisible will detaint it, so we lose taint-in as well. Is there a way to mark a regexp as not changing the taint status of a string?
Attached patch this should be everything... (obsolete) — Splinter Review
Attachment #51298 - Attachment is obsolete: true
Where is this detainting? I can only see a search and replace, which as far as I know won't remove tainting. You changed the detaint_string comment to detaint_*. However the comment really only applied to detaint_string - you only use that when you're sure the string is OK, whereas the other detaint_* routines are for when you're not sure.
I thought that that would detaint a string, but rereading the camel book, thats not correct. Re the comment, running a number through detaint_number assumes that that number is safe. You don't want to take a bug# from the user, and then detaint it so that you can do "select * from bugs where bug_id = $foo". From that perspective I think that the comment is correct.
That's a possible interpretation, but the major use of tainting is to avoid passing special characters through, not to check its a valid bug number. I'm sure that's not the intent of the comment.
I'm undone the comment change locally. myk, matty - can I get official review to check this in then, please?
You still have the "detaint_*" comment in there. That only applies to "detaint_string", not all of the detaint_* routines. Also, Matty didn't use the detaint routines I quoted to him in IRC, which may be easier to use (or not, but I want other opinions first. :) The routines that made it into this patch depend on your assigning the return value to another variable. If you're going to do that, you might as well detaint by hand instead of using the routine. :) I was suggesting something like this: sub detaint_string(\$) { my ($stringref) = @_; $$stringref =~ /^(.*)$/; $$stringref = $1; } The \ in the prototype forces the routine to use a reference to the passed variable instead of only taking the value from it, eliminating the need for a return variable. Thus you can simply say this: detaint_string($mystring); and "$mystring" would get detainted. The same concept would apply to the other detaint routines. The existing routines in this patch require you to do this: $mystring = detaint_string($mystring); compare to: $mystring =~ /^(.*)$/; $mystring = $1; same amount of typing for both. Why bother with the routine? :)
As I mentioned on IRC, I didn't use those because you weren't sure what happened when you have a \$ in detaint_natura calling another \$ in detaint. Basically, I consider the difference between the two rather negligible. The benefit of detaint_natural is not that it is shorter, the benefit is that it is more readable and you don't have to remember exactly how you should detaint a natural. AFAICS detaint_string isn't using detaint anyway so the example isn't entirely relevant but I can see your viewpoint from it.
Perhaps I'm showing a little bit of ignorance, but why detaint_natural()? detaint_integer() seems more logical to me. Also, what happens if you do: my $var = "2; DROP DATABASE bugs"; detaint_natural($var); SendSQL("SELECT * FROM bugs WHERE bug_id = $var"); I don't see any functionality in detaint_natural() (or detain() for that matter) for what happens if the variable doesn't match the regular expression. The hope is it never becomes in issue, but the point in using taint mode is catching it if it does become an issue. Obviously this isn't the case with detaint_string as its only objective is to blindy detaint the string (which means you have to be really sure that it can't be altered by the user). (FWIW, I do like the idea of including \ in the prototype).
It's not detaint_integer because it doesn't detaint all integers - it detaints all naturals. AFAIK either solution sets the result to be undefined if it doesn't succeed. Checking this is up to the client, if it is considered a problem.
I believe if the regexp doesn't match, you wind up with an undefined variable, which in theory means you have to check the value after detainting it. On the other hand, if you're using the reference version, you could build it into the return value... sub detaint_natural(\$) { my ($ref) = @_; $$ref =~ /^(\d+)$/; $$ref = $1; return (defined($$ref)); } Then you can do something like : if (!detaint_natural($my_var)) { Error("We had a bad value here"); exit; }# Do something with $my_var here.
So if I've followed everything correctly so far, this is what we want: sub detaint(\$$) { my ($$taintedstr, $regexp) = @_; $$taintedstr =~ /^($regexp)$/; $$taintedstr = $1; return defined($$taintedstr); } sub detaint_natural(\$) { my ($ref) = @_; $$ref =~ /^(\d+)$/; $$ref = $1; return (defined($$ref)); } sub detaint_string(\$) { my ($ref) = @_; $$ref =~ /^(.*)$/; $$ref = $1; return (defined($$ref)); }
almost.... >sub detaint(\$$) { > my ($$taintedstr, $regexp) = @_; ^^^^^^^^^^^^ You'll want to drop a $ there. remember, what's coming in is the reference, you deref it everywhere else.
Comment on attachment 51426 [details] [diff] [review] this should be everything... >Index: CGI.pl > my $usergroupset = $::usergroupset || 0; > my $userid = $::userid || 0; ... >+ return if CanSeeBug($id, $userid, $usergroupset); You should delete the two local variables and just use $::userid and $::usergroupset here, since that's the way you do it in other places in the scripts and since the extra code isn't necessary. >Index: duplicates.cgi ... >+ " FROM bugs WHERE bugs.bug_id = $id", $::userid, $::usergroup)); $::usergroup -> $::usergroupset
Attachment #51426 - Flags: review-
OK, I'm going to attach a new patch now. This fixes up the detaint stuff to the approval of people on #mozwebtools. It also fixes the fact that mysql requires aliases for columns to be locked if the original column is. Looking for review...
Attached patch new patch (obsolete) — Splinter Review
Attachment #51426 - Attachment is obsolete: true
Comment on attachment 52635 [details] [diff] [review] new patch looks good, r=myk
Attachment #52635 - Flags: review+
Comment on attachment 52635 [details] [diff] [review] new patch working on reviewing this. visual inspection looks good, but I was going to install it on landfill first to play with it before calling it reviewed, and I haven'y had a chance to do that yet.
You have fixed bug #101560, so you may want to take that. Did you fix the problem about getting email if you were in the groupset beforehand? I can't see any solution. Not a problem if not. This looks OK. I honestly can't bring myself to review this whopper entirely again, but I skimmed it. I'll let Dave test this and then it's probably ready to go if he's happy with it.
Attached patch Updated to CVS. (obsolete) — Splinter Review
No, I didn't fix that problem. that one is more involved, and would probably bets be done by making processmail a separate sub - we generate all the data I'd need, but then don't pass it to processmail. I'm going to uplaod a new version of the diff to account for the tab changes - matty's is wrong: - # Make sure the bug exists in the database. - MoreSQLData() + FetchOneColumn(); || DisplayError("Bug #$id does not exist.") Note the extra ; Apart from this, I think that our patches are identical (its a bit hard to tell, since we're using different versions of diff, and so the actual patches are slightly different). A diff -w of my previous patch and the one I will attach only has timestamp/rcs version differences, so I presume that it still has r=myk.
Attached patch patch to current cvs (obsolete) — Splinter Review
Attachment #52635 - Attachment is obsolete: true
Attachment #53466 - Attachment is obsolete: true
Blocks: 101560
Blocks: 95024
Comment on attachment 53469 [details] [diff] [review] patch to current cvs visual inspection looks good, so replacing first review on here. will add second review after it's had playtime on landfill. It's now live at http://landfill.tequilarista.org/bz97469/
Attachment #53469 - Flags: review+
Comment on attachment 53469 [details] [diff] [review] patch to current cvs >-sub detaint_string { >- my ($str) = @_; >- $str =~ m/^(.*)$/s; >- $str = $1; >+sub trick_taint { >+ $_[0] =~ /^(.*)$/; >+ $_[0] = $1; >+ return (defined($_[0])); > } OK, this is wrong. trick_taint needs the /s modifier. Without it, the editparams.cgi stuff is wrong. From talks with justdave on irc, the m// isn't needed, so I'll leave it out. New patch coming up. I don't think that detaint_number needs the /s.
Attachment #53469 - Attachment is obsolete: true
Attachment #53469 - Flags: review-
Attached patch regexp fooSplinter Review
Comment on attachment 53512 [details] [diff] [review] regexp foo same as the last patch, with one bug fix, confirmed. r=myk
Attachment #53512 - Flags: review+
bug 97471 might affect this.... better now than later.
dave: possibly... I just want to get this patch in. Fixing that bug would mean both a schema change, and possibly moving the current checks we do on marking duplictes to some global location. (Although I don't think we need a warning, since the action is explicit, rather than implicit like the duplicate stuff is)
OK, this is really icky. I created a user that wasn't in any groups and CC'd them on one of two secure bugs. I then ran a query, and both bogs appeared in the buglist. When clicking on the second bug (the one I wasn't CC'd on) it told me I didn't have permission to see the bug. So I took it to the command line and narrowed it down to: mysql> select bugs.bug_id, selectVisible_cc.who from bugs LEFT JOIN cc selectVisible_cc ON bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 5 WHERE selectVisible_cc.who = 5; +--------+------+ | bug_id | who | +--------+------+ | 1 | NULL | | 2 | NULL | ... | 25 | NULL | | 26 | 5 | | 27 | NULL | ... | 88 | NULL | +--------+------+ 88 rows in set (0.01 sec) mysql> select bugs.bug_id, selectVisible_cc.who from bugs LEFT JOIN cc selectVisible_cc ON bugs.bug_id = selectVisible_cc.bug_id WHERE selectVisible_cc.who = 5; +--------+------+ | bug_id | who | +--------+------+ | 26 | 5 | +--------+------+ 1 row in set (0.01 sec) Apparanetly, in versions older than 3.23.31 of MySQL having the same condition in a LEFT JOIN as a WHERE just plain doesn't work. I don't know if this has always been a bug in MySQL or if it was introduced at some point, but I know it happens on my version which is 3.23.33 http://www.mysql.com/documentation/mysql/bychapter/manual_News.html#News-3.23.31
...except that that release note is for .31, and jake is using .33 - so thats not it. It works for me, and I'm using .36. I can, however, confirm this on landfill. So, suggestions?
OK, I can, sort of, reproduce this at home: This is really really really wierd, and I need SQL help: I have two bugs in a group, but 1 and bug 4. user 1 (the admin) is the reporter of both, but user 2 (not in this group) is only cc'd on bug 1. mysql> SELECT bugs.bug_id FROM bugs, profiles map_assigned_to, profiles map_reporter LEFT JOIN cc selectVisible_cc ON bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2 WHERE ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0') GROUP BY bugs.bug_id; +--------+ | bug_id | +--------+ | 1 | | 4 | +--------+ 2 rows in set (0.01 sec) This is wrong. Removing _either_ the profiles as map_assigned_to gives: mysql> SELECT bugs.bug_id FROM bugs, profiles map_reporter LEFT JOIN cc selectVisible_cc ON bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2 WHERE ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0') GROUP BY bugs.bug_id; +--------+ | bug_id | +--------+ | 1 | +--------+ 1 row in set (0.00 sec) Alternately, removing the bit which allows the reporter to see the bug also works: mysql> SELECT bugs.bug_id FROM bugs, profiles map_reporter LEFT JOIN cc selectVisible_cc ON bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2 WHERE ((bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0') GROUP BY bugs.bug_id; +--------+ | bug_id | +--------+ | 1 | +--------+ 1 row in set (0.00 sec) (Removing the cc bit gives 0 results, as expected) So, I thought that this may be something to do with the order of the join, but: SELECT bugs.bug_id FROM bugs LEFT JOIN cc selectVisible_cc ON bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2, profiles map_assigned_to, profiles map_reporter WHERE ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0') GROUP BY bugs.bug_id; +--------+ | bug_id | +--------+ | 1 | | 4 | +--------+ 2 rows in set (0.00 sec) So, at this point I have absolutly no idea what is wrong. Oh, heres something even better. If I restrict this with bugs.bug_id=4, I get an empty result! (which is obviously why actually viewing the bug fails): mysql> SELECT bugs.bug_id FROM bugs LEFT JOIN cc selectVisible_cc ON (bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2), profiles map_assigned_to, profiles map_reporter WHERE ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.bug_id=4) GROUP BY bugs.bug_id; Empty set (0.00 sec)
OK, can someone explain this to me, please: mysql> SELECT bugs.bug_id, selectVisible_cc.* FROM bugs LEFT JOIN cc selectVisible_cc ON (bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2), profiles map_assigned_to, profiles map_reporter WHERE ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0'); +--------+--------+------+ | bug_id | bug_id | who | +--------+--------+------+ | 1 | 1 | 2 | | 4 | NULL | NULL | | 1 | 1 | 2 | | 4 | NULL | NULL | | 1 | 1 | 2 | | 4 | NULL | NULL | | 1 | 1 | 2 | | 4 | NULL | NULL | +--------+--------+------+ 8 rows in set (0.00 sec) mysql> SELECT bugs.bug_id, selectVisible_cc.* FROM bugs LEFT JOIN cc selectVisible_cc ON (bugs.bug_id = selectVisible_cc.bug_id AND selectVisible_cc.who = 2), profiles map_assigned_to WHERE ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0'); +--------+--------+------+ | bug_id | bug_id | who | +--------+--------+------+ | 1 | 1 | 2 | | 1 | 1 | 2 | +--------+--------+------+ 2 rows in set (0.00 sec) I'm not even using those tables!
Heres the EXPLAIN output, in case it matters: mysql> EXPLAIN SELECT bugs.bug_id, selectVisible_cc.* FROM bugs LEFT JOIN cc -> selectVisible_cc ON (bugs.bug_id = selectVisible_cc.bug_id AND -> selectVisible_cc.who = 2), profiles map_assigned_to, profiles map_reporter WHERE -> ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible -> = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0'); +------------------+--------+---------------+---------+---------+-------------------+------+-------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +------------------+--------+---------------+---------+---------+-------------------+------+-------------------------+ | bugs | ALL | reporter | NULL | NULL | NULL | 5 | where used | | map_assigned_to | index | NULL | PRIMARY | 3 | NULL | 2 | Using index | | map_reporter | index | NULL | PRIMARY | 3 | NULL | 2 | Using index | | selectVisible_cc | eq_ref | bug_id,who | bug_id | 6 | bugs.bug_id,const | 1 | where used; Using index | +------------------+--------+---------------+---------+---------+-------------------+------+-------------------------+ 4 rows in set (0.00 sec) mysql> EXPLAIN SELECT bugs.bug_id, selectVisible_cc.* FROM bugs LEFT JOIN cc -> selectVisible_cc ON (bugs.bug_id = selectVisible_cc.bug_id AND -> selectVisible_cc.who = 2), profiles map_assigned_to WHERE -> ((bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible -> = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0'); +------------------+-------+---------------+---------+---------+------+------+-------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +------------------+-------+---------------+---------+---------+------+------+-------------+ | bugs | ALL | reporter | NULL | NULL | NULL | 5 | where used | | selectVisible_cc | range | bug_id,who | who | 3 | NULL | 2 | where used | | map_assigned_to | index | NULL | PRIMARY | 3 | NULL | 2 | Using index | +------------------+-------+---------------+---------+---------+------+------+-------------+ 3 rows in set (0.01 sec) There are differences, but I'm not sure why, or even if they matter.
Previous comment looks like another bug in MySQL, similiar to the one from "Additional Comments From Jake 2001-10-16 08:25". I seem to remember problems with just "throwing in" extra tables on MySQL queries, but I haven't used MySQL at work for over 1.5 years (not counting my local Bugzilla installation which I don't mess with too much). It would be interesting to see what PostgreSQL or another database would do with a similar set of tables and the same SQL.
Okay, here's a better explanation of what happens when you simply "throw in" extra tables in the FROM clause of a query: basically, each extra table you throw in will multiply the results WITHOUT the extra table by the number of rows in the extra table. If you think about it, this makes sense because with SQL, you are including every row in every table listed in the FROM clause, then narrowing down the results in the WHERE clause by creating constraints either between tables or on certain pieces of data in rows from a table. When you add a whole other table to the WHERE clause and don't put any constraints on it, all of its rows match every other row that the rest of the query would have contained, thus multiplying the original results by the number of rows in the new table (which has no constraints). My *GUESS* is that MySQL (and/or its query optimizer) gets really confused when you add unconstrained tables to the FROM clause, and you get results you should not have gotten if all of the tables were constrained properly. Personally, I would call this a bug in MySQL and/or its query optimizer since PostgreSQL (on a different database, but using a similar query with a LEFT JOIN and an extra table) only multiplies the original dataset by the number of rows in the new table; it does NOT fail meet the conditions of the original query without the extra table like MySQL does. Hope that helps. :^)
Can't...stop...commenting! SPOCK! FWIW, I'm using MySQL version 3.23.36. Also, I think the problem with the last set of SQL queries that you gave is that you are duplicating a constraint (selectVisible_cc.who = 2) in the ON clause of the LEFT JOIN and in the boolean expression in the WHERE clause. I'd guess that by doing this you will get unpredictable results, and I don't think the SQL makes sense like that anyway. Try this SQL (add a DISTINCT to remove duplicates): SELECT bugs.bug_id ,selectVisible_cc.* FROM bugs LEFT JOIN cc selectVisible_cc ON (bugs.bug_id = selectVisible_cc.bug_id) ,profiles map_assigned_to WHERE ( (bugs.reporter_accessible = 1 AND bugs.reporter = 2) OR (bugs.cclist_accessible = 1 AND selectVisible_cc.who = 2)) AND (bugs.groupset != '0'); Note that you will still get twice the number of results expected because of the "profiles map_assigned_to" table included in the FROM clause (which has two rows, apparently, in your database), at least until you add the DISTINCT keyword.
ddk: right, I'll double the number of responses, but the constraints should still only match the same rows as it otherwise would. I decided not to use DISTINCT because I wanted my sql wrapper to give the exact same answers as without wrapping. See comments in this bug. In any event, I'm goint to come up with a test script, and then file a mysql bug report. However, why do we use these bogus tables in the first place? bug_list.cgi should just add them in as needed. If we do that, then the bug goes away, and I can just add a big warning in the comments. myk? justdave?
Attached file mysql bug?
OK, I've attached a test script. This isn't quite the same bug, since we don't need the extra table. However, removing any of the "not null" constraints, or the index, gives different results, so I'm reasonably certain that this is a bug. Can someone who knows more SQL confirm this?
The mysql people have confirmed that this is a bug. Adding a not null qualifier helps on landfill, and on my version of mysql, so I'm going to go with that. I'll upload an updated patch later today.
Comment on attachment 54138 [details] [diff] [review] work arround mysql bug It works. r=myk.
Attachment #54138 - Flags: review+
Comment on attachment 54138 [details] [diff] [review] work arround mysql bug r= justdave I've had this on bugzilla.syndicomm.com for the last two days, and once the workaround for ANOTHER mysql bug from bug 99519 is put in, it works for me, too.
Attachment #54138 - Flags: review+
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
*** Bug 108979 has been marked as a duplicate of this bug. ***
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: