Closed Bug 96101 Opened 23 years ago Closed 23 years ago

Bugzilla query pondering time is best measured with a calendar

Categories

(Bugzilla :: Query/Bug List, defect, P1)

Other
Other
defect

Tracking

()

RESOLVED DUPLICATE of bug 57350

People

(Reporter: SkewerMZ, Assigned: endico)

References

()

Details

(Keywords: perf)

Attachments

(1 file)

Lately, this query takes very long to load. As I'm writing this Bugzilla hasn't returned the results for this query after five minutes from the time the "Please stand by..." text was displayed. It could be an issue with the search engine, but it seems more like the server is overdue for an upgrade. My connection speed is 56kbps, but this is irrelevant as I have received NO data from the server since the query (no modem activity at all, in fact).
The problem is a query for "all bugs I'm involved in that have changed the last day". This is a very common query. I don't think it's good for the server if such queries time out and thus a lot of people submit the same query several times until they finally get a result. This query was quite usable some time ago. While other queries are faster now, a query for a email with more than one checkbox checked is quite unusable at the moment. Compare this queries: 1) Both "Assigned to" and "Added comment" checked: http://bugzilla.mozilla.org/buglist.cgi?email1=c%40c07.de&emailtype1=exact&emailassigned_to1=1&emaillongdesc1=1&changedin=1 2) Only "Assigned to" checked: http://bugzilla.mozilla.org/buglist.cgi?email1=c%40c07.de&emailtype1=exact&emailassigned_to1=1&changedin=1 3) Only "Added comment" checked: http://bugzilla.mozilla.org/buglist.cgi?email1=c%40c07.de&emailtype1=exact&emaillongdesc1=1&changedin=1 1 takes *much* more time than the sum of 2 and 3.
This is a bugzilla performance issue. "Clarence"'s comment has a good testcase. Maybe someone can have a look how mysql executes these queries, and where things go wrong?
Assignee: scbrown → endico
Component: Server Operations → Query/Bug List
Keywords: perf
OS: Windows 98 → other
Product: mozilla.org → Bugzilla
QA Contact: endico → matty
Hardware: PC → Other
Version: other → unspecified
From the test cases above, the following SQL is generated: Test case #1: (the slow one) SELECT bugs.bug_id, bugs.groupset, unix_timestamp(bugs.creation_ts), unix_timestamp(bugs.delta_ts), substring(bugs.bug_severity, 1, 3), substring(bugs.priority, 1, 3), map_assigned_to.login_name, substring(bugs.bug_status,1,4), substring(bugs.resolution,1,4), substring(bugs.version, 1, 5), bugs.target_milestone, bugs.short_desc FROM bugs, profiles map_assigned_to, profiles map_reporter LEFT JOIN profiles map_qa_contact ON bugs.qa_contact = map_qa_contact.userid, longdescs longdescs_, profiles longdescnames_ WHERE bugs.assigned_to = map_assigned_to.userid AND bugs.reporter = map_reporter.userid AND bugs.groupset & 192 = bugs.groupset AND longdescs_.bug_id = bugs.bug_id AND longdescs_.who = longdescnames_.userid AND bugs.assigned_to = map_assigned_to.userid AND (map_assigned_to.login_name = 'c@c07.de' OR longdescnames_.login_name = 'c@c07.de') AND ((to_days(now()) - to_days(bugs.delta_ts)) < '2') GROUP BY bugs.bug_id ORDER BY bugs.bug_id Test case #2: SELECT bugs.bug_id, bugs.groupset, unix_timestamp(bugs.creation_ts), unix_timestamp(bugs.delta_ts), substring(bugs.bug_severity, 1, 3), substring(bugs.priority, 1, 3), map_assigned_to.login_name, substring(bugs.bug_status,1,4), substring(bugs.resolution,1,4), substring(bugs.version, 1, 5), bugs.target_milestone, bugs.short_desc FROM bugs, profiles map_assigned_to, profiles map_reporter LEFT JOIN profiles map_qa_contact ON bugs.qa_contact = map_qa_contact.userid WHERE bugs.assigned_to = map_assigned_to.userid AND bugs.reporter = map_reporter.userid AND bugs.groupset & 192 = bugs.groupset AND bugs.assigned_to = map_assigned_to.userid AND (map_assigned_to.login_name = 'c@c07.de') AND ((to_days(now()) - to_days(bugs.delta_ts)) < '2') GROUP BY bugs.bug_id ORDER BY bugs.bug_id Test case #3: SELECT bugs.bug_id, bugs.groupset, unix_timestamp(bugs.creation_ts), unix_timestamp(bugs.delta_ts), substring(bugs.bug_severity, 1, 3), substring(bugs.priority, 1, 3), map_assigned_to.login_name, substring(bugs.bug_status,1,4), substring(bugs.resolution,1,4), substring(bugs.version, 1, 5), bugs.target_milestone, bugs.short_desc FROM bugs, profiles map_assigned_to, profiles map_reporter LEFT JOIN profiles map_qa_contact ON bugs.qa_contact = map_qa_contact.userid, longdescs longdescs_, profiles longdescnames_ WHERE bugs.assigned_to = map_assigned_to.userid AND bugs.reporter = map_reporter.userid AND bugs.groupset & 192 = bugs.groupset AND longdescs_.bug_id = bugs.bug_id AND longdescs_.who = longdescnames_.userid AND (longdescnames_.login_name = 'c@c07.de') AND ((to_days(now()) - to_days(bugs.delta_ts)) < '2') GROUP BY bugs.bug_id ORDER BY bugs.bug_id
Priority: -- → P1
Target Milestone: --- → Bugzilla 2.16
*** Bug 105624 has been marked as a duplicate of this bug. ***
The above attachment shows the results on SQL EXPLAIN. You can see with the first three there is a definite problem on the first one, with the longdescs_ table. However, the fourth one is 'assigned to' and 'reporter' on, which has the same problem, except with the bugs table. This all seems to be caused by the OR condition. Removing the OR and just having one condition makes the problem disappear, even though you would be obviously pulling in too many records. We might be hitting a MySQL bug here, or we might be confounding optimisations and there might be little we can do. I'm not too sure. The SQL all looks OK, except for the bugs/assigned_to join which happens twice in the first query, but which should be and appears to be harmless (the problem persisted without it). dkl, does PostgreSQL have an equivalent to the MySQL EXPLAIN command? If so, does it show a similar problem? myk, you're the EXPLAIN wizard, does this mean anything to you?
Other thing that confounds the mechanism and results in NULLs is substring lookups on just one field. These use INSTR(LOWER(X), field), both functions cause the problem. However the speed there isn't so slow.
Do we know the query always had this problem? Perhaps old versions of MySQL did this better, or its just because we have a lot more bugs.
*** Bug 112892 has been marked as a duplicate of this bug. ***
I simplified the query, but it didn't help: explain select bugs.bug_id FROM bugs INNER JOIN longdescs USING(bug_id) WHERE (bugs.assigned_to=86 OR longdescs.who=86); still looks at every entry in the longdescs table. Adding an index(bug_id, who) on longdescs made it use the index, but it still didn't help. myk, can you run the above query, then twice more with each part of the OR on bmo?
dkl: could you try the equivalent on the bugzilla database in pgsql? (BTW, the 86 is my userid on landfill's installation - myk, can you use asa's id instead?)
Actually, see: http://www.mysql.com/documentation/mysql/bychapter/manual_Tutorial.html#Searching_on_two_keys Where the docs say that they "haven't had time" to do optimisations on this. They recommend either using union queries (which requries 4.0.0), or temporary tables (which probably aren't faster unless you're very careful to get the order correct, which would be difficult for autogenerated queries). Unless someone has a smart idea on how to speed this up somehow, I think we should FUTURE/CANTFIX this.
Here is the explain output of the sql query in PgSQL although I am not exactly sure how to read it yet: bugs=# explain select bugs.bug_id FROM bugs INNER JOIN longdescs USING(bug_id) WHERE (bugs.assigned_to=86 OR longdescs.who=86); NOTICE: QUERY PLAN: Hash Join (cost=14231.92..22504.11 rows=22 width=16) -> Seq Scan on bugs (cost=0.00..4689.02 rows=41102 width=8) -> Hash (cost=14219.14..14219.14 rows=5114 width=8) -> Seq Scan on longdescs (cost=0.00..14219.14 rows=5114 width=8) EXPLAIN
How many bugs, and longdesc entries are there? You have run vaccuum analyze on the tables, right? I _think_ that what its saying is that its going to do a sequential scan on the longdescs, then hash the results to merge with the bugs output It doesn't look like its using any indices at all. What if you try each constraint separately? Maybe postgresql has the same limitation mysql does.
After a vaccuum: <dkl> bugs=# explain select bugs.bug_id FROM bugs INNER JOIN longdescs USING(bug_id) WHERE (bugs.assigned_to=86 OR longdescs.who=86); <dkl> NOTICE: QUERY PLAN: <dkl> Hash Join (cost=12568.41..17157.56 rows=13 width=16) <dkl> -> Seq Scan on bugs (cost=0.00..4325.06 rows=4706 width=8) <dkl> -> Hash (cost=12560.93..12560.93 rows=2990 width=8) <dkl> -> Index Scan using longdescs_index on longdescs (cost=0.00..12560.93 rows=2990 width=8) and dkl said that the spearate queres combined is about the same time as the single query. myk - can you run timings for us, please? (use a user with some bugs assigned, though, rather than the generic #86)
On a test installation running mysql with b.m.o. data from several months ago loaded into it the simple query using my ID instead of 86 returns 2992 records in 8 minutes, 41.55 seconds.
Each query separately takes less than a second to run.
So we have 2 secs vs 521.55, or just under 26,000% difference, assuming I did the maths right. Anyone got any ideas on how we can work arround this?
I was thinking that we could use a resultset abstraction in query.cgi. The idea of this is that the results don't necessarily come as-is from the database, but can have some intermediate processing. This would allow union and subselect emulation and could be used to send this query through as two different database queries. This should be release noted for 2.16 if we can't fix it.
>I was thinking that we could use a resultset abstraction in query.cgi. Bingo, and the templatization patch for buglist.cgi does this (bug 103778).
OK, I have no idea about SQL. But the problem (to the best of my understanding) is a result of OR-connectives (like: my address appears as reporter OR comment OR CC = all bugs I am involved, a very important query I'd like to run). It seems to be pretty easy to run three separate queries. So the idea would be to do just that, store the three results and then merge them. Another idea (which might not work due to data structure which I don't know) would be to allow a query for an address to appear anywhere in the bug. Finally, if it will not be possible to solve this bug here, the checkboxes should be replace by radio buttons, so people don't even expect it to work. pi
Depends on: 103778
Yeah, but thats Hard. I tried changing the OR to NOT AND using DeMorgan's rule, but thats the same result from EXPLAIN. I wonder if thats what mysql does internally....
Whiteboard: [blocker will fix]
bbaetz says in IRC that this won't be fixed by bug 103778, but Myk's comment 20 seems to imply that it will be... Myk? If this indeed won't be fixed by that patch, please remove the will fix comment from the status whiteboard.
Whiteboard: [blocker will fix]
I played with this a bit, but ->2.18, and removing bogus dependancy on the blocker. However, I suspect that this bug will only be fixed by a mysql upgrade or changing databases. Doing the OR ourself and trying to merge results is just impossible - thats the database's job, and no real query only consists of OR terms, so we can't even special case that. If it wasn't for the bit in the manual admitting that OR is unoptimised, I'd file a mysql bug report.
No longer depends on: 103778
Target Milestone: Bugzilla 2.16 → Bugzilla 2.18
I understand the issue a bit better now. Firstly, although OR is unoptimised for all situations here, if you just choose assignee OR reporter, it would only use the bugs table, and you dont need to pull in an auxiliary table like CCs and comments. Sure enough, this operation is not too slow. Secondly, CCs OR comments are probably even worse as thats two auxilary tables that aren't joined against each other. I'm not sure whether this would cause problems even after OR is optimised.
Well, whilst cc and comments aren't directly joined against each other, that shouldn't matter. If the index was perfect, then the db should only examine the matching rows, and since thats all you want anyway..... If we're selecting bugs LEFT JOIN longdescs, then you would get one row for each comment in the entire db if no index was used, and even with the index if I have commented multiple times in the same bug, then I will get multiple rows, one for each comment, and then the ORDER BY clause we use removes the duplicates. You can't avoid that, although I guess it is possible for the db to notice that we only want unique values, and just skip over them. Not too sure about that one though. The real problem is that no index is used.
I'm still seeing this problem occur differently at different times of day (possibly at peak hours?). Right now I can't get my usual updated bug query (not the one in the URL) to load, but usually it works. This further leads me to believe there is a server overload issue involved. In that case it could only be solved by making Bugzilla more efficient or upgrading the servers.
Asa had to touch all bugs which has a TM or mozilla1.1 or mozilla1.2, when he renamed them to mozilla.11alpha and mozilla1.2alpha. Its possible that you were just hit by the mysql table lock.
This is becoming a serious problem. For the past few days I haven't even been able to run the "My bugs" query! I should note that the mozilla.org server problems are getting worse, bug pages are taking longer to load lately.
It takes about 20 seconds, but the "My Bugs" link does return 100 bugs for me.
I've had times where it takes considerably longer, and rarely does it beat the reasonable timeout on my proxy.
Can you paste the query you are using?
Yeah, that has issues - mysql does really poorly with searching for both longdescs OR ccs. And I don't think we have a way to rewrite this query to work easily.
Earlier today I was having trouble getting Quicksearch to work. Is this going to be assigned soon? As it is I have to do my updated bugs query part-by-part.
Severity: major → critical
Earlier I submitted a bookmarked query (I call it "all my bugs" - it returns dupes and unconfirmeds and others) and received no results after waiting more than 20 minutes. I exited that tab and started the same query again several minutes later. That was about an hour ago. Still no results. <http://bugzilla.mozilla.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED&email1=mrmazda%40atlantic.net&emailtype1=substring&emailreporter1=1&email2=&emai>
Bugzilla has been pondering my query continuously for over two hours now.
Your sqp query will be killed off after 6 minutes. I'm not sure why you don't get an error message in that case. This query returns almost immediate though for me. Could be the time of day. Also, exact matches on emails are faster than substring searches, since you can use an index for exact matches.
Bugzilla was apparently broken earlier today. I asked one platform owner and he said it was down due to database issues. Works like it used to now. Must have been I encountered a reason why it had to go down.
ATM Bugzilla is still misbehaving for me. Mrmazda's query works for me, but mine are just as bad as ever.
mrmazda's query doesn't need to match against the "bug status" field, since he wants all bugs regardless of status, and can use an exact address search. With those two optimizations, the query runs much faster: http://bugzilla.mozilla.org/buglist.cgi?email1=mrmazda%40atlantic.net&emailtype1=exact&emailreporter1=1 The query in the URL field of this bug has the same problem as in bug 57350, recently reopened, so I'm marking this one a duplicate of that one. *** This bug has been marked as a duplicate of 57350 ***
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → DUPLICATE
clearing target milestone on invalid/duplicate/worksforme/wontfix bugs so they'll show up as untriaged if they get reopened.
Target Milestone: Bugzilla 2.18 → ---
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: