With the fix for bug 251567, the fulltext index is no longer used in fulltext searches, resulting in significant performance degradation to the point where queries time out on b.m.o. See for example the following query: https://bugzilla.mozilla.org/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__open__&product=&content=addons&debug=1 This query times out on b.m.o, but if I modify the query string, removing the clause added in the fix for bug 251567, the query takes ~0.05 seconds to run. This is too much degradation for the benefits the bug fix provides. We should back it out.
Looking at bug 251567, I'd imagine that the slow SQL is: (lower(bugs.short_desc) regexp '(^|[^a-z0-9])silly($|[^a-z0-9])' AND lower(bugs.short_desc) regexp '(^|[^a-z0-9])bug($|[^a-z0-9])')) Note that using lower() bypasses the index, and using the regex when we should be using LIKE is also silly.
> Note that using lower() bypasses the index, and using the regex when we > should be using LIKE is also silly. Perhaps, but neither change speeds up the query, since the slowness is due to MySQL not using the fulltext index to search the longdescs table. Instead, it greps every record for occurrences of the term, which is expensive.
Created attachment 179772 [details] [diff] [review] patch v1: reverts fix for bug 251567 Dave, here's a patch that undoes the damage. Bug 251567 is legitimate, but its cure is worse than the disease. We need to find some other way of fixing the problem it describes which doesn't cause common queries to time out because we aren't using the fulltext index that we implemented for the express purpose of handling these queries.
Assignee: query-and-buglist → myk
Status: NEW → ASSIGNED
What if the summary text was stored in the longdescs table, with the bug_when set to NULL, and queries involving the longdescs table modified to say "AND bug_when NOT NULL" where this row would be explicitly not wanted? The default when searching on "a comment" could then be (in effect) "the summary or a comment". I can think of few circumstances where someone would explicitly need to search on comments without matching the summary.
bah. I screwed with this for the last hour or so, and can't find any way to come up with a query that actually uses both indexes short of running the entire query once for the longdescs and once for the short_desc and using a UNION between them. And that would suck. Putting a mirror of the short_desc into the longdescs table sounds like the most feasible option if we really need to retain accuracy, but that really sounds like an "oy, what a hack!" type solution. Joel, any bright ideas?
(In reply to comment #5) > Putting a mirror of the short_desc into the longdescs table sounds [...] > like an "oy, what a hack!" type solution. Yes, but perhaps longer-term it wouldn't be a mirror but the only copy, with an id linking the bugs and longdescs tables... ... or would that make it impossible to optimise summary-only searches?
The patch on comment 3 has now been applied to b.m.o. At first glance, it doesn't seem to have helped any.
Whiteboard: [applied to b.m.o]
> At first glance, it doesn't seem to have helped any. The query in the description of this bug currently returns immediately instead of timing out, which is a pretty significant improvement.
Ping? Since there's no perfect solution at the moment, could we get my patch in or decide that we're not going to take it? I'm ok either way, but it'd be good to make a definitive decision about it, at least for now (we could always reopen the decision if more information comes to light in the future).
Comment on attachment 179772 [details] [diff] [review] patch v1: reverts fix for bug 251567 No sense leaving these things around forever.
This patch is included as part of the patch from bug 287170. *** This bug has been marked as a duplicate of 287170 ***
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → DUPLICATE
Whiteboard: [applied to b.m.o]
You need to log in before you can comment on or make changes to this bug.