Open Bug 824372 Opened 12 years ago Updated 12 years ago

"Comment contains all/any of the words" should use the bugs_fulltext table and be the default operator

Categories

(Bugzilla :: Query/Bug List, enhancement)

enhancement
Not set
normal

Tracking

()

People

(Reporter: LpSolit, Unassigned)

Details

Currently, the default operator for the comment field is "contains all of the strings" which looks for a subtring in the longdescs table. That's terribly slow because there is no index for the thetext column. Instead, we should display the "contains all of the words" by default and make it use the bugs_fulltext table to benefit from the fulltext index. Currently, this operator also looks at the longdescs table with this ugly regexp: thetext REGEXP '(^|[^[:alnum:]])foo($|[^[:alnum:]])' From what I can read at http://dev.mysql.com/doc/refman/5.5/en/regexp.html#operator_regexp, REGEXP is not multi-byte safe anyway: "The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets." The disadvantage with fulltext searches is that they do not do substring matches, so if you look for "bug" but a comment contains "bugs", it won't find it.
i agree that our method for searching comments needs to be improved, however i have concerns about the restrictions/differences using fulltext would impose, in particular stop-words and min_word_len. http://dev.mysql.com/doc/refman/5.1/en/fulltext-restrictions.html http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html http://stackoverflow.com/q/609935/953
You need to log in before you can comment on or make changes to this bug.