Closed Bug 287170 Opened 19 years ago Closed 18 years ago

Need to be able to do fulltext searches on bugs.short_desc even with most tables using InnoDB

Categories

(Bugzilla :: Creating/Changing Bugs, enhancement)

2.19.2
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 3.0

People

(Reporter: mkanat, Assigned: mkanat)

References

Details

Attachments

(1 file, 1 obsolete file)

We need to have a MyISAM table for fulltext searches, but InnoDB tables for
transactions.

bugs.short_desc is one of our only fulltext fields, so it needs its own table.

I was thinking that it would become bugs_fulltext.summary. It could also stay
short_desc, because that might make things easier.
Bug 96431 is a better way to handle this problem. Then we can remove the
fulltext index.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → WONTFIX
OK, maybe we do want to do it this way, because we still do need the fulltext
index for the specific search.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: Move bugs.short_desc to its own table → Need to be able to do fulltext searches on bugs.short_desc even with most tables using InnoDB
FWIW, there is a text-search plug-in for MySQL. Although this would be tough to integrate in a database abstract manner.

http://sphinxsearch.com/
Okay, I have an idea of how to do this.

Let's modify fulltext search so that it doesn't use a fulltext index on short_desc. Instead, it does this:

1) It splits the input list by spaces and does a short_desc LIKE '%word%' search
   each word.

2) For every word that matches in the summary, we add (some number) to the
   relevance value. I'm thinking of making the "(some number)" 1/$total_words.
   That is, if you typed in three words, two words matching in the summary 
   would add .66 to the relevance. That makes a lot of sense with how we
   currently calculate "relevance" for longdescs.

Attached patch v1 (obsolete) — Splinter Review
Okay, here it is. I went from doing a fulltext match to doing a LIKE match on short_desc.

I was sorely tempted to change the way that fulltext actually *works* during this process, but in the interest of "the patch of least change" I haven't done it. This should return essentially the same results that fulltext search returned before, with perhaps one or two additions (because now we're doing a real substring search in the summary, not just a word search).

This should also fix bug 347721.
Assignee: create-and-change → mkanat
Status: REOPENED → ASSIGNED
Attachment #232707 - Flags: review?(myk)
Target Milestone: --- → Bugzilla 3.0
Blocks: 347864
Comment on attachment 232707 [details] [diff] [review]
v1

>              # Create search terms to add to the SELECT and WHERE clauses.
>              # $term1 searches comments.
>-             # $term2 searches summaries, which contributes to the relevance
>-             # ranking in SELECT but doesn't limit which bugs get retrieved.
>              my $term1 = $dbh->sql_fulltext_search("${table}.thetext", $v);
>-             my $term2 = $dbh->sql_fulltext_search("bugs.short_desc", $v);
>+             # $term2 searches summaries.
>+             my $term2 = _split_words_into_like("bugs.short_desc", $v);
>+             # For the WHERE clause
>+             my $where_term2 = _split_words_into_like('bugs.short_desc', $v, 1);

Nit: it might be more readable to call the last two variables $term2_select and $term2_where.

Nit: use parentheses consistently (i.e. double-parentheses around "bugs.short_desc" in the second _split_words_into_like call).

Nit: also, factoring out the code into _split_words_into_like is a good idea, but the logical branch based on the third $for_where argument involves relatively little code, and I think the code would be more readable if that branch occurred here rather than in that function.

In other words, have _split_words_into_like return an array of words, then process that array into the select and where clauses here, something like:

             my @words = _split_words_into_like("bugs.short_desc", $v)
             my $term2_where = join(' AND ', @words);
             my $factor = SUMMARY_RELEVANCE_FACTOR;
             my @words = map("CASE WHEN $_ THEN $factor ELSE 0 END", @words);
             my $term2_select = join(' + ', @words);


>+        # XXX Perhaps this should be OR, but to duplicate the behavior of 
>+        # bug 251567 I'm making it AND for now.

Seems to me like it should be OR too.
Attachment #232707 - Flags: review?(myk) → review+
Flags: approval?
Flags: approval? → approval+
Okay, I fixed all the nits. I'll attach a patch that shows what was actually checked in.

Checking in Bugzilla/Search.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Search.pm,v  <--  Search.pm
new revision: 1.137; previous revision: 1.136
done
Checking in Bugzilla/DB/Schema.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Schema.pm,v  <--  Schema.pm
new revision: 1.64; previous revision: 1.63
done
Checking in Bugzilla/Install/DB.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Install/DB.pm,v  <--  DB.pm
new revision: 1.10; previous revision: 1.9
don
Status: ASSIGNED → RESOLVED
Closed: 19 years ago18 years ago
Resolution: --- → FIXED
Attachment #232707 - Attachment is obsolete: true
Blocks: 350237
*** Bug 289207 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: