Last Comment Bug 1307485 - Bugzilla::Elastic::Search
: Bugzilla::Elastic::Search
Status: NEW
:
Product: bugzilla.mozilla.org
Classification: Other
Component: Search (show other bugs)
: Production
: Unspecified Unspecified
P2 normal (vote)
: ---
Assigned To: Dylan Hardison [:dylan]
:
:
Mentors:
Depends on: 1307478 1330768
Blocks: 1184823
  Show dependency treegraph
 
Reported: 2016-10-04 08:33 PDT by Dylan Hardison [:dylan]
Modified: 2017-02-27 15:38 PST (History)
1 user (show)
See Also:
Due Date:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
1307485_1.patch (40.68 KB, patch)
2017-01-12 14:53 PST, Dylan Hardison [:dylan]
dkl: review-
Details | Diff | Splinter Review
1307485_3.patch (50.31 KB, patch)
2017-02-16 19:01 PST, Dylan Hardison [:dylan]
no flags Details | Diff | Splinter Review
1307485_4.patch (51.36 KB, patch)
2017-02-16 20:18 PST, Dylan Hardison [:dylan]
dkl: review-
Details | Diff | Splinter Review

Description User image Dylan Hardison [:dylan] 2016-10-04 08:33:03 PDT
This is the part of the elasticsearch code that pretends to be a Bugzilla::Search and does the complicated translation between our internal search structure and ES queries. This should be easier to review and deploy.
Comment 1 User image Dylan Hardison [:dylan] 2017-01-12 14:53:29 PST
Created attachment 8826352 [details] [diff] [review]
1307485_1.patch
Comment 2 User image David Lawrence [:dkl] 2017-02-10 12:58:53 PST
Comment on attachment 8826352 [details] [diff] [review]
1307485_1.patch

Review of attachment 8826352 [details] [diff] [review]:
-----------------------------------------------------------------

I would also like to see Simple bug search and possible duplicate detection use elasticsearch if possible. Not necessarily a blocker but seem like they would be a good fit.

Also simple searches that return results for DB search are not returning results for Elastic. Such as:

Component: (is equal to any of the strings) Administration
Product: (is equal to any of the strings) bugzilla.mozilla.org 
Resolution: (is equal to any of the strings) FIXED

I get 0 results for elasticsearch. We need to make sure all searches that fall in the searchable criteria for elastic returns the same results as the DB would (> 500 in this case). bulk_index.pl was up to date.

::: Bugzilla/Elastic/Search.pm
@@ +172,5 @@
> +}
> +
> +sub _describe {
> +    my ($thing) = @_;
> +    

remove all whitespace

@@ +234,5 @@
> +    };
> +
> +    my $func = $class_to_func->{ref $thing} or die "nothing for $thing\n";
> +
> +    return $func->($thing);

could you just call _describe($thing) here? duplicate code.

::: Bugzilla/WebService/Elastic.pm
@@ +33,5 @@
> +
> +sub suggest_users {
> +    my ($self, $params) = @_;
> +
> +    Bugzilla->switch_to_shadow_db();

Remove this since we are not accessing the DB.

Also we should check to see that elastic_nodes is set in Params() otherwise user autocomplete 
will not work if Elastic is down. We should fall back to Bugzilla::WebService::User::get if
Elastic is not available. You are doing similar in buglist.cgi.

@@ +46,5 @@
> +    # if ($params->{limit}) {
> +    #     ThrowCodeError('param_must_be_numeric',
> +    #                    { function => 'Elastic.suggest_users', param => 'limit' })
> +    #       unless detaint_natural($params->{limit});
> +    # }

remove this section. does elasticsearch have a way to limit the amount of results? we do not want people trying to retrieve all users.

@@ +67,5 @@
> +__END__
> +
> +=head1 NAME
> +
> +Bugzilla::Webservice::Elastic - Elasticsearch-backed info

Fix POD docs to match what is in this module.

::: buglist.cgi
@@ +780,5 @@
> +    $fulltext = 1;
> +}
> +
> +$vars->{'search_description'} = $search->search_description;
> +$vars->{elastic} = $elastic;

not used in any template from what i can tell
Comment 3 User image Dylan Hardison [:dylan] 2017-02-16 19:01:34 PST
Created attachment 8838362 [details] [diff] [review]
1307485_3.patch

Everything except the docs, I'll work on that. You'll need to delete the bugzilla ES database and reindex.
Comment 4 User image Dylan Hardison [:dylan] 2017-02-16 20:18:27 PST
Created attachment 8838377 [details] [diff] [review]
1307485_4.patch

glob pointed out that quicksearch=cheese was matching way more than it should -- this was because my ngram implementation was incorrect. Instead of fixing this, I've removed ngrams and turned substring searches into wildcard searches on the non-analyzed .eq fields. 

This is probably going to be slower (than the non-working ngrams ;)) but we can work at this. Maybe people won't mind if we treat substring searches on product/component as prefix searches? 

we should also encourage the use of allwords/anywords.
Comment 5 User image David Lawrence [:dkl] 2017-02-27 14:59:00 PST
Comment on attachment 8838377 [details] [diff] [review]
1307485_4.patch

Review of attachment 8838377 [details] [diff] [review]:
-----------------------------------------------------------------

Looking better but atill does not proper handle resolution == '---' which means bugs with no resolution (open)

Elastic:

Status: (is equal to any of the strings) NEW () Component: (is equal to any of the strings) WebService () Product: (is equal to any of the strings) Bugzilla () Resolution: (is equal to any of the strings) --- () 
Zarro Boogs found.

elastic=0:

Status: (is equal to any of the strings) NEW ( bugs.bug_status IN ('NEW') ) Component: (is equal to any of the strings) WebService ( bugs.component_id IN (605) ) Product: (is equal to any of the strings) Bugzilla ( bugs.product_id IN (19) ) Resolution: (is equal to any of the strings) --- ( bugs.resolution IN ('') ) 
45 bugs found.

::: Bugzilla/Elastic/Indexer.pm
@@ +44,3 @@
>                  analysis => {
> +                    filter => {
> +                        asciifolding_original => { 

whitespace

::: Bugzilla/WebService/Elastic.pm
@@ +44,5 @@
> +    ThrowUserError('user_access_by_match_denied')
> +      unless Bugzilla->user->id;
> +
> +    trick_taint($params->{match});
> +    my $results = Bugzilla->elastic->suggest_users($params->{match} . "");

Should we add a fallback here to Bugzilla::WebService::User::get in case elastic is not configured or user search fails inside and eval()? We do similar for buglist.cgi.
If ElasticSearch is unreachable, we would no longer have user auto complete which I guess would not be the end of the world.

@@ +61,5 @@
> +__END__
> +
> +=head1 NAME
> +
> +Bugzilla::Webservice::Elastic - Elasticsearch-backed info

Still need to either remove POD or update to match this module and its functions.

::: template/en/default/list/list.html.tmpl
@@ +304,5 @@
> +[% END %]
> +
> +[% BLOCK elastic_disable %]
> +<br>
> +<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try without ElasticSearch</a>

Would rather see:

<strong>Note:</strong>Using ElasticSearch.
<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try same search without</a>.
Comment 6 User image Dylan Hardison [:dylan] 2017-02-27 15:04:53 PST
(In reply to David Lawrence [:dkl] from comment #5)
> Comment on attachment 8838377 [details] [diff] [review]
> 1307485_4.patch
> 
> Review of attachment 8838377 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looking better but atill does not proper handle resolution == '---' which
> means bugs with no resolution (open)
> 
> Elastic:
> 
> Status: (is equal to any of the strings) NEW () Component: (is equal to any
> of the strings) WebService () Product: (is equal to any of the strings)
> Bugzilla () Resolution: (is equal to any of the strings) --- () 
> Zarro Boogs found.

Aha, I'll have to do that in the tree-pruning code.

> elastic=0:
> 
> Status: (is equal to any of the strings) NEW ( bugs.bug_status IN ('NEW') )
> Component: (is equal to any of the strings) WebService ( bugs.component_id
> IN (605) ) Product: (is equal to any of the strings) Bugzilla (
> bugs.product_id IN (19) ) Resolution: (is equal to any of the strings) --- (
> bugs.resolution IN ('') ) 
> 45 bugs found.
> 
> ::: Bugzilla/Elastic/Indexer.pm
> @@ +44,3 @@
> >                  analysis => {
> > +                    filter => {
> > +                        asciifolding_original => { 
> 
> whitespace
> 
> ::: Bugzilla/WebService/Elastic.pm
> @@ +44,5 @@
> > +    ThrowUserError('user_access_by_match_denied')
> > +      unless Bugzilla->user->id;
> > +
> > +    trick_taint($params->{match});
> > +    my $results = Bugzilla->elastic->suggest_users($params->{match} . "");
> 
> Should we add a fallback here to Bugzilla::WebService::User::get in case
> elastic is not configured or user search fails inside and eval()? We do
> similar for buglist.cgi.
> If ElasticSearch is unreachable, we would no longer have user auto complete
> which I guess would not be the end of the world.

The code works exactly that way. Actually, it does more -- if ES is down it just uses the regular matching.

https://gist.github.com/dylanwh/5aa8fb27fab2a6dd0bd565939a144ca8

> @@ +61,5 @@
> > +__END__
> > +
> > +=head1 NAME
> > +
> > +Bugzilla::Webservice::Elastic - Elasticsearch-backed info
> 
> Still need to either remove POD or update to match this module and its
> functions.
> 
> ::: template/en/default/list/list.html.tmpl
> @@ +304,5 @@
> > +[% END %]
> > +
> > +[% BLOCK elastic_disable %]
> > +<br>
> > +<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try without ElasticSearch</a>
> 
> Would rather see:
> 
> <strong>Note:</strong>Using ElasticSearch.
> <a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try same
> search without</a>.
That works for me.
Comment 7 User image David Lawrence [:dkl] 2017-02-27 15:38:14 PST
(In reply to Dylan Hardison [:dylan] from comment #6)
> > Should we add a fallback here to Bugzilla::WebService::User::get in case
> > elastic is not configured or user search fails inside and eval()? We do
> > similar for buglist.cgi.
> > If ElasticSearch is unreachable, we would no longer have user auto complete
> > which I guess would not be the end of the world.
> 
> The code works exactly that way. Actually, it does more -- if ES is down it
> just uses the regular matching.
> 
> https://gist.github.com/dylanwh/5aa8fb27fab2a6dd0bd565939a144ca8

Sorry bout that. Was looking for an eval() like you have in buglist.cgi and didnt see it.

dkl

Note You need to log in before you can comment on or make changes to this bug.