Add code to run a subset of buglist.cgi search queries against the ES backend

RESOLVED FIXED

Status

()

bugzilla.mozilla.org
Search
P2
normal
RESOLVED FIXED
11 months ago
3 months ago

People

(Reporter: dylan, Assigned: dylan)

Tracking

(Depends on: 1 bug, Blocks: 2 bugs)

Production
Dependency tree / graph

Details

Attachments

(1 attachment, 3 obsolete attachments)

(Assignee)

Description

11 months ago
This is the part of the elasticsearch code that pretends to be a Bugzilla::Search and does the complicated translation between our internal search structure and ES queries. This should be easier to review and deploy.
(Assignee)

Updated

11 months ago
Depends on: 1307478
(Assignee)

Comment 1

7 months ago
Created attachment 8826352 [details] [diff] [review]
1307485_1.patch
Assignee: nobody → dylan
Attachment #8826352 - Flags: review?(dkl)
(Assignee)

Updated

7 months ago
Priority: -- → P2

Comment 2

6 months ago
Comment on attachment 8826352 [details] [diff] [review]
1307485_1.patch

Review of attachment 8826352 [details] [diff] [review]:
-----------------------------------------------------------------

I would also like to see Simple bug search and possible duplicate detection use elasticsearch if possible. Not necessarily a blocker but seem like they would be a good fit.

Also simple searches that return results for DB search are not returning results for Elastic. Such as:

Component: (is equal to any of the strings) Administration
Product: (is equal to any of the strings) bugzilla.mozilla.org 
Resolution: (is equal to any of the strings) FIXED

I get 0 results for elasticsearch. We need to make sure all searches that fall in the searchable criteria for elastic returns the same results as the DB would (> 500 in this case). bulk_index.pl was up to date.

::: Bugzilla/Elastic/Search.pm
@@ +172,5 @@
> +}
> +
> +sub _describe {
> +    my ($thing) = @_;
> +    

remove all whitespace

@@ +234,5 @@
> +    };
> +
> +    my $func = $class_to_func->{ref $thing} or die "nothing for $thing\n";
> +
> +    return $func->($thing);

could you just call _describe($thing) here? duplicate code.

::: Bugzilla/WebService/Elastic.pm
@@ +33,5 @@
> +
> +sub suggest_users {
> +    my ($self, $params) = @_;
> +
> +    Bugzilla->switch_to_shadow_db();

Remove this since we are not accessing the DB.

Also we should check to see that elastic_nodes is set in Params() otherwise user autocomplete 
will not work if Elastic is down. We should fall back to Bugzilla::WebService::User::get if
Elastic is not available. You are doing similar in buglist.cgi.

@@ +46,5 @@
> +    # if ($params->{limit}) {
> +    #     ThrowCodeError('param_must_be_numeric',
> +    #                    { function => 'Elastic.suggest_users', param => 'limit' })
> +    #       unless detaint_natural($params->{limit});
> +    # }

remove this section. does elasticsearch have a way to limit the amount of results? we do not want people trying to retrieve all users.

@@ +67,5 @@
> +__END__
> +
> +=head1 NAME
> +
> +Bugzilla::Webservice::Elastic - Elasticsearch-backed info

Fix POD docs to match what is in this module.

::: buglist.cgi
@@ +780,5 @@
> +    $fulltext = 1;
> +}
> +
> +$vars->{'search_description'} = $search->search_description;
> +$vars->{elastic} = $elastic;

not used in any template from what i can tell
Attachment #8826352 - Flags: review?(dkl) → review-
(Assignee)

Comment 3

6 months ago
Created attachment 8838362 [details] [diff] [review]
1307485_3.patch

Everything except the docs, I'll work on that. You'll need to delete the bugzilla ES database and reindex.
Attachment #8826352 - Attachment is obsolete: true
Attachment #8838362 - Flags: review?(dkl)
(Assignee)

Comment 4

6 months ago
Created attachment 8838377 [details] [diff] [review]
1307485_4.patch

glob pointed out that quicksearch=cheese was matching way more than it should -- this was because my ngram implementation was incorrect. Instead of fixing this, I've removed ngrams and turned substring searches into wildcard searches on the non-analyzed .eq fields. 

This is probably going to be slower (than the non-working ngrams ;)) but we can work at this. Maybe people won't mind if we treat substring searches on product/component as prefix searches? 

we should also encourage the use of allwords/anywords.
Attachment #8838362 - Attachment is obsolete: true
Attachment #8838377 - Flags: review?(dkl)
Attachment #8838362 - Flags: review?(dkl)

Updated

6 months ago
Depends on: 1330768

Comment 5

6 months ago
Comment on attachment 8838377 [details] [diff] [review]
1307485_4.patch

Review of attachment 8838377 [details] [diff] [review]:
-----------------------------------------------------------------

Looking better but atill does not proper handle resolution == '---' which means bugs with no resolution (open)

Elastic:

Status: (is equal to any of the strings) NEW () Component: (is equal to any of the strings) WebService () Product: (is equal to any of the strings) Bugzilla () Resolution: (is equal to any of the strings) --- () 
Zarro Boogs found.

elastic=0:

Status: (is equal to any of the strings) NEW ( bugs.bug_status IN ('NEW') ) Component: (is equal to any of the strings) WebService ( bugs.component_id IN (605) ) Product: (is equal to any of the strings) Bugzilla ( bugs.product_id IN (19) ) Resolution: (is equal to any of the strings) --- ( bugs.resolution IN ('') ) 
45 bugs found.

::: Bugzilla/Elastic/Indexer.pm
@@ +44,3 @@
>                  analysis => {
> +                    filter => {
> +                        asciifolding_original => { 

whitespace

::: Bugzilla/WebService/Elastic.pm
@@ +44,5 @@
> +    ThrowUserError('user_access_by_match_denied')
> +      unless Bugzilla->user->id;
> +
> +    trick_taint($params->{match});
> +    my $results = Bugzilla->elastic->suggest_users($params->{match} . "");

Should we add a fallback here to Bugzilla::WebService::User::get in case elastic is not configured or user search fails inside and eval()? We do similar for buglist.cgi.
If ElasticSearch is unreachable, we would no longer have user auto complete which I guess would not be the end of the world.

@@ +61,5 @@
> +__END__
> +
> +=head1 NAME
> +
> +Bugzilla::Webservice::Elastic - Elasticsearch-backed info

Still need to either remove POD or update to match this module and its functions.

::: template/en/default/list/list.html.tmpl
@@ +304,5 @@
> +[% END %]
> +
> +[% BLOCK elastic_disable %]
> +<br>
> +<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try without ElasticSearch</a>

Would rather see:

<strong>Note:</strong>Using ElasticSearch.
<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try same search without</a>.
Attachment #8838377 - Flags: review?(dkl) → review-
(Assignee)

Comment 6

6 months ago
(In reply to David Lawrence [:dkl] from comment #5)
> Comment on attachment 8838377 [details] [diff] [review]
> 1307485_4.patch
> 
> Review of attachment 8838377 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looking better but atill does not proper handle resolution == '---' which
> means bugs with no resolution (open)
> 
> Elastic:
> 
> Status: (is equal to any of the strings) NEW () Component: (is equal to any
> of the strings) WebService () Product: (is equal to any of the strings)
> Bugzilla () Resolution: (is equal to any of the strings) --- () 
> Zarro Boogs found.

Aha, I'll have to do that in the tree-pruning code.

> elastic=0:
> 
> Status: (is equal to any of the strings) NEW ( bugs.bug_status IN ('NEW') )
> Component: (is equal to any of the strings) WebService ( bugs.component_id
> IN (605) ) Product: (is equal to any of the strings) Bugzilla (
> bugs.product_id IN (19) ) Resolution: (is equal to any of the strings) --- (
> bugs.resolution IN ('') ) 
> 45 bugs found.
> 
> ::: Bugzilla/Elastic/Indexer.pm
> @@ +44,3 @@
> >                  analysis => {
> > +                    filter => {
> > +                        asciifolding_original => { 
> 
> whitespace
> 
> ::: Bugzilla/WebService/Elastic.pm
> @@ +44,5 @@
> > +    ThrowUserError('user_access_by_match_denied')
> > +      unless Bugzilla->user->id;
> > +
> > +    trick_taint($params->{match});
> > +    my $results = Bugzilla->elastic->suggest_users($params->{match} . "");
> 
> Should we add a fallback here to Bugzilla::WebService::User::get in case
> elastic is not configured or user search fails inside and eval()? We do
> similar for buglist.cgi.
> If ElasticSearch is unreachable, we would no longer have user auto complete
> which I guess would not be the end of the world.

The code works exactly that way. Actually, it does more -- if ES is down it just uses the regular matching.

https://gist.github.com/dylanwh/5aa8fb27fab2a6dd0bd565939a144ca8

> @@ +61,5 @@
> > +__END__
> > +
> > +=head1 NAME
> > +
> > +Bugzilla::Webservice::Elastic - Elasticsearch-backed info
> 
> Still need to either remove POD or update to match this module and its
> functions.
> 
> ::: template/en/default/list/list.html.tmpl
> @@ +304,5 @@
> > +[% END %]
> > +
> > +[% BLOCK elastic_disable %]
> > +<br>
> > +<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try without ElasticSearch</a>
> 
> Would rather see:
> 
> <strong>Note:</strong>Using ElasticSearch.
> <a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try same
> search without</a>.
That works for me.

Comment 7

6 months ago
(In reply to Dylan Hardison [:dylan] from comment #6)
> > Should we add a fallback here to Bugzilla::WebService::User::get in case
> > elastic is not configured or user search fails inside and eval()? We do
> > similar for buglist.cgi.
> > If ElasticSearch is unreachable, we would no longer have user auto complete
> > which I guess would not be the end of the world.
> 
> The code works exactly that way. Actually, it does more -- if ES is down it
> just uses the regular matching.
> 
> https://gist.github.com/dylanwh/5aa8fb27fab2a6dd0bd565939a144ca8

Sorry bout that. Was looking for an eval() like you have in buglist.cgi and didnt see it.

dkl
(Assignee)

Comment 8

6 months ago
Created attachment 8843450 [details] [diff] [review]
1307485_6.patch

Note: this adds reporter to the index.
Attachment #8838377 - Attachment is obsolete: true
Attachment #8843450 - Flags: review?(dkl)
(Assignee)

Updated

5 months ago
Duplicate of this bug: 1250689
Comment on attachment 8843450 [details] [diff] [review]
1307485_6.patch

Review of attachment 8843450 [details] [diff] [review]:
-----------------------------------------------------------------

r=dkl
Attachment #8843450 - Flags: review?(dkl) → review+
(Assignee)

Updated

5 months ago
Summary: Bugzilla::Elastic::Search → Add search to run a subset of buglist.cgi search queries against the ES backend
(Assignee)

Updated

5 months ago
Summary: Add search to run a subset of buglist.cgi search queries against the ES backend → Add code to run a subset of buglist.cgi search queries against the ES backend
(Assignee)

Comment 11

5 months ago
To git@github.com:mozilla-bteam/bmo.git
   1ef7b44..9c26c01  master -> master
Status: NEW → RESOLVED
Last Resolved: 5 months ago
Resolution: --- → FIXED
Depends on: 1347016
No longer depends on: 1347016
(Assignee)

Comment 12

5 months ago
Backing this out while I investigate bug 1343533
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Updated

5 months ago
Blocks: 1348379
(Assignee)

Updated

5 months ago
Depends on: 1348380
(Assignee)

Comment 13

5 months ago
To github.com:mozilla-bteam/bmo.git
   4cc54c4c9..1ad85fd04  master -> master

This will go out assuming bug 1348380 can go out.
(Assignee)

Comment 14

5 months ago
To github.com:mozilla-bteam/bmo.git
   e53de8c34..b921e3142  master -> master

Next week, with bug 1348380
(Assignee)

Updated

4 months ago
Depends on: 1359224
(Assignee)

Comment 15

3 months ago
To github.com:mozilla-bteam/bmo.git
   0ea91b298..ffe838a92  master -> master
Status: REOPENED → RESOLVED
Last Resolved: 5 months ago3 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.