The default bug view has changed. See this FAQ.

Status

()

bugzilla.mozilla.org
Search
--
enhancement
2 years ago
16 days ago

People

(Reporter: dylan, Assigned: dylan)

Tracking

(Depends on: 2 bugs, {bmo-goal})

Production
bmo-goal
Dependency tree / graph

Details

Attachments

(5 obsolete attachments)

(Assignee)

Description

2 years ago
Tracking bug for work to make quicksearch faster via ElasticSearch.
most of this looks sane to me.

>  Write translation layer to translate quicksearches into ES queries.
>  Modifications in the form of additional hooks to the BMO and
>  TrackingFlags extensions.

don't use hooks - update the search code directly.

> Some consideration to paging will need to be thought of.

pagination is a non-goal of this work.
(In reply to Byron Jones ‹:glob› from comment #1)
> >  Write translation layer to translate quicksearches into ES queries.
> >  Modifications in the form of additional hooks to the BMO and
> >  TrackingFlags extensions.
> 
> don't use hooks - update the search code directly.

Ugh. Wish this was happening after the master merge. Anything we add to core now is that much more possible conflict to resolve. Wish extension hooks did not come with the overhead we try to avoid as they make this type of thing much easier :(

dkl
(Assignee)

Comment 3

2 years ago
(In reply to Byron Jones ‹:glob› from comment #1)
> most of this looks sane to me.
> 
> >  Write translation layer to translate quicksearches into ES queries.
> >  Modifications in the form of additional hooks to the BMO and
> >  TrackingFlags extensions.
> 
> don't use hooks - update the search code directly.
Treat those extensions as if they were core? Okay.

> > Some consideration to paging will need to be thought of.
> 
> pagination is a non-goal of this work.

Well, the simple case is a max limit on results. I'll do that.
(Assignee)

Comment 4

2 years ago
(In reply to David Lawrence [:dkl] from comment #2)
> (In reply to Byron Jones ‹:glob› from comment #1)
> > >  Write translation layer to translate quicksearches into ES queries.
> > >  Modifications in the form of additional hooks to the BMO and
> > >  TrackingFlags extensions.
> > 
> > don't use hooks - update the search code directly.
> 
> Ugh. Wish this was happening after the master merge. Anything we add to core
> now is that much more possible conflict to resolve. Wish extension hooks did
> not come with the overhead we try to avoid as they make this type of thing
> much easier :(

There are only a few places in the code where I will be making changes, buglist.cgi and some related areas.
Mostly the code will be in the form of new modules (Bugzilla::Search::Elastic::*, and a PushConnector).
(Assignee)

Updated

2 years ago
User Story: (updated)

Comment 5

2 years ago
This got derailed due to security work, so downgrading this from a goal to a "big" item until we catch up.
Keywords: bmo-goal → bmo-big
(Assignee)

Comment 6

a year ago
Progress on this has been good, actually!

comments are now stored as parent/child relations to the bugs. After reading and re-reading the elastic search book (The O'Reilly one?) I believe this is the best use case for our data.

Meanwhile, adding tracking flags has caused the spectre of the out of memory killer to rear its head.
Also using $user->address was leaking memory too (filed a bug for the UserProfile as well).

In addition to getting bulk loading right, I have been cooking up an API to get "what has changed" information out of Bugzilla. For non-bug objects, I'm using the audit tables. Bug objects themselves provide enough information.

What still remains an open question is: there could be huge performance impacts from user's changing their names (and thus invalidating many many elasticsearch records). :(
(Assignee)

Updated

a year ago
Keywords: bmo-big → bmo-goal
(Assignee)

Updated

a year ago
Depends on: 1250688
(Assignee)

Comment 7

a year ago
Created attachment 8726942 [details] [diff] [review]
WIP.patch

Note this doesn't pass sanity tests, or have boilerplate. It's been severely gutted and re-architectured a few times now. 

However, the bulk_index.pl script works. It can work incrementally (But you have to comment out the ->create_mapping() call)
and it is pretty fast. 

Meanwhile, it is able to parse the most common types of quicksearches.

The plan is for it to raise exceptions when asked for field that don't exist in ES -- it doesn't do that yet.

You can index your bugzilla db with the indexer and use search.pl to generate ES-compatible queries. But mostly this just up for general feedback.
Attachment #8726942 - Flags: feedback?(dkl)
(Assignee)

Updated

a year ago
User Story: (updated)
(Assignee)

Comment 8

a year ago
I got around to making my "find changed bugs" logic use bugs_activity. 
Considering this is going to be run against (a slave) every 20s, I'd like a DBA to look it over.


SELECT DISTINCT bug_id
          FROM bugs_activity
          JOIN fielddefs on fieldid = fielddefs.id
        WHERE UNIX_TIMESTAMP(bug_when) > $mtime
          AND fielddefs.name IN ("keywords", "short_desc", "product", "component", 
                                 "cf_crash_signature", "alias", "status_whiteboard", "bug_status", "resolution")

How horrible is this going to be? What can I do to make it better, if anything?
Flags: needinfo?(mpressman)
(Assignee)

Comment 9

a year ago
It looks like the following is faster by a lot:

SELECT DISTINCT bug_id
          FROM bugs_activity
          JOIN fielddefs on fieldid = fielddefs.id
        WHERE bug_when > FROM_UNIXTIME($mtime)
          AND fielddefs.name IN ("keywords", "short_desc", "product", "component", 
                                 "cf_crash_signature", "alias", "status_whiteboard", "bug_status", "resolution")
(Assignee)

Comment 10

a year ago
Created attachment 8727511 [details] [diff] [review]
1184823_3.patch

More stuff. This has the js/field user auto-completion being backed by a native REST wrapper around elasticsearch. it's not quite as fast, but still pretty fast.

Currently adding anonymized search queries to t/015_*.t (not in this patch, because it could be sensitive)

Based on the corpus, the operators I need to support still are:

1) anywords (used by the keyword syntax, !foo)
2) notsubstring
3) notequals

Probably more, but the not* ones present a particular problem with translating them to ES. If I can't figure it out today, they will be dropped from initial support. We'll still be able to handle a large number of searches this way.
Attachment #8726942 - Attachment is obsolete: true
Attachment #8726942 - Flags: feedback?(dkl)
Attachment #8727511 - Flags: feedback?(dkl)
(Assignee)

Comment 11

a year ago
Created attachment 8735691 [details] [diff] [review]
1184823_4.patch

last semi-working copy (in the middle of refactoring, so this is a snapshot 'cause I wanted to share)
(Assignee)

Comment 12

a year ago
Created attachment 8738025 [details] [diff] [review]
1184823_5.patch
Attachment #8727511 - Attachment is obsolete: true
Attachment #8735691 - Attachment is obsolete: true
Attachment #8727511 - Flags: feedback?(dkl)
Flags: needinfo?(mpressman)
Attachment #8738025 - Flags: review?(dkl)
Comment on attachment 8738025 [details] [diff] [review]
1184823_5.patch

Review of attachment 8738025 [details] [diff] [review]:
-----------------------------------------------------------------

Remove references to Alive.pm (debugging) and spin new patch. 
Thanks
Attachment #8738025 - Flags: review?(dkl) → review-
(Assignee)

Comment 14

a year ago
Created attachment 8739091 [details] [diff] [review]
1184823_6.patch
Attachment #8738025 - Attachment is obsolete: true
Attachment #8739091 - Flags: review?(dkl)
(Assignee)

Updated

10 months ago
Blocks: 1274418

Comment 15

10 months ago
Will I still be able to search for exact strings, including strings that contain punctuation? (I ask because it's important to me in Bugzilla, and because many search systems don't support it.)
(Assignee)

Comment 16

10 months ago
(In reply to Jesse Ruderman from comment #15)
> Will I still be able to search for exact strings, including strings that
> contain punctuation? (I ask because it's important to me in Bugzilla, and
> because many search systems don't support it.)

Yes with caveats (for comments this isn't allowed for space considerations).
It is also possible to force a fallback onto the sql-based system.
(A fallback also occurs if a field that is not indexed is searched for, or if an operator that is not supported is used.)

If you provide me a list of your searches, I'll test them against my system and tell you the result. I used a corpus of 4000 common quicksearches to figure out what queries to focus on.
Flags: needinfo?(jruderman)
(Assignee)

Updated

9 months ago
Flags: needinfo?(jruderman)
(Assignee)

Updated

6 months ago
Depends on: 1307478
(Assignee)

Updated

6 months ago
Depends on: 1307485
(Assignee)

Updated

6 months ago
Attachment #8739091 - Flags: review?(dkl)

Updated

3 months ago
Depends on: 1316660
(Assignee)

Updated

3 months ago
No longer depends on: 1250688

Updated

3 months ago
No longer depends on: 1307478

Updated

3 months ago
No longer blocks: 1274418
Depends on: 1274418
(Assignee)

Updated

3 months ago
Attachment #8739091 - Attachment is obsolete: true
(Assignee)

Updated

3 months ago
User Story: (updated)
Depends on: 1347016
Severity: normal → enhancement
You need to log in before you can comment on or make changes to this bug.