Status

()

--
enhancement
RESOLVED WONTFIX
4 years ago
a month ago

People

(Reporter: dylan, Assigned: dylan)

Tracking

({bmo-goal})

Production
bmo-goal
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 obsolete attachments)

Tracking bug for work to make quicksearch faster via ElasticSearch.
most of this looks sane to me.

>  Write translation layer to translate quicksearches into ES queries.
>  Modifications in the form of additional hooks to the BMO and
>  TrackingFlags extensions.

don't use hooks - update the search code directly.

> Some consideration to paging will need to be thought of.

pagination is a non-goal of this work.
(In reply to Byron Jones ‹:glob› from comment #1)
> >  Write translation layer to translate quicksearches into ES queries.
> >  Modifications in the form of additional hooks to the BMO and
> >  TrackingFlags extensions.
> 
> don't use hooks - update the search code directly.

Ugh. Wish this was happening after the master merge. Anything we add to core now is that much more possible conflict to resolve. Wish extension hooks did not come with the overhead we try to avoid as they make this type of thing much easier :(

dkl
(Assignee)

Comment 3

4 years ago
(In reply to Byron Jones ‹:glob› from comment #1)
> most of this looks sane to me.
> 
> >  Write translation layer to translate quicksearches into ES queries.
> >  Modifications in the form of additional hooks to the BMO and
> >  TrackingFlags extensions.
> 
> don't use hooks - update the search code directly.
Treat those extensions as if they were core? Okay.

> > Some consideration to paging will need to be thought of.
> 
> pagination is a non-goal of this work.

Well, the simple case is a max limit on results. I'll do that.
(Assignee)

Comment 4

4 years ago
(In reply to David Lawrence [:dkl] from comment #2)
> (In reply to Byron Jones ‹:glob› from comment #1)
> > >  Write translation layer to translate quicksearches into ES queries.
> > >  Modifications in the form of additional hooks to the BMO and
> > >  TrackingFlags extensions.
> > 
> > don't use hooks - update the search code directly.
> 
> Ugh. Wish this was happening after the master merge. Anything we add to core
> now is that much more possible conflict to resolve. Wish extension hooks did
> not come with the overhead we try to avoid as they make this type of thing
> much easier :(

There are only a few places in the code where I will be making changes, buglist.cgi and some related areas.
Mostly the code will be in the form of new modules (Bugzilla::Search::Elastic::*, and a PushConnector).
(Assignee)

Updated

4 years ago
User Story: (updated)
This got derailed due to security work, so downgrading this from a goal to a "big" item until we catch up.
Keywords: bmo-goal → bmo-big
(Assignee)

Comment 6

3 years ago
Progress on this has been good, actually!

comments are now stored as parent/child relations to the bugs. After reading and re-reading the elastic search book (The O'Reilly one?) I believe this is the best use case for our data.

Meanwhile, adding tracking flags has caused the spectre of the out of memory killer to rear its head.
Also using $user->address was leaking memory too (filed a bug for the UserProfile as well).

In addition to getting bulk loading right, I have been cooking up an API to get "what has changed" information out of Bugzilla. For non-bug objects, I'm using the audit tables. Bug objects themselves provide enough information.

What still remains an open question is: there could be huge performance impacts from user's changing their names (and thus invalidating many many elasticsearch records). :(
(Assignee)

Updated

3 years ago
Keywords: bmo-big → bmo-goal
(Assignee)

Updated

3 years ago
Depends on: 1250688
(Assignee)

Comment 7

3 years ago
Posted patch WIP.patch (obsolete) — Splinter Review
Note this doesn't pass sanity tests, or have boilerplate. It's been severely gutted and re-architectured a few times now. 

However, the bulk_index.pl script works. It can work incrementally (But you have to comment out the ->create_mapping() call)
and it is pretty fast. 

Meanwhile, it is able to parse the most common types of quicksearches.

The plan is for it to raise exceptions when asked for field that don't exist in ES -- it doesn't do that yet.

You can index your bugzilla db with the indexer and use search.pl to generate ES-compatible queries. But mostly this just up for general feedback.
Attachment #8726942 - Flags: feedback?(dkl)
(Assignee)

Updated

3 years ago
User Story: (updated)
(Assignee)

Comment 8

3 years ago
I got around to making my "find changed bugs" logic use bugs_activity. 
Considering this is going to be run against (a slave) every 20s, I'd like a DBA to look it over.


SELECT DISTINCT bug_id
          FROM bugs_activity
          JOIN fielddefs on fieldid = fielddefs.id
        WHERE UNIX_TIMESTAMP(bug_when) > $mtime
          AND fielddefs.name IN ("keywords", "short_desc", "product", "component", 
                                 "cf_crash_signature", "alias", "status_whiteboard", "bug_status", "resolution")

How horrible is this going to be? What can I do to make it better, if anything?
Flags: needinfo?(mpressman)
(Assignee)

Comment 9

3 years ago
It looks like the following is faster by a lot:

SELECT DISTINCT bug_id
          FROM bugs_activity
          JOIN fielddefs on fieldid = fielddefs.id
        WHERE bug_when > FROM_UNIXTIME($mtime)
          AND fielddefs.name IN ("keywords", "short_desc", "product", "component", 
                                 "cf_crash_signature", "alias", "status_whiteboard", "bug_status", "resolution")
(Assignee)

Comment 10

3 years ago
Posted patch 1184823_3.patch (obsolete) — Splinter Review
More stuff. This has the js/field user auto-completion being backed by a native REST wrapper around elasticsearch. it's not quite as fast, but still pretty fast.

Currently adding anonymized search queries to t/015_*.t (not in this patch, because it could be sensitive)

Based on the corpus, the operators I need to support still are:

1) anywords (used by the keyword syntax, !foo)
2) notsubstring
3) notequals

Probably more, but the not* ones present a particular problem with translating them to ES. If I can't figure it out today, they will be dropped from initial support. We'll still be able to handle a large number of searches this way.
Attachment #8726942 - Attachment is obsolete: true
Attachment #8726942 - Flags: feedback?(dkl)
Attachment #8727511 - Flags: feedback?(dkl)
(Assignee)

Comment 11

3 years ago
Posted patch 1184823_4.patch (obsolete) — Splinter Review
last semi-working copy (in the middle of refactoring, so this is a snapshot 'cause I wanted to share)
(Assignee)

Comment 12

3 years ago
Posted patch 1184823_5.patch (obsolete) — Splinter Review
Attachment #8727511 - Attachment is obsolete: true
Attachment #8735691 - Attachment is obsolete: true
Attachment #8727511 - Flags: feedback?(dkl)
Flags: needinfo?(mpressman)
Attachment #8738025 - Flags: review?(dkl)
Comment on attachment 8738025 [details] [diff] [review]
1184823_5.patch

Review of attachment 8738025 [details] [diff] [review]:
-----------------------------------------------------------------

Remove references to Alive.pm (debugging) and spin new patch. 
Thanks
Attachment #8738025 - Flags: review?(dkl) → review-
(Assignee)

Comment 14

3 years ago
Posted patch 1184823_6.patch (obsolete) — Splinter Review
Attachment #8738025 - Attachment is obsolete: true
Attachment #8739091 - Flags: review?(dkl)
(Assignee)

Updated

3 years ago
Blocks: 1274418

Comment 15

3 years ago
Will I still be able to search for exact strings, including strings that contain punctuation? (I ask because it's important to me in Bugzilla, and because many search systems don't support it.)
(Assignee)

Comment 16

3 years ago
(In reply to Jesse Ruderman from comment #15)
> Will I still be able to search for exact strings, including strings that
> contain punctuation? (I ask because it's important to me in Bugzilla, and
> because many search systems don't support it.)

Yes with caveats (for comments this isn't allowed for space considerations).
It is also possible to force a fallback onto the sql-based system.
(A fallback also occurs if a field that is not indexed is searched for, or if an operator that is not supported is used.)

If you provide me a list of your searches, I'll test them against my system and tell you the result. I used a corpus of 4000 common quicksearches to figure out what queries to focus on.
Flags: needinfo?(jruderman)
(Assignee)

Updated

3 years ago
Flags: needinfo?(jruderman)
(Assignee)

Updated

3 years ago
Depends on: 1307478
(Assignee)

Updated

3 years ago
Depends on: 1307485
(Assignee)

Updated

3 years ago
Attachment #8739091 - Flags: review?(dkl)
Depends on: 1316660
(Assignee)

Updated

2 years ago
No longer depends on: 1250688
No longer depends on: 1307478
No longer blocks: 1274418
Depends on: 1274418
(Assignee)

Updated

2 years ago
Attachment #8739091 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
User Story: (updated)

As per dylan, Elasticsearch is no more.

Status: NEW → RESOLVED
Last Resolved: a month ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.