Closed Bug 1091643 Opened 7 years ago Closed 7 years ago

Switch bugscache queries from full-text search to LIKE

Categories

(Tree Management :: Treeherder, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file, 1 obsolete file)

54 bytes, text/x-github-pull-request
mdoglio
: review+
Details | Review
In bug 1057359, it was found that for the failures in bug 1078237 (and likely some of the other missing suggestions issues covered in the dependent bugs), the correct search term was being used (compared to TBPL), however the bugscache API was returning zero results.

For example...

Cleaned error line:
TEST-UNEXPECTED-FAIL | test_switch_frame.py TestSwitchFrame.test_should_be_able_to_carry_on_working_if_the_frame_is_deleted_from_under_us | AssertionError: 0 != 1

Extracted test name (after 100 char truncation):
test_switch_frame.py TestSwitchFrame.test_should_be_able_to_carry_on_working_if_the_frame_is_deleted

Search:
https://treeherder.mozilla.org/api/bugscache/?search=test_switch_frame.py TestSwitchFrame.test_should_be_able_to_carry_on_working_if_the_frame_is_deleted

I believe this to be since we've truncated the test name after the word "deleted", whereas the bug summary in the DB is: 
Intermittent test_switch_frame.py TestSwitchFrame.test_should_be_able_to_carry_on_working_if_the_frame_is_deleted_from_under_us | AssertionError: 0 != 1

...and thus means the fulltext search doesn't find a match.

I also believe that the fulltext search stopwords are meddling with results in other cases too.

tl;dr: I believe we should be using LIKE rather than fulltext search (at least for the WHERE clause, having a relevance score returned is still useful).
Summary: Switch bugscache queries from fulltext search to LIKE → Switch bugscache queries from full-text search to LIKE
Attached file WIP (obsolete) —
This WIP doesn't work and I'm having trouble figuring out if I should be doing something differently, or if it's a bug in datasource.

The current patch fails with:
ValueError: unsupported format character ''' (0x27) at index 327

And using "%%" instead of "%" returns no matches.

Any ideas? (Travis run at https://travis-ci.org/mozilla/treeherder-service/builds/39501100 )
Attachment #8514308 - Flags: feedback?(mdoglio)
(In reply to Ed Morley [:edmorley] from comment #1)
> And using "%%" instead of "%" returns no matches.

(Slightly clearer:)
And whilst using "%%" instead of "%" fixes the exception, it returns no bug matches for any of the search terms, including the previously passing ones.
Attachment #8514308 - Attachment mime type: text/plain → text/x-github-pull-request
Did you try to pass "%yourstring%" as a placeholder?
Attached file Final PR
(I pasted the wrong URL)

Travis run with '%':
https://travis-ci.org/mozilla/treeherder-service/builds/39501100

Travis run with '%%':
https://travis-ci.org/mozilla/treeherder-service/builds/39503251
Attachment #8514326 - Flags: feedback?(mdoglio)
Attachment #8514308 - Attachment is obsolete: true
Attachment #8514308 - Flags: feedback?(mdoglio)
(Sorry our messages passed each other)

Using:
AND `summary` LIKE `%test%`

Gives:
ValueError: unsupported format character 't' (0x74) at index 319

And:
AND `summary` LIKE `%%test%%`

Gives:
TypeError: not all arguments converted during string formatting
Blocks: 1072377
Comment on attachment 8514326 [details] [review]
Final PR

So this works now :-)

Few notes:
1) This switches from full-text search to LIKE for the WHERE clause, so we get exact-substring behaviour, even for search terms that were truncated (this still isn't the exact same means of searching as happens with TBPL's use of Bugzilla quicksearch, but it's more predictable, so I think we should just use it).
2) When using LIKE, we need to perform escaping of the search term, otherwise it won't be treated literally. I couldn't find any better way of performing it other than manually.
3) Adds tests for the failing case in comment 0 & also the escaping of the search term. I think we'll need to overhaul the way we do these tests soon, but punting for another bug, since this issue is causing the sheriffs day to day pain.
Attachment #8514326 - Attachment description: WIP → Final PR
Attachment #8514326 - Flags: feedback?(mdoglio) → review?(mdoglio)
Attachment #8514326 - Flags: review?(mdoglio) → review+
Commits pushed to master at https://github.com/mozilla/treeherder-service

https://github.com/mozilla/treeherder-service/commit/65f8a66dbd9b61633360285392d5d014fb5421f0
Bug 1091643 - Switch bugscache queries from full-text search to LIKE

Since full-text searches miss cases where the search term was truncated,
but is still a substring of the bug summary.

When using LIKE, any underscores or percent signs will be treated as
wildcards, so we need to escape them, as well as the escape symbol
itself, so the search term is handled literally. A custom escape
symbol is used since the default of backslash leads to less readable
code, given it has to be double escaped.

https://github.com/mozilla/treeherder-service/commit/b2ffa4d61fbd161a6d264c99658e31fc531a6a84
Bug 1091643 - Add test for truncated test name bug searches

https://github.com/mozilla/treeherder-service/commit/93e0c98999f4fbe3e0baecf68fe626f49ddae771
Bug 1091643 - Add tests for SQL LIKE escaping of search_term

Test that we are treating the search term literally in the LIKE
statement, and so have correctly escaped any underscores, percent signs
or escape symbols.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Duplicate of this bug: 1076963
You need to log in before you can comment on or make changes to this bug.