Open Bug 157952 Opened 22 years ago Updated 11 years ago

searching on "words" in comments takes so long it times out

Categories

(Bugzilla :: Query/Bug List, defect, P2)

2.15
defect

Tracking

()

People

(Reporter: myk, Unassigned)

References

Details

Searching for words in comments takes so long it times out.  Compare the
following two queries, the first for two words and the second for two
substrings.  The first query, for words, times out (takes longer than 180
seconds to run on b.m.o), while the second query returns results.  Both queries
should return results.

http://bugzilla.mozilla.org/buglist.cgi?email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailassigned_to2=1&emailreporter2=1&emailqa_contact2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&short_desc=&short_desc_type=substring&long_desc=my.yahoo+printing&long_desc_type=allwords&bug_file_loc=&bug_file_loc_type=substring&status_whiteboard=&status_whiteboard_type=substring&keywords
http://bugzilla.mozilla.org/buglist.cgi?email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailassigned_to2=1&emailreporter2=1&emailqa_contact2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&short_desc=&short_desc_type=substring&long_desc=my.yahoo+printing&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=substring&status_whiteboard=&status_whiteboard_type=substring&ke

The SQL generated by the first query looks like this:

(lower(longdescs_.thetext) regexp '(^|[^a-z0-9])my\\.yahoo($|[^a-z0-9])' AND
lower(longdescs_.thetext) regexp '(^|[^a-z0-9])printing($|[^a-z0-9])')

Isn't there any way to optimize this besides implementing full-text searches
(the direction of bug 145588)?  This bug may also be related to bug 137011.
Nope. We need full text indexing for this. postgres can do it....

However, note that the size of such an index is likely to be very very large (at
one table entry per word in the db), esp if you allow substrings to be searched.
*** Bug 170832 has been marked as a duplicate of this bug. ***
has anyone noticed that in both bug 170832 and this bug, the character
period '.' is incorrectly escaped as '\\.' instead of '\.' in its
regexp equivelent?

Unfortunately, I cannot test if this is what causing the problem due
to another bugzilla bug. I tried "a comment match the regexp all\.js"
(equivelent to "a comment matches all the words of 'all.js'") and
find that 'all\.js' is again escaped to 'all\\.js'; needless to say
I got a software error.
*** Bug 170832 has been marked as a duplicate of this bug. ***
*** Bug 181115 has been marked as a duplicate of this bug. ***
Assignee: endico → nobody
This is still true on b.m.o today, 2.19.1, even though we have fulltext indexes.
At the least, I've been waiting about 60 or 70 seconds now for the query to
return, and it's not done yet. :-(

I'll look into it, at some point.
Assignee: nobody → mkanat
Summary: searching for words in comments takes so long it times out → searching on "words" in comments takes so long it times out
Are you using the fulltext index?  What's the query (add &debug=1 to the URL to
display the query itself)?
The query is the first link in comment 0.
The search uses a regexp, which is very nonoptimal, instead of the fulltext
index.  This can be fixed very easily.
Priority: -- → P2
Target Milestone: --- → Bugzilla 2.20
Target Milestone: Bugzilla 2.20 → Bugzilla 2.22
QA Contact: mattyt-bugzilla → default-qa
Target Milestone: Bugzilla 2.22 → Bugzilla 3.0
The Bugzilla 3.0 branch is now locked to security bugs and dataloss fixes only. This bug doesn't fit into one of these two categories and is retargetted to 3.2 as part of a mass-change. To catch bugmails related to this mass-change, use lts081207 in your email client filter.
Target Milestone: Bugzilla 3.0 → Bugzilla 3.2
Bugzilla 3.2 is restricted to security bugs only. Mass-retargetting to 3.6.
Target Milestone: Bugzilla 3.2 → Bugzilla 3.6
Assignee: mkanat → query-and-buglist
Depends on: 579568
Bugzilla 3.6 is now restricted to security fixes only, and this bug got no traction for several months. We will retarget this bug once it has a patch ready for checkin.
Target Milestone: Bugzilla 3.6 → ---
Assignee: query-and-buglist → mkanat
Target Milestone: --- → Bugzilla 4.2
Does this bug block 4.2?
Yeah, I think I can fix this by having "words" use the FT engine, for 4.2.
Flags: blocking4.2+
Is this something that could be fixed in a short period time once work was started or will this take some major changes? If it is just slower but working for the time being, I vote we move this to 5.0 and not be a blocker for now.

dkl
Not a regression, and 6 months without any activity, so not a blocker.
Flags: blocking4.2+ → blocking4.2-
Assignee: mkanat → query-and-buglist
Bugzilla 4.2 is now restricted to security fixes only.
Target Milestone: Bugzilla 4.2 → ---
You need to log in before you can comment on or make changes to this bug.