Closed Bug 65190 Opened 24 years ago Closed 23 years ago

query form: add comparison type "all words as substrings"

Categories

(Bugzilla :: Bugzilla-General, enhancement)

enhancement
Not set
normal

Tracking

()

VERIFIED FIXED
Bugzilla 2.14

People

(Reporter: afranke, Assigned: Chris.Yeh)

References

()

Details

Attachments

(5 files)

Both "(case-insensitive) substring" and "all words" comparison types are not optimal solutions if you want to query for several words: >> The problem with the Simple one ["(case-insensitive) substring"] >> is that, if people put in a few words relevant to their bug, it's >> reasonably unlikely that those words will appear in that exact order >> in the summary. Could we do "all words" instead? > >i've never found "all words" to be useful, because it won't let you type >"bookmarklet" and get hits for both "bookmarklet" and "bookmarklets". C.Begle's SimpleSearch <http://www.mozilla.org/quality/help/simplesearch.html> solves this problem by generating boolean charts on the fly. The javascript for this page is located at <http://www.mozilla.org/quality/help/bugreport.js>. (A more structured and further developed version can be found in bug 61561.) It would greatly simplify things if you had an "all words as substrings" option in all comparison type dropdowns on the default query page, too. Probably it should even be the default. Similarly, there could be an "any words as substrings" option, too.
Good idea ... I would like that feature, too.
I don't know what I'm doing, but it seems to work... :)
Whiteboard: 2.12
Gerv, if you like, you can clean up that patch...
Keywords: patch, review
I don't think we should have "any words as substrings" It has the potential to produce loads and loads of wrong results, if people use short words which happen to appear lots in longer words. I can't see much use for "none of the words as (case-insens.) substring" either - I think we are in danger of bloating that drop-down even more. "All words as substrings" is a fine idea, though. Re: the patches: GetByWordList1 is a bad name for a function. What makes it different from GetByWordList? Explain that a bit in the name. Why not "foreach $word ..." rather than foreach $w... $word = $w? You need to remove the old code you left in comments. Have you tested this patch yourself? Gerv
Re: "any", "no": I can drop them, although I think they can be useful: - "any" can be useful if combined with other restrictions if you have alternative names for one thing (like thumb/slider or address/location/url) - "no" can be useful if you have many things you want to exclude (don't have an example at hand) I added them just to have parity with "all words", "any words" and "no words". I think the functionality for "any" and "no" could stay in the backend (buglist.cgi) and just be removed from the frontend (query.cgi). And that's only one line per option you want to get rid off. Not sure if that needs a new patch for query.cgi, but I can make one if you tell me which ones to drop exactly. The first part of the changes in query.cgi is for the Summary/StatusWhiteboard/URL/Description dropdowns, the second part for the boolean charts feature. > - I think we are in danger of bloating that drop-down even more. That's true, but then let's ask if we need all the other existing options, too. > GetByWordList1 is a bad name for a function. What makes it different from > GetByWordList? Explain that a bit in the name. Well, I just wanted to make it work. So I copied existing code from GetByWordList (that is used for the ordinary "all words" case) and tried to make it produce the same output the ordinary "substring" would do (for every word). I'll post a new patch shortly. The main question I have is: Are the names for the options ok? "allsubstr" etc.? Or should it be "allwordssubstr" or "allwordsassubstrings" or something else? This probably can't be changed easily in the future... > Have you tested this patch yourself? Yes, sort of. I tried 5 or so different combinations, and they seemed to work. They included different options (all/any), the ordinary "Summary" field vs. boolean charts, one vs. multiple words for the value, and plain [a-zA-z] words vs. a sequence of some special chars. The problem is that I don't know enough perl to understand what I'm doing. Yesterday I didn't even know that @ was a naming convention for lists ...
OK, let's compromise. Remove "no words" from the front-end, in all instances :-) Do you think we need all the existing options? allwordssubstr looks good. Try and be as consistent as possible with the current names. And GetByWordListSubstr(). Don't worry about not knowing Perl - that's exactly the way I started hacking on Bugzilla ;-) Gerv
Ok, I have attached version 2 for both diffs. Assuming the buglist.cgi script was correct before the patch, it should be provably correct with the latest patch applied. buglist.cgi: 1. Is it true that the following lines are equivalent? (word.toLowerCase) $word =~ tr/A-Z/a-z/; $word = lc($word); 2. Is lc(SqlQuote($v)) is the same as SqlQuote(lc($v))? (commutativity) 3. What does metaquote mean in SqlQuote(metaquote($v)) ? BTW, I learned that changing "allsubstr" to "allwordssubstr" can have unexpected consequences: suddenly "allwords" matched instead of "allwordssubstr", so I had to place the new code _before_ the old one. query.cgi: I have removed the "no words as substrings" option. 1. What about the wording of the options? I have left out "case-insensitive" on the hardcoded Summary/Description/URL/StatusWhiteboard dropdown because it would increase the width of the selectbox a lot, however I have included it in the boolean charts option descriptions. 1a. What about grammar & spelling? Is the "s" at the end correct? 2. What about the order of the options in the selectbox? Currently the order of "case-sensitive substring" and "case-insensitive substring" in the main part of the query form is different from the oder in the boolean charts dropdown, probably because case-insensitive substring was the default. Should this be changed? Should the "wordssubstr" options occur before the plain "words" options or after them, or should they even be at the top? Please review the latest patches, I will do some testing now and report back.
Testing looks fine. If you want to check it in as-is, you have my blessing :)
search for subject:plugin and description that "doesn't have any of these words"|"contains none of the following" : real, flash, shockwave search for description that "contains one of the following" : crash, hang, slows down, freezes notice that i use , to express string separation, i'm not sure it's possible to ask buglist to excluse the substring 'slows down' and exclude the substring 'hang', but that's how i might use it.
This is potentially interesting, but is not a 2.12 candidate.
Whiteboard: 2.12
timeless: I'm not sure if I'm understanding what you're saying here: > search for subject:plugin and description that > "doesn't have any of these words"|"contains none of the following" > : real, flash, shockwave Is this a request to put the "no words as substrings" option back in (that Gerv wanted to be taken out)? If so, where do you want it: in both the main part of the query form and the boolean charts, or in the boolean charts only? > search for description that "contains one of the following" > : crash, hang, slows down, freezes > notice that i use , to express string separation, i'm not sure it's possible > to ask buglist to excluse the substring 'slows down' and exclude the substring > 'hang', but that's how i might use it. Currently the input is split into words at non-empty sequences of (space or comma): /[\s,]+/ . This is the same splitting mechanism used for "all words", "any words" etc. If you want that to be changed, please submit a separate RFE. [ Note that the QuickSearch tool described in bug 61561 supports quoting. There it's possible to type 'desc:crash,desc:hang,desc:"slows down",desc:freezes' to express your query. ]
> Is this a request to put the "no words as substrings" option back in > (that Gerv wanted to be taken out)? yes. but i'm not familiar w/ how this query system works so i'm not going to press any of my comments.
Tara: Why is this not a candidate? According to Dave, this is the "better fix" for bug 66184 which was nominated for 2.12 by Gerv. Nominating for 222.12 in the hope that is evaluated before the end of this century... ;)
Whiteboard: 222.12
If you are concerned that people may have different opinions about the UI part (query.cgi), then what about just checking in the backend part (buglist.cgi)? - The backend patch doesn't touch existing functionality, it just adds some. - I think I can prove that the added functionality is correct: For treating of the input value as multiple words, it uses exactly the same mechanisms that are used for the "allwords" case. And building the SQL fragment for each word uses exactly the same mechanism that is used for the "substr" case. The frontend part of this patch is trivial and could be release noted.
If you look at the timestamps, you'll notice Tara removed the 2.12 AFTER bug 66184 was closed. She probably would have removed that one, too, if it had still been open. Not to say it's a bad patch (looks like a nice one to me) just we're attempting to get 2.12 out the door ReallySoonNow and trying to avoid the attack of creeping featuritis which keeps sneaking up on us :)
Whiteboard: 222.12 → 2.14
So the problem is that this fix is a feature whereas bug 66184 is not? Then let's reopen bug 66184 and fix that one ;)
dunno... Tara's calling the shots here... Tara?
I'm not sure if Tara is looking at any bugs without 2.12 in the status whiteboard... Well, the thing is this: if you want to release today, then I can accept that it's too late. Or if you release a 2.14 one week after 2.12 with all the patches included that didn't make it into 2.12, that's fine, too. But looking at what happened in the past there can be more than half a year between two releases of Bugzilla, there can be really long times when the installation at mozilla.org is not upgraded. Now I do care less about all the other bugzilla installations out there that could use this feature (maybe except of my own one, but that's already patched). So just tell me (assuming the patch doesn't have any problems) that mozilla.org's bugzilla will have this before the end of February, and I will happily shut up :)
I think the lack of responses has proven my theory that bugs without 2.12 in the status whiteboard are off the radar. So I'm putting 2.12 back in order to get a response from Tara. Here are the reasons for accepting at least the backend support (buglist.cgi): - The "all words as substrings" is more useful than both "all words" and "(case-insensitive) substring" in most cases. - Currently it can only be realized with JS. The change to buglist.cgi makes it possible to build non-JS based query forms with this feature. - I believe the patch is provably correct. - The backend patch is the major change here. The UI change (query.cgi) is trivial and can be release noted if controversial (e.g. with a link to the patch attached here: http://bugzilla.mozilla.org/showattachment.cgi?attach_id=23354 ). Every bugzilla maintainer should be able to apply it by hand.
Whiteboard: 2.14 → 2.12 (was: 2.14)
This is my whinemail for the week Feb. 12th - Feb. 18th. This bug has a patch that is waiting for review and checkin.
Okay, Andreas has convinced me. Accepting it back on 2.12 list.
Whiteboard: 2.12 (was: 2.14) → 2.12
moving to real milestones...
Whiteboard: 2.12
Target Milestone: --- → Bugzilla 2.12
reassigning...
Assignee: tara → cyeh
moving to 2.14 after discussion with Andreas.
Target Milestone: Bugzilla 2.12 → Bugzilla 2.14
query.cgi patch v2 r= justdave buglist.cgi patch v1 w/spec chars and patch v2 both have my r=, but I need to know which one to check in. Can we get some people to make some more comments and settle this argument? :-) Do we or don't we want "none of these words as substrings" on the string match type popup?
For buglist.cgi, it's definitely version 2 (attachment 23353 [details] [diff] [review]) that should go in (I think it's even provably correct, under the assumption that "allwords" and "substr" are correct). For query.cgi, my suggestion would be to check in version 2 (attachment 23354 [details] [diff] [review]) as-is, update b.m.o to the tip, and see if anyone complains. Then query.cgi can easily be adapted according to what the majority wants. I can't see any other way to get the broad usability testing that is needed to optimize / fine-tune the order of options in the menus. People need to be able to try them out first. You can try out these patches on http://bugzilla.mathweb.org/query.cgi , but of course that doesn't have the huge mass of bugs that is available on mozilla.org.
haven't seen any other comment, v2 it is for both. checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Moving to Bugzilla product
Component: Bugzilla → Bugzilla-General
Product: Webtools → Bugzilla
Version: other → unspecified
Verified fixed. Tested on b.m.o.
Status: RESOLVED → VERIFIED
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: