Closed Bug 422026 Opened 17 years ago Closed 8 years ago

bookmark keywords with spaces don't work (trigger search behavior rather than the keyword)

Categories

(Firefox :: Address Bar, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox48 --- affected

People

(Reporter: bodyvisual, Assigned: cbartley)

References

Details

(Keywords: helpwanted, regression)

Attachments

(1 file, 6 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b4) Gecko/2008030714 Firefox/3.0b4
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b4) Gecko/2008030714 Firefox/3.0b4

When typing in the keyword for a bookmark that contains a space, a search for the term is executed. Expected result: browser navigates to URL of bookmarked keyword.

Reproducible: Always

Steps to Reproduce:
1. Bookmark a page (i.e. google.com)
2. Edit book mark properties, assign keyword 'teh googz'
3. In location bar, enter 'teh googz' and hit enter.
Actual Results:  
Taken to Google search page with suggestion for 'Did you mean: the good'

Expected Results:  
Taken to google.com

In Firefox 2, a keyword with a space works fine. Possible fix: include keywords in the list of autocompletions.

It's worth mentioning that keyworded quick searches (i.e. keyword: yt, url: youtube.com/results?search_query=%s) still work perfectly.
Note: the same behavior is experienced on OS X 10.4.
Confirmed on latest trunk with Windows XP. Regression range is 
http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&date=explicit&mindate=2007-06-26+20%3A00&maxdate=2007-06-27+20%3A00 
Bug 378553 ?
Blocks: 378553
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regression
Version: unspecified → Trunk
Yeah, I noticed this as another regression caused by bug 378553, but I wasn't sure it was worth fixing.
Summary: Keywords with spaces default to search → bookmark keywords with spaces don't work (trigger search behavior rather than the keyword)
OS: Windows XP → All
Hardware: PC → All
If it's a straightforward fix then I suggest this *is* worth fixing, particularly because it's a regression over Firefox 2 (and this functionality has existed in Firefox since time immemorial).

It's not a major feature, obviously, and I suspect that people who used this in Firefox <=2 will just quietly work around it. But it'd be nice if they didn't have to.

Further to the end of comment 0, smart keywords that themselves contain spaces *don't* work: my life's ambition, to use the smart keyword "wtf is" for Wikipedia (so that entering "wtf is lolcats" brings up the article on lolcats), is as yet unfulfilled.

And that makes the cute kittens cry.
Flags: blocking-firefox3?
Greg K Nicholson's use case *is* compelling. Gavin, you mentioned that you spotted the regression; is the fix complex?

I don't think this is a blocker (right again, Greg!) but we should probably clean it up.
Flags: wanted-firefox3+
Flags: blocking-firefox3?
Flags: blocking-firefox3-
Any chance to get this in for Firefox 3.1?
(In reply to comment #8)
> Any chance to get this in for Firefox 3.1?
> 

setting the wanted ? flag so that your question can get answered.
Flags: wanted-firefox3.1?
This problem does have a big impact on some of us--I'm considering going back to Firefox 2 because of it. I have many keywords with internal spaces in them, and am very frustrated that all no longer work. I've started deleting internal spaces in keywords, but then end up with duplicates: "w gh" and "wg h" both become "wgh" (and Firefox does not alert to duplicate keywords).
Firefox 2 has it. Its a very good feature. i mean the human readable keywords feature. i have "abc timesheet", "xyz timesheet", "xyz".. else i would have to live with xyz, xyztimesheet.

Please fix this.
I have been (and still am) looking at this bug and seeing if I can fix it, and while testing it I noticed (using 3.0.1) that it doesn't work even if the bookmark has no spaces in it - that is, if you bookmark google.com calling it "BookmarkTest", and then type "BookmarkTest" in the address bar and press enter, it searches for "BookmarkTest" (instead of bringing you to google.com).

I'd like to try to fix it, and I'm working on it now - but I don't consider myself reliable by any means, so don't count on me.
This bug was caused by the code splitting the keyword at the first ' ' before checking whether that keyword was a bookmark's keyword.  My patch reverses that, thus fixing the bug (unless I messed something up).

P. S. My earlier comment about the bookmarks not working even when there was no space was my mistake.
Attachment #336220 - Flags: review?
Attachment #336220 - Flags: review? → review?(gavin.sharp)
Attachment #336220 - Attachment description: This patch, as far as I know, fixes the bug described. → This patch, as far as I know, fixes the bug described. (Update: It breaks something else. Don't use it.)
Attachment #336220 - Attachment is obsolete: true
Attachment #336220 - Flags: review?(gavin.sharp)
Comment on attachment 336220 [details] [diff] [review]
This patch, as far as I know, fixes the bug described.  (Update: It breaks something else.  Don't use it.)

I just realized, my earlier patch breaks something else.  Don't use it.
Attached patch This patch should work. (obsolete) — Splinter Review
OK - I have another patch, I'm pretty sure this one will work.  (I'm only pretty sure because I can never be completely sure about anything.)

I should probably mention that this will work for bookmarks and bookmarked searches no matter how many spaces are in the keyword and parameter, but the part that comes after it in the code (var searchservice...) still assumes no spaces in keyword.  I don't know enough about what that code does.

I'll also note that it will now do the following: If you have a search bookmark called, say, "Go To Target's Site" for Google I'm feeling Lucky Search, and a regular bookmark called "Go To Target's Site and shop" (for, say, the site of the store Target), and you type "Go To Target's Site and shop for clothes", it will take you to the bookmark "Go To Target's Site and shop" and as far as I know ignore the "for clothes".
Attachment #336271 - Flags: review?(gavin.sharp)
Flags: blocking-firefox3.1?
Would like to see this fixed for 3.1, but not holding the release for it, blocking-.
Flags: wanted-firefox3.1?
Flags: wanted-firefox3.1+
Flags: blocking-firefox3.1?
Flags: blocking-firefox3.1-
Target Milestone: --- → Firefox 3.1
Target Milestone: Firefox 3.1 → Firefox 3.5
Target Milestone: Firefox 3.5 → ---
Flags: wanted-firefox3.5+ → wanted-firefox3.6+
Taking this bug.
Assignee: nobody → cbartley
Attached patch support keywords with spaces, v1 (obsolete) — Splinter Review
This patch supports keywords with spaces.  Note that it will faithfully retain spaces in both the keyword and parameter portions of the string, so for example, the keyword "wtf is" is distinct from the keyword "wtf  is" (two spaces).  I'm not sure if this is right, nor am I sure what the behavior was back when this worked before.  Preserving spaces in the parameter portion might make more sense, but even there it seems like we might want to remove leading and trailing spaces.
Attachment #336271 - Attachment is obsolete: true
Attachment #400897 - Flags: review?
Attachment #336271 - Flags: review?(gavin.sharp)
Attachment #400897 - Flags: review? → review?(gavin.sharp)
Attachment #400897 - Flags: review?(gavin.sharp) → review?(mak77)
Comment on attachment 400897 [details] [diff] [review]
support keywords with spaces, v1

>diff -r 4313fb6cb7a3 browser/base/content/browser.js

I'm not a browser peer, so you'll need a further approval from rflint, dao or gavin, but i can give a first review

>+function parseURLForKeyword(aURL) {

this should probably be parseStringForKeywordData(aString)
we are not parsing an url string, nor a nsIURI, and we are getting more than a keyword

>+  // Search the URL backwards for runs of one or more spaces.  The entire
>+  // substring before the first space of the run is tested to see if it is a
>+  // keyword.  If it is, then the parameter is the entire substring after the
>+  // keyword, including the spaces.  If the leading substring is not a keyword,
>+  // then we continue searching backwards until we find a keyword, or we hit
>+  // the beginning of the string.  Note that we can terminate the search at
>+  // i == 1, since we can't have an empty keyword.

I had some thought about RTL, but looks like both keywords and the locationbar are ltr. I tried the patch setting intl.uidirection.en = rtl and it was working, but i'm unsure if my ltr-mind is hiddin me something. Anyway, asking Ehsan or someone that usually uses an RTL browser could be a wise thing.

>+  for (var i = aURL.length - 1; i >= 1; i--) {
>+    var chPrev = aURL.charAt(i - 1);
>+    var ch = aURL.charAt(i);
>+    if (chPrev != ' ' && ch == ' ') {
>+      var keyword = aURL.substring(0, i);


I'm not sure why you walk this char by char, javascript has awesome string methods
i think you should be able to use something like

let token = aString;
while((let index = token.lastIndexOf(" ")) != -1) {
  token = token.substr(0, index);
  if (token is a valid keyword and we are nice guys)
    return [token, param, aString.slice(index + 1)];
}
return [null, null, null];
Attachment #400897 - Flags: review?(mak77) → review-
(In reply to comment #20)
> (From update of attachment 400897 [details] [diff] [review])
> >diff -r 4313fb6cb7a3 browser/base/content/browser.js
> 
> I'm not a browser peer, so you'll need a further approval from rflint, dao or
> gavin, but i can give a first review
> 
> >+function parseURLForKeyword(aURL) {
> 
> this should probably be parseStringForKeywordData(aString)
> we are not parsing an url string, nor a nsIURI, and we are getting more than a
> keyword


Do we have terminology for strings typed into the location bar, which might be URLs or might be something else that we can turn into a URL?  The existing getShortcutOrURI() function just calls it a URL even before it's checked to see if it contains a keyword, which is why I did the same thing here.

I think "parseLocationForKeywordData" or something like that might be even better than "parseStringForKeywordData".


> >+  // Search the URL backwards for runs of one or more spaces.  The entire
> >+  // substring before the first space of the run is tested to see if it is a
> >+  // keyword.  If it is, then the parameter is the entire substring after the
> >+  // keyword, including the spaces.  If the leading substring is not a keyword,
> >+  // then we continue searching backwards until we find a keyword, or we hit
> >+  // the beginning of the string.  Note that we can terminate the search at
> >+  // i == 1, since we can't have an empty keyword.
> 
> I had some thought about RTL, but looks like both keywords and the locationbar
> are ltr. I tried the patch setting intl.uidirection.en = rtl and it was
> working, but i'm unsure if my ltr-mind is hiddin me something. Anyway, asking
> Ehsan or someone that usually uses an RTL browser could be a wise thing.


Good question.  I know zippo about RTL...


> >+  for (var i = aURL.length - 1; i >= 1; i--) {
> >+    var chPrev = aURL.charAt(i - 1);
> >+    var ch = aURL.charAt(i);
> >+    if (chPrev != ' ' && ch == ' ') {
> >+      var keyword = aURL.substring(0, i);
> 
> 
> I'm not sure why you walk this char by char, javascript has awesome string
> methods
> i think you should be able to use something like
> 
> let token = aString;
> while((let index = token.lastIndexOf(" ")) != -1) {
>   token = token.substr(0, index);
>   if (token is a valid keyword and we are nice guys)
>     return [token, param, aString.slice(index + 1)];
> }
> return [null, null, null];


I've written an awful lot of parsing code in my time, and I've come to the conclusion that walking character-by-character is sometimes simpler and easier to understand.  In this case I was particularly concerned about how spaces were handled, and I thought coding it this way would make space-handling about as clear as it could be.

That said, I don't think I've made the right call on space handling.  In particular, I'm thinking I ought to trim the keyword phrase and then reduce internal runs of whitespace each to a single space, so

  "wtf is"
  "wtf  is"
  " wtf is"
  "wtf is "
  "  wtf    is  "

would all resolve to "wtf is" (one internal space).

On the other hand the bookmark system is not canonicalizing keywords like this, so if we were to do that, we should probably do it in both places.

Then there's the question of what to do with spaces in the parameter.  I think it makes sense to preserve spaces there and let the target webapp handle them however it pleases.  So if the user enters

  "wtf.is..firefox" (imagine "."s are spaces)

we'd extract ".firefox" as the parameter (skipping the first space after the end of the keyword phrase).  The theory being that most services won't care about the space one way or other, but you can imagine some where embedded spaces are significant and meaningful, for example a service that expects a regular expression rather than a traditional list of words style query.
(In reply to comment #21)
> (In reply to comment #20)
> Do we have terminology for strings typed into the location bar, which might be
> URLs or might be something else that we can turn into a URL?  The existing
> getShortcutOrURI() function just calls it a URL even before it's checked to see
> if it contains a keyword, which is why I did the same thing here.

well the method says getShortcutOrURI because it's trying to distinguish between a valid shortcut (a search, a keyword) or a real address, and directly receives the data from the locationbar, while your method can be used to extract keyword data from a generic string, regardless the origin.
I don't think there is some common term for the data in the location bar. Reading parseLocationFor, i'd expect to not have to pass anything to the method, it should just get the location value by itself.

> I've written an awful lot of parsing code in my time, and I've come to the
> conclusion that walking character-by-character is sometimes simpler and easier
> to understand.  In this case I was particularly concerned about how spaces were
> handled, and I thought coding it this way would make space-handling about as
> clear as it could be.

I don't see lot of difference in code clarity, but using js string methods could be faster, can't place a bet on that though.

> That said, I don't think I've made the right call on space handling.  In
> particular, I'm thinking I ought to trim the keyword phrase and then reduce
> internal runs of whitespace each to a single space, so
> 
>   "wtf is"
>   "wtf  is"
>   " wtf is"
>   "wtf is "
>   "  wtf    is  "
> 
> would all resolve to "wtf is" (one internal space).
> 
> On the other hand the bookmark system is not canonicalizing keywords like this,
> so if we were to do that, we should probably do it in both places.

if you search for "keyword space" in bugzilla you can find different bugs about it. I agree that if we strip spaces here, we should do the same when saving the keyword in bookmarks panels. otherwise a user could add a "foo " keyword and never be able to use it.

> Then there's the question of what to do with spaces in the parameter.  I think
> it makes sense to preserve spaces

i agree
Attached patch support keywords with spaces, v2 (obsolete) — Splinter Review
This patch implements more robust whitespace handling, including keyword canonicalization.  This is spelled out in detail in comments in the code.  Also renamed the parse function to parseStringForKeywordData() per Marco's suggestion.

Major questions:

1. Is the whitespace handling as spelled out "correct"?
2. Does the code actually do what the comments say it does?
3. Given 1 and 2, should we introduce keyword canonicalization in keyword creation code in this patch, or can it wait to be fixed later?  (I'm assuming that if we want to canonicalize keywords in one place, then we want to do it in both.)
Attachment #400897 - Attachment is obsolete: true
Attachment #403576 - Flags: review?
Attachment #403576 - Flags: review? → review?(gavin.sharp)
i think that we want to canonicalize in both points in this patch, should not be hard at all, just have to fix editBookmarkOverlay.js. Mostly for the case above: user entering a keyword with a space (by error) won't ever be able to use it, and maybe won't be clear to him why he can't use it.

btw, sorry, i still find my lastIndexOf tokenize approach more readable than char by char, but i suppose everyone finds his approaches more readable or he would not have written them like that, so... :)
Attached patch support keywords with spaces, v3 (obsolete) — Splinter Review
Changes from the v2 patch:

browser.js
  Removed the keyword canonicalization code since that's now handled fully by
  nsNavBookmarks.cpp.  Also updated the comment block.

nsNavBookmarks.h
nsNavBookmarks.cpp
  Factored keyword canonicalization out into a helper function and added 
  whitespace canonicalization to it -- previously canonicalization was just
  conversion to lowercase.

Keyword canonicalization consists of removing trailing and leading whitespace
  and replacing any internal whitespace runs each with a single ASCII 32 space.

Known Issues

  * No special allowance is made for Unicode whitespace.  I'm not sure what that
    would mean anyway. 

  * If anyone out there is using keywords with leading or trailing spaces, 
    they will stop working with this change.  On the other hand, the most 
    likely existing usage of spaces in keywords is for multi-word keywords, and
    those wouldn't work anyway.

  * We hit the database every time we check for a keyword; that used to mean
    once per location string at most.  Now it could be multiple times as we
    check each possibility in a multi-word phrase.  For example:
    
      google someone set up us the bomb
      
    could involve 5 or 6 hits on the db.

  * The naive way the parse loop works involves a string allocation for each
    character.  Even a single hit to the database should completely swamp any
    overhead from the parse operation.
Attachment #403576 - Attachment is obsolete: true
Attachment #403576 - Flags: review?(gavin.sharp)
Attachment #405569 - Flags: review?(mak77)
Attachment #405569 - Flags: review?(gavin.sharp)
(In reply to comment #25)
>   Removed the keyword canonicalization code since that's now handled fully by
>   nsNavBookmarks.cpp.  Also updated the comment block.

this means what i see in the keyword field in properties dialog is different from what is saved? that looks like a usability hit.
also this looks like putting a frontend decision in backend API, i think usually callers should take care of that. and we are getting rid of user choose casing... what if i want to create a doSearch keyword?
(In reply to comment #26)
> (In reply to comment #25)
> >   Removed the keyword canonicalization code since that's now handled fully by
> >   nsNavBookmarks.cpp.  Also updated the comment block.
> 
> this means what i see in the keyword field in properties dialog is different
> from what is saved? that looks like a usability hit.

If you look at existing keywords, what you see in the keyword field is the same as what's in the database.

If you enter a new keyword, it could get changed when it's saved.  However, this can already happen!  Try entering a keyword with uppercase letters in it, save it, and re-open it.

In the case of the "Add a keyword for this search..." dialog, you can actually see the keyword change just before the dialog disappears if you watch closely.
(In reply to comment #27)
> also this looks like putting a frontend decision in backend API, i think
> usually callers should take care of that. and we are getting rid of user choose
> casing... what if i want to create a doSearch keyword?

This is a legitimate question to ask.  However, let me stress that the lowercasing step was already present in the backend code -- I didn't add this functionality.  I simply added whitespace compression to the canonicalization step that was already being done.

I kind of get the feeling that our keyword support has always been under-specified so it may benefit from some more focused attention.  That's beyond the scope of this bug, however.

It might be worthwhile to open a more general keyword bug and start some discussion about how we'd like it to work.
yeah i'm not saying everything is your fault, i know keywords support sucks. And if you take a look at the db schema you will also see that at schema level it sucks even more.

I suppose it was lowercased to avoid user saving KeYword and being unable to use keYWord. And with spaces we are doing something similar to avoid entering keyword with spaces and being unable to use them.
Ok, i'm fine with it but you should at least annotate that in the idl.
Attachment #405569 - Flags: review?(mak77)
Comment on attachment 405569 [details] [diff] [review]
support keywords with spaces, v3

>diff -r 21ca51ccd2a9 browser/base/content/browser.js
>--- a/browser/base/content/browser.js	Tue Oct 06 11:47:46 2009 -0700
>+++ b/browser/base/content/browser.js	Fri Oct 09 14:03:37 2009 -0700
>@@ -1911,16 +1911,64 @@ function loadURI(uri, referrer, postData
>     if (allowThirdPartyFixup) {
>       flags = nsIWebNavigation.LOAD_FLAGS_ALLOW_THIRD_PARTY_FIXUP;
>     }
>     gBrowser.loadURIWithFlags(uri, flags, referrer, null, postData);
>   } catch (e) {
>   }
> }
> 
>+function parseStringForKeywordData(s) {
>+
>+  // The string |s| may consist of a keyword and a keyword-parameter.  For the
>+  // purposes of this algorithm, we hold:
>+  // * the only whitespace in the string will be spaces
>+  // * the keyword must be at least one (non-space) character long
>+  // * the keyword may actually be a keyword phrase with each word separated
>+  //   by one or more spaces
>+  // * the keyword will never end with trailing whitespace
>+  // * the significance of other whitespace in or before the keyword is left to
>+  //   the bookmark system
>+  // * the parameter may be completely empty
>+  // * the parameter will be separated from the keyword by a single space
>+  // * a keyword will *not* be matched unless there is a trailing space, even if
>+  //   the parameter is completely empty
>+  // * whitespace inside the parameter is completely significant -- the ultimate
>+  //   consumer of the parameter may not care, but we'll pass it along
>+  //   faithfully preserved, just in case
>+  // * in the case where multiple keywords are possible, we want to identify the
>+  //   longest one
>+
>+  // The algorithm walks the string backwards, looking for the first space after
>+  // each word.  At each of these spaces we check to see if the part of the
>+  // string before the space constitutes a valid keyword.  If it does, then the 
>+  // parameter for that string is the part of the string *after* the separator
>+  // space.
>+
>+  // The algorithm is pretty straightforward.  Two things to keep in mind:
>+  // First, there must always be a space after the last word of a keyword, even
>+  // if the parameter itself is the empty string.  Second, since a keyword must
>+  // be at least one (non-space) character long, we have a simpler terminating
>+  // case than we otherwise might.

this is an awesome comment, but looks a bit verbose, and browser.js is a quite big file... i'll let gavin evaluate this.

>diff -r 21ca51ccd2a9 toolkit/components/places/src/nsNavBookmarks.cpp
>--- a/toolkit/components/places/src/nsNavBookmarks.cpp	Tue Oct 06 11:47:46 2009 -0700
>+++ b/toolkit/components/places/src/nsNavBookmarks.cpp	Fri Oct 09 14:03:37 2009 -0700
>@@ -2809,24 +2809,34 @@ nsNavBookmarks::SetItemIndex(PRInt64 aIt
> 
>   ENUMERATE_OBSERVERS(mCanNotify, mCacheObservers, mObservers, nsINavBookmarkObserver,
>                       OnItemMoved(aItemId, parent, oldIndex, parent,
>                                   aNewIndex, itemType))
> 
>   return NS_OK;
> }
> 
>+void nsNavBookmarks::CanonicalizeKeyword(nsString& aKeyword)

please
NS_HIDDEN_(void) // static
nsNavBookmarks::...

>+{
>+  // Strip preceding and trailing whitespace.

whitespaceS maybe? it will strip more then one if they exist

>+  // Convert uppercase characters to lowercase.

this comment is redundant, if you want to explain all that this method is doing you could add a javadoc in the header file.
but ToLowerCase method name is self documenting.

>+  // TODO: Does this need to be generalized for Unicode?

please file a bug and put bug number here... but actually i don't think this issue is blocking, i'm not even sure why a user would want to put a special whitespace as part of the keyword, they are intended for other uses. Btw it should be fixed in compressWhitespace or with a new helper.

>diff -r 21ca51ccd2a9 toolkit/components/places/src/nsNavBookmarks.h
>--- a/toolkit/components/places/src/nsNavBookmarks.h	Tue Oct 06 11:47:46 2009 -0700
>+++ b/toolkit/components/places/src/nsNavBookmarks.h	Fri Oct 09 14:03:37 2009 -0700
>@@ -146,16 +146,18 @@ private:
>    *
>    * @throws If folder does not exist.
>    */
>   nsresult FolderCount(PRInt64 aFolderId, PRInt32 *aFolderCount);
> 
>   nsresult GetFolderType(PRInt64 aFolder, nsACString &aType);
> 
>   nsresult GetLastChildId(PRInt64 aFolder, PRInt64* aItemId);
>+
>+  static void CanonicalizeKeyword(nsString& aKeyword);

static NS_HIDDEN_(void) please
Attached patch support keywords with spaces, v4 (obsolete) — Splinter Review
(In reply to comment #31)
> (From update of attachment 405569 [details] [diff] [review])
> >diff -r 21ca51ccd2a9 browser/base/content/browser.js
> >--- a/browser/base/content/browser.js	Tue Oct 06 11:47:46 2009 -0700
> >+++ b/browser/base/content/browser.js	Fri Oct 09 14:03:37 2009 -0700
> >@@ -1911,16 +1911,64 @@ function loadURI(uri, referrer, postData
> >     if (allowThirdPartyFixup) {
> >       flags = nsIWebNavigation.LOAD_FLAGS_ALLOW_THIRD_PARTY_FIXUP;
> >     }
> >     gBrowser.loadURIWithFlags(uri, flags, referrer, null, postData);
> >   } catch (e) {
> >   }
> > }
> > 
> >+function parseStringForKeywordData(s) {
> >+
> >+  // The string |s| may consist of a keyword and a keyword-parameter.  For the
> >+  // purposes of this algorithm, we hold:
> >+  // * the only whitespace in the string will be spaces
> >+  // * the keyword must be at least one (non-space) character long
> >+  // * the keyword may actually be a keyword phrase with each word separated
> >+  //   by one or more spaces
> >+  // * the keyword will never end with trailing whitespace
> >+  // * the significance of other whitespace in or before the keyword is left to
> >+  //   the bookmark system
> >+  // * the parameter may be completely empty
> >+  // * the parameter will be separated from the keyword by a single space
> >+  // * a keyword will *not* be matched unless there is a trailing space, even if
> >+  //   the parameter is completely empty
> >+  // * whitespace inside the parameter is completely significant -- the ultimate
> >+  //   consumer of the parameter may not care, but we'll pass it along
> >+  //   faithfully preserved, just in case
> >+  // * in the case where multiple keywords are possible, we want to identify the
> >+  //   longest one
> >+
> >+  // The algorithm walks the string backwards, looking for the first space after
> >+  // each word.  At each of these spaces we check to see if the part of the
> >+  // string before the space constitutes a valid keyword.  If it does, then the 
> >+  // parameter for that string is the part of the string *after* the separator
> >+  // space.
> >+
> >+  // The algorithm is pretty straightforward.  Two things to keep in mind:
> >+  // First, there must always be a space after the last word of a keyword, even
> >+  // if the parameter itself is the empty string.  Second, since a keyword must
> >+  // be at least one (non-space) character long, we have a simpler terminating
> >+  // case than we otherwise might.
> 
> this is an awesome comment, but looks a bit verbose, and browser.js is a quite
> big file... i'll let gavin evaluate this.


Noted.


> >diff -r 21ca51ccd2a9 toolkit/components/places/src/nsNavBookmarks.cpp
> >--- a/toolkit/components/places/src/nsNavBookmarks.cpp	Tue Oct 06 11:47:46 2009 -0700
> >+++ b/toolkit/components/places/src/nsNavBookmarks.cpp	Fri Oct 09 14:03:37 2009 -0700
> >@@ -2809,24 +2809,34 @@ nsNavBookmarks::SetItemIndex(PRInt64 aIt
> > 
> >   ENUMERATE_OBSERVERS(mCanNotify, mCacheObservers, mObservers, nsINavBookmarkObserver,
> >                       OnItemMoved(aItemId, parent, oldIndex, parent,
> >                                   aNewIndex, itemType))
> > 
> >   return NS_OK;
> > }
> > 
> >+void nsNavBookmarks::CanonicalizeKeyword(nsString& aKeyword)
> 
> please
> NS_HIDDEN_(void) // static
> nsNavBookmarks::...


Done.


> >+{
> >+  // Strip preceding and trailing whitespace.
> 
> whitespaceS maybe? it will strip more then one if they exist


"Whitespace" generally means "a run of one or more whitespace characters".


> >+  // Convert uppercase characters to lowercase.
> 
> this comment is redundant, if you want to explain all that this method is doing
> you could add a javadoc in the header file.
> but ToLowerCase method name is self documenting.


Moved comments to JavaDoc-style comment in nsNavBookmarks.h.


> >+  // TODO: Does this need to be generalized for Unicode?
> 
> please file a bug and put bug number here... but actually i don't think this
> issue is blocking, i'm not even sure why a user would want to put a special
> whitespace as part of the keyword, they are intended for other uses. Btw it
> should be fixed in compressWhitespace or with a new helper.


Just removed the comment.


> 
> >diff -r 21ca51ccd2a9 toolkit/components/places/src/nsNavBookmarks.h
> >--- a/toolkit/components/places/src/nsNavBookmarks.h	Tue Oct 06 11:47:46 2009 -0700
> >+++ b/toolkit/components/places/src/nsNavBookmarks.h	Fri Oct 09 14:03:37 2009 -0700
> >@@ -146,16 +146,18 @@ private:
> >    *
> >    * @throws If folder does not exist.
> >    */
> >   nsresult FolderCount(PRInt64 aFolderId, PRInt32 *aFolderCount);
> > 
> >   nsresult GetFolderType(PRInt64 aFolder, nsACString &aType);
> > 
> >   nsresult GetLastChildId(PRInt64 aFolder, PRInt64* aItemId);
> >+
> >+  static void CanonicalizeKeyword(nsString& aKeyword);
> 
> static NS_HIDDEN_(void) please


Done.
Attachment #405569 - Attachment is obsolete: true
Attachment #407132 - Flags: review?(gavin.sharp)
Attachment #405569 - Flags: review?(gavin.sharp)
Comment on attachment 407132 [details] [diff] [review]
support keywords with spaces, v4

I was going to suggest moving parseStringForKeywordData into getShortcutOrURI to avoid further polluting the global scope, but since it's not that large, you might as well just inline it, I think.

The comment does seem slightly repetitive (the last paragraph repeats two points mentioned in the earlier list), perhaps combine those by just highlighting the important points in the list (or putting them at the top)?

A browser chrome test that just called getShortcutOrURI with various example strings and checked its return values would be useful, and should be pretty easy to write.

r=me with those addressed.
Attachment #407132 - Flags: review?(gavin.sharp) → review+
Comment on attachment 407132 [details] [diff] [review]
support keywords with spaces, v4

>diff -r 21ca51ccd2a9 toolkit/components/places/src/nsNavBookmarks.cpp
>--- a/toolkit/components/places/src/nsNavBookmarks.cpp	Tue Oct 06 11:47:46 2009 -0700
>+++ b/toolkit/components/places/src/nsNavBookmarks.cpp	Mon Oct 19 15:11:39 2009 -0700
>@@ -2809,24 +2809,31 @@ nsNavBookmarks::SetItemIndex(PRInt64 aIt
> 
>   ENUMERATE_OBSERVERS(mCanNotify, mCacheObservers, mObservers, nsINavBookmarkObserver,
>                       OnItemMoved(aItemId, parent, oldIndex, parent,
>                                   aNewIndex, itemType))
> 
>   return NS_OK;
> }
> 
>+NS_HIDDEN_(void) // static
>+nsNavBookmarks::CanonicalizeKeyword(nsString& aKeyword)
>+{
>+  aKeyword.CompressWhitespace(PR_TRUE, PR_TRUE);
>+  ToLowerCase(aKeyword);
>+}
>+

hm, now that i look again at this, i think it does not need to be part of the nsNavBookmarks object. you could just add a mozilla::places namespace in the top of nsNavBookmarks.cpp and define this as a static global util.
clearly adding a using mozilla::places on top.

reasons to have this as a static method of nsNavbookmarks?
(In reply to comment #33)
> (From update of attachment 407132 [details] [diff] [review])
> I was going to suggest moving parseStringForKeywordData into getShortcutOrURI
> to avoid further polluting the global scope, but since it's not that large, you
> might as well just inline it, I think.
> 
> The comment does seem slightly repetitive (the last paragraph repeats two
> points mentioned in the earlier list), perhaps combine those by just
> highlighting the important points in the list (or putting them at the top)?


I can edit the comment down to something more concise.  The verbosity was mostly for yours and Marco's benefit. :-)


> A browser chrome test that just called getShortcutOrURI with various example
> strings and checked its return values would be useful, and should be pretty
> easy to write.


Do we have tests that add bookmarks that I can use as an example?

Alternatively, I can factor the parsing out from the shortcut lookup so it can be tested directly in an XPCShell test.  That, of course, requires that parsing should be handled by a stand-alone function that's accessible to the test.


> 
> r=me with those addressed.
(In reply to comment #34)
> (From update of attachment 407132 [details] [diff] [review])
> >diff -r 21ca51ccd2a9 toolkit/components/places/src/nsNavBookmarks.cpp
> >--- a/toolkit/components/places/src/nsNavBookmarks.cpp	Tue Oct 06 11:47:46 2009 -0700
> >+++ b/toolkit/components/places/src/nsNavBookmarks.cpp	Mon Oct 19 15:11:39 2009 -0700
> >@@ -2809,24 +2809,31 @@ nsNavBookmarks::SetItemIndex(PRInt64 aIt
> > 
> >   ENUMERATE_OBSERVERS(mCanNotify, mCacheObservers, mObservers, nsINavBookmarkObserver,
> >                       OnItemMoved(aItemId, parent, oldIndex, parent,
> >                                   aNewIndex, itemType))
> > 
> >   return NS_OK;
> > }
> > 
> >+NS_HIDDEN_(void) // static
> >+nsNavBookmarks::CanonicalizeKeyword(nsString& aKeyword)
> >+{
> >+  aKeyword.CompressWhitespace(PR_TRUE, PR_TRUE);
> >+  ToLowerCase(aKeyword);
> >+}
> >+
> 
> hm, now that i look again at this, i think it does not need to be part of the
> nsNavBookmarks object. you could just add a mozilla::places namespace in the
> top of nsNavBookmarks.cpp and define this as a static global util.
> clearly adding a using mozilla::places on top.


> reasons to have this as a static method of nsNavbookmarks?


I put it in nsNavBookmarks, because it's only called by two methods, both of which are methods of that class.  My general rule is to put utility methods in the smallest scope that works.  That boils down to a choice between class scope and file scope (I could just make CanonicalizeKeyword a file static).  Since file organization in C++ is somewhat arbitrary, I figure class scope makes more sense.

I don't know that there's any compelling reason to prefer one approach over the other in this case.  If you want me to change it, that's cool, but I'd like to at least hear a rationale as to why your approach is better :-).
(In reply to comment #35)
> Do we have tests that add bookmarks that I can use as an example?
> 
> Alternatively, I can factor the parsing out from the shortcut lookup so it can
> be tested directly in an XPCShell test.  That, of course, requires that parsing
> should be handled by a stand-alone function that's accessible to the test.

You don't need to add bookmarks, or factor stuff out for an xpcshell test -  just write a browser chrome test that calls getShortcutOrURI directly. A more comprehensive test for the entire code path (i.e. enter text in the url bar, check that the right page is loaded) would be great too, but much more involved, so no need to block on it.
no rationales from me, just that we are trying to cleanup that one-hundred-hands-did-touch-me code. but you're free to proceed with either class or file scope.
about tests that add bookmarks, Places has about 150 tests to take ideas from, but Gavin is right, the scope of this bug is not to test all paths involved in adding bookmarks.
But if you want to try with that, browser_bookmarksProperties.js could be an hook.
Attachment #407132 - Attachment is obsolete: true
Attachment #419978 - Flags: review?(gavin.sharp)
Attachment #419978 - Flags: feedback?(mak77)
Comment on attachment 419978 [details] [diff] [review]
support keywords with spaces, v5

Review of attachment 419978 [details] [diff] [review]:
-----------------------------------------------------------------

My feedback is still the same here:
- NS_HIDDEN_() is useless and should be removed
- CanonicalizeKeyword should be a static method in the anonymous namespace of nsNavBookmarks.cpp
- the test should also have an example with 3 words, not just 2
- finally (but nit) I guess if we could rewrite that for loop as a Generator returning possible keywords

f+ on the approach though
Attachment #419978 - Flags: feedback?(mak77) → feedback+
Comment on attachment 419978 [details] [diff] [review]
support keywords with spaces, v5

I apologize for the completely unreasonable delay in getting to this.

Two comments:
- it might be nice to move the getEngineByAlias check into parseStringForKeywordData as well (so that they both parse keywords the same way). What do you think?
- as mentioned in my earlier review, avoid adding parseStringForKeywordData to the global scope - I'd put it inside of getShortcutOrURI.
Attachment #419978 - Flags: review?(gavin.sharp) → feedback+
I can reproduce this bug on Mozilla/5.0 (Windows NT 6.3; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0

Is this something that is still being investigated or should this be closed as wont fix?
Flags: needinfo?(mak77)
Flags: needinfo?(gavin.sharp)
honestly I feel like it's a lot of added complication for the benefit of Firefox 2 compatibility. That was a good value in 2008, likely not in 2016.

Considered the telemetry data about how many users have keywords and the fact we dropped the Ubiquity concept of natural language actions in the urlbar, I don't think we have much value into adding this today.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(mak77)
Flags: needinfo?(gavin.sharp)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: