Closed Bug 261608 Opened 20 years ago Closed 20 years ago

Google search (default) in location bar with : or . in terms fails

Categories

(Core :: DOM: Navigation, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(1 file, 6 obsolete files)

Type 'chewbacca site:starwars.com' into the location bar.  Assuming the user
doesn't have 'chewbacca' defined as a keyword, the expected result is to visit:

http://www.google.com/search?&q=chewbacca%20site:starwars.com&sourceid=mozilla-search

However, the characters ":" and "." fool Firefox into thinking that the stuff
entered is a URL, and the user gets the message:

The URL is not valid and cannot be loaded.

This becomes a problem if I often visit a long URL that's the top result on
Google within a specific site.  For example, as I go through Firefox nightlies I
often need to download a special CA and personal certificate in order to use
parts of MIT's website.  Within Google I can access the webpage by typing
"certificates site:mit.edu" in Google.  (I could also type site:mit.edu
certificates", but I wouldn't expect that to work because it should be
interpreted as a URL within the nonexistent site: protocol.)  Firefox, however,
restricts me to using the workaround "google certificates site:mit.edu" within
the location.  In addition to being extra typing, it requires a second mouse
click to get where I want to go.

Incidentally, this is annoying enough that I've done some searching to find the
problem.  The problem is here:

http://lxr.mozilla.org/aviarybranch/source/docshell/base/nsDefaultURIFixup.cpp#679

I've cced vlad in case he thinks this is something he might be able to handle
reasonably quickly.
This bug was surprisingly easy for me to fix, given that my knowledge of C++ is
only maybe an hour's worth of learning past JavaScript.  I spent most of the
time just tracking through the code to verify that I'd found the function that
contained the faulty logic.

For the solution, instead of outright rejecting all strings that contain a ":"
or ".", the code simply searches to determine whether the suspect characters
are before or after the first space.  If they're after the first space or
aren't in the string, the code continues to evaluate the URL as a keyword
search.  I also added some examples of what this would enable to the function's
documentation.

In my testing of a Linux trunk build with this fix I haven't found any
difference in behavior with both normal keyword searches and with the special
type that this bug allows.  Now, who to review this?  (Ideally I'd like to see
this fixed in 1.0 because it's personal dogfood as I described above, so I need
someone who'll respond relatively quickly.)
Assignee: bugs → jwalden+bmo
Status: NEW → ASSIGNED
Component: Location Bar and Autocomplete → XPCOM
Product: Firefox → Browser
Version: unspecified → Trunk
Comment on attachment 160213 [details] [diff] [review]
Exec keyword search iff first space-separated substring doesn't contain [.:]

This patch fixes a bug that causes some useful strings that can be used in
default searches to fail in Firefox.  The fix is pretty simple, and as long as
during a review you don't see any holes in my logic it should be pretty much
good to go.  I've tested on Linux, and it seems to work well.
Attachment #160213 - Flags: review?(dougt)
Component: XPCOM → Embedding: Docshell
Comment on attachment 160213 [details] [diff] [review]
Exec keyword search iff first space-separated substring doesn't contain [.:]

Ben, can you review this and hopefully approve it for aviary checkin?  This is
something that would be really helpful for me in day-to-day use.
Attachment #160213 - Flags: review?(dougt) → review?(bugs)
Comment on attachment 160213 [details] [diff] [review]
Exec keyword search iff first space-separated substring doesn't contain [.:]

I just noticed this breaks "?mozilla".	Patch coming up ASAP...
Attachment #160213 - Attachment is obsolete: true
Attachment #160213 - Flags: review?(bugs)
I've checked every string mentioned in the documentation for the bug and
received the results I expected (by comparing them with an aviary branch
build's results).  This patch should work correctly.
This patch is relatively trivial in that the only *real* change occurs in one
line.  I've tested the usage cases outlined in docs, and they function properly.
 This is also something I could really use as I switch between nightlies and
erase and create new profiles.  It should be safe for the aviary branch, because
the function I change is only called once in code, and that's for the specific
situation I tracked here.
Flags: blocking-aviary1.0?
Comment on attachment 160315 [details] [diff] [review]
Fixes the case '?mozilla', which was broken w/prev. patch

Ben, can you review this?  It enables using the Location Bar in Firefox for a
search like 'firefox site:mozilla.org'.  I've tested the usage cases listed in
source (and added a few new ones as well), and I fixed the bug in the previous
version.  This is also something that would be a big help to me day-to-day as I
erase and create new profiles for nightly builds.
Attachment #160315 - Flags: review?(bugs)
Flags: blocking-aviary1.0? → blocking-aviary1.0+
I don't review changes to that code. 
Flags: blocking-aviary1.0+ → blocking-aviary1.0-
Comment on attachment 160315 [details] [diff] [review]
Fixes the case '?mozilla', which was broken w/prev. patch

Okay, let's try someone else then.  Still hoping to get this before 1.0...
Attachment #160315 - Flags: review?(bugs) → review?(caillon)
*** Bug 262605 has been marked as a duplicate of this bug. ***
(In reply to comment #11)
> This is bug 79655.

It would seem so, although some of the problems described there seem to have
been worked around in the urlbar handling logic of Firefox or have been fixed in
the meantime.  This bug is far newer, and *usually* it would be marked as a dup
of that one.

However, because the patch is here, it makes more sense to keep both open, get
r+sr+a (this would need sr now that I think of it), and then mark both fixed
(assuming all of Seamonkey's issues get fixed by this -- that bug might need to
be morphed to cover Seamonkey-specific urlbar handling logic if the problems
still exist).  This prevents bugspam from dup notification, reposting the patch,
and asking for a review there.
I agree, whish is why I didn't dupe it. Just thought I'd let you know.

This patch looks to me like it addresses the issues of that bug in one way, but
how does it compare to attachment 75174 [details] [diff] [review]?
Comment on attachment 160315 [details] [diff] [review]
Fixes the case '?mozilla', which was broken w/prev. patch

Are you sure you attached the right patch?  Looks like "?mozilla" is still
broken here.

Anyway, use kNotFound instead of -1
Attachment #160315 - Flags: review?(caillon) → review-
Attached patch The real updated patch (obsolete) — Splinter Review
(In reply to comment #14)
> Are you sure you attached the right patch?

Gah!  Curses!  How did that happen!?

> Anyway, use kNotFound instead of -1

Done, including in one spot I'd previously not touched but was within the area
I was tweaking.
Attachment #160315 - Attachment is obsolete: true
Attachment #161108 - Flags: review?(caillon)
Comment on attachment 161108 [details] [diff] [review]
The real updated patch

>Index: mozilla/docshell/base/nsDefaultURIFixup.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/docshell/base/nsDefaultURIFixup.cpp,v
>retrieving revision 1.40
>diff -u -8 -r1.40 nsDefaultURIFixup.cpp
>--- mozilla/docshell/base/nsDefaultURIFixup.cpp	9 Sep 2004 14:27:52 -0000	1.40
>+++ mozilla/docshell/base/nsDefaultURIFixup.cpp	5 Oct 2004 02:49:55 -0000
>@@ -696,33 +696,43 @@
> nsresult nsDefaultURIFixup::KeywordURIFixup(const nsACString & aURIString, 
>                                             nsIURI** aURI)
> {
>     // These are keyword formatted strings
>     // "what is mozilla"
>     // "what is mozilla?"
>     // "?mozilla"
>     // "?What is mozilla"
>+    // "?docshell site:mozilla.org" - first space-separated string contains no colons or dots
>+    // "docshell site:mozilla.org"
> 
>     // These are not keyword formatted strings
>-    // "www.blah.com" - anything with a dot in it 
>-    // "nonQualifiedHost:80" - anything with a colon in it
>+    // "www.blah.com" - anything with the first space-separated substring containing a dot
>+    // "www.blah.com stuff"
>+    // "?www.blah.com"
>+    // "nonQualifiedHost:80" - anything with the first space-separated substring containing a colon
>+    // "nonQualifiedHost:80 args"
>+    // "?site:intranetSite"

It's unclear to me why the above line is not a keyword search.	I don't think
that we would do anything with treating "?site:nonQualifedHost" as a
non-keyword anyway (the load will probably throw an error).  However, you are
not mandating the characters to be "site" so someone could enter something like
"?Magic: The Gathering" which would get treated as a non-keyword using your
algorithm.  I'd argue that anything starting with a ? should get treated as a
keyword search, though I could perhaps be persuaded to treat "?www.mozilla.org"
as a non-keyword.

>     // "nonQualifiedHost?"
>     // "nonQualifiedHost?args"
>     // "nonQualifiedHost?some args"
> 
>-    if(aURIString.FindChar('.') == -1 && aURIString.FindChar(':') == -1)
>+    PRInt32 dotLoc   = aURIString.FindChar('.');
>+    PRInt32 colonLoc = aURIString.FindChar(':');
>+    PRInt32 spaceLoc = aURIString.FindChar(' ');
>+
>+    if(((dotLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < dotLoc))) &&
>+       ((colonLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < colonLoc))))
>     {
>         PRInt32 qMarkLoc = aURIString.FindChar('?');
>-        PRInt32 spaceLoc = aURIString.FindChar(' ');
> 
>         PRBool keyword = PR_FALSE;
>         if(qMarkLoc == 0)
>         keyword = PR_TRUE;
>-        else if((spaceLoc > 0) && ((qMarkLoc == -1) || (spaceLoc < qMarkLoc)))
>+        else if((spaceLoc > 0) && ((qMarkLoc == kNotFound) || (spaceLoc < qMarkLoc)))
>         keyword = PR_TRUE;
> 
>         if(keyword)

Hm, the boolean and two ifs can be combined into one larger if statement.
Attachment #161108 - Flags: review?(caillon) → review-
Attached patch Patch, the next iteration (obsolete) — Splinter Review
(In reply to comment #16)
> >+	// "?site:intranetSite"
> 
> It's unclear to me why the above line is not a keyword search. I don't think
> that we would do anything with treating "?site:nonQualifedHost" as a
> non-keyword anyway (the load will probably throw an error).  However, you are

> not mandating the characters to be "site"

I didn't/wouldn't mandate the characters to be site or any such string, because
Google also has at least intitle, inurl, filetype, cache, link, related, and
define.  It's also not worth bloating code to fix only a specific set of cases.
 As for not doing anything with "?site:nonQualifiedHost", I was trying to
modify the original algorithm as little as possible to allow a few specific
safe usages, and this just got banned with the modifications I made; the
addition to the list was to make that fact clear.

> so someone could enter something like
> "?Magic: The Gathering" which would get treated as a non-keyword using your
> algorithm.  I'd argue that anything starting with a ? should get treated as a

> keyword search, though I could perhaps be persuaded to treat
"?www.mozilla.org"
> as a non-keyword.

Google redirects to the URL if you do an I'm Feeling Lucky search for a URL, so
I don't see a real reason not to treat it as a keyword.

Incidentally, do you know why are searches beginning with "?" considered
keywords?  Some forgotten historical precedent, perhaps?

> >	    PRBool keyword = PR_FALSE;
> >	    if(qMarkLoc == 0)
> >	    keyword = PR_TRUE;
> >-	    else if((spaceLoc > 0) && ((qMarkLoc == -1) || (spaceLoc <
qMarkLoc)))
> >+	    else if((spaceLoc > 0) && ((qMarkLoc == kNotFound) || (spaceLoc <
qMarkLoc)))
> >	    keyword = PR_TRUE;
> > 
> >	    if(keyword)
> 
> Hm, the boolean and two ifs can be combined into one larger if statement.

Done, although the if() gets really ugly (probably why it was split up
initially) in my opinion (which is, admittedly, the opinion of one who isn't
really familiar with C++ yet, and so might not be the best judge of ugliness). 
Also, in making "?" at beginning always trigger keyword search, there's some
repetition of (qMarkLoc == 0) in both if()s, but I don't know of a good way to
move qMarkLoc into a single if() in a way that doesn't lower the coding clarity
or increase the probable compiled code size.
Attachment #161108 - Attachment is obsolete: true
Attachment #161327 - Flags: review?(caillon)
Comment on attachment 161327 [details] [diff] [review]
Patch, the next iteration

>@@ -696,34 +697,36 @@
> nsresult nsDefaultURIFixup::KeywordURIFixup(const nsACString & aURIString, 
>                                             nsIURI** aURI)
> {
>     // These are keyword formatted strings
>     // "what is mozilla"
>     // "what is mozilla?"
>-    // "?mozilla"
>-    // "?What is mozilla"
>+    // "docshell site:mozilla.org" - has no dot/colon in the first space-separated substring
>+    // "?mozilla" - anything that begins with a question mark
>+    // "?site:mozilla.org docshell"
> 
>     // These are not keyword formatted strings
>-    // "www.blah.com" - anything with a dot in it 
>-    // "nonQualifiedHost:80" - anything with a colon in it
>+    // "www.blah.com" - first space-separated substring contains a dot, doesn't start with "?"
>+    // "www.blah.com stuff"
>+    // "nonQualifiedHost:80" - first space-separated substring contains a colon, doesn't start with "?"
>+    // "nonQualifiedHost:80 args"
>     // "nonQualifiedHost?"
>     // "nonQualifiedHost?args"
>     // "nonQualifiedHost?some args"
> 
>-    if(aURIString.FindChar('.') == -1 && aURIString.FindChar(':') == -1)
>+    PRInt32 dotLoc   = aURIString.FindChar('.');
>+    PRInt32 colonLoc = aURIString.FindChar(':');
>+    PRInt32 spaceLoc = aURIString.FindChar(' ');
>+    PRInt32 qMarkLoc = aURIString.FindChar('?');
>+
>+    if((((dotLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < dotLoc))) &&
>+        ((colonLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < colonLoc)))) ||
>+       qMarkLoc == 0)


Hm, I think this is right.  Please do testing against all the cases above,
though.  Also note, you are over-bracing everything.  The above would be
re-written as:

  if ((dotLoc == kNotFound || (spaceLoc > 0 && spaceLoc < dotLoc)) &&
      (colonLoc == kNotFound || (spaceLoc > 0 && spaceLoc < colonLoc)) ||
      qMarkLoc == 0)


>+        if(qMarkLoc == 0 ||
>+           ((spaceLoc > 0) && ((qMarkLoc == kNotFound) || (spaceLoc < qMarkLoc))))


And, if you add the above if, everything would combine into:

  if ((dotLoc == kNotFound || (spaceLoc > 0 && spaceLoc < dotLoc)) &&
      (colonLoc == kNotFound || (spaceLoc > 0 && spaceLoc < colonLoc)) &&
      (spaceLoc > 0 && (qMarkLoc == kNotFound || spaceLoc < qMarkLoc)) ||
      qMarkLoc == 0)

There probably is a way to whittle this down further, but I have been looking
at this too long and have other things to do.  Chopping off braces helps make
this much more readable though.
Attachment #161327 - Flags: review?(caillon) → review+
Attached patch Tweaked version of prev. patch (obsolete) — Splinter Review
I retested all the testcases again, and they all work properly.

I tested the latter set of testcases with a valid nonQualifiedHost, so when
each permutation is typed into Google the first result is a completely
different site (because an invalid nonQualifiedHost is later passed on as a
Google search).
Attachment #161327 - Attachment is obsolete: true
Comment on attachment 163656 [details] [diff] [review]
Tweaked version of prev. patch

Carrying over r+ from previous patch, asking for sr...
Attachment #163656 - Flags: superreview?(darin)
Attachment #163656 - Flags: review+
Comment on attachment 163656 [details] [diff] [review]
Tweaked version of prev. patch

>Index: mozilla/docshell/base/nsDefaultURIFixup.cpp

>+        nsCAutoString keywordSpec("keyword:");
>+        char *utf8Spec = ToNewCString(aURIString); // aURIString is UTF-8
>+        if(utf8Spec)
>+        {
>+            char* escapedUTF8Spec = nsEscape(utf8Spec, url_Path);
>+            if(escapedUTF8Spec) 
>             {
>+                keywordSpec.Append(escapedUTF8Spec);
>+                NS_NewURI(aURI, keywordSpec.get(), nsnull);
>+                nsMemory::Free(escapedUTF8Spec);
>+            } // escapedUTF8Spec
>+            nsMemory::Free(utf8Spec);
>+        } // utf8Spec
>+    } // dotLoc

This could all be simplified down to:

  nsCAutoString keywordSpec("keyword:");
  keywordSpec.Append(aURIString);
  NS_NewURI(aURI, keywordSpec);

The implementation of NS_NewURI will perform the escaping itself,
so there is no need to escape here.  All you have to do is make
sure that the string passed to NS_NewURI is UTF-8, which in this
it is.
Attachment #163656 - Flags: superreview?(darin) → superreview-
This patch changes the indicated code as requested above.  I removed the "//
dotLoc" comment after the closing brace because it's not needed for such a
short code block.  Hopefully, that's it now.

Aah, the sound of a smaller codebase...
Attachment #163656 - Attachment is obsolete: true
Attachment #163657 - Attachment is obsolete: true
Comment on attachment 167370 [details] [diff] [review]
Addresses sr comments

Moving forward the r+, asking for sr again...
Attachment #167370 - Flags: superreview?(darin)
Attachment #167370 - Flags: review+
Comment on attachment 167370 [details] [diff] [review]
Addresses sr comments

sr=darin
Attachment #167370 - Flags: superreview?(darin) → superreview+
Checking in docshell/base/nsDefaultURIFixup.cpp;
/cvsroot/mozilla/docshell/base/nsDefaultURIFixup.cpp,v  <--  nsDefaultURIFixup.cpp
new revision: 1.42; previous revision: 1.41
done
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
*** Bug 273157 has been marked as a duplicate of this bug. ***
*** Bug 79655 has been marked as a duplicate of this bug. ***
*** Bug 293764 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: