Closed Bug 982428 Opened 10 years ago Closed 10 years ago

Interpret urls with bogus protocols ('site:mozilla.org firefox', 'define:serendipity') as keyword search queries rather than showing an error page

Categories

(Firefox :: Address Bar, defect)

defect
Not set
normal
Points:
8

Tracking

()

VERIFIED FIXED
Firefox 33
Iteration:
33.2
Tracking Status
firefox33 --- verified
firefox34 --- verified
relnote-firefox --- 33+

People

(Reporter: vtsatskin, Assigned: Gijs)

References

Details

Attachments

(1 file, 8 obsolete files)

The rationale here is that users will probably not have the URI scheme model when interacting with our awesome bar. Thus showing an "invalid address" or "malformed URI" error will cause confusion and frustration when their intentions were to search the web.

Instead of showing the either error page, we should just direct them to an internet search.
Flags: firefox-backlog?
Note: this is particularly useful for Google-related commands such as mathematical equations, define:word, etc.  Right now, these work as inputs in Chrome's URL bar but not ours.
Flags: firefox-backlog? → firefox-backlog+
Whiteboard: p=8
See Also: → 693808
Whiteboard: p=8 → p=8 [qa?]
I've added bug 982428 as a dup of this bug, because parsing non-URLs as searches should translate to utilizing Google-specific queries, such as "define:cat" and "site:mozilla.org free" if Google is the user's search provider.
Assignee: nobody → gijskruitbosch+bugs
Component: General → Location Bar
OS: Mac OS X → All
Hardware: x86 → All
Marco, can you add this to the current iteration?
Status: NEW → ASSIGNED
Flags: needinfo?(mmucci)
Whiteboard: p=8 [qa?] → p=8 [qa+]
Added to Iteration 33.1
Flags: needinfo?(mmucci)
Whiteboard: p=8 [qa+] → p=8 s=33.1 [qa+]
(In reply to Valentin Tsatskin [:vt] from comment #0)
> The rationale here is that users will probably not have the URI scheme model
> when interacting with our awesome bar. Thus showing an "invalid address" or
> "malformed URI" error will cause confusion and frustration when their
> intentions were to search the web.
> 
> Instead of showing the either error page, we should just direct them to an
> internet search.

Just so we're on the same page, if I use "site:foo", I get neither of these errors, I get "The address wasn't understood" which isn't nearly as unfriendly as what you quoted.

What input are you using that triggers "invalid address" and/or "malformed URI"?
Flags: needinfo?(vtsatskin)
So here's what I'm currently looking at... Boris, does this look non-crazy to you? The IsFrame() check in nsDocShell.h is basically a hacky version of 'is this a toplevel load' - not 100% sure if there's a better way to check for that. One thing I noticed in my testing is that this doesn't seem to work well with session restore: I still get a network error page with 'site:foo' instead of the google search page. Maybe that's something about my testing environment, or unrelated session store breakage, but it'd be good to sort that out...
Attachment #8442528 - Flags: feedback?(bzbarsky)
Comment on attachment 8442528 [details] [diff] [review]
fix up 'valid' URIs that we can't handle, f?bz

>+        aFixupFlags & FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP) {

Parens around the bitwise and to be clear about precedence, please.

>+            if (!handlerExists) {
>+                CreateSearch(uriString, aPostData, aURI);

This is leaking the URI we created so far.  You want NS_RELEASE(*aURI) here before doing the CreateSearch thing.

Do we want to just key this new behavior off FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP, or add a new flag?  I expect the only user of FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP in practice is the URL bar?

>+    return CreateSearch(aKeyword, aPostData, aURI);

Why aKeyword, not keyword?

>+                responseType.Assign(mozKeywordSearch.get());

Drop the .get() bit.

>+                responseType.AssignLiteral("");

No need: it's already empty.

The docshell change I'm not convinced about.  I can see doing third-party fixup for a load from the URL bar, but if a web page does a location set on another web page, I don't think we want to be doing third-party fixup...
Attachment #8442528 - Flags: feedback?(bzbarsky) → feedback+
(In reply to Boris Zbarsky [:bz] from comment #10)
> Comment on attachment 8442528 [details] [diff] [review]
> fix up 'valid' URIs that we can't handle, f?bz
> 
> >+        aFixupFlags & FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP) {
> 
> Parens around the bitwise and to be clear about precedence, please.
> 
> >+            if (!handlerExists) {
> >+                CreateSearch(uriString, aPostData, aURI);
> 
> This is leaking the URI we created so far.  You want NS_RELEASE(*aURI) here
> before doing the CreateSearch thing.

Interesting. I was under the impression that nsDefaultURIFixup does a lot of this type of overwriting... Now that I'm more awake, I'll have another look.

> Do we want to just key this new behavior off
> FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP, or add a new flag?  I expect the only user
> of FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP in practice is the URL bar?

This is hard to check because it gets passed in as a load flag. Here's users I found beside the obvious browser.js / utilityOverlay.js ones:

http://mxr.mozilla.org/mozilla-central/source/caps/src/nsScriptSecurityManager.cpp?mark=834,836#804

http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsFrameLoader.cpp#558

http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabChild.cpp#1442

http://mxr.mozilla.org/mozilla-central/source/browser/components/sessionstore/src/ContentRestore.jsm#201

> 
> >+    return CreateSearch(aKeyword, aPostData, aURI);
> 
> Why aKeyword, not keyword?

Oversight, thanks for catchiing this. :-)

> The docshell change I'm not convinced about.  I can see doing third-party
> fixup for a load from the URL bar, but if a web page does a location set on
> another web page, I don't think we want to be doing third-party fixup...

This makes sense. I assumed that this flag wouldn't be passed in the cases you describe anyway... Furthermore, if you pass something that is parsable as a URL, the most we'll do is strip \r\n and some other bogus unescaped stuff out, mess with view-source and/or half-typed schemes / file links / backslashes, and then we actually return early because the block at http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDefaultURIFixup.cpp#257 will create a URL and the one at http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDefaultURIFixup.cpp#266 will return with that.

As it is, I'm not sure how best to distinguish these two cases - we could add an extra flag if we think that'd be enough of a distinction. Also not super sure why this code is even resetting the existing flag - why is it even possible we end up in that case? Bonsai (why, hello!) points to bug 263213, from which this isn't clear to me - that is, seems like something that was necessary then but shouldn't be necessary right now.

Then again, depending on cautious behaviour of nsDefaultURIFixup from the nsDocShell side seems like a bad move as well...
Also, are there tests for this code? A casual search did not reveal any, which is mildly worrying... I will try and add one for this specific case, at least.
I decided to add a new flag. I've also removed the INTERNAL equivalent to the fixup_scheme_typos flag because it was unused. This works, except that session restore still thinks that the url in use is the original one. This doesn't happen in the regular keyword search (e.g. just typing 'foo bar' in the location bar) so there must be something wrong with my patch, but I'm not sure what it is. :-( .
Attachment #8442528 - Attachment is obsolete: true
This patch uses the flag everywhere (including mobile and metro...).
> I was under the impression that nsDefaultURIFixup does a lot of this type of
> overwriting...

If you look carefully, you'll see that once *aURI is set, fixup either returns immediately or works with *aURI, not aURI.

> Here's users I found beside the obvious browser.js / utilityOverlay.js ones:

Hmm.  The frameloader owner is the only slightly fishy case, but that's coming from an attribute on the mozbrowser, not from web content, so should be ok.

The other uses look ok.

> I assumed that this flag wouldn't be passed in the cases you describe anyway...

Oh, I see.  This is removing the flag, not adding it.  And this is in the string-accepting function, which is never invoked by web content.  OK, that makes some sense, I think.  

On the other hand, note that the whole point of bug 263213 was that http://some-invalid-host-name typed in the URL bar should NOT do a search for "some-invalid-host-name".  That is, we want to treat the user typing "some-invalid-host-name" and "http://some-invalid-host-name" in the URL bar differently.  It's not clear to me whether this bug is trying to change that behavior; I think it's pretty clear that if a user types "http://some-invalid-host-name" they are not in fact trying to search the web.

> Also, are there tests for this code?

Probably not.  It doesn't help that it has multiple knobs that are set inside and outside of Gecko so it's not even clear where the tests should live and what they should test...  Adding some tests would be pretty awesome.
(In reply to Boris Zbarsky [:bz] from comment #15)
> > I assumed that this flag wouldn't be passed in the cases you describe anyway...
> 
> Oh, I see.  This is removing the flag, not adding it.  And this is in the
> string-accepting function, which is never invoked by web content.  OK, that
> makes some sense, I think.  

Right. The issue is that it's removing it, which meant my code wasn't getting hit if I reused the flag in question.

> On the other hand, note that the whole point of bug 263213 was that
> http://some-invalid-host-name typed in the URL bar should NOT do a search
> for "some-invalid-host-name".  That is, we want to treat the user typing
> "some-invalid-host-name" and "http://some-invalid-host-name" in the URL bar
> differently.  It's not clear to me whether this bug is trying to change that
> behavior; I think it's pretty clear that if a user types
> "http://some-invalid-host-name" they are not in fact trying to search the
> web.

I'm not trying to change that, although changing a related case (typing "some-invalid-hostname" without a protocol prefix) is bug 693808's job. The current change I'm introducing (if I wrote it correctly) is very specific: it needs to be an external protocol for which we don't believe there's a handler (which basically guarantees that creating a channel will fail later on in the docshell code, causing the error page to show).

I went with the additional flag in the end anyway, as it seems safer considering there are so few tests. That said, now we're sending very similar flags in very many cases, which makes me wonder if I should be using the existing SCHEME_TYPOS flag... even if this isn't really a scheme typo. Maybe passing that flag could imply the new flag either way? That goes back to scary territory, in a certain sense - in another, I'm updating all the callers right now, which is also silly.

Needinfo for this: in your capacity as module peer/owner for docshell, can you recommend a course of action here wrt new flags / reusing flags / changing the impact of existing flags / renaming flags? (the impact of the flag addition is visible in the latest patches) :-)

> > Also, are there tests for this code?
> 
> Probably not.  It doesn't help that it has multiple knobs that are set
> inside and outside of Gecko so it's not even clear where the tests should
> live and what they should test...  Adding some tests would be pretty awesome.

I actually found a unit test for the urifixup case after asking this. I'll update that, but I'm considering adding a browser test for the actual functionality testing as well.

I found out the session restore issue is orthogonal, see bug 1027637.
Flags: needinfo?(bzbarsky)
I realized that really, splitting off the CreateSearch bit was just breaking e10s. D'oh. Calling a search 'keywordtouri' is kind of confusing, but alright. :-)
Attachment #8442780 - Attachment is obsolete: true
> which meant my code wasn't getting hit if I reused the flag in question.

Ah, because we do manage to produce a valid URI, I see.

I think making the scheme typos flag imply fixup for schemes we can't handle at all is pretty reasonable, actually.  It's a recent addition, and I can't think of cases offhand where we'd want to convert "ttp:" to "http:" but not convert unknown schemes into a search.
Flags: needinfo?(bzbarsky)
I think this is ready for review. Try: https://tbpl.mozilla.org/?tree=Try&rev=82baf7c6d977
Attachment #8442951 - Flags: review?(bzbarsky)
Attachment #8442818 - Attachment is obsolete: true
Attachment #8442782 - Attachment is obsolete: true
(In reply to :Gijs Kruitbosch from comment #8)
> (In reply to Valentin Tsatskin [:vt] from comment #0)
> > The rationale here is that users will probably not have the URI scheme model
> > when interacting with our awesome bar. Thus showing an "invalid address" or
> > "malformed URI" error will cause confusion and frustration when their
> > intentions were to search the web.
> > 
> > Instead of showing the either error page, we should just direct them to an
> > internet search.
> 
> Just so we're on the same page, if I use "site:foo", I get neither of these
> errors, I get "The address wasn't understood" which isn't nearly as
> unfriendly as what you quoted.
> 
> What input are you using that triggers "invalid address" and/or "malformed
> URI"?

I used the wrong name for "The address wasn't understood" (unknownProtocolFound)[1]; when I said "invalid address" I meant unknownProtocolFound.

As for malformedURI [2], I'm having trouble finding the example I had before which would trigger it that could be a search term. So I think just the unknownProtocolFound changes are good enough.

[1]:http://mxr.mozilla.org/mozilla-central/source/browser/locales/en-US/chrome/overrides/appstrings.properties#8
[2]:http://mxr.mozilla.org/mozilla-central/source/browser/locales/en-US/chrome/overrides/appstrings.properties#5
Flags: needinfo?(vtsatskin)
Comment on attachment 8442951 [details] [diff] [review]
allow fixing up URIs with schemes that we can't handle,

Review of attachment 8442951 [details] [diff] [review]:
-----------------------------------------------------------------

::: docshell/base/nsDefaultURIFixup.cpp
@@ +434,5 @@
> +
> +            defaultEngine->GetSubmission(NS_ConvertUTF8toUTF16(keyword),
> +                                         responseType,
> +                                         NS_LITERAL_STRING("keyword"),
> +                                         getter_AddRefs(submission));

To be clear, this hunk is not necessary for this patch, but it was cleanup I did while I was there, and so I left it in the patch.

::: docshell/base/nsDocShell.cpp
@@ -1571,5 @@
>      if (aLoadFlags & LOAD_FLAGS_ALLOW_THIRD_PARTY_FIXUP)
>          flags |= INTERNAL_LOAD_FLAGS_ALLOW_THIRD_PARTY_FIXUP;
>  
> -    if (aLoadFlags & LOAD_FLAGS_FIXUP_SCHEME_TYPOS)
> -        flags |= INTERNAL_LOAD_FLAGS_FIXUP_SCHEME_TYPOS;

Ditto here + removing the INTERNAL_... flag from the idl
Comment on attachment 8442951 [details] [diff] [review]
allow fixing up URIs with schemes that we can't handle,

r=me if you add a comment in the IDL about 0x80 being a free bit.
The new xpcshell test went orange on android only because it has an existing default search provider when running xpcshell tests, while desktop does not (?). I've updated the test to deal with that (hopefully), and repushed to try: https://tbpl.mozilla.org/?tree=Try&rev=ebca7ffc4a3c . This also addresses the nit from the review.
Attachment #8442951 - Attachment is obsolete: true
Attachment #8442951 - Flags: review?(bzbarsky)
Comment on attachment 8443061 [details] [diff] [review]
allow fixing up URIs with schemes that we can't handle,

Carrying over r+
Attachment #8443061 - Flags: review+
Updating summary per comment #20 and discussion on IRC.
Summary: Do not display "invalid address" or "Malformed URI" network error and instead interpret as a search result → Interpret urls with bogus protocols ('site:mozilla.org firefox', 'define:serendipity') as keyword search queries rather than showing an error page
... and considering there's an automated test here, and we ended up doing non-scary work for this, marking qa- instead of qa+
Whiteboard: p=8 s=33.1 [qa+] → p=8 s=33.1 [qa-]
Does this patch obey keyword.enabled? There's no explicit check for it like there is for FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP.
Comment on attachment 8443061 [details] [diff] [review]
allow fixing up URIs with schemes that we can't handle,

Boriss, I'm not fixing e.g. "60/5" in this bug, so you may or may not want to undo that duplicate. As I tried to make clear when I needinfo'd vt about this (who then seemed to agree that the case I've focused on is the main issue), for bugs like this, from an engineering perspective, it would be really helpful if there were clear steps to reproduce. Clearly, not every error page should be converted to a web search ("Firefox can't find the server for www.google.com" would be a fun one, for instance). I'm not really sure where UX feels the line is, and there are definitely cases that I'm not aware of (e.g. the 60/5 which was only mentioned in the now-duped-bug).

This bug only fixes the "thisisnotarealprotocol:stuff" case. It would probably make sense to file separate bugs if there are other cases where you think we should always do a keyword search, and to give concrete STR and expected/actual results there.

(In reply to dagger.bugzilla from comment #27)
> Does this patch obey keyword.enabled? There's no explicit check for it like
> there is for FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP.

Ugh. I forgot we have prefs for every little thing we do. IMO, we should just remove keyword.enabled, it's a footgun type pref. Anyway, different debate.

I have a fixed version locally that does check the pref, but I'm more worried about how the test continues to fail on android right now, so I'm not putting up a new patch until I've figured that out.
Attachment #8443061 - Flags: review+
Depends on: 1028091
Now with pref check. Test disabled on Android because of bug 1028091. Android seems to already have a prompt implemented for cases like this, with a Search button next to it, which opens the app store to search for something to handle the unknown protocol. Maybe we can do better than that, but I think we should followup that (it could be that we should just fix bug 1028091, or that we need android-specific code in docshell to cope with the lying external protocol service). Try: https://tbpl.mozilla.org/?tree=Try&rev=b961cb3a026c (third time lucky, I hope...)
Attachment #8443378 - Flags: review?(bzbarsky)
Attachment #8443061 - Attachment is obsolete: true
This time with the pref set in the test so it doesn't orange all over desktop...
Attachment #8444419 - Flags: review?(bzbarsky)
Attachment #8443378 - Attachment is obsolete: true
Attachment #8443378 - Flags: review?(bzbarsky)
Iteration: --- → 33.2
Points: --- → 8
QA Whiteboard: [qa-]
Whiteboard: p=8 s=33.1 [qa-] →
Comment on attachment 8444419 [details] [diff] [review]
allow fixing up URIs with schemes that we can't handle,

You don't want to add the bool var cache multiple times.
Attachment #8444419 - Flags: review?(bzbarsky) → review-
(In reply to Boris Zbarsky [:bz] from comment #33)
> Comment on attachment 8444419 [details] [diff] [review]
> allow fixing up URIs with schemes that we can't handle,
> 
> You don't want to add the bool var cache multiple times.

Ugh. This is where my trusting existing code over my intuition / earlier experience when writing C++ is a bad thing, clearly:

http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDefaultURIFixup.cpp#206

:-(

(I'll fix both up to use a guard)
Attachment #8444419 - Attachment is obsolete: true
Thank you!  That existing code is basically a memory leak!
Comment on attachment 8445128 [details] [diff] [review]
allow fixing up URIs with schemes that we can't handle,

r=me
Attachment #8445128 - Flags: review?(bzbarsky) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/930f107f460a
Whiteboard: → [fixed-in-inbound]
https://hg.mozilla.org/mozilla-central/rev/930f107f460a
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 33
(In reply to :Gijs Kruitbosch from comment #29)
> Comment on attachment 8443061 [details] [diff] [review]
> allow fixing up URIs with schemes that we can't handle,
> 
> Boriss, I'm not fixing e.g. "60/5" in this bug, so you may or may not want
> to undo that duplicate. As I tried to make clear when I needinfo'd vt about
> this (who then seemed to agree that the case I've focused on is the main
> issue), for bugs like this, from an engineering perspective, it would be
> really helpful if there were clear steps to reproduce. Clearly, not every
> error page should be converted to a web search ("Firefox can't find the
> server for www.google.com" would be a fun one, for instance). I'm not really
> sure where UX feels the line is, and there are definitely cases that I'm not
> aware of (e.g. the 60/5 which was only mentioned in the now-duped-bug).
> 
> This bug only fixes the "thisisnotarealprotocol:stuff" case. It would
> probably make sense to file separate bugs if there are other cases where you
> think we should always do a keyword search, and to give concrete STR and
> expected/actual results there.

This time with needinfo, as I'm marking this verified...
Status: RESOLVED → VERIFIED
Flags: needinfo?(jboriss)
Whiteboard: [fixed-in-inbound]
(In reply to :Gijs Kruitbosch from comment #40)
> (In reply to :Gijs Kruitbosch from comment #29)
> > Comment on attachment 8443061 [details] [diff] [review]
> > allow fixing up URIs with schemes that we can't handle,
> > 
> > Boriss, I'm not fixing e.g. "60/5" in this bug, so you may or may not want
> > to undo that duplicate. As I tried to make clear when I needinfo'd vt about
> > this (who then seemed to agree that the case I've focused on is the main
> > issue), for bugs like this, from an engineering perspective, it would be
> > really helpful if there were clear steps to reproduce. Clearly, not every
> > error page should be converted to a web search ("Firefox can't find the
> > server for www.google.com" would be a fun one, for instance). I'm not really
> > sure where UX feels the line is, and there are definitely cases that I'm not
> > aware of (e.g. the 60/5 which was only mentioned in the now-duped-bug).
> > 
> > This bug only fixes the "thisisnotarealprotocol:stuff" case. It would
> > probably make sense to file separate bugs if there are other cases where you
> > think we should always do a keyword search, and to give concrete STR and
> > expected/actual results there.
> 
> This time with needinfo, as I'm marking this verified...

Hey Gijs, could I possibly grab a build from you with patch applied?
Flags: needinfo?(jboriss)
(In reply to Jennifer Morrow [:Boriss] (Firefox UX) from comment #41)
> Hey Gijs, could I possibly grab a build from you with patch applied?

Hi, well, it's landed on Nightly a while ago, so any recent Nightly (last few days) should do!
Just like bug 693808.

Added in the 33 release notes with the wording "Improved search experience through the location bar"
Hoping to have a blog post to point to.
Depends on: 1047600
No longer depends on: 1047600
See Also: → 1591032
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: