Closed Bug 904731 Opened 11 years ago Closed 11 years ago

[browser] text entered into search bar containing dot (.) should not be treated as url address if it has whitespaces

Categories

(Firefox OS Graveyard :: Gaia::Browser, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aryx, Unassigned)

Details

(Whiteboard: [sprintready])

Attachments

(1 file)

Unagi with Boot2Gecko 1.1.0.0-prerelease 20130812041203

When entering text with a dot and whitespaces into the address bar, it gets treated as url after pressing enter, but it should trigger a search.

Example:
Type "3. oktober 2013" and press Enter.
Attached file PR to master
This patch fix below items
1. Space between any words should also be keyword search.
2. Update utilities_test.js, 'www.blah.com foo' and 'a?some b' should not be url.
Attachment #790742 - Flags: review?(dale)
So according to http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDefaultURIFixup.cpp#783 'a?some b' is not a keyword formatted string

This patch is good but I am hesitant to commit it until we find out what firefox uses to handle this case, as we ported that function from firefox core and any mistakes in it can mean completely in accessible urls
Whiteboard: [sprintready]
Needinfoing as I am having trouble finding the logic in which this is handled by firefox desktop (finding androids behaviour may also be useful)
Flags: needinfo?(gduan)
Flags: needinfo?(blassey.bugs)
Thanks Brad,

https://github.com/mozilla/mozilla-central/blob/master/mobile/android/base/util/StringUtils.java#L28

it seems that isSearchQuery will return false if the string is "abc. d" and then call openUrlAndFinish. However, almost all browsers take it as search query, even firefox browser. Please currect me if wrong.

I will try to track chromium's behavior later.
Flags: needinfo?(gduan)
It definitely does a keyword search for me, I am happy with the patch and would like us to match desktops behaviour, but I am not comfortable landing until we understand desktops behaviour as getting this wrong means inaccessible urls
When we originally implemented this, we thought about using html5 input=url to do the validation, I forgot why we couldnt at the time (will cc ben to see if he has better memory) but it looks like it works for this case (<input type="url" value="http://abc. d" /> is an invalid url)

If there are no reasons not to, this seems like a more robust solution
Flags: needinfo?(bfrancis)
Just to clarify, we couldnt use input type=url for the field because we need to enter search terms, however we can use 

    function isSearchQuery(term) { 
      var input = document.createElement('input'); 
      input.setAttribute('type', 'url'); 
      input.setAttribute('value', term); 
      return !input.validity.valid;
    }

This I wouldnt be as worried about the robustness, as if it does have any false negatives then they would need to be fixed in gecko (this example code would need to strip out schemas from the term)
Flags: needinfo?(bfrancis)
(In reply to Dale Harvey (:daleharvey) from comment #8)
> Just to clarify, we couldnt use input type=url for the field because we need
> to enter search terms

Yep, as discussed in IRC.

>, however we can use 
> 
>     function isSearchQuery(term) { 
>       var input = document.createElement('input'); 
>       input.setAttribute('type', 'url'); 
>       input.setAttribute('value', term); 
>       return !input.validity.valid;
>     }

Although it's a bit of a hack, that does seem preferable to trying to maintain a regular expression. I'd be interested to instrument this to see if it improves performance. Would it help if we didn't create the input element on every key press but instead kept one in the Browser object and just reset the value?
It requires at least a scheme to validate, so we may need to check the existence of scheme and then manually add a "http://" on it.

However, it's just like a black box, I have less idea what it has checked. It fails the test in below case on my Aurora.

http://codepen.io/anon/pen/cblqE
I should put the string I tested.
"http://abc.? d"
It being a black box is a good thing, it means we arent hardcoding duplicated url logic into the browser app, what is and isnt a url vs a keyword search has been problematic on desktop for years and in the browser app it continues to be so so if we can get rid of that code, all the better.

Then if we are absolutely certain "http://abc.? d" isnt a valid url (which I would naively agree), we can file a bug against the form validation
A Pivotal Tracker story has been created for this Bug: http://www.pivotaltracker.com/story/show/55559368
Hi Dale, I updated the patch for below reasons, please kindly check again.

1. Since form require basic scheme check, I manually add scheme as 'http://' if it doesn't have.
2. And after testing, I found below cases with unexpected result, so I still add some regex validation, we can close it once fixed behind.
 bugs: 'data','a?some b','?mozilla','?site:mozilla.org docshell'
3. I move a test case 'foo:about' to 'isNotUrl => true', since I think port should be a number, unless it's another case.
Flags: needinfo?(dale)
Comment on attachment 790742 [details]
PR to master

Apologies, I commented on github but didnt mark it in a review, we need to be able to open data uri's which will require another special case, thats what that test was for

I am not sure we should keep the special cases for the firefox bugs as opposed to just filing bugs against gecko and waiting for them to be fixed

The rest looks good, thanks
Attachment #790742 - Flags: review?(dale) → review-
Flags: needinfo?(dale)
Comment on attachment 790742 [details]
PR to master

Hi Dale, 
I just updated and add data:uri case. Please kindly check again.
Attachment #790742 - Flags: review- → review?(dale)
Comment on attachment 790742 [details]
PR to master

Sorry for going round on this, but its important to get right, few issues:

1. This fails for plain keyword searches, 'http://foo' is a valid url, the 'plain string' regex exception has a ? in there which stops it matching

2. Data uri's start with data:, so the check should probably be a strict check for that

3. Not as important, but wondering why we are doing |(regex.exec(str) || [])[0])| instead of |regex.test(str)|

4. I definitely want us to file and fix the "?abc" and "a? b" cases in platform and remove the special cases for them, this is just going to leave dead special case code in the browser that we will forget to remove, without the special cases we still fix for a lot of cases, the rest can wait for platform
Attachment #790742 - Flags: review?(dale) → review-
Comment on attachment 790742 [details]
PR to master

Sorry I wasnt clear previously

when you enter 'foo' as a plain string, we prepend http://foo, which is a valid url and visit that url, plain strings should definitely be keyword searches (whereas if a use entered http://foo they should visit the url, as they do)

Will also need added to tests
Attachment #790742 - Flags: review?(dale) → review-
Comment on attachment 790742 [details]
PR to master

as discussed offline, the patch already added 'data' and 'http://foo' in tests and return correct result. Please let me know if anything I miss.
Attachment #790742 - Flags: review- → review?(dale)
Filed https://bugzilla.mozilla.org/show_bug.cgi?id=910034 as a follow up for the platform bug
Comment on attachment 790742 [details]
PR to master

Thanks for this, looking good now
Attachment #790742 - Flags: review?(dale) → review+
https://github.com/mozilla-b2g/gaia/commit/e1fb3fcc12f7517b7e76fde492cdbe9db700ce92
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: