Closed Bug 1038248 Opened 10 years ago Closed 10 years ago

Figure out how many urls with bogus protocols made from URL bar are going to Google rather than error pages.

Categories

(Firefox :: General, defect)

x86
All
defect
Not set
normal
Points:
2

Tracking

()

VERIFIED FIXED
Firefox 33
Iteration:
33.3
Tracking Status
firefox33 --- verified

People

(Reporter: bwinton, Assigned: bwinton)

References

Details

Attachments

(1 file, 2 obsolete files)

We would also like to tag or count Google-specific strings (eg site: define: and math).  This may be tricky, because of things like "webcal:" which (should) redirect to your calendar provider.  We could add an FHR count for urlbar searches 
which match the regular expression "^[a-zA-Z]: ", if that's allowed.
Flags: firefox-backlog+
Attached patch bug1038248.diff (obsolete) — Splinter Review
Hey Mike,

I'm not entirely sure that this is the right way to do this, but come bug me tomorrow, and I'll explain my reasoning…

Thanks!
Assignee: nobody → bwinton
Attachment #8455811 - Flags: review?(mconley)
Added to Iteration 33.3
Status: NEW → ASSIGNED
Iteration: --- → 33.3
QA Whiteboard: [qa?]
To test this, make a couple of searches from the urlbar.
Some should start with "define:" or "site:", others shouldn't.
In about:telemetry » Simple Measurements » UITelemetry there should be a "search" section, with a "urlbar" number equal to the total number of urlbar searches, and a "urlbar-keyword" number equal to the total number of urlbar searches that started with a keyword.
QA Whiteboard: [qa?] → [qa+]
Comment on attachment 8455811 [details] [diff] [review]
bug1038248.diff

Is this patch a --reverse? It seems to remove things instead of adding things...

Anyhow, I flipped it in my head.  :) I can't speak to whether or not recordSearchInHealthReport and the keyword-search handler thing run for all search cases, but if they do, then structurally this looks like the right idea.

Your regex, however, seems to match anything starting with characters followed by a colon - so people searching for http://www.facebook.com, for example, will hit this. I assume you wanted to put some whitespace after the colon?
Attachment #8455811 - Flags: review?(mconley) → review-
Attached patch bug1038248.diff (obsolete) — Splinter Review
(In reply to Mike Conley (:mconley) from comment #4)
> Is this patch a --reverse? It seems to remove things instead of adding
> things...

I am not good with hg histedit rebase non-mq stuff…
Turns out rebasing and "hg export -r bugNNNNNNN > bugNNNNNNN.diff" was more what I wanted.

> Anyhow, I flipped it in my head.  :) I can't speak to whether or not
> recordSearchInHealthReport and the keyword-search handler thing run for all
> search cases, but if they do, then structurally this looks like the right
> idea.

They do.  At least, all the search cases I could find, and all the ones listed in the bug.  ;)  Furthermore, it mirrors the calls to search reporting in FHR, and I trust those folks to have caught all the cases we're interested in.

> Your regex, however, seems to match anything starting with characters
> followed by a colon - so people searching for http://www.facebook.com, for
> example, will hit this. I assume you wanted to put some whitespace after the
> colon?

I didn't, because "site:latte.ca quit" is a site-specific search that we want to track.

for this run, I've filtered out things that have a : followed by a / or a \…
That gives us:
  ftp://id.com test = false
  facebook test = false
(That second one was typed in as "http://facebook test", btw.  Something other than me stripped the protocol.)
  site:latte.ca = true
  http:\\facebook test = false
  "http://facebook.com" = false
all of which seem like reasonable results to me.
Attachment #8455811 - Attachment is obsolete: true
Attachment #8457706 - Flags: review?(mconley)
Oh, there's one more slightly tricky bit I should mention.

The "query" parameter is only populated when we're running a urlbar search (and it gets "win.gURLBar.value"), so this code will only track urls with bogus protocols, and not searches for http://facebook.com from, say, the searchbar.
Comment on attachment 8457706 [details] [diff] [review]
bug1038248.diff

Review of attachment 8457706 [details] [diff] [review]:
-----------------------------------------------------------------

One last small (but critical) problem, and then I think this is good to go.

::: browser/components/nsBrowserGlue.js
@@ +327,5 @@
>          // be consolidated if there is will. We need the observer in
>          // nsBrowserGlue to prevent double counting.
> +        let win = this.getMostRecentBrowserWindow();
> +        BrowserUITelemetry.countSearchEvent("urlbar", win.gURLBar.value);
> +#ifdef MOZ_SERVICES_HEALTHREPORT

Ah, I think you're going to want to move the other end of this ifdef to before the "break", otherwise we fall through to the browser-search-engine-modified case! Aren't switch statements lovely? ;)
Attachment #8457706 - Flags: review?(mconley) → review-
Attached patch bug1038248.diffSplinter Review
Changed, although I will note that the previous behaviour was to fall through, so I'm going to f? gps, to see if that was intentional or accidental.

Thanks for the review!
Attachment #8457706 - Attachment is obsolete: true
Attachment #8458046 - Flags: review?(mconley)
Attachment #8458046 - Flags: feedback?(gps)
Comment on attachment 8458046 [details] [diff] [review]
bug1038248.diff

Review of attachment 8458046 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Removing feedback? from gps since bwinton and I cleared up the ifdef thing IRL.
Attachment #8458046 - Flags: review?(mconley)
Attachment #8458046 - Flags: review+
Attachment #8458046 - Flags: feedback?(gps)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/0bf3196d00f3
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 33
Hi Florin, can a QA contact be assigned for verification of this bug.
Flags: needinfo?(florin.mezei)
Flags: needinfo?(florin.mezei)
QA Contact: florin.mezei
I've verified this on Win 7 x64, Ubuntu 13.04 x64, and Mac OS X 10.9.4 using the latest Nightly build (BuildID: 20140718030202).

Verified as working as expected:
- simple keyword based search - counted under "urlbar" and "urlbar-keyword"
allintitle:elections
allinurl:politics
allinbody:schools
site:nytimes.com
site:wikipedia.org
link:google.com
related:time.com
info:google.com
cache:washington.edu
define:carrot
- simple non-keyword search - counted under "urlbar"
facebook
- non-search strings - NOT counted
ftp://ftp.mozilla.org/pub/mozilla.org/firefox
http:\\facebook
http://facebook.com
- special strings - NOT counted
data:text/html,<body>hi<body>
data:data:text/html,<body>hi<body>
- simple keyword based search in small search bar at top - counted under "searchbar"
allintitle:elections
allinurl:politics
allinbody:schools
- simple keyword based search in large search bar in New Tab page - counted under "newtab"
link:google.com
related:time.com
info:google.com

Special cases - potential issues:
1. Strings with keyword based structure are still counted as normal keyword based searches - e.g. "zxc:asd" is not a keyword, but will be counted as a keyword (expected based on implementation, but not sure if intended).
2. More complex keyword based searches are only counted under "urlbar", and NOT also under "urlbar-keyword" - e.g. "pandas -site:wikipedia.org" or "olympics site:nbc.com".
3. Keyword searches are counted also when there's a space following the ":" (these in theory are NOT keyword searches) - e.g. "site: nytimes.com" is counted same as "site:nytimes.com", but only the latter is a keyword search ("site: nytimes.com" is not a keyword search as far as I know).
4. On Ubuntu 13.04 x64, entering "info:google.com" in the URL bar will open the "Launch Application" dialog which says: "This link needs to be opened with an application" and "Help" is presented as the only application available. This behavior is specific to Ubuntu only, as the string triggers a Google search as expected on Windows and Mac.

Please review the special cases above and let me know whether they are OK, or I should reopen the bug, or I should log separate issues for any of them.
Flags: needinfo?(bwinton)
1. Intended.  I assume there will be few enough of them that they won't skew the results too heavily.  (Also, I'm not sure what the full list of keywords are…  DuckDuckGo, for instance, has a bunch listed at https://duck.co/help/results/syntax ;)

2. Intended, since those aren't bogus protocols.

3. Hmmm…  I threw that in so that we would catch things like "define: serendipity", but are those really broken protocols?  I think I might say "yes", but could be argued out of it if other people feel strongly the other way.

4. Interesting, but unrelated to this bug I think.  (webcal:jfldjsdfsjkl on Mac does the same thing.)

So, I would say they were all okay.  Thanks!
Flags: needinfo?(bwinton)
Marking as verified, since based on comment 14 the current implementation is OK. Thanks Blake!
Status: RESOLVED → VERIFIED
QA Whiteboard: [qa+] → [qa!]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: