Closed Bug 1193892 Opened 9 years ago Closed 9 years ago

Manually test search counts

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kparlante, Assigned: sphilp)

References

Details

(Whiteboard: [unifiedTelemetry][data-validation])

Attachments

(1 file)

Search from all possible sources, using a variety of engines, and see what appears on the client and server in v2/v4

Hijacking may account for some differences, so testing hijacking is also helpful. A lot of searches that were formerly reported as "google" are now categorized as "other' due to hijacking. Bing and Yahoo are less likely to be subject to hijacking, so they are expected (by jjensen) to be closer.

Context:
We've been comparing search counts in v4 and v2 data, and have some discrepancies. We want to be confident that we're generating the right data from the client.

Meeting notes: https://etherpad.mozilla.org/executive-rollup-v2-v4

Data validation analysis:

- rollup comparison between v2 and v4 (looking at nightly data), search is at the bottom: https://metrics.mozilla.com/protected/prototypes/rollup-comparison/

- Sam's analysis:
https://gist.github.com/SamPenrose/01e2127fe983efbbbfb0
https://plot.ly/221/~mozilla/ (scatter plot of paired v2/v4 searches)
https://plot.ly/223/~mozilla/ (per client difference per day as histogram)
Blocks: 1173438
Priority: -- → P1
Bug 1191681 has a local validation tool that you can use to quickly compare v2/v4 search counts:
https://gist.github.com/georgf/1b0831a6b81b6c9fe240

Note the caveats in bug 1191681, comment 14.
We can easily extend this with additional data dumping if needed.
tl;dr: Executed several searches and everything looks okay. I checked V4 manually as well as V2/V4 via the script georg posted. Numbers are identical. I’ll continue to look at other test cases to see if there’s some corner case that explains what’s going on. I did not yet try any sort of hijacking.

Tested on nightly 43.0a1 (2015-08-17)

V4 Property = payload.keyedHistograms.SEARCH_COUNTS

General steps:
Open nightly, starting a new session
Perform n searches for x,y,z search engines
Shutdown and restart nightly (to force telemetry data to send)
Use the follow script to compare V2/V4 quickly: https://gist.github.com/georgf/1b0831a6b81b6c9fe240
Manually examine the property values for any quirks if they are flagged by the tool

Google
***Perform 10 searches with Google, after telemetry has started
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”google.searchbar” shows sum of 10

***Perform 10 searches with Google, before telemetry has started
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”google.searchbar” shows sum of 10

***Perform 10 searches with Google set as the default, and then changing default and selecting the Google from the non-default search provider list in the searchbar
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”google.searchbar” shows sum of 20 for saved-session
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”google.searchbar” shows sum of 10 for shutdown
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”google.searchbar” shows sum of 10 for environment-change

Yahoo
***Perform 10 searches with Yahoo, before telemetry has started
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”yahoo.searchbar” shows sum of 10

***Perform 10 searches with Yahoo, before telemetry has started
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”yahoo.searchbar” shows sum of 10

***Perform 10 searches with Yahoo set as the default, and then changing default and selecting the Yahoo from the non-default search provider list in the searchbar
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”yahoo.searchbar” shows sum of 20 for saved-session
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”yahoo.searchbar” shows sum of 10 for shutdown
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”yahoo.searchbar” shows sum of 10 for environment-change

Bing
***Perform 10 searches with Bing, before telemetry has started
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”bing.searchbar” shows sum of 10

***Perform 10 searches with Bing, before telemetry has started
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”bing.searchbar” shows sum of 10

***Perform 10 searches with Bing set as the default, and then changing default and selecting the Bing from the non-default search provider list in the searchbar
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”bing.searchbar” shows sum of 20 for saved-session
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”bing.searchbar” shows sum of 10 for shutdown
- confirmed payload.keyedHistograms.SEARCH_COUNTS.”bing.searchbar” shows sum of 10 for environment-change

***

That's the gist of it, I ran some tests on the other providers and the results were identical.

Also a couple of things stood out that are not really related:

Changing environment settings (like default search provider) before telemetry has started, does not seem to track an environment-change event, rather once telemetry starts it only shows SEARCH_COUNTS for the new default search provider in one ping (10 shutdown and saved-sessions, as opposed to 10 saved-session, 5 shutdown, 5 environment-change)

In the “UITelemetry” field for "oneoff" searches, hitting return on a default search seems to track as “google.unknown”, whereas clicking a non-default search provider shows it as “google.oneoff” 

"search-oneoff": {
  "google.unknown": {
    "key": {
      "current": 5
    }
  },
  "google.oneoff": {
    "mouse": {
      "current": 5
    }
  }
}
Thanks Stuart, some questions to be sure here.

(In reply to Stuart Philp :sphilp from comment #2)
> General steps:
> Open nightly, starting a new session
> Perform n searches for x,y,z search engines
> Shutdown and restart nightly (to force telemetry data to send)
> Use the follow script to compare V2/V4 quickly:
> https://gist.github.com/georgf/1b0831a6b81b6c9fe240
> Manually examine the property values for any quirks if they are flagged by
> the tool

Did you only use the searchbar?
We have some different search sources, e.g.:
* abouthome
* contextmenu
* searchbar
* urlbar
* ... others?

Per [0] they should all be counted equivalently in FHR, Telemetry & UITelemetry, but thats just the theory.

> Changing environment settings (like default search provider) before
> telemetry has started, does not seem to track an environment-change event,
> rather once telemetry starts it only shows SEARCH_COUNTS for the new default
> search provider in one ping (10 shutdown and saved-sessions, as opposed to
> 10 saved-session, 5 shutdown, 5 environment-change)

Not triggering an environment-change event before Telemetry start is intentional.
It should still track the search counts correctly though, even before the delayed Telemetry start - does it?
> 
> In the “UITelemetry” field for "oneoff" searches, hitting return on a
> default search seems to track as “google.unknown”, whereas clicking a
> non-default search provider shows it as “google.oneoff” 
> 
> "search-oneoff": {
>   "google.unknown": {
>     "key": {
>       "current": 5
>     }
>   },
>   "google.oneoff": {
>     "mouse": {
>       "current": 5
>     }
>   }
> }

Good find, that is a bug.

One-off is for clicking the icons for other searches in the searchbar popup?
Do we just count those all as "other" in FHR/Telemetry? Or do they show up as "searchbar" or something?

0: https://dxr.mozilla.org/mozilla-central/rev/90d9b7c391d38ae118865bd87b5d011feee6dded/browser/base/content/browser.js#3676
(In reply to Georg Fritzsche [:gfritzsche] from comment #3)
> We have some different search sources, e.g.:
> * abouthome
> * contextmenu
> * searchbar
> * urlbar
> * ... others?

"abouthome"
"contextmenu"
"newtab"
"searchbar"
"urlbar"
... other
I also started a scratchpad to generate searches, which lead to digging into the code a bit, there appears to be two functions that initiate search count recording

browser.js::BrowserSearch.recordSearchInTelemetry and browser.js::BrowserSearch.recordSearchInHealthReport

The former only saves V4, the latter both V2 and V4. It looks like the latter is used in most cases, except for "keyword-search" events, which you can see in nsBrowserGlue.js:415

Aside from that (at least at the level I'm running searches at) they seem identical over larger runs (10000+).
Searches before telemetry has started do track fine, and I did try each of the types (some more than others) and they all added up correctly. Oneoff's are counted as {{engine}}.searchbar in the histogram.
(In reply to Stuart Philp :sphilp from comment #5)
> I also started a scratchpad to generate searches, which lead to digging into
> the code a bit, there appears to be two functions that initiate search count
> recording
> 
> browser.js::BrowserSearch.recordSearchInTelemetry and
> browser.js::BrowserSearch.recordSearchInHealthReport
> 
> The former only saves V4, the latter both V2 and V4. It looks like the
> latter is used in most cases, except for "keyword-search" events, which you
> can see in nsBrowserGlue.js:415

While this is odd, this is equivalent to calling recordSearchInHealthReport:
https://dxr.mozilla.org/mozilla-central/rev/90d9b7c391d38ae118865bd87b5d011feee6dded/browser/components/nsBrowserGlue.js#415

It first calls recordSearchInTelemetry / v4, then explicitly records the search in FHR / v2, so this should count equivalently in both v2 & v4.
Its only odd that it doesn't use the existing helper.
(In reply to Stuart Philp :sphilp from comment #2)
> In the “UITelemetry” field for "oneoff" searches, hitting return on a
> default search seems to track as “google.unknown”, whereas clicking a
> non-default search provider shows it as “google.oneoff” 
> 
> "search-oneoff": {
>   "google.unknown": {
>     "key": {
>       "current": 5
>     }
>   },
>   "google.oneoff": {
>     "mouse": {
>       "current": 5
>     }
>   }
> }

Filed bug 1195733.
(In reply to Georg Fritzsche [:gfritzsche] from comment #7)
> It first calls recordSearchInTelemetry / v4, then explicitly records the
> search in FHR / v2, so this should count equivalently in both v2 & v4.
> Its only odd that it doesn't use the existing helper.

Ah yup, I see that now. 

Okay so my scratchpad script is https://gist.github.com/stuartphilp/d48d4f13b97bbfc3fdc7

With larger numbers (change maxSearches to 1000+) I can occasionally get some inconsistencies with the V2/V4 comparison script -  see screen shot. This may be due to the script not executing entirely correctly, but could also be that failed or rapid requests are not saved?
Is this over one session or multiple ones?

I've poked around locally and found:
* to get mismatches, i need to increase the number to be so high that i get beachballs / slow script dialogs (e.g. 10000)
* now when i try to shutdown, FHR causes AsyncShutdown timeouts (probably too many async storage tasks queued?)
* that timeout happens before Telemetry saves its pings on shutdown, so i lost the Telemetry data
* if i trigger Telemetry storing a ping before shutdown (e.g. disabling an addon/plugin), then the Telemetry data looks fine, but i might lose the FHR data

The script is blocking the main-thread the whole time, which can just create other issues; waiting a little (at least 1ms) between the calls would probably be much saner.
Whiteboard: [unifiedTelemetry] → [unifiedTelemetry][data-validation]
closing - as per Msr Philp "normal use it was fine and any inconsistency I saw seemed to be due to my script and not the counts"
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: