Closed
Bug 1426170
Opened 8 years ago
Closed 8 years ago
Add engine level search counts to clients_daily
Categories
(Data Platform and Tools :: General, enhancement, P2)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: harter, Assigned: mreid)
References
Details
We want to be able to analyze search_counts in clients_daily.
After discussing with Dave in Austin, I suggest we add columns for total-SAP search_counts for each of the major engines (Google, Bing, & Yahoo). This deviates from our original plan of adding `engine` as a key to clients_daily. Adding `engine` to clients daily will cause there to be duplicate data for several fields (e.g. URIs).
This will unblock BD's current analysis needs. Eventually, we'd like to include a nested search_counts structure to include all engines.
To be clear, let's add five new columns to clients daily: (search_total, search_google, search_bing, search_yahoo, search_google_nocodes). Google searches should include all searches with engine in ('google', 'google-2018').
We should only include search_counts with a `source` in the whitelist maintained in SEARCH_SOURCE_WHITELIST: https://github.com/mozilla/python_mozetl/blob/master/mozetl/constants.py#L5
The goal is to have this implemented by EOM January 2018
Reporter | ||
Comment 1•8 years ago
|
||
This is no longer necessary. I talked with arana today and it sounds like BD does not need the additional metrics included in clients_daily. Instead, it sounds like it would be more useful to add client_id as a column to search_aggregates. This comes with the advantage of including all search providers without having duplicate data (as noted above).
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Updated•3 years ago
|
Component: Datasets: General → General
You need to log in
before you can comment on or make changes to this bug.
Description
•