Closed Bug 1535717 Opened 6 years ago Closed 6 years ago

Add SPOC Fill telemetry event to FX68

Categories

(Firefox :: New Tab Page, enhancement, P1)

67 Branch
enhancement

Tracking

()

VERIFIED FIXED
Firefox 68
Iteration:
68.4 - Apr 29 - May 12
Tracking Status
firefox67 --- wontfix
firefox68 --- verified

People

(Reporter: kdemtchouk, Assigned: nanj, Mentored)

References

Details

(Keywords: github-merged)

Attachments

(3 files)

Nan, Nick and I will put something on the cal to discuss this with you next week!

Implement a new AS telemetry event that records whether or not a SPOC was loaded. If it was not loaded, include the reason why.

Fields:
-impression_id
-page, any of about:home, about:newtab, about:welcome etc.
-source, aka section ID.
-preferences
-experiment information
-0/1 (SPOC was loaded or not)

Possible reasons:
-Did not meet SPOC targeting requirements

-Would exceed SPOC frequency cap
--By Day
--By Campaign / Lifetime

-User opted out of SPOCs
-User opted out of Pocket Recs
-Custom Homepage
-Not en-US (or appropriate language)
-Not US (or appropriate locale)

Blocks: 1512725
Iteration: --- → 68.2 - Apr 1 - 14
Priority: -- → P1
No longer blocks: 1512725

Nan - is this work happening in 1530740?

Flags: needinfo?(najiang)

(In reply to Jessilyn Davis from comment #1)

Nan - is this work happening in 1530740?

This one is different from bug 1530740 (already merged on Github). Will pick it up in this iteration or next once I wrap up the remote CFR messages for 68.

Flags: needinfo?(najiang)

Awesome - Thanks, Nan! \o/

Assignee: nobody → najiang
Blocks: 1542867

Requires data scheme change on both server (Danny & Kirill) and client side (Nan & potentially Scott).

Goal: knock it out this week.

Iteration: 68.2 - Apr 1 - 14 → 68.3 - Apr 15 - 28

For the clarification, here is the brief description of the life cycle of SPOCS.

SPOCS starts from the content downloading from the Pocket endpoint, this currently happens once every 30 minutes. Then the downloaded SPOCS set will go through following filters:

  • Check against the frequency caps, will be dropped if it's beyond the impression cap.
  • Check if it's already blocked by that user.
  • Calculate the score based on user's profile, will be dropped if it's less than the minimal score specified by the server.
  • Dedup SPOCS at the campaign level, only one SPOC gets chosen for each campaign.

All the SPOCS, which passed the aforementioned filters and sorted descendingly by its score, form the candidate SPOCS pool. They will be choose as the SPOC(s) when the user opens newtab page.

When the user opens a newtab:

  • Probability selection. (no effect currently, could be controlled remotely)
  • Check the position availability, serve SPOC(s) accordingly.

If any SPOC(s) was displayed, another round of frequency capping will be applied to the SPOCS candidate pool, which means some items in the pool might get dropped.
If the user blocked a SPOC, it also got dropped.

It will repeat for all the newtabs until a new content set gets downloaded from the endpoint.

Thanks, Nan! Here are some example events for a couple test scenarios.

Common fields:
-client_id (the same client_id as in assa_impression_stats_daily)
-spoc_calculation_id (a unique identifier for all events fired during the course of one calculation/"life cycle" as you describe it above)
-tile_id
-displayed (boolean for whether SPOC was displayed)
-reason (can store this as a string or as an int, but need a unique value for each bullet point above)
-full_recalc (boolean for whether the full SPOC calculation was performed or SPOCs were just picked from the candidate SPOCs pool)

Fire this event for every new tab open.

--Scenario A (Initial calculation, No SPOCs Displayed)

client_id|spoc_calculation_id|tile_id|displayed|reason|full_recalc

999......|1888...............|1......|0........|frequency_cap|1
999......|1888...............|2......|0........|blocked_by_user|1
999......|1888...............|3......|0........|below_min_score|1
999......|1888...............|4......|0........|campaign_duplicate|1
999......|1888...............|5......|0........|probability_selection|1

--Scenario B (Initial calculation, 1 SPOC Displayed)

client_id|spoc_calculation_id|tile_id|displayed|reason|full_recalc

999......|1889...............|1......|0........|frequency_cap|1
999......|1889...............|2......|0........|blocked_by_user|1
999......|1889...............|3......|0........|below_min_score|1
999......|1889...............|4......|0........|campaign_duplicate|1
999......|1889...............|5......|1........|NULL|1

--Scenario C (SPOC 5 is blocked and calculation is re-started)

client_id|spoc_calculation_id|tile_id|displayed|reason|full_recalc

999......|1890...............|5......|0........|blocked_by_user|0

--Scenario C (SPOC 5 is seen and calculation is re-started, then SPOC 5 cannot be shown due to frequency cap)

client_id|spoc_calculation_id|tile_id|displayed|reason|full_recalc

999......|1891...............|5......|0........|frequency_cap|0

Would it be possible to do something like that?

Yes, we can structure it as described above.

I'd recommend to add a new database table for this ping, since the current impression_stats table wasn't designed for this payload.

Kirill, what do you think?

Works for me! Thanks, Nan!

Schema proposal for this metric:

name type description
impression_id string a guid for each profile
locale string locale string
version string browser version
addon_version string version of Discovery Stream
shield_id string a semicolon separated string to store a list of Shield Study IDs
release_channel string nightly, aurora, beta, release
spoc_fills array an array of entries whose structure described below

Spoc_fill structure:

name type description
id integer a.k.a tile_id
displayed integer 0: not displayed; 1: displayed
reason string any of frequency_cap, blocked_by_user, below_min_score, campaign_duplicate, probability_selection
full_recalc integer 0: non-recalculation; 1: recalculation

Other fields will be automatically populated by the server:

name type description
submission_timestamp timestamp
country_code string

Looks good to me! Great work, Nan!

Hey Kenny,

Can you do a data review for this one? Here is the document for this ping.

Thanks!

Attachment #9060412 - Flags: data-review?(kenny)
Comment on attachment 9060412 [details] data_review_request_spocs_fill.txt data-review+ 1) Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes: https://github.com/mozilla/activity-stream/pull/4928/files#diff-f92d07e23fd9de1eb2f4ad96bd041287 2) Is there a control mechanism that allows the user to turn the data collection on and off? Yes, the user can opt-out the data collection by either disabling the telemetry of Activity Stream or disabling the Firefox telemetry as a whole. 3) If the request is for permanent data collection, is there someone who will monitor the data over time? Yes. Kirill Demtchouk will be monitoring these metrics. 4) Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 2 5) Is the data collection request for default-on or default-off? Default-on 6) Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)? No, all identifiers used are pre-existing. 7) Is the data collection covered by the existing Firefox privacy notice? Yes 8) Does there need to be a check-in in the future to determine whether to renew the data? No 9) Does the data collection use a third-party collection tool? No
Attachment #9060412 - Flags: data-review?(kenny) → data-review+
Iteration: 68.3 - Apr 15 - 28 → 68.4 - Apr 29 - May 12
Blocks: 1548388
Status: NEW → RESOLVED
Closed: 6 years ago
Keywords: github-merged
Resolution: --- → FIXED
Target Milestone: --- → Firefox 68

I have verified this issue with the latest Firefox Nightly (68.0a1 Build ID - 20190503041749) installed, on Windows 10 x64, Arch Linux and Mac 10.14.4. Now, the following ping is also displayed in the "Browser Console":

"TELEMETRY PING (STRUCTURED INGESTION): {"locale":"en-US","client_id":"n/a","version":"68.0a1","release_channel":"nightly","addon_version":"20190503041749","user_prefs":255,"spoc_fills":[{"id":36830,"reason":"n/a","displayed":1,"full_recalc":0},{"id":29287,"reason":"n/a","displayed":1,"full_recalc":0}],"impression_id":"{fc8cf477-da80-8c40-ba47-81bc7fff6e63}","session_id":"n/a"}".

Status: RESOLVED → VERIFIED
Blocks: 1548952
Status: VERIFIED → RESOLVED
Closed: 6 years ago6 years ago
Status: RESOLVED → VERIFIED
Component: Activity Streams: Newtab → New Tab Page
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: