Closed Bug 1634755 Opened 5 years ago Closed 3 months ago

Implement a hash_id in About:Welcome telemetry to reflect content

Categories

(Firefox :: Messaging System, enhancement, P2)

enhancement

Tracking

()

RESOLVED INVALID

People

(Reporter: shong, Unassigned)

References

Details

Summary:

The current way that the message_id is being generated in about:welcome experiments by joining the experiment slug and the branch name, and using that as the message identifier.

ex:

experiment_id: SIMPLIFIED_ABOUT_WELCOME_AW_PULL_FACTOR_PRIVACY_1
branch_id: VARIANT_1
message_id: SIMPLIFIED_ABOUT_WELCOME_AW_PULL_FACTOR_PRIVACY_1_VARIANT_1

This is problematic for a number of reasons. Message_id should be reflecting the content of what the user saw, regardless of what experiment they were in.

The most obvious example of how this can fail is the following case:

  • We want to deploy an experiment with two treatment branches:
    {branch: treatment1, about_welcome_params: A, B, C}
    {branch: treatment2, about_welcome_params: X, Y, Z}

During deployment, due to human error like a copy and paste mistake, we accidentally give both branches the same parameters, and users in both branches end up seeing the same message:
{branch: treatment1, about_welcome_params: A, B, C}
{branch: treatment2, about_welcome_params: A, B, C}

When this happens, what we want to happen is, we look at the impression rates of about:welcome, and see that both branches are seeing the same message, and we realize that something went wrong.

What will happen with the current way message_id is being defined is, we'll see users in one branch getting impressions from exp_treatment1, and users from the other branch getting impressions from exp_treatment2, and it will appear to us that they saw different messages, when in fact the content was exactly the same (this is bad).


How message_id should be used:
Message IDs need to be a 1 to 1 mapping with the message content, independent of any experiment conditions. Meaning:

  • Message ID for the same content should be consistent across experiments (as well as users not in experiments). Meaning if users, in any branch of any experiment, or outside an experiment, sees the same message content, then they should have the same message id.
  • Message IDs must be unique to the content. Meaning, if the content changes for the message, the message id should be different then before.

Proposed Fix: Hash ID

For about:welcome messages, define the message_id as a hash of the input parameters for the message. Since the content, behavior, and identity of the about:welcome message is now defined by parameters:

ex:

         "value": {
            "cards": [
              {
                "id": "TRAILHEAD_CARD_12",
                "content": {
                  "icon": "pledge",
                  "text": {
                    "string_id": "onboarding-personal-data-promise-text"
                  },
                  "title": {
                    "string_id": "onboarding-personal-data-promise-title"
                  },
                  "primary_button": {
                    "label": {
                      "string_id": "onboarding-personal-data-promise-button"
                    },
                    "action": {
                      "data": {
                        "args": "https://www.mozilla.org/firefox/privacy/",
                        "where": "tabshifted"
                      },
                      "type": "OPEN_URL"
                    }
                  }
                }
              },

use those parameters (hashed in some way to compress the size) as the identifier for what the content delivered was. That should fulfill the above message_id requirements (will always be a 1 to 1 mapping, and be consistent over time / experiments).

Blocks: 1630456
Blocks: x-man
Type: defect → enhancement
Iteration: --- → 78.1 - May 4 - May 17
Priority: -- → P2
Iteration: 78.1 - May 4 - May 17 → 78.2 - May 18 - May 31

To capture conversation with Kate Hudson on May 21st 2020:

The goal of this hash_id is to differentiate between messages with the same message_id but different parameters.

So for example, if we were previously running a CFR called CRYPTOMINERS_PROTECTION defined as:

{
  "weight": 100,
  "content": ...,
  "trigger": ...,
  "priority": ...,
  "template": "cfr_doorhanger",
  "frequency": ...,
  "targeting": "pageLoad >= 4",
  "id": "CRYPTOMINERS_PROTECTION",
  "last_modified": ...,
  "_status": ...,
  "provider": ...
}

and we changed the targeting to be pageLoad >= 10, I want to be able to distinguish when users are exposed to the CFR with the old targeting parameter, versus when the user is exposed to the CFR with the new targeting parameter. The message id will stay the same, and currently there's no way for me to recognize this.

So outside of an experiment context, I want to be able to distinguish between messages whose content / target / etc (the parameters of the message) are different, because message_id is not currently capturing this.

Within an experiment context, I want to be able to identify exactly which messages that a user was exposed to (including messages with the same message_id but with different content / parameters).

Iteration: 78.2 - May 18 - May 31 → 79.1 - June 1 - June 14
Iteration: 79.1 - June 1 - June 14 → 79.2 - June 15 - June 28
Iteration: 79.2 - June 15 - June 28 → 80.1 - June 29 - July 12
Iteration: 80.1 - June 29 - July 12 → 80.2 - July 13 - July 26
Iteration: 80.2 - July 13 - July 26 → ---
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.