Closed Bug 1848201 Opened 2 years ago Closed 1 year ago

Introduce API to allow setting an experimentation ID

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: csadilek, Assigned: travis_)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

[mozilla/glean] Bug 1848201 - Introduce API to allow setting an experimentation ID (#2606) 1 years ago BMO Github Automation 42 bytes, text/x-github-pull-request		Details \| Review
Data Collection Request 1 years ago Travis Long [:travis_] 2.71 KB, text/plain	chutten : data-review+	Details
[mozilla/glean] Bug 1848201 (v2) - Add an API to set an experimentation ID (#2615) 1 years ago BMO Github Automation 42 bytes, text/x-github-pull-request		Details \| Review

Christian Sadilek [:csadilek]

Reporter

Description

•

2 years ago

•

Edited

We're in the process of integrating Nimbus into many more of our applications, following the new Nimbus on the Web architecture, which relies on server-side integration and a request to Cirrus. This request carries a unique ID and targeting context, and returns a set of active features. The unique ID cannot be Glean's client_id, because it is not currently exposed to consuming applications. It is also not desirable to expose Glean's client_id due to data integrity concerns, and to prevent accidental misuse.

Applications therefore need to generate (or derive) a new unique ID used for experimentation/enrolment, or rely on an existing one such as the Firefox Account ID. Defining this ID in our applications is desired anyway as it allows for more advanced use cases such as running a single experiment across multiple applications.

This then prompts the requirement to include the new experimentation ID in all recorded Glean events. Otherwise, partitioning of data (or experiment analysis more generally) becomes impossible. In our Firefox clients, which rely on client-side Nimbus integration, the Nimbus SDK calls Glean.setExperimentActive(experiment) to achieve this connection. However, in this new integration scenario the clients are unaware of experiment details.

In our discussions, we concluded that adding some minimal new API surface to the Glean SDK would be ideal. If we left it up to each individual client development team to "manually" add this ID to all existing and future events, we would very likely end up with diverging names and implementation gaps, which would negatively impact data quality and therefore impede experimentation and analysis.

The API discussed so far was a simple call e.g., glean.set_experimentation_id(id) to be used on the client.

We will have a follow-up discussion to verify if we need server-side API, but felt that divergence is less of a concern there e.g., it's much easier to fix and roll out changes. NB: Since server-side logic for recording events will likely run in the context of a user session, we can't rely on a "global" call to glean.set_experiment_id. The client-side solution seems simple enough to address our biggest concerns, and we will discuss and file a follow-up enhancement for the server, if needed.

Travis Long [:travis_]

Assignee

Updated

•

2 years ago

Assignee: nobody → tlong

Priority: -- → P1

Travis Long [:travis_]

Assignee

Comment 1

•

2 years ago

Christian, I have a couple of follow up questions in regards to this.

First, what is the expected persistence of this information? Will the application set this with every execution or is it expected that Glean will persist this information once it is set?

Secondly, what is the expectations around the format of the identifier? Will this always be a UUID or does there need to be more flexibility for other forms of identifiers?

Flags: needinfo?(csadilek)

Christian Sadilek [:csadilek]

Reporter

Comment 2

•

2 years ago

First, what is the expected persistence of this information? Will the application set this with every execution or is it expected that Glean will persist this information once it is set?

Outside of any pending pings, I don't see a need for Glean to store this ID separately. I think client applications should set this ID as part of Glean initialization on startup / on load.

Secondly, what is the expectations around the format of the identifier? Will this always be a UUID or does there need to be more flexibility for other forms of identifiers?

I think we should keep this more flexible. We have use cases for experiments running across multiple applications where we'll perhaps use (or derive) an ID from the Firefox Account. Would it be acceptable for this ID to just be a String? Looks like Cirrus defines it as a String too.

Please let me know if you disagree or have any concerns. Happy to discuss more!

Flags: needinfo?(csadilek)

Travis Long [:travis_]

Assignee

Comment 3

•

2 years ago

Thanks Christian! I think that answers my questions and should be everything I need to know as I'm working on the implementation of this.

Travis Long [:travis_]

Assignee

Updated

•

1 years ago

Blocks: 1850323

BMO Github Automation

Comment 4

•

1 years ago

Attached file [mozilla/glean] Bug 1848201 - Introduce API to allow setting an experimentation ID (#2606) — Details

Travis Long [:travis_]

Assignee

Updated

•

1 years ago

Blocks: 1850479

Travis Long [:travis_]

Assignee

Comment 5

•

1 years ago

Attached file Data Collection Request — Details

Attachment #9350600 - Flags: data-review?(chutten)

Chris H-C :chutten

Comment 6

•

1 years ago

Comment on attachment 9350600 [details]
Data Collection Request

PRELIMINARY NOTES:

As an identifier, will this be sent with other identifiers? Will this bridge Cat3+ data with identifiers that aren't allowed to be used to reach that data?

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection can be controlled through the product's preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Travis Long is responsible.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical.

Is the data collection request for default-on or default-off?

Default on for all channels.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes.

Does the data collection use a third-party collection tool?

No.

Result: datareview+

Attachment #9350600 - Flags: data-review?(chutten) → data-review+

BMO Github Automation

Comment 7

•

1 years ago

Attached file [mozilla/glean] Bug 1848201 (v2) - Add an API to set an experimentation ID (#2615) — Details

Travis Long [:travis_]

Assignee

Updated

•

1 year ago

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Introduce API to allow setting an experimentation ID

Categories

(Data Platform and Tools :: Glean: SDK, enhancement, P1)

Tracking

(Not tracked)

People

(Reporter: csadilek, Assigned: travis_)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(3 files)

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Updated

Attachment

General

Description

File Name

Content Type