Closed Bug 1737923 Opened 3 years ago Closed 3 years ago

Add quick suggest probe that records fallbacks from Merino to remote settings

Categories

(Firefox :: Address Bar, task, P1)

task
Points:
3

Tracking

()

VERIFIED FIXED
96 Branch
Iteration:
96.1 - Nov 1 - Nov 14
Tracking Status
firefox94 + verified
firefox95 --- verified
firefox96 --- verified

People

(Reporter: adw, Assigned: adw)

References

Details

Attachments

(2 files)

Basically, the desired measure is the fraction of Merino fallbacks: e.g., # of Merino fallback/# of Merino requests. I believe this can be achieved by adding a keyed scalar probe that has keys (success, fallback), with counter values.

Will want to know the amount of times (count) merino was successful and amount of time failed back.

There is a discussion around how to implement it. Ideal if let merino finish and the probe captures the time - though fall back to remote setting will happen at a set time.

mythmon says "fallback" means when we request suggestions from Merino for a query but end up using remote settings for that query.

Corey, I'm going to need a tighter definition of "fallback" in order to implement this. CC'ing mythmon too. There are several cases where we would fall back from Merino to remote settings:

  1. The Merino request times out
  2. There's a network error connecting to Merino, e.g., the user's internet is down or the Merino server is down
  3. The Merino request completes successfully but the server returns an HTTP error, e.g., the server is misconfigured or buggy
  4. The Merino request completes successfully without an HTTP error but it doesn't return a suggestion (i.e., there's no matching suggestion) -- but in this case, does "fallback" depend on whether remote settings returns a suggestion?
  5. The Merino request completes successfully without an HTTP error and returns a suggestion but the suggestion's score is lower than the remote settings suggestion score

One other question: Bug 1737928 adds the timeout mechanism, and as part of that I'm adding a keyed scalar probe to record the number of timeouts. I guess if "fallback" just means timeouts, then that would be enough? But it doesn't record the number of successes, which we would also want? If fallback does not mean timeouts, then would a separate timeout probe still be useful?

Flags: needinfo?(cdowhygelund)
Iteration: 95.2 - Oct 18 - Oct 31 → 96.1 - Nov 1 - Nov 14
Depends on: 1737928

Drew, would it be too difficult to add the keys correspond to these different cases? This would allow for more granular analyses.

However, if this is not possible, I believe 1-3 should be lumped together as "fallback" insofar as telemetry is concerned. Cases 4 and 5 aren't really a fallback, as they don't correspond to a Merino failure, rather the nature of suggestions returned.

To answer your second question, having the # of successes is important for comparison purposes. I believe it would be more useful to have all of this information in this single room, with different keys:

  • success
  • timeout
  • network_error
  • http_error

In this case a separate timeout probe is redundant.

Flags: needinfo?(cdowhygelund)

That sounds good, we can do that, thanks Corey. I'll go with the four keys you suggested.

This adds a new categorical histogram called FX_URLBAR_MERINO_RESPONSE. There
are four categories per the discussion in the bug:

0: success
1: timeout
2: network_error
3: http_error

Only one value is recorded per fetch, so for example if Merino times out but
then later finishes successfully, we only record the timeout.

Depends on D129772

Attached file request.md

Data review request for the FX_URLBAR_MERINO_RESPONSE categorical histogram

Attachment #9249825 - Flags: data-review?(cdowhygelund)
Attachment #9249825 - Flags: data-review?(cdowhygelund) → data-review+

request.md

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes, it will be available with other telemetry on DTMO.

Is there a control mechanism that allows the user to turn the data collection on and off?

Clients may use the Firefox telemetry opt-out mechanism.

If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Drew Willcoxon Contexual Services team.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction data

Is the data collection request for default-on or default-off?

Default on for all channels.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes

Does the data collection use a third-party collection tool?

No


Result: datareview+

Pushed by dwillcoxon@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ef51860f3668
Add a telemetry histogram for recording Merino response categories. r=nanj
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 96 Branch

@Drew, I have started to create test cases for Merino and also wanted to cover this new histogram. In order to cover and verify this bug, I have the following scenarios:

0: success

  • Enable Merino and trigger a Sponsored/Non-Sponsored result. The value of "0" column increases.

1: timeout

  • Enable Merino and change the "browser.urlbar.merino.timeoutMs" pref to "1". After triggering a Sponsred/Non-Sponsored, the value of "1" column increases.

2: network_error

  • Enable Merino then disable the internet connection. After triggering a Sponsred/Non-Sponsored, the value of "2" column increases.

3: http_error

  • Enable Merino then change the "browser.urlbar.merino.endpointURL" pref to a invalid endpoint. After triggering a Sponsred/Non-Sponsored, the value of "3" column increases.

Can you please let me know if these are the correct scenarios in order to verify this histogram? Are there any other scenarios that we should cover for this?
I have noticed on the Histogram that there is a fourth "4" column in some cases, but I never saw a value for it (see this screenshot). Is there another scenarios for this and when it is triggered?

Flags: needinfo?(adw)

Thanks Cosmin.

(In reply to Cosmin Muntean [:cmuntean], Ecosystem QA from comment #9)

@Drew, I have started to create test cases for Merino and also wanted to cover this new histogram. In order to cover and verify this bug, I have the following scenarios:

0: success

  • Enable Merino and trigger a Sponsored/Non-Sponsored result. The value of "0" column increases.

Yes

1: timeout

  • Enable Merino and change the "browser.urlbar.merino.timeoutMs" pref to "1". After triggering a Sponsred/Non-Sponsored, the value of "1" column increases.

Yes

2: network_error

  • Enable Merino then disable the internet connection. After triggering a Sponsred/Non-Sponsored, the value of "2" column increases.

This one might be hard to trigger. Your STR here are good, but the problem is the timeout might happen before the network error. If you set browser.urlbar.merino.timeoutMs to a very large value like 30000 (30 seconds) then use your STR, that should work.

3: http_error

  • Enable Merino then change the "browser.urlbar.merino.endpointURL" pref to a invalid endpoint. After triggering a Sponsred/Non-Sponsored, the value of "3" column increases.

That would trigger network_error, not http_error. In order to test this, you need the Merino server to return a non-200 (non-success) response. In other words you would need to modify the server somehow, or you could set up any local web server that returns a 500 response and set its URL to browser.urlbar.merino.endpointURL. TBH I'm not sure it's worth your time to do that, so IMO you can skip verifying this value.

Are there any other scenarios that we should cover for this?

That's it, thanks.

I have noticed on the Histogram that there is a fourth "4" column in some cases, but I never saw a value for it

There isn't a 4th value, only 0-3, so you can ignore it. I think the extra value is just a consequence of how histograms are implemented, but I'm not sure.

Flags: qe-verify+
Flags: needinfo?(adw)
Flags: in-testsuite+

Comment on attachment 9249525 [details]
Bug 1737923 - Add a telemetry histogram for recording Merino response categories.

Beta/Release Uplift Approval Request

  • User impact if declined: We need this for the Firefox Suggest preferences redesign targeting 95/94.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Please see comment 9 and 10
  • List of other uplifts needed: Please see uplift spreadsheet: https://docs.google.com/spreadsheets/d/1LavihS-VOPFYEyum7mrx6FKXmuQeHi9xQHfGNSxjnoY/edit?usp=sharing
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This only adds some new telemetry related to Merino client integration, which is disabled for all users and will only be enabled in a future Merino rollout.
  • String changes made/needed:
Attachment #9249525 - Flags: approval-mozilla-beta?

Comment on attachment 9249525 [details]
Bug 1737923 - Add a telemetry histogram for recording Merino response categories.

Approved for 95.0b5.

Attachment #9249525 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

3: http_error

  • Enable Merino then change the "browser.urlbar.merino.endpointURL" pref to a invalid endpoint. After triggering a Sponsred/Non-Sponsored, the value of "3" column increases.

That would trigger network_error, not http_error. In order to test this, you need the Merino server to return a non-200 (non-success) response. In other words you would need to modify the server somehow, or you could set up any local web server that returns a 500 response and set its URL to browser.urlbar.merino.endpointURL. TBH I'm not sure it's worth your time to do that, so IMO you can skip verifying this value.

@Drew, I have managed to trigger the "3: http_error" by setting the "browser.urlbar.merino.endpointURL" to "https://stage.merino.nonprod.cloudops.mozgcp.net/api/v1/suggest1". I have added only the "1" number at the end of the endpoint. Indeed if I change the endpoint to something invalid like "https://www.test.com" the " http_error" is not triggered.
It is ok if we use this scenario to verify this?

Flags: needinfo?(adw)

Oh, good idea, yeah that's good. Adding a 1 at the end causes the response to be a 404, which triggers http_error. Thanks Cosmin!

Flags: needinfo?(adw)

We have verified this bug on the latest Nightly 96.0a1 build (Build ID: 20211109190508) and the latest Beta 95.0b5 (Build ID: 20211109194756) on Windows 10 x64, macOS 10.15.7 and Ubuntu 20.04 x64.

Status: RESOLVED → VERIFIED
Flags: qe-verify+

[Tracking Requested - why for this release]: We need this for the Firefox Suggest preferences redesign targeting 95/94.

Comment on attachment 9249525 [details]
Bug 1737923 - Add a telemetry histogram for recording Merino response categories.

Beta/Release Uplift Approval Request

  • User impact if declined: We need this for the Firefox Suggest preferences redesign targeting 95/94.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Please see comment 9 and 10
  • List of other uplifts needed: Please see uplift spreadsheet: https://docs.google.com/spreadsheets/d/1LavihS-VOPFYEyum7mrx6FKXmuQeHi9xQHfGNSxjnoY/edit?usp=sharing
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This only adds some new telemetry related to Merino client integration, which is disabled for all users and will only be enabled in a future Merino rollout.
  • String changes made/needed:
Attachment #9249525 - Flags: approval-mozilla-release?
Flags: qe-verify+

We have verified this bug on Firefox 94.0.2 try build on Windows 10 x64, macOS 10.15.7 and Ubuntu 20.04 x64.

Comment on attachment 9249525 [details]
Bug 1737923 - Add a telemetry histogram for recording Merino response categories.

Approved for 94.0.2.

Attachment #9249525 - Flags: approval-mozilla-release? → approval-mozilla-release+

We have verified this bug on Firefox 94.0.2 candidate build (Build ID: 20211117154346) on Windows 10 x64, macOS 10.15.7 and Ubuntu 20.04 x64.

Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: