Measure the edit distance when selecting a search suggestion

NEW
Unassigned

Status

()

defect
P3
normal
2 years ago
2 years ago

People

(Reporter: past, Unassigned)

Tracking

(Blocks 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox54 affected)

Details

(Whiteboard: [fxsearch])

Attachments

(1 attachment)

When the user selects a search suggestion, we want to measure the edit distance (Levenshtein distance) between the selected search suggestion and the typed string. We want the same measurement for the search bar as well.
This could be implemented using an exponential histogram.

Updated

2 years ago
Assignee: nobody → adw
Status: NEW → ASSIGNED
Priority: -- → P1
This is an interesting metric, and I want to make sure we choose an optimal binning that maximizes the interpretability of the collected data.

I've been thinking about what the edit distance would look like for search suggestions. Levenshtein distance is at least the difference in string lengths, and at most the length of the longer string. Some example use cases for search suggestions:

- suggestion is a spelling-corrected version of the typed query (distance is <= 5? in absolute terms)
- suggestion is the completion of a longer phrase from a shorter typed prefix (distance is larger (10-100?), depends on length of suggestion string)

This could be captured in an exponential histogram, which has more granular buckets for smaller values. We'd want to make sure 0 is its own bucket, since this is an interesting special case.

An alternative scheme would be to normalize the edit distance by the length of the longer string, and report the value as a percentage. This lets us use a linear histogram with potentially more granularity. However, then we lose the absolute value of the metric.

Ideally, we'd record the pair of measurements (query length, edit distance from suggestion). For example, we could key the edit distance histogram by binned query length (short/medium/long?). However, this might be overkill.

Any thoughts?
If we'd record the pair of measurements, would we need both this and bug 1334611?
(In reply to Panos Astithas [:past] from comment #3)
> If we'd record the pair of measurements, would we need both this and bug
> 1334611?

Yes, since the measurements here would only apply to cases where a search suggestion was selected.

That's why I think that if we go with this keyed approach, at least for now, we don't need to worry about getting too granular with the query lengths. I was thinking of this mainly as a way to add more context to the edit distance metric.
Comment on attachment 8848707 [details]
Bug 1334630 - Measure the edit distance when selecting a search suggestion.

Dave, could you please give me feedback on this patch?  In particular I need help choosing bucket sizes, etc.  I'll describe what it does:

* Adds a new FX_SEARCH_SUGGESTION_EDIT_DISTANCE histogram:

  "FX_SEARCH_SUGGESTION_EDIT_DISTANCE": {
    "expires_in_version": "58",
    "kind": "exponential",
    "keyed": true,
    "high": 200,
    "n_buckets": 20,
    ...
  }

Are the high and n_buckets values OK?  200 seems way high but also a not-too-unreasonable upper bound.

According to the telemetry docs and my experience testing this patch, we'll get a default low value of 1 with this, which means that we will get a separate zero bucket (which would also capture negative values if we had any, but we shouldn't of course), as you suggested.

* I chose some keys based on the last paragraph in your comment 2.  The keys are:

    "6" => queries of length  (0,  6]
   "12" => queries of length  (6, 12]
   "24" => queries of length (12, 24]
   "48" => queries of length (24, 48]
  ">48" => queries of length > 48

Is this too many keys, the right mix, etc.?
Attachment #8848707 - Flags: feedback?(dzeber)
Priority: P1 → P3
Comment on attachment 8848707 [details]
Bug 1334630 - Measure the edit distance when selecting a search suggestion.

For the keys, how about we do:

"short" => queries of length  (0,  3]
"medium" => queries of length  (3, 10]
"long" => queries of length > 10

I think three buckets for query length is sufficient for now, and we can revisit this if we decide this metric should become more central to evaluating search suggestions. I'm especially interested in separating out the case where a search suggestion completes a short prefix of a few characters from cases involving longer typed queries (eg. a spelling correction). I think 3 is a good cutoff for the prefix completion case, and the second threshold of 10 is somewhat more arbitrary.

For the edit distance buckets, a "high" of 200 is fine. Edit distances larger than this are likely an edge case, as this means that the suggestion modifies/adds/removes 200 characters relative to the typed query.

What are the bucket ranges when setting "n_buckets": 20? I suggest choosing an "n_buckets" large enough that values of 1 to 5 each get assigned to separate buckets, and the remaining bucket ranges will be granular enough for our purposes.
Attachment #8848707 - Flags: feedback?(dzeber) → feedback+
Unassigning myself from non-P1 search bugs for now so I have more time for Photon bugs.
Assignee: adw → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.