Closed Bug 1806777 Opened 3 years ago Closed 3 years ago

Document Glean events n_words count limits

Categories

(Firefox :: Address Bar, defect)

Firefox 110
Desktop
Unspecified
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr102 --- unaffected
firefox108 --- unaffected
firefox109 --- unaffected
firefox110 --- affected

People

(Reporter: aflorinescu, Unassigned)

References

Details

Found in

  • 110.0a1

Affected versions

  • 110.0a1 (2022-12-20)

Tested platforms

  • Windows 10

Preconditions

  • browser.urlbar.searchEngagementTelemetry.enabled set true
  • devtools.chrome.enabled set true

Steps to reproduce

  1. Open browser console and execute: Services.fog.testResetFOG();
  2. Create a long multiple word search query, delimited by space: count count count ... - should be a string query of 55 instances of count delimited by space. (use https://charactercounttool.com/ to count them)
  3. Open a new tab, copy paste the above search query string and press enter (engagement)
    3'. Open a new tab, copy paste the above search query string and disengage the address bar by focusing New Tab Page(abandonment)
    3". Open a new tab, copy paste the above search query string and wait 2 seconds (impression)
  4. In the browser console execute: Glean.urlbar.engagement.testGetValue()
    4'. In the browser console execute: Glean.urlbar.abandonment.testGetValue()
    4". In the browser console execute: Glean.urlbar.impression.testGetValue()

Expected result
4. n_chars: "329" n_words: "55"
4'. n_chars: "329" n_words: "55"
4". n_chars: "329" n_words: "55"
​​​

Actual result
4. n_chars: "329" n_words: "43"
4'. n_chars: "329" n_words: "43"
4". n_chars: "329" n_words: "43"

Regression range

  • New feature, N/A

Additional notes

  • We are guessing that a similar issue is happening with n_results, but for the time being we don't know how to validate the number of search results shown in the address bar when browser.urlbar.maxRichResults is modified to a bigger number than what address bar displays.

Thank you very much for your report, Adrian!

It seems due to the string length limit.
https://searchfox.org/mozilla-central/rev/cef96316b3643720769dec96542604c3209f1877/browser/components/urlbar/UrlbarController.sys.mjs#1204-1216
We record n_chars as it is, but for n_words, we are counting it after truncating by string length limit (255).
So,

let string = "";
for (let i = 0; i < 55; i++) {
  string += "count ";
}
string.substring(0, 255).split(" ").length

= 43.
So, it is expected. But, I need to add this case to the QA document. Thanks!

(In reply to Daisuke Akatsuka (:daisuke) from comment #1)

Thank you very much for your report, Adrian!

It seems due to the string length limit.
https://searchfox.org/mozilla-central/rev/cef96316b3643720769dec96542604c3209f1877/browser/components/urlbar/UrlbarController.sys.mjs#1204-1216
We record n_chars as it is, but for n_words, we are counting it after truncating by string length limit (255).
So,

let string = "";
for (let i = 0; i < 55; i++) {
  string += "count ";
}
string.substring(0, 255).split(" ").length

= 43.
So, it is expected. But, I need to add this case to the QA document. Thanks!

I'm still trying to make sense what would be the point to log a truncated count? Maybe for n_words once we hit the limit, we might want to signal this with a different non-numeric value? As it stands there is no difference betwen exactly 43 or 100 words in the address bar and I'm not sure if further down the line analisys of the data would account for the fact that n_words is limited to 43.

(In reply to Adrian Florinescu [:aflorinescu] from comment #2)

(In reply to Daisuke Akatsuka (:daisuke) from comment #1)

Thank you very much for your report, Adrian!

It seems due to the string length limit.
https://searchfox.org/mozilla-central/rev/cef96316b3643720769dec96542604c3209f1877/browser/components/urlbar/UrlbarController.sys.mjs#1204-1216
We record n_chars as it is, but for n_words, we are counting it after truncating by string length limit (255).
So,

let string = "";
for (let i = 0; i < 55; i++) {
  string += "count ";
}
string.substring(0, 255).split(" ").length

= 43.
So, it is expected. But, I need to add this case to the QA document. Thanks!

I'm still trying to make sense what would be the point to log a truncated count? Maybe for n_words once we hit the limit, we might want to signal this with a different non-numeric value? As it stands there is no difference betwen exactly 43 or 100 words in the address bar and I'm not sure if further down the line analisys of the data would account for the fact that n_words is limited to 43.

This was done so as not to impair performance. If we handle super long text, the load will be heavy.

We do not care about precision at those very long strings.
Our attention is around optimizing the user workflow, thus give the best results we can with the smallest amount of input. If the user has to type 20+ words to get what they want, it means our results are terrible and we must optimize them. The difference between 20 or 100 words is uninteresting. We should document this limit in the yaml and the source docs, so that Data Science will be aware of that.

metrics.yaml was updated already and it states "For performance reasons a maximum of 255 characters are considered when splitting."

I don't think there's anything else to do here.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
Summary: Glean events n_words count is incorrect → Document Glean events n_words count limits
You need to log in before you can comment on or make changes to this bug.