Closed Bug 1845779 Opened 2 years ago Closed 2 years ago

Figure out why use counters seem low

Categories

(Core :: DOM: Core & HTML, task)

task

Tracking

()

RESOLVED FIXED
118 Branch
Tracking Status
firefox118 --- fixed

People

(Reporter: bgrins, Assigned: emilio)

References

Details

Attachments

(1 file)

I was comparing https://chromestatus.com/metrics/css/timeline/popularity/105 (showing percentage of page loads that use overflow) with https://mozilla.github.io/usecounters/index.html#kind=page&group=CSS, and our number is way different from what Chrome shows (~30% for us and ~90% for them). I suspect Chrome's number is closer to the truth given supporting evidence from HTTP Archive like https://docs.google.com/spreadsheets/d/1OU8ahxC5oYU8VRryQs9BzHToaXcOntVlh6KUHjm15G4/edit#gid=383111007.

After chatting with Emilio we speculated that maybe:

  1. properties are not being recorded in some cases
  2. we're sending telemetry for pages we don't care about (i.e. about:blank)

Filing this bug to track investigation into what's going on

Ok, so locally in about:telemetry I see the Display use counter at 40, but 202 TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED. That looks bad...

This doesn't seem specific to CSS use counters tho. E.g., look at our offscreencanvas telemetry vs. Chrome's. It seems also off by a somewhat similar factor...

Brian, can you look up if we have a use counter that goes over ~50% per toplevel-loads? Or if we have other similar counters that also show that issue?

Flags: needinfo?(bgrinstead)

Don't yet know how to get this answer. This query returns rows with 2 columns for all use counters (one document and one page), but still need to figure out how to report that as a percent of loads.

SELECT
  submission_timestamp,
  payload.processes.content.histograms AS histograms
FROM
  telemetry.main_use_counter_1pct
WHERE
  submission_timestamp > '2023-07-25'
  AND submission_timestamp < '2023-07-28'
LIMIT 10

Per the chat, this seems not specific to CSS use counters.

For the record the last big change to the setup here was bug 1656114.

Will try to dig.

Flags: needinfo?(emilio)
See Also: → 1656114

Worth noting also that

  • The Measurement Dashboard and Use Counters-specific dash both report similarly-low ratios, and both agree with SQL-calculated values.
  • Beta vs Release are also about the same

Suggests we're either undercounting pages using use-counter-instrumented things, or we're overcounting top-level pages. Or both. And at a ratio of nearly 1:2 pages-which-appear-to-instrument-correctly:pages-which-dont-seem-to-be-working.

So I added some extra logging to the use counter data. Every time I open a new tab on a debug build I see:

[Parent 23396: Main Thread]: D/UseCounters Expect page use counters: WindowContext 36 -> 36
[Parent 23396: Main Thread]: D/UseCounters  > top-level now waiting on 1
[Child 23502: Main Thread]: D/UseCounters Sending page use counters: from WindowContext 36 [about:blank]
[Parent 23396: Main Thread]: D/UseCounters Accumulate page use counters: WindowContext 36 -> 36
[Parent 23396: Main Thread]: D/UseCounters Stop expecting page use counters: -> WindowContext 36
[Parent 23396: Main Thread]: D/UseCounters  > reporting [about:blank]
[Parent 23396: Main Thread]: D/UseCounters  > page use counter data was received, but was empty

That seems problematic and would explain what we're seeing.

Not specific to CSS counters given that.

Component: CSS Parsing and Computation → DOM: Core & HTML
Assignee: nobody → emilio

The wrong code goes back to bug 968923 :/. Probably before fission we didn't create so many about:blank docs tho.

Flags: needinfo?(bgrinstead)
See Also: → 968923

Our counters are super low because we report lots of empty entries for
these. It seems just opening a new tab is enough to trigger this.

While at it I found that we also report about:preferences / about:config
since they use the system principal. Also block those.

Flags: needinfo?(emilio)
Summary: Figure out why CSS use counters seem low → Figure out why use counters seem low
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/88ec6682eab1 Don't report use counter data for content-inaccessible about:blank documents. r=nika
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 118 Branch
Status: RESOLVED → REOPENED
Flags: needinfo?(emilio)
Resolution: FIXED → ---
Target Milestone: 118 Branch → ---
Attachment #9346274 - Attachment description: Bug 1845779 - Don't report use counter data for content-inaccessible about:blank documents. r=nika,edgar → Bug 1845779 - Don't report use counter data for content-inaccessible about:blank documents. r=nika
Flags: needinfo?(emilio)
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3cee98b2330b Don't report use counter data for content-inaccessible about:blank documents. r=nika
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 118 Branch

:emilio or :peterv Can you confirm for me that it's appropriate to delete use counter data prior to 2023-08-16 and Fx118, based on when this bug was resolved?

For context, my plan is to only drop data prior to 2023-08-16, and not put any effort into adding functionality to continuously drop data prior to Fx118, but getting confirmation for both just in case the plan changes before :chutten is ready to replace the table entirely.

Flags: needinfo?(peterv)
Flags: needinfo?(emilio)

Do you have context for why do we want to delete the data? Having a use counter values being lower than they should doesn't make them completely useless. E.g., a use counter from before the patch that would report a 0.5% would still be more useful than being completely in the dark, you just need to be aware of some stuff being potentially underrepresented.

Flags: needinfo?(emilio) → needinfo?(dthorn)

use counters have been split out of main pings to a separate table: https://groups.google.com/a/mozilla.org/g/fx-data-dev/c/pOFifCjNH7c

As part of this move, I am working on backfilling the new tables (main_use_counter_v4 and main_v5). Once data is migrated to the new tables, main_v4 will have its data retention reduced to 30 days, so that we don't have to pay to store the history twice and so don't need to process data deletion requests for it, both of which are expensive. Rather than backfill the use counter data to the new table, and store it indefinitely, and need to process data deletion requests for it, it would be much cheaper to drop data prior to the fix.

Having a use counter values being lower than they should doesn't make them completely useless.

I was under the impression, as are most of our data engineers, that it's not just a matter of being lower than they should. It's that the values are arbitrarily and unmeasurably lower, without consistency between clients, and as such any analysis of the data would be limited at best, and more likely misleading than helpful.

Flags: needinfo?(dthorn) → needinfo?(emilio)

It's that the values are arbitrarily and unmeasurably lower, without consistency between clients.

It's not arbitrarily lower. While it is true that how low they are might depend on browsing patterns, it's not just random. For the kind of stuff we use use counters for (is this used little enough that we can remove it, for example), it seems they'd still be useful.

However I guess now that Firefox 118 is on release, that probably gives us enough data to work with that we could just drop the older data it if it's easier, I suppose.

Flags: needinfo?(emilio)

It's not arbitrarily lower.

Thank you for the clarification.

that probably gives us enough data to work with

I will move forward with dropping the data then. Thank you.

Flags: needinfo?(peterv)
Regressions: 1859945
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: