Closed Bug 1852098 Opened 11 months ago Closed 8 months ago

Collect `UseCounters` using Glean

Categories

(Core :: DOM: Core & HTML, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
122 Branch
Tracking Status
firefox121 --- fixed
firefox122 --- fixed

People

(Reporter: nalexander, Assigned: chutten)

References

(Regressed 1 open bug)

Details

Attachments

(15 files)

4.48 KB, text/plain
royang
: data-review+
Details
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Per Winston Churchill, "Never let a good crisis go to waste." The crisis here is that our UseCounter data is unreliable for some window (hopefully) ending with Bug 1845779. The opportunity here is to get the new UseCounter data into Glean telemetry so that we can transition a decent chunk of instrumentation away from Legacy telemetry.

The main advantages that I am aware of are:

  • UseCounter data is large, bloating Legacy main pings
  • UseCounter data is "wide", i.e., produces many columns, bloating main ping tables

A new Glean UseCounter ping would allow to eventually streamline Legacy main pings and reduce main ping table width.

I investigated this a little and found that there's a delicate interplay between build-time generation steps and run-time consumption where particular Legacy probes are indexed by number. See https://searchfox.org/mozilla-central/rev/61aca7cda4b217245d9af5aac5bf49bf0d5ecfe3/dom/ipc/WindowGlobalParent.cpp#1106-1120.

It would be helpful for the experts to confirm that this approach is possible, or to suggest alternate approaches that can be GIFTTed. :chutten?

Flags: needinfo?(chutten)

I'm not 100% sure I grasp the significance here, but I think it's "Today, WindowGlobalParents all store a shadow structure of use counters, indexed by a numeric (as in supporting arithmetic) id. How can we do this in FOG which supplies identifiers instead?" which sorta-kinda points to "Isn't it convenient that this enum maintains the same order as this contiguous block" as a thing to either emulate or avoid... but I digress.

Assuming that it remains important that use counters only be reported as the document is destroyed, this implies that the Document remains in need of a structure to store its use counters plus some way to map from its structure to Glean metric instances (specifically mozilla::glean::impl::RateMetric instances, but that's just detail). This mapping necessarily must be provided at the latest by codegen time since it isn't guaranteed to be (nor should it be) stable between builds.

So maybe some sort of GleanUseCounterMapping.h that has /* static */ void IncrementUseCounter(mozilla::UseCounter aCounter) { switch(aUseCounter) { case eUseCounter_custom_DocumentExecCommandContentReadOnly: mozilla::glean::use_counters_page::document_exec_command_content_read_only.addToNumerator(1); mozilla::glean::use_counters_doc::document_exec_command_content_read_only.addToNumerator(1); break; ... }}.

The method by which we create the metrics in the use_counters.{doc|page} categories is still up in the air, of course. For now we could presumably force the generation of a checked-in metrics.yaml whenever a use counter is added or removed plus a mach command to autogen it. But it'll have telemetry_mirror: USE_COUNTER2_<appropriate stuff goes here>_{DOCUMENT|PAGE} and then GIFFT'll take care of the mirroring.

Does that help? Or did I answer the wrong questions?

Flags: needinfo?(chutten)

(In reply to Chris H-C :chutten from comment #2)

I'm not 100% sure I grasp the significance here, but I think it's "Today, WindowGlobalParents all store a shadow structure of use counters, indexed by a numeric (as in supporting arithmetic) id. How can we do this in FOG which supplies identifiers instead?" which sorta-kinda points to "Isn't it convenient that this enum maintains the same order as this contiguous block" as a thing to either emulate or avoid... but I digress.

Yes, that's exactly correct. I saw the general scheme (index -> histogram ID) and wanted to get the experts (you!) to look and suggest how we could either do something similar or propose an alternative.

Assuming that it remains important that use counters only be reported as the document is destroyed, this implies that the Document remains in need of a structure to store its use counters plus some way to map from its structure to Glean metric instances (specifically mozilla::glean::impl::RateMetric instances, but that's just detail). This mapping necessarily must be provided at the latest by codegen time since it isn't guaranteed to be (nor should it be) stable between builds.

So maybe some sort of GleanUseCounterMapping.h that has /* static */ void IncrementUseCounter(mozilla::UseCounter aCounter) { switch(aUseCounter) { case eUseCounter_custom_DocumentExecCommandContentReadOnly: mozilla::glean::use_counters_page::document_exec_command_content_read_only.addToNumerator(1); mozilla::glean::use_counters_doc::document_exec_command_content_read_only.addToNumerator(1); break; ... }}.

The method by which we create the metrics in the use_counters.{doc|page} categories is still up in the air, of course. For now we could presumably force the generation of a checked-in metrics.yaml whenever a use counter is added or removed plus a mach command to autogen it. But it'll have telemetry_mirror: USE_COUNTER2_<appropriate stuff goes here>_{DOCUMENT|PAGE} and then GIFFT'll take care of the mirroring.

Do we not have an in-tree example where we generate content that would otherwise be hand-written in a metrics.yaml?

Does that help? Or did I answer the wrong questions?

Nope, this is what I was after. Some generated code that maps (index -> Glean metric) as a giant case statement is a fine approach, and slightly more robust than requiring a contiguous block with stable ordering.

Alessio: as we discussed, this seems low hanging fruit at a particularly valuable point in time. Can your team jump on it?

Flags: needinfo?(alessio.placitelli)

(In reply to Nick Alexander :nalexander [he/him] from comment #4)

Alessio: as we discussed, this seems low hanging fruit at a particularly valuable point in time. Can your team jump on it?

Yes, our team can. Will leave the ni? on me to come back with more information after talking to the team.

(In reply to Nick Alexander :nalexander [he/him] from comment #3)

The method by which we create the metrics in the use_counters.{doc|page} categories is still up in the air, of course. For now we could presumably force the generation of a checked-in metrics.yaml whenever a use counter is added or removed plus a mach command to autogen it. But it'll have telemetry_mirror: USE_COUNTER2_<appropriate stuff goes here>_{DOCUMENT|PAGE} and then GIFFT'll take care of the mirroring.

Do we not have an in-tree example where we generate content that would otherwise be hand-written in a metrics.yaml?

We do not. The closest thing we have is the codegen regression suite (because it's easier to tell what a template change means when you see what it does to a whole file (see for example when I split the codegen into header and impl)). It runs codegen against the tree and compares it against the most recent known "good" version. If there's a mismatch, that's a testfail. Running the test with UPDATE_EXPECT=1 updates the in-tree version to match whatever's being generated.

A similar model could work for UseCounters where we write a test that generates the metrics.yaml output(s) for UseCounters and compares it(them) against the in-tree version(s), failing on mismatch. That'll keep us from landing/removing a UseCounter without updating the metrics definition.

That still leaves "how we generate metrics.yaml(s) for Use Counters" as an exercise for the reader, mind. I presume it's at least possible since we did it for Histograms. We have some build scripts stuff in stuff in toolkit/components/glean/build_scripts/ that might be useful, but use-counter-specific stuff should probably live near dom/base/UseCounters.conf since that's where the metrics.yaml(s) will go.

Of course, a lot of this goes away if/when we get probe-scraper push-mode working. Glean metrics for Use Counters will lose some tool integration niceties like codesearch and definition file line references in Glean Dictionary (unless we find some way to advertise the patterns to search and the lines of e.g. UseCounters.conf to link to) since those are powered by metrics files being stored in the repo. But we ought to be able to live without them.

Talked to Chris, who will take this within H2 after the current webidl/PingCentre engineering work.

Assignee: nobody → chutten
Flags: needinfo?(alessio.placitelli)

I've been doing some reading (of the docs and the code (no, I didn't look at all of that, I perused dom/ and t/c/t and traced back some codegen python scripts) and have (re-)learned things.

  • We have two kinds of Use Counter: Normal and Worker. Each of those have sub-kinds for which we require separate counts:
    • Normal use counters have sub-kinds for Document (any frame anywhere) and Page (top-level content only).
    • Worker use counters have sub-kinds for Dedicated, Shared, and Service workers.
  • The Normal kind of Use Counter has a few different families of Use Counters that all come together in an enum. There are families of use counters for:
    • [UseCounter]-annotated and UseCounters.conf-configured DOM attributes and methods (which get codegen'd by dom/base/usecounters.py to UseCounterList.h)
    • deprecated operations manually defined in nsDeprecatedOperationList.h
    • each and every CSS property, lists of which are sourced from servo/components/style/properties via GenerateServoCSSPropList.py
    • and some CSS properties which we consider "unknown" to us but we nevertheless count (counted_unknown_properties.py)
  • The Worker kind of Use Counter doesn't have a document, so none of the CSS use counters and few of the API ones make sense. But some [UseCounter]-annotated Web IDL methods/attributes are configured in UseCountersWorker.conf (at time of writing it's mostly console.*) and get use counters for each of the three sub-kinds in its own enum.
  • These two enums are the indices into storage that documents and workers keep about themselves so that, on dtor, they can report the list of these uses that need to be counted.
  • To make these counts useful it is important to have a count of the total number of opportunities these counts had opportunity to be used. (This is because use counters are, despite the name, more useful not as counts but as rates). For legacy reasons these denominators are Histograms with kind: count (instead of the more efficient Scalar with kind: uint):
  • The mapping from enum value to Use Counter Histogram Id is conducted via arithmetic: the order of the use counters in their enums matches the order of the use counters' histogram ids in its enum.

The plan that's taking shape in my brain at the moment is:

  • Leverage the same codegen machinery that DOM uses to generate enum class UseCounter and UseCounterWorker to generate metrics definitions for each use counter as a rate metric and each denominator as a counter metric into one or more metrics.yaml files in dom/base/. By generating them into the source dir, that'll keep us from having to build any magic machinery to replicate this in the pipeline. I expect these to play like Cargo.lock - if you update something relevant the codegen will update the file fully for you and devs would just need to include it in their patchsets.
  • Augment UseCounter.h with functions that can take a UseCounter or UseCounterWorker and accumulate to the correctly-mapped metrics. Call them and increment appropriate denominators during page, document, and worker destruction.
    • See whether changes to metrics in other parts of the code cause large portions of DOM to be rebuilt. If so, consider splitting GleanMetrics.h like StaticPrefList_*.h either as a part of this or as a follow-up.
  • Get Data Review for use counters as a concept. New use counters presently do not require data review because they all are being collected from the same products for the same reasons and are treated in the same way. We shouldn't change that for any additions in the future. Glean being opinionated about ensuring there's Data Reviews for everything means we'll need something to link to.
  • Don't send these rate metrics or the denominator counter metrics in the "metrics" ping. Instead construct a "use-counters" ping.
  • Schedule the "use-counters" ping to be submitted every (uh, I dunno, say) 48h using update-timer. I don't know which component to integrate this against and will reach out to DOM Peers to ask.
    • I also don't know if update-timer is available everywhere we ship Gecko. If it's only available in Desktop then my backup plan for non-update-timer-having Gecko-using projects is to submit "use-counters" pings on every application startup until and unless a cross-session long-interval timer solution presents itself for that project.
  • Move the Use Counters documentation to dom/docs and update it for the new situation. (At the very least mention something about workers in there and link to the Glean Dictionary)
  • Conduct system validation (do all of the different kinds of use counters show up? Does the data smell correct compared to Telemetry-sent or Google-reported data? What does the data from Fenix look like?)
  • Develop an ETL for turning this into a useful public dataset. I'm thinking a table with columns received_day, product_name, geo.country, geo.subdivision, use_counter_name, rate or something like that. I'll be following the process.
  • Throw together a hideous but functional use of the public dataset and link to it on https://telemetry.mozilla.org (perhaps not dissimilar to https://mozilla.github.io/usecounters/index.html ). The hideousness is important to encourage someone else (who knows how to make things pretty) to make a proper one for, say, data.firefox.com. Or MDN.
  • Remove the Telemetry impl of Use Counters from Firefox Desktop. File a ticket for removing it from the pipeline, too (eventually).
  • Send an email to fx-data-dev@, firefox-dev@, dev-platform@ about all this. Post a message to #dom:mozilla.org, #fx-desktop-dev:mozilla.org, and Slack#firefox.
  • Rest.

I intend to conduct most discussions in Phabricator over patchsets of proposed code changes, but this is a good opportunity for early feedback. So let's give it a go:

Emilio: what say you? Is my understanding correct? Is my plan flawed? (I know it's definitely incomplete). Would the proposed changes fit well into how DOM does use counters? Who else should I ask (as well or instead)?

Severity: -- → N/A
Status: NEW → ASSIGNED
Flags: needinfo?(emilio)
Priority: -- → P2
Summary: Collect `UseCounters` using Glean/GIFTT → Collect `UseCounters` using Glean

The parts of the plan that touch stuff I'm familiar with seem reasonable at a glance. Another interesting bit that happens and I don't see mentioned is that page use counters aren't reported on document destruction, but in the parent process. But that should not change the setup significantly afaict, since AIUI I think you still want to keep something similar to the current code that keeps the use counters bitfield.

Flags: needinfo?(emilio)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #9)

...AIUI I think you still want to keep something similar to the current code that keeps the use counters bitfield.

That is correct. If we wanted to count how often in the documents they use all features then we wouldn't need it (we could instead tell the Glean SDK about each and every use of each countable feature), but if we're wanting to count the number of docs that use things at least once, then storage external-to-Glean like the bitfield is needed until the doc has finished doing stuff that might trigger the use of a feature.

In a conversation on #build:mozilla.org, :glandium tells me that the plan I had for a generated-during-build, srcdir-resident, checked-in metrics.yaml is not only not currently done but is expressly forbidden. Which kinda sucks, since that was going to solve several overlapping problems of usability, scrutability, and visibility... but if it's forbidden it'll be for good reasons.

I'll see about rustling up an alternative method for accomplishing this. Maybe push-mode probe-scraper for mozilla-central is more possible these days. Or maybe a manual step is acceptable while we're working our way through the problem.

Further conversations suggest push-mode probe-scraper for mozilla-central is out of scope.

Instead I'll aim for a mach command (say mach gen-usecounter-metrics) to generate the use counter metrics.yaml(s). We could rely on the build process to error out if a dev adds a use counter without running mach gen-usecounter-metrics (because the map from UseCounter to metric will presume the presence of a specifically-named metric that won't exist). If upon testing those ergonomics don't suit (they might, but then again "C++ error message" doesn't have the best reputation for comprehensibility), we could run gen-usecounter-metrics as part of the build and compare it to the one in srcdir and fail the build with an ergonomic error if they don't match.

Is uploading this file as an artifact to taskcluster instead feasible? That'd be much better. We used to get backed out all the time for forgetting to regenerate the property database...

It could be in the future. Right now, though, probe-scraper looks for these files inside the repo, so it'd require some feature work to get the pipeline to look for things in something that isn't version control.

I think "the build doesn't work" with or without an ergonomic message about what to do about it ought to keep this from getting far enough that backouts are needed. Feedback will be provided on the developer's machine, for sure... unless the developer neglects to build before submitting a patch. In which case maybe we'll need to look into a commit hook.

DECISION: I'm going to record use counters as counter metrics, not rate metrics as an optimization. rate metrics will report even if the numerator is 0, which we expect for the vast majority of all the counters. Normally, this would be an unnecessary optimization because 1) The actual impact of using rate vs counter is small per metric, 2) Analysis ergonomics are more important than bytes on the wire or in the db. However, in the case of Use Counters (1) is magnified significantly by the volume of use counters, and (2) is less impactful because we expect ETL to munge the data into derived datasets for common analyses anyway (ie, analysis ergonomics will be addressed by ETL).

Of course, this decision's only based on what I know, so if there's something I don't know or dismissed about how this data is used or ideally represented, I'm happy to review and reverse it.

Let's get this conversation started.

Attachment #9362621 - Flags: data-review?(tlong)

(In reply to Chris H-C :chutten from comment #15)

DECISION: I'm going to record use counters as counter metrics, not rate metrics as an optimization. rate metrics will report even if the numerator is 0, which we expect for the vast majority of all the counters. Normally, this would be an unnecessary optimization because 1) The actual impact of using rate vs counter is small per metric, 2) Analysis ergonomics are more important than bytes on the wire or in the db. However, in the case of Use Counters (1) is magnified significantly by the volume of use counters, and (2) is less impactful because we expect ETL to munge the data into derived datasets for common analyses anyway (ie, analysis ergonomics will be addressed by ETL).

Of course, this decision's only based on what I know, so if there's something I don't know or dismissed about how this data is used or ideally represented, I'm happy to review and reverse it.

I think I'm fine with this, as long as this can be used to power the use-cases we need to support: the current https://mozilla.github.io/usecounters/index.html and things such as this https://chromestatus.com/metrics/feature/timeline/popularity/3581

Note that in this case we'll need to document that "missing from the ping" implies "0".

Comment on attachment 9362621 [details]
data collection review request

(load balancing)

Attachment #9362621 - Flags: data-review?(tlong) → data-review?(royang)

Comment on attachment 9362621 [details]
data collection review request

DATA COLLECTION REVIEW RESPONSE:

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, through the Glean Dictionary.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. Data upload can be controlled in the product's preferences. For Firefox Desktop, that's in about:preferences#privacy.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Emilio and the DOM Core team want to permanently monitor this data.

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical data

  1. Is the data collection request for default-on or default-off?

Default on for all channels.

  1. Does the instrumentation include the addition of any new identifiers?

No

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does the data collection use a third-party collection tool?

No

Result: data-review+

Attachment #9362621 - Flags: data-review?(royang) → data-review+

DECISION: Going to start with a default ping scheduling of once per app session, aiming for AppShutdownConfirmed (so we still have network). It's more often than I'd like but still less often than the "main" ping. The most important things are that it'll be easy to understand, easy to implement, easy to maintain, and portable across products (fully contained inside dom/base. Emilio suggests nsContentUtils. ).

Takes use counter definitions and generates a use_counter_metrics.yaml with
counter metric definitions for each use counter and denominator.

Total use counters at time of writing: 2271 (excludes denominators)

Depends on D193245

Depends on D193247

Also, remove vestigial canRecordExtended push and pop
since we're in the neighbourhood.

Depends on D193250

Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6e8d54a51fb7
Appease python linters r=emilio
https://hg.mozilla.org/integration/autoland/rev/16b0698a239d
mach command gen-use-counter-metrics r=emilio,firefox-build-system-reviewers,ahochheiden
https://hg.mozilla.org/integration/autoland/rev/8bed08cf2a68
Move and update Use Counter documentation r=emilio
https://hg.mozilla.org/integration/autoland/rev/256984effaa2
Raise Glean JS PHF table size to 1k r=janerik
https://hg.mozilla.org/integration/autoland/rev/05ba23510b93
Generate mapping from Use Counter to use counter metrics r=emilio
https://hg.mozilla.org/integration/autoland/rev/5c174b0ee1ab
Increment Glean use counter metrics and submit the use-counters ping during shutdown r=emilio
https://hg.mozilla.org/integration/autoland/rev/2caabd4bfe86
Test new Glean-based use counters r=emilio

Backed out for causing build bustages in UseCounterMetrics.cpp.

  • Backout link
  • Push with failures
  • Failure Log
  • Failure line: /builds/worker/workspace/obj-build/dom/base/UseCounterMetrics.cpp:98:34: error: no member named 'js_late_weekday' in namespace 'mozilla::glean::use_counter_page'
Flags: needinfo?(chutten)

Blast. This is because https://bugzilla.mozilla.org/show_bug.cgi?id=1862922#c18 hit autoland before I did. I get to either wait for it to hit central or find some other way of generating the necessary metrics...

Flags: needinfo?(chutten)
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/920a798dbb8a
Appease python linters r=emilio
https://hg.mozilla.org/integration/autoland/rev/9edef145d1c5
mach command gen-use-counter-metrics r=emilio,firefox-build-system-reviewers,ahochheiden
https://hg.mozilla.org/integration/autoland/rev/220e9ee63ec3
Move and update Use Counter documentation r=emilio
https://hg.mozilla.org/integration/autoland/rev/15e0b0bd3bfb
Raise Glean JS PHF table size to 1k r=janerik
https://hg.mozilla.org/integration/autoland/rev/e594c7eebb91
Generate mapping from Use Counter to use counter metrics r=emilio
https://hg.mozilla.org/integration/autoland/rev/d9c7c84c82df
Increment Glean use counter metrics and submit the use-counters ping during shutdown r=emilio
https://hg.mozilla.org/integration/autoland/rev/3b87419a9eea
Test new Glean-based use counters r=emilio

Ah, non-unified builds need to pass, too. This required some header rejiggering, but now we're clear. However, we're now in soft freeze so this'll have to wait.

Flags: needinfo?(chutten)
Attachment #9364726 - Flags: approval-mozilla-beta?

Takes use counter definitions and generates a use_counter_metrics.yaml with
counter metric definitions for each use counter and denominator.

Total use counters at time of writing: 2271 (excludes denominators)

Original Revision: https://phabricator.services.mozilla.com/D193246
Depends on D193245

Depends on D194225

Attachment #9364728 - Flags: approval-mozilla-beta?
Attachment #9364729 - Flags: approval-mozilla-beta?
Attachment #9364730 - Flags: approval-mozilla-beta?
Attachment #9364732 - Flags: approval-mozilla-beta?
Attachment #9364734 - Flags: approval-mozilla-beta?

Also, remove vestigial canRecordExtended push and pop
since we're in the neighbourhood.

Original Revision: https://phabricator.services.mozilla.com/D193491
Depends on D193250

Depends on D194231

Attachment #9364735 - Flags: approval-mozilla-beta?

Uplift Approval Request

  • Fix verified in Nightly: yes
  • User impact if declined: Current Use Counter telemetry is unreliable, so failure to uplift would remove some capability to measure browser issues.
  • Code covered by automated testing: yes
  • Explanation of risk level: Most of the code is new and not user facing
  • Needs manual QE test: no
  • Is Android affected?: yes
  • Steps to reproduce for manual QE testing: n/a
  • String changes made/needed: no
  • Risk associated with taking this patch: Low
Pushed by pmcmanis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d9904c352643
Appease python linters r=emilio
https://hg.mozilla.org/integration/autoland/rev/d0a97916fda2
mach command gen-use-counter-metrics r=emilio,firefox-build-system-reviewers,ahochheiden
https://hg.mozilla.org/integration/autoland/rev/71e48679c251
Move and update Use Counter documentation r=emilio
https://hg.mozilla.org/integration/autoland/rev/fdbcb0c8a71e
Raise Glean JS PHF table size to 1k r=janerik
https://hg.mozilla.org/integration/autoland/rev/67fb8e7deeed
Generate mapping from Use Counter to use counter metrics r=emilio
https://hg.mozilla.org/integration/autoland/rev/d1d1111b95d5
Increment Glean use counter metrics and submit the use-counters ping during shutdown r=emilio
https://hg.mozilla.org/integration/autoland/rev/238d07f2df95
Test new Glean-based use counters r=emilio
Regressions: 1866479

Comment on attachment 9364726 [details]
Bug 1852098 - Appease python linters

Approved for 121.0b4.

Attachment #9364726 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9364728 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9364729 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9364730 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9364732 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9364734 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9364735 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Regressions: 1866783
Regressions: 1867658

Given the regressions should we back out from beta so we have more time to address them?

Flags: needinfo?(chutten)

I don't have a sense for the severity of the regressions, but my feelings make me think that a few dozen kb on an empty process doesn't warrant a backout, especially since we'll be getting it back when we remove the Telemetry-based ones. My set-point for what's a big deal might need changing, though. What's your experience telling you?

Flags: needinfo?(chutten) → needinfo?(emilio)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #46)

Given the regressions should we back out from beta so we have more time to address them?

FWIW, I would recommend to take the temporary regression right now, as the short-term plan is to deprecate the legacy stuff. Stand by for details in in bug 1866834.

4% of an empty process seems pretty big, but Andrew has a better sense for that kind of stuff.

In any case, taking the regression temporarily seems reasonable if we're confident that removing the old ones would recover the regression. Can we at least push a try run with that removed to confirm that's the case?

Flags: needinfo?(emilio)
Flags: needinfo?(continuation)
Flags: needinfo?(chutten)

Well, it's 4% of only the unclassified heap on an empty process. It's 0.4% of the sum of unclassified, JS, and explicit. I've pushed a try run for the related installer size regression (bug 1866479), but neglected to run the awsy suites (until just now. Thank you for the reminder).

Oh, I should mention that I did notice us reclaiming a similar absolute amount of memory from explicit heap in local testing, but my local n is small and my local machine might not be as stable a test bed as try, so the range was wide.

Flags: needinfo?(chutten)

I commented in bug 1866783.

Flags: needinfo?(continuation)
Blocks: 1877273
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: