Closed Bug 1603185 Opened 10 months ago Closed 4 months ago

We want a telemetry for optimizing content process creation for typical sites.

Categories

(Core :: Performance, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla78
Fission Milestone M6a
Tracking Status
firefox78 --- fixed

People

(Reporter: sefeng, Assigned: barret)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

The proposed implementation is to collect number of unique origins per loaded tab. By using this number, we can get the average/distribution of the number of site origins of a typical tabs. We can use this number to optimize the content process creation for typical sites.

Olli, would you agree the proposed implementation?

Flags: needinfo?(bugs)

How would we use the number for optimizing anything?
(We know that we must optimize process creation, but that we need to do anyhow)

Flags: needinfo?(bugs)

Say we know typical pages would use 2 processes, then we can try to make sure to always have 2 free processes ready before a pageload? Just my 2 cents guessing

^ What do you think Olli?

Flags: needinfo?(bugs)

ah, that. Yeah, makes sense.
And this is about collecting number of unique sites per tab - sites which aren't being used in other tabs, right?

Flags: needinfo?(bugs)
Priority: -- → P3

Initially I think we do want to include the sites that are currently being used in other tabs.

However, by thinking it a bit more, I agree to only counts the sites that aren't being used in other tabs, and this gives us a how many new processes are going to be created for typical sites.

Do you agree Randell?

Flags: needinfo?(rjesup)

The number of origins per-page is useful for determining the number of processes to prestart at startup and how many to have when there are "few" tabs open. The number that aren't already open will be lower, which will be too low for low-tab/startup cases, and too high for add-one-more-tab-to--bunch cases. I want information about the state of the web as we hit it, which can be used to inform various decisions and tradeoffs. The delta-new-processes number may be useful in looking at tuning an algorithm for how many to keep in reserve, though it wasn't what I was looking for here.

The alternate would be to key it so we collect the distribution of number of new origins needed vs. number of tabs loaded (not number of tabs, note) or versus number of content processes running.

Flags: needinfo?(rjesup)

I am not convinced to use the number of site origins per-page....

I mean, I get the point that the if we only count the delta-new-process, it may vary depend on the current state of the browser. My argument is, isn't it more realistic and accurate? I think it's more representative.

The number that aren't already open will be lower, which will be too low for low-tab/startup cases

Well, I think this is true if the majority of user would open a bunch of other tabs, and then open certain high number of origins sites, and the data won't capture the minority of users who would open these high number of origins at startup. If this is the case, then do we really care? and if this is not the case, which means the majority of users would open these high number of origins sites at the startup, then the delta-new-process is able to capture this as well.

too high for add-one-more-tab-to--bunch cases

Yeah, same as the previous one, I just feel if this is an edge case, then we probably don't care. And if it's not an edge case, then the delta-new-process can capture this as well .

Hope I am clear, let me know what you think Randell.

Flags: needinfo?(rjesup)

My goal was to get data that will help a) guide our work on process startup and caching including the model/algorithm; b) help us understand the web our users are seeing (like for example if over the next year the average origins/site goes from 1.5 to 2).

For example, we may want to prestart more origins when someone has 1 tab open than if they have 20; knowing typical origins used (and their distribution) helps set the startup/1-tab end, and the total origins (which we have) and delta-origins-per-new-load would help design how much we tail that off as number of active tabs (or origins) increases. If there's too big a tail of high-origins pages (like CNN) in the data, we may not be able to tail off the prestart #'s.

Perhaps we could get data to help with a) by getting a distribution of unique origins vs #of tabs, but I don't think we can do that sort of 2d distribution in Telemetry.

Flags: needinfo?(rjesup)
Fission Milestone: --- → M6

Reopening because I closed this bug by mistake. We do want this telemetry to know how to tune Fission's iframe process management.

We should try to collect this information even if users don't have Fission enabled (because we don't have many Fission users yet).

Assignee: nobody → brennie
Status: RESOLVED → REOPENED
Fission Milestone: M6 → M6a
Resolution: WONTFIX → ---
Attached file telemetry.txt
Attachment #9138955 - Flags: data-review?(cdowhygelund)
Attachment #9138955 - Flags: data-review?(cdowhygelund) → data-review?(chutten)
Comment on attachment 9138955 [details]
telemetry.txt

DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes. This collection is Telemetry so is documented in its definitions file [Histograms.json](https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/telemetry/Histograms.json) and the [Probe Dictionary](https://telemetry.mozilla.org/probe-dictionary/).

    Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Barret Rennie is responsible.

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical. (the user's interaction isn't really encoded in the number of origins, so this seems more like a Cat1 to me.)

    Is the data collection request for default-on or default-off?

Default on for all channels.

    Does the instrumentation include the addition of any new identifiers?

No.

    Is the data collection covered by the existing Firefox privacy notice?

Yes.

    Does there need to be a check-in in the future to determine whether to renew the data?

No. This collection is permanent.

---
Result: datareview+
Attachment #9138955 - Flags: data-review?(chutten) → data-review+
Depends on: 1634764
Pushed by brennie@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/13c6ff0fdb29
Collect per tab unique site origin telemetry r=Dexter,Gijs,nika
Regressions: 1640080
Status: REOPENED → RESOLVED
Closed: 6 months ago4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78
You need to log in before you can comment on or make changes to this bug.