Closed Bug 1526072 Opened Last year Closed 10 months ago

Record whether a build was a clobber and cpu utilization in build metrics

Categories

(Firefox Build System :: General, enhancement)

enhancement
Not set

Tracking

(firefox68 fixed)

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: chmanchester, Assigned: chmanchester)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 1 obsolete file)

The build driver has a function[1] that runs on every build and decides whether a clobber is necessary. We should note its result as a part of build telemetry.

[1] https://searchfox.org/mozilla-central/rev/490ab7f9b84570573a49d7fa018673ce0d5ddf22/python/mozbuild/mozbuild/controller/building.py#1502

Rather than checking whether the build itself decided to clobber, it would be better to check whether the build started from a clobbered objdir (or no objdir). That would cover the cases where the tree was clobbered by the user rather than the build system.

There are several sorts of clobbers: a build starting from where an objdir didn't exist, the build following |./mach clobber|, and a build after the build system decides it needs to clobber.

We want to measure clobbers because we want to have an idea of how much work a build intends to do (a "full" build here) and because we want to see clobbers go down as the build system improves. All these sorts of clobbers seem useful for these purposes.

Assignee: nobody → cmanchester

It looks like we're going to need to extend our schema to collect this. Connor, can you point me in the direction of the server side schema code that would need to be updated and any additional approvals we may need? Thanks

Flags: needinfo?(sheehan)

The schemas live in the mozilla-pipeline-schemas repo. The build system telemetry is specifically under {templates,schemas}/eng-workflow/build/. The templates directory is where you want to make your changes by hand, and changes to the schemas directory should be generated using the build instructions in the README of that repo and added to your pull request. First you will want to add your changes to the Voluptuous schema in-tree at python/mozbuild/mozbuild/telemetry.py, and generate the JSON-schema version using export_telemetry_schema.py. You can then check that version in as build.1.schema.json. You will need to note any changes here and add them to build.1.parquetmr.txt by hand. There isn't an automated process for this, you'll need to reference existing Parquet schemas or check the online docs (I'm happy to help if you need it). Then you can follow the build instructions, which will generate new files in the schemas directory to be added to your pull request.

If I recall correctly, we should be able to add new data points to the existing schema without bumping the version.

Bug 1291053 tracked the initial approvals from the data team for build telemetry collection. I'm not sure if we need to go through that whole process again to add a new data point though. :kmoir was the driver for that bug, she will probably know more.

Flags: needinfo?(sheehan)

Thanks Connor! It looks like all this will be easier to implement in a single patch stack, so I'll dupe bug 1526067 over to this.

Summary: Record whether a build was a clobber in build metrics → Record whether a build was a clobber and cpu utilization in build metrics
Duplicate of this bug: 1526067

Hello :chutten, I'm asking for your help here because you provided the initial data sign off for this project in bug 1291053. Do the additional collections described in comment 10 require additional data review? Thanks for your help!

Flags: needinfo?(chutten)

Yes, they should have a data review. Both are Category 1 data, are documented, and can be opted-out using the same mechanism as the rest so it shouldn't be a problem.

Flags: needinfo?(chutten)
Attachment #9051856 - Attachment is obsolete: true

Thank you, Chris. Here are my data review form responses. I have included responses from bug 1291053 where things have not changed.

What questions will you answer with this data?

How efficient is our build system?
What proportion of builds are incremental vs full builds?
How much do users trust out build tools?

Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

Establish baselines and measure how local builds are invoked so that we can better address the needs of developers and improve their local build experienced.

What alternative methods did you consider to answer these questions? Why were they not sufficient?

We could have talked to people individually and watch them work but this would have been very time consuming and does not work well for our very distributed team.

Can current instrumentation answer these questions?

No

List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the Mozilla wiki.

The following measurements are added:
Optional('build_attrs', description='Attributes characterizing a build'): {
Optional('cpu_percent', description='cpu utilization observed during a build'): int,
Optional('clobber', description='true if the build was a clobber/full build'): bool,
},
These measurements are both category 1. Some clobber builds are user-initiated, which would be category 2.

How long will this data be collected? Choose one of the following:

I want this data to be collected for 6 months initially (potentially renewable).

What populations will you measure?

Firefox developers both Moco employees and contributors

Which release channels?

N/A not measuring firefox usage

Which countries?

Countries where there are Firefox developers who are do not opt out of collecting this data

Which locales?

Many

Any other filters? Please describe in detail below.

None

If this data collection is default on, what is the opt-out mechanism for users?

Users can opt out the next time they run mach

Please provide a general description of how you will analyze this data.

analyze using tools in the general data ingestion pipeline

Where do you intend to share the results of your analysis?

Mozilla + community wide

Flags: needinfo?(chutten)

Preliminary note:

For future Data Collection Review requests please attach them to the bug and use the data-review? flag to mark them as needing review.

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes. This collection is documented with the rest of Build Telemetry here:
https://firefox-source-docs.mozilla.org/build/buildsystem/telemetry.html

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is opt-in on bootstrap in the first place, and can be opted out through the same mechanism as the rest of build telemetry.

If the request is for permanent data collection, is there someone who will monitor the data over time?

No. This collection is asked to expire after 6 months.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction.

Is the data collection request for default-on or default-off?

Default off.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

No, but it is instead covered by the existing Mozilla Privacy Policy and principles. (It's not a Firefox collection)

Does there need to be a check-in in the future to determine whether to renew the data?

Yes. :chmanchester is responsible for renewing or removing the collection before six months is up (so, around mid-September ish).


Result: datareview+

Flags: needinfo?(chutten)
Pushed by cmanchester@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c1d1576431d7
Avoid trailing spaces when generating the schema for build telemetry. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/a0eb0f43c928
Add cpu utilization and clobber fields to build telemetry schema. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/37942b0f911b
Record cpu utilization and clobber/full builds in build telemetry. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/d3d56eca307f
Add build attributes to documentation. r=nalexander

This needs a trivial update to appease tests. I will update and re-push shortly.

Flags: needinfo?(cmanchester)
Pushed by cmanchester@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/483189a652e1
Avoid trailing spaces when generating the schema for build telemetry. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/901753b5fe58
Add cpu utilization and clobber fields to build telemetry schema. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/827ad1eecbbb
Record cpu utilization and clobber/full builds in build telemetry. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/6319c1c6a6a4
Add build attributes to documentation. r=nalexander
You need to log in before you can comment on or make changes to this bug.