Closed Bug 1634123 Opened 7 months ago Closed 7 months ago

job_resource_usage framework needs to be registered

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: sclements, Unassigned)

References

Details

We see a lot of errors in our Treeherder/Perfherder logs because the job_resource_usage framework is not officially registered even though we are parsing PERFORMANCE_DATA log lines re https://searchfox.org/mozilla-central/source/testing/mozharness/mozharness/base/python.py#678

We need to know if this framework should be registered, or if that is not desired, then the code needs to be changed so it will not report data to Perfherder.

Blocks: 1596347

Hi, can you tell me if this framework should be registered? This is something simple I can do on my end.

Flags: needinfo?(jlund)

302 to wlach. I'm unfamiliar with perfherder :)

Flags: needinfo?(jlund) → needinfo?(wlachance)

I'm not really so involved with perfherder these days, but IIRC :trink and :ekyle were ingesting this type of data into BigQuery and ActiveData. They might have thoughts on whether this is something we should continue to do. It might be the case that we want to continue ingesting this somewhere, even if not into perfherder itself.

Flags: needinfo?(mtrinkala)
Flags: needinfo?(klahnakoski)

My understanding is that job_resource_usage is not monitored by the perf sheriffs, so it is not worth ingesting. Also, it is quite large. Here is a list of the top frameworks by record count from ActiveData:

name record count
job_resource_usage 379,470,630
talos 373,085,357
raptor 276,946,540
build_metrics 127,982,908
vcs 114,323,678
devtools 29,627,140
Flags: needinfo?(klahnakoski)

Might this information be useful in improving runtime of things?

Flags: needinfo?(jmaher)
Flags: needinfo?(catlee)
Flags: needinfo?(bhearsum)

What does it mean for the framework to be registered?

Flags: needinfo?(catlee)

It needs to be added to the performance_framework fixtures, used to populate the PerformanceFramework table in Treeherder: https://github.com/mozilla/treeherder/blob/master/treeherder/perf/fixtures/performance_framework.json

When we parse data after ingesting jobs (tasks) from pulse guardian, we kick off log parsing in part to find PERFHERDER_DATA log lines, and then extract that data to add to various Perfherder tables. Before we store it, we verify that the framework exists in the framework table (we only get that detail after parsing the log lines).

So if this data doesn't need to be monitored, the PERFHERDER_DATA log lines should be removed in the file in my first comment so the parser doesn't waste time processing data that is basically thrown away (and spamming the logs about it). From my conversation with Ionut, in bug 1596347, it doesn't sound like this data is needed by Perf sheriffs (and presumably this would have been added to the frameworks a while ago if it was).

(In reply to Sarah Clements [:sclements] from comment #7)

So if this data doesn't need to be monitored, the PERFHERDER_DATA log lines should be removed in the file in my first comment so the parser doesn't waste time processing data that is basically thrown away (and spamming the logs about it). From my conversation with Ionut, in bug 1596347, it doesn't sound like this data is needed by Perf sheriffs (and presumably this would have been added to the frameworks a while ago if it was).

I think ActiveData and BigQuery both parse the logs for PERFHERDER_DATA, presumably they will also stop ingesting this information if the log lines are removed. Maybe that's ok though? It all depends on if anyone is using this information, that's why I needinfo'd Kyle and Trink, in the hopes that they would know.

Flags: needinfo?(wlachance)

Kyle commented :) But we can see what Trink (or anyone else) has to say about it.

I have not seen the data used in ActiveData for at least a few months. But, I suggest we continue to ingest it into ActiveData and BigQuery: The data is small, for these two datastores, and we keep it available for when a project needs it. We definitely can not monitor it if we are not collecting it. I do not want to reestablish an ETL pipeline if/when this data is required again.

Treeherder can either ignore the performance_framework sooner in its pipeline, or add it to a know exception list to make the logs nicer.

All perfdata is ingested into Bigquery so it is available for use but I am not the end user of that data... asking around. As far as cost, it is minimal in both processing and storage.

Flags: needinfo?(mtrinkala)
Flags: needinfo?(jmaher)

(In reply to Kyle Lahnakoski [:ekyle] from comment #10)

I have not seen the data used in ActiveData for at least a few months. But, I suggest we continue to ingest it into ActiveData and BigQuery: The data is small, for these two datastores, and we keep it available for when a project needs it. We definitely can not monitor it if we are not collecting it. I do not want to reestablish an ETL pipeline if/when this data is required again.

Treeherder can either ignore the performance_framework sooner in its pipeline, or add it to a know exception list to make the logs nicer.

Ok, I thought you were saying from your first comment that you didn't see the utility in ingesting it. I guess you meant that for Treeherder.

Treeherder can't ignore the framework until the PERFHERDER_DATA log lines are parsed, since that's where the framework is "discovered". I can of course add it to a known exceptions list so it won't be logged.

Flags: needinfo?(bhearsum)
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.