Measuring Test Infrastructure

RESOLVED FIXED

Status

RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: ekyle, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

3 years ago
We would like to gather measures that buildbot and Mozharness are emitting:  This will allow us to measure the time spent on various activities (clobber, read-buildbot-config, create-virtualenv) and help us prioritize our time.  It will also allow us to monitor these activities; in case there is a change, or if it exceeds some threshold.

Collecting this information would be easier and cleaner if Mozharness emitted structured logs.  The StructuredLogger has been designed within the context of running suites and tests.  It does not apply well to general structured logging, and I imagine it will take some time to generalize the StructuredLogger.   I imagine that adding structured logs would involve emitting another artifact.

We could just scan the text logs:  It seems the action_message [1] is unique enough and it does not require changes to the logging.   The only problem I see is we must know what we are looking for:  We can not be confident we are collecting 'everything'.   The StructuredLogger is good because it provides context that defines 'everything'.  

I am tending towards ingesting the text logs:  It is the least amount of work and involves the least number of people to implement.  When we have the data indexed, we will inevitably have questions it can not answer; then we can decide if more logging in Mozharness is required, and if we should move to structured logging. 

Suggestions on how to proceed? 


[1] https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/base/script.py#1813
(Reporter)

Updated

3 years ago
Summary: Measuring Infrastructure → Measuring Test Infrastructure

Comment 1

3 years ago
In bug 1208223 I mention that we have to post per action information to InfluxDB.
If I end up working on that, I'm happing to create a structured log for ActiveData as well.

However, I hope that I can generate this data just before exiting the script and just after the last action (there is something called PostScriptAction which is nameless).
My concern with that approach if I would still be able to include the structured log as one more of the artifacts.

Also note that actions that run the tests are not uniform accross jobs (e.g. run-tests vs run-marionette-tests). This is something that I also hope to fix at the same time and hopefully find a way to prevent jobs from not using "run-tests" as the running tests action.

This work would probably take me 3-8 weeks depending on what I encounter, hence, not being sure if you should wait for me if you want to make it a deliverable.

How does ActiveData determine which artifacts or structured logs there are associated to a given job?

Comment 2

3 years ago
Also note, that there is per-action information (timings) coming through Buildbot's non-normalized Pulse stream. Just in case, it is also possible for you to use.
So a few thoughts:

1) I have no doubt that ingesting text logs is the fastest way to get this working
2) I think adapting structured logs to fit mozharness isn't that hard; define a couple of actions that correspond to the data we want to consume, and away you go. What's hard is doing all the work to convert the mozharness side.
3) Producing structured logs in mozharness has a number of other benefits. Not least that treeherder could consume them. This would make the error classification stuff much easier to integrate (at the moment I am looking at putting magic markers in the source to identify regions that have a corresponding error_summary).
4) For testing I wonder how much useful information there is in the non-test steps. I wouldn't be super-surprised if we find that 95% of the time for each job is "running test suite", which won't tell us much. I guess knowing about the overhead is useful if we want more chunks/parallelism, but there is overhead inside the test harnesses too e.g. the time taken to load manifests and other data. A more coherent logging story could be very helpful in identifying performance bottlenecks.

Comment 4

3 years ago
(In reply to James Graham [:jgraham] from comment #3)
> 2) I think adapting structured logs to fit mozharness isn't that hard;
> define a couple of actions that correspond to the data we want to consume,
> and away you go. What's hard is doing all the work to convert the mozharness
> side.

We could create an artifact which follows a structured log format without converting Mozharness completely to use structured logs.
(Reporter)

Comment 5

3 years ago
James,

> 3) Producing structured logs in mozharness has a number of other benefits.

Yes, another specific benefit to structured logs is being able to adding log lines to Mozharness to measure new and interesting actions while the downstream log processing need not change to report it.
I think changing how mozharness does logging is worth pursuing, but I don't want to block this on that.  I think ingesting raw test logs is probably the fastest way to go, and allows us the additional opportunity to better instrument the harnesses to emit information about their own overhead without much change to the data collection mechanism.
(Reporter)

Comment 7

2 years ago
All the test infrastructure data is in ActiveData, and has been for a while.  We have a few dashboards that use the data.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.