Split the log's mozharness step into smaller parts and emphasise those rather than buildbot steps

NEW
Unassigned

Status

4 years ago
a year ago

People

(Reporter: ahal, Unassigned)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
I appreciate the rationale behind the structured log view in treeherder and agree that it could be useful. But I think in its current form, it is more of a nuisance than a help to anyone.

There are a few problems as I see it:

1) The sections are buildbot steps which 99.999999% of the time we don't care about. Buildbot failures do happen from time to time, but they are infrequent. And when they do happen just looking at the raw log is easy enough to do that it doesn't really make sense to have a log view devoted to them.

Separating the log by mozharness steps would be a lot more useful, but even then, only the ateam/releng cares about that. I think we should optimize for developers and only show the log of the job that actually ran (along with failure annotations). Failures in builbot/mozharness are easy enough to see in the raw log.

2) The logs aren't actually structured. It's basically just making it easier to jump to specific sections of the log. But the log being displayed is still just the normal raw log, there isn't really anything structured about it (i.e it's not possible to change the format on the fly).

3) Structured log is very confusing when we also start talking about actual structured logs from tests. I think developers will have trouble wrapping their heads around what the difference is between the two.


I'd like to propose the following:

* short term - rename "structured log" to "annotated log" to avoid confusion. Only include the mozharness buildbot step along with the failure annotations (at least by default). This is the only step developers (and most of the time even ateam/releng) care about.

* long term - integrate the actual structured logs from the test harness. For example, we could let developers choose what format to display the logs in, whether to show subtest results or not, etc.. I think only showing the structured log results from the test harness by default is a good thing. Most devs really don't care about all the mozharness spew before the test logs.
(Reporter)

Comment 1

4 years ago
I understand that buildbot isn't the only source of input, and that buildbot steps were likely shoe-horned into this model.

Comment 2

4 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #0)
> I appreciate the rationale behind the structured log view in treeherder and
> agree that it could be useful. But I think in its current form, it is more
> of a nuisance than a help to anyone.

Sadly it's still an improvement over TBPL! :-)

> 1) The sections are buildbot steps which 99.999999% of the time we don't
> care about. Buildbot failures do happen from time to time, but they are
> infrequent. And when they do happen just looking at the raw log is easy
> enough to do that it doesn't really make sense to have a log view devoted to
> them.

Disagree from a sheriffs POV.

> Separating the log by mozharness steps would be a lot more useful, but even
> then, only the ateam/releng cares about that.

Agree this would be useful.

> I think we should optimize for
> developers and only show the log of the job that actually ran (along with
> failure annotations). Failures in builbot/mozharness are easy enough to see
> in the raw log.

I think this view actually already optimises for developers more than TBPL - in that at least we have something that's better than the all-or-nothing "full log" vs "failure log with excerpts". Not to say we shouldn't improve it, but I don't think it's a regression.

ie: if the job fails, we already only show the failing steps (thereby excluding the buildbot ones). Is this not working?

> 2) The logs aren't actually structured. It's basically just making it easier
> to jump to specific sections of the log. But the log being displayed is
> still just the normal raw log, there isn't really anything structured about
> it (i.e it's not possible to change the format on the fly).
> 3) Structured log is very confusing when we also start talking about actual
> structured logs from tests. I think developers will have trouble wrapping
> their heads around what the difference is between the two.

Agree, and in fact the UI no longer refers to them as structured.
Are you just referring to the artefact name? (in the URL) Agree this should be renamed too - though it may get removed by bug 1078450 anyway.

> I'd like to propose the following:
> 
> * short term - rename "structured log" to "annotated log" to avoid
> confusion. Only include the mozharness buildbot step along with the failure
> annotations (at least by default). This is the only step developers (and
> most of the time even ateam/releng) care about.

The new view will need to easily allow access to the other steps, or this is a no-go from a sheriffing POV.

Also, just to clarify - are you finding the current workflow problematic for green jobs or failing? I would guess that you are talking about green jobs, since for failing the log viewer already hides successful steps by default.

> * long term - integrate the actual structured logs from the test harness.
> For example, we could let developers choose what format to display the logs
> in, whether to show subtest results or not, etc.. I think only showing the
> structured log results from the test harness by default is a good thing.
> Most devs really don't care about all the mozharness spew before the test
> logs.

Agree, bug 1043739 is the meta for this.


Definitely think there are many improvements we can (and want) to make to the log viewer - just a case of splitting this bug up into a few parts, once we've hashed out a desired end-state.

Updated

4 years ago
Summary: Treeherder "structured log" view is misleading and not very useful when running from buildbot → Treeherder log viewer is misleading and not very useful when running from buildbot

Comment 3

4 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #1)
> I understand that buildbot isn't the only source of input, and that buildbot
> steps were likely shoe-horned into this model.

Just to add, the steps feature was designed to both be generic and with buildbot in mind (I've not been allowing sacrifices in the name of taskcluster etc). We always knew we'd need to split steps up more than just by buildbot step, it was just an easier first target :-)

Updated

4 years ago
Component: Treeherder → Treeherder: Log Viewer

Updated

4 years ago
Priority: -- → P3

Comment 4

4 years ago
This bug is about quite a few things, but I'm picking what seems to be the main issue (and some of the others are already filed) and morphing it to be about that. Let me know if this isn't correct and/or file separate bugs for any of the other problems (one issue per bug ideally :-)).
Summary: Treeherder log viewer is misleading and not very useful when running from buildbot → Log viewer should break up the mozharness step into smaller parts and emphasise those rather than buildbot steps

Updated

4 years ago
Summary: Log viewer should break up the mozharness step into smaller parts and emphasise those rather than buildbot steps → Split the log's mozharness step into smaller parts and emphasise those rather than buildbot steps

Updated

4 years ago
Priority: P3 → P4
This would be super useful for actually finding bits of the build log that didn't fail.

Comment 6

3 years ago
I'm interested in doing this. Let me make some changes to mozharness first to make it easier to extract the data...

Updated

3 years ago
Depends on: 1270265

Updated

3 years ago
Duplicate of this bug: 1084169
FYI we are considering just eliminating the build step metadata from treeherder entirely in bug 1258861, just keeping error lines (presumably in this case failed "build step" lines in logs would become error lines themselves).

Comment 9

3 years ago
inbound now prints "#### Finished %s step (%s)" messages at the end of every mozharness step. These are paired with "##### Running %s step." messages.

The thing in parens is "success" or "failure." I put this in so bug 1270276 can put metrics for success and failure in different buckets, since recording times of failed steps will add noise to data.
Created attachment 8750593 [details] [review]
[treeherder] indygreg:parse_mozharness > mozilla:master

Updated

2 years ago
Component: Treeherder: Log Viewer → Treeherder: Data Ingestion

Updated

a year ago
Component: Treeherder: Data Ingestion → Treeherder: Log Parsing & Classification
Marking as P-- so this comes up in the triage session.

Our thoughts from chatting today were:
* we should either improve the steps behaviour in a taskcluster world, or else remove the complexity entirely
* a factor for whether steps are actually useful is whether we decide to move ahead with the "exclude the things covered by structured logs from the raw text log" (since that would drastically shorten the text log anyway, making steps redundant)
Priority: P4 → --
(In reply to Ed Morley [:emorley] from comment #11)
> Our thoughts from chatting today were:
> * we should either improve the steps behaviour in a taskcluster world, or
> else remove the complexity entirely
> * a factor for whether steps are actually useful is whether we decide to
> move ahead with the "exclude the things covered by structured logs from the
> raw text log" (since that would drastically shorten the text log anyway,
> making steps redundant)

FWIW I thought it might be useful for profiling purposes to track the mozharness/taskcluster steps and their duration in treeherder's db so they could be queried by redash or similar (e.g. to determine if setup steps were getting worse over time), but in retrospect I'm not sure how compelling that would be-- in any case, I believe this information may be in activedata, which should also be accessible via redash soon.

I'd probably err on the side of just removing the complexity from the database/logviewer.
You need to log in before you can comment on or make changes to this bug.