Closed Bug 1272352 Opened 9 years ago Closed 8 years ago

Taskcluster not providing blobber-uploaded files to treeherder

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: jgraham, Assigned: garndt)

References

Details

(Whiteboard: [feature])

Attachments

(1 file)

taskcluster-treeherder PR 29 8 years ago Greg Arndt [:garndt] 61 bytes, text/x-github-pull-request	camd : review+	Details \| Review

James Graham [:jgraham]

Reporter

Description

•

9 years ago

Test jobs upload files to blobber that are then used either by developers or other tools. For an example of these files, see the "Job Detail" panel for [1]. Of particular importance is the errorsummary file, which contains a machine-readable list of test failures and other errors, and is the basis for the autoclassification feature. Taskcluster jobs are currently not providing treeherder with the information required to locate these uploaded files, and as a result features inclusing autoclassification do not work with taskcluster jobs. [1] https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&selectedJob=3870022

James Graham [:jgraham]

Reporter

Comment 1

•

9 years ago

The code for reading the blobber data is at https://github.com/mozilla/treeherder/blob/master/treeherder/etl/buildapi.py#L165

James Graham [:jgraham]

Reporter

Comment 2

•

9 years ago

And the input to that code is generated by the TinderboxPrintParser: https://github.com/mozilla/treeherder/blob/master/treeherder/log_parser/parsers.py#L278

James Graham [:jgraham]

Reporter

Comment 3

•

9 years ago

wlach pointed out that the TinderboxPrintParser stuff isn't actually used as input to that function, but is responsible for the links in the treeherder UI.

Greg Arndt [:garndt]

Assignee

Comment 4

•

9 years ago

So looking at the sample job, and others I found on treeherder, there is a set of artifact information that are uploaded for jobs. Is this only for some of the artifacts produced by a job or for any artifact? How are these artifacts submitted to Treeherder? Are they just artifacts in the job details piece of the "Job Info" message with a content_type of link? http://treeherder.readthedocs.io/submitting_data.html#job-artifacts-format If so, when posting the completion of a job to treeherder, we could get a list of all artifacts for a task run and post them within the job message.

William Lachance (:wlach)

Comment 5

•

9 years ago

(In reply to Greg Arndt [:garndt] from comment #4) > So looking at the sample job, and others I found on treeherder, there is a > set of artifact information that are uploaded for jobs. Is this only for > some of the artifacts produced by a job or for any artifact? They aren't actually related to artifacts at all, they're part of the log_references property of jobs that you submit. See this link that jgraham posted in comment 2. (to simplify things and reduce confusion, I think we should just kill the TinderboxPrints that print out that information, and put the extra log references in the job details panel... I'll file a bug for this later)

James Graham [:jgraham]

Reporter

Comment 6

•

9 years ago

Note that it's not just extra log references, it's everything in the blobber_files property. At the moment treeherder does the needed magic to convert a _errorsummary file to work like a log_url. I don't particularly mind if taskcluster instead wants to put the errorsummary file in with the other logs, as long as we don't end up double-counting each file. And we also want to ensure that we get all the other non-log blobber files since they are used by developers.

Selena Deckelmann :selenamarie :selena

Comment 7

•

9 years ago

Triaging as "feature" to complete in Q2. I realize that this is for parity with buildbot. For our project, it's feature work in that someone (ahem, garndt) is going to have to dig into the parsing and create something new to resolve it.

Priority: P1 → P3

Whiteboard: [feature]

Selena Deckelmann :selenamarie :selena

Comment 8

•

8 years ago

Greg - is this something that we can look at after the tc-treeherder service deployment? Could someone other than you have a look?

Flags: needinfo?(garndt)

Greg Arndt [:garndt]

Assignee

Comment 9

•

8 years ago

I made some changes to taskcluster-treeherder staging to show uploaded artifacts. Is this what we're after here? Click the job details panel for this job: https://treeherder.allizom.org/#/jobs?repo=mozilla-inbound&selectedJob=28662952

Armen [:armenzg]

Comment 10

•

8 years ago

It looks good to me. Passing NI to filer: http://people.mozilla.org/~armenzg/sattap/9e28eb50.png

Flags: needinfo?(garndt) → needinfo?(james)

Greg Arndt [:garndt]

Assignee

Comment 11

•

8 years ago

Right now the artifact name/path displayed is limited to 50 characters but there will be a patch soon for treeherder that will increase this to 125. I know we have some artifact names+path that are longer than 50 characters. Taskcluster artifact names are usually a path such as "public/test_info/resource-usage.json". Should these display that way or do we just want "resource-usage.json" to appear. If having just the filename is ideal, there might be collisions in treeherder as they are going to be put uniqueness constraints on the displayed text such that if two artifacts have the same name, and we are only displaying the base file name rather than full path, only the latest url will be stored regardless if one artifact is path/to/artifact/artifact.json and the other is different/path/artifact.json.

James Graham [:jgraham]

Reporter

Comment 12

•

8 years ago

It's not clear to me if this is the complete change, because I don't know if this puts the data through the same ingestion pipeline as buildbot artifacts. In particular if we don't go through the code at [1] then further changes are needed to make TC work with autoclassification, albeit possibly not changes that the TC team need to make. [1] https://github.com/mozilla/treeherder/blob/b040bb40455e2baac403813c882d143711262b49/treeherder/etl/buildapi.py#L169

Flags: needinfo?(james) → needinfo?(cdawson)

Cameron Dawson [:camd]

Comment 13

•

8 years ago

James-- It won't go through that code. The paths start being the same at ``store_jobs``. So, ``job_loader`` would need your modification for blobber files special handling. Or perhaps create an abstracted function they can both share.

Flags: needinfo?(cdawson)

Greg Arndt [:garndt]

Assignee

Comment 14

•

8 years ago

I have updated the PR to use the filename as the link test rather than the whole artifact path. There is a limit on the TH side of 125 characters for the linkText. Artifacts will now appear as: https://treeherder.allizom.org/#/jobs?repo=try&selectedJob=24293578 For artifacts that have the same file name (but live as different artifact paths), they will display with an incrementing number. This is because if treeherder encounters a link that has the same label ("artifact uploaded") and same link text (such as the filename) then the most recent URL recorded will win resulting in all of the links having the same URL. The incrementing number helps mitigate that.

Greg Arndt [:garndt]

Assignee

Comment 15

•

8 years ago

Attached file taskcluster-treeherder PR 29 — Details

Attachment #8770377 - Flags: review?(cdawson)

Cameron Dawson [:camd]

Updated

•

8 years ago

Attachment #8770377 - Flags: review?(cdawson) → review+

Greg Arndt [:garndt]

Assignee

Comment 16

•

8 years ago

This has been merged into prod and deployed.

Assignee: nobody → garndt

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

James Graham [:jgraham]

Reporter

Updated

•

8 years ago

Blocks: 1294149

You need to log in before you can comment on or make changes to this bug.