Closed Bug 1283845 Opened 8 years ago Closed 8 years ago

Pulse job data schema doesn't have a parse_status field

Categories

(Tree Management :: Treeherder: Data Ingestion, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: emorley, Unassigned)

References

Details

For normal API jobs submission, log references can be given a parse_status of {'pending', 'parsed', 'failed'}. However for Pulse data ingestion, the doesn't appear to be such a field:
https://github.com/mozilla/treeherder/blob/24a31fe78b8c70960dd9bf258c2b3f081b690f6a/schemas/pulse-job.yml#L322-L397

Instead job_loader.py uses:
  "parse_status": "parsed" if "steps" in logref else "pending"

This is definitely much cleaner and makes sense, however for the arewefastyet jobs (submitted via the Treeherder python client at the moment), we have a new use-case of "the log shouldn't be parsed, but we don't have any generated steps" (see bug 1283413).

As such, I wonder how best to satisfy this use case via Pulse data ingestion for the future, in case jobs like arewefastyet start using Pulse instead?
Flags: needinfo?(cdawson)
We could add a field to the YML where you can have an optional ``parseStatus`` as a peer of ``steps`` and be able to set it to "skip" or something like that.  That would be pretty straightforward.  We'd just have to change the logic you mentioned a bit, but that's easy enough.

Would you want to take a crack at this?  Otherwise, I'm happy to.  :)

The only issue with making YML changes is that we'd need to notify and update all the consumers of it.  Just Task Cluster in the taskcluster-treeherder project for now.  But Greg and I were talking about how we really need to get the YML in some unified place where everyone can use the same one.
Flags: needinfo?(cdawson)
I'm not 100% sure whether the awewefastyet use-case is even a valid one - it seems like a workaround for some other missing feature perhaps.

The main reason for me asking was just in case it was something that we did feel we should support, and if doing so would require pulse schema changes that we needed to perform now since doing so after the fallout would be hard.
This would be more appropriate as a job_detail entry instead of a log url, since we never want to parse it.  If you ever wanted to submit this via pulse, and need it to be a log url, you could submit the empty set of steps.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.